tiktoken primer
tiktoken is the byte-pair encoding tokenizer used by OpenAI and compatible Meridian routes. Counting tokens before you ship a prompt is the cheapest way to avoid surprise bills and mid-stream truncation. This recipe walks you through installing tiktoken, counting tokens for a chat payload, and choosing the right encoding for the model you are targeting.
1. Install the library
tiktoken ships as a native Python wheel. Install it from PyPI in the same virtualenv you use to call the Meridian gateway.
pip install tiktoken python -c "import tiktoken; print(tiktoken.__version__)"
2. Pick an encoding
Use cl100k_base for GPT-4 family and o200k_base for GPT-4o and reasoning models. Meridian routes accept either; mismatched counts only affect your local estimate.
import tiktoken
enc = tiktoken.get_encoding("o200k_base")
tokens = enc.encode("hello meridian")
print(len(tokens))3. Count a chat payload
Chat messages add a small per-message overhead. Sum the content tokens, add four tokens per message for role and separator framing, then add three for the priming of the assistant turn.
def count_chat(messages, enc):
total = 3
for m in messages:
total += 4 + len(enc.encode(m["content"]))
return total