tiktoken primer

tiktoken is the byte-pair encoding tokenizer used by OpenAI and compatible Meridian routes. Counting tokens before you ship a prompt is the cheapest way to avoid surprise bills and mid-stream truncation. This recipe walks you through installing tiktoken, counting tokens for a chat payload, and choosing the right encoding for the model you are targeting.

1. Install the library

tiktoken ships as a native Python wheel. Install it from PyPI in the same virtualenv you use to call the Meridian gateway.

pip install tiktoken
python -c "import tiktoken; print(tiktoken.__version__)"

2. Pick an encoding

Use cl100k_base for GPT-4 family and o200k_base for GPT-4o and reasoning models. Meridian routes accept either; mismatched counts only affect your local estimate.

import tiktoken
enc = tiktoken.get_encoding("o200k_base")
tokens = enc.encode("hello meridian")
print(len(tokens))

3. Count a chat payload

Chat messages add a small per-message overhead. Sum the content tokens, add four tokens per message for role and separator framing, then add three for the priming of the assistant turn.

def count_chat(messages, enc):
    total = 3
    for m in messages:
        total += 4 + len(enc.encode(m["content"]))
    return total

← Back to all recipes