Token-level observability

Logprobs + top tokens

Every token returned by Meridian carries a log-probability score. Enable logprobs=true and set top_logprobs=5 to surface the five most likely alternatives alongside their scores.

Quick start

curl https://api.getnimbus.net/v1/chat/completions \
  -H "Authorization: Bearer $MERIDIAN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meridian-1",
    "messages": [{"role":"user","content":"Explain entropy"}],
    "logprobs": true,
    "top_logprobs": 5
  }'

The response payload includes a logprobs array on every choice, with per-token scores and the top-5 alternates.

Confidence scoring

Aggregate token logprobs into a sequence-level confidence metric. Low-confidence spans flag hallucinations, ambiguous completions, or out-of-distribution inputs — ideal for guardrails and human-in-the-loop workflows.

  • Mean logprob — average over all generated tokens; simple threshold gating.
  • Min logprob — catch the single least-certain token in a response.
  • Entropy gap — distance between the chosen token and the runner-up; narrow gaps signal indecision.

RAG re-ranking

When Meridian generates an answer grounded in retrieved chunks, logprobs reveal which tokens the model is most certain about. Re-rank candidate passages by the cumulative logprob of the tokens they contributed — higher fidelity than embedding cosine alone.

// Pseudo: re-rank chunks by token-level contribution
const scores = chunks.map((chunk, i) => {
  const tokens = tokenize(chunk.text);
  const lp = tokens.reduce((sum, t) => sum + (t.logprob ?? 0), 0);
  return { index: i, score: lp / tokens.length };
});
scores.sort((a, b) => b.score - a.score);

Response shape

{
  "choices": [{
    "logprobs": {
      "content": [
        {
          "token": " Ent",
          "logprob": -0.12,
          "top_logprobs": [
            { "token": " Ent", "logprob": -0.12 },
            { "token": " The", "logprob": -2.41 },
            { "token": " In",  "logprob": -3.10 },
            { "token": " A",   "logprob": -4.05 },
            { "token": " Sh",  "logprob": -5.20 }
          ]
        }
      ]
    }
  }]
}

Streaming responses emit logprobs incrementally on each chunk — no buffering required.

Ready to calibrate your outputs?

Grab an API key and start sending logprobs=true today.

Go to dashboard →