Roadmap

Assistants API status

Meridian is built around blazing-fast, stateless chat completions. The Assistants API — with managed threads, tool use, and persistent context — is on our near-term roadmap. In the meantime, you can build equivalent functionality yourself with a few straightforward patterns.

What ships today

  • /v1/chat/completions — OpenAI-compatible endpoint with streaming, system prompts, temperature, top-p, and stop sequences.
  • Model routing across GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro — select via the model field.
  • Usage-based billing with per-token pricing. No seat licenses, no minimum commit.

Build threads yourself

The Assistants API is fundamentally a convenience layer over chat completions. You can replicate threads and persistence today:

  1. Store messages. Persist each user message and assistant response in your database keyed by thread_id.
  2. Reconstruct context. On each new request, fetch the full message history for that thread and pass it as the messages array.
  3. Truncate intelligently. When the context window fills, drop oldest messages or summarize them with a lightweight model call.
  4. Add tool use. Parse the assistant response for function-call delimiters, execute the function server-side, and feed the result back as a tool role message.

Roadmap

Managed threadsQ3 2025

Server-side thread storage with automatic context window management. No more rolling your own message persistence.

Built-in tool callingQ4 2025

Define tools in your API requests and let Meridian handle execution loops, retries, and result injection automatically.

File & vector storeTBD

Upload files, chunk them, embed them, and query via the Assistants API — all managed infrastructure.

Want to influence the roadmap or get early access? Email the team — we ship based on customer demand.