Assistants API status
Meridian is built around blazing-fast, stateless chat completions. The Assistants API — with managed threads, tool use, and persistent context — is on our near-term roadmap. In the meantime, you can build equivalent functionality yourself with a few straightforward patterns.
What ships today
/v1/chat/completions— OpenAI-compatible endpoint with streaming, system prompts, temperature, top-p, and stop sequences.- Model routing across GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro — select via the
modelfield. - Usage-based billing with per-token pricing. No seat licenses, no minimum commit.
Build threads yourself
The Assistants API is fundamentally a convenience layer over chat completions. You can replicate threads and persistence today:
- Store messages. Persist each user message and assistant response in your database keyed by
thread_id. - Reconstruct context. On each new request, fetch the full message history for that thread and pass it as the
messagesarray. - Truncate intelligently. When the context window fills, drop oldest messages or summarize them with a lightweight model call.
- Add tool use. Parse the assistant response for function-call delimiters, execute the function server-side, and feed the result back as a
toolrole message.
Roadmap
Server-side thread storage with automatic context window management. No more rolling your own message persistence.
Define tools in your API requests and let Meridian handle execution loops, retries, and result injection automatically.
Upload files, chunk them, embed them, and query via the Assistants API — all managed infrastructure.
Want to influence the roadmap or get early access? Email the team — we ship based on customer demand.