Recipe

Hugging Face primer

Hugging Face is the de-facto registry for open-weight models, datasets, and inference endpoints. This primer walks through pulling a model into your Meridian pipeline in under five minutes.

1. Authenticate

Create a read token at huggingface.co/settings/tokens and export it as an environment variable. Meridian reads HF_TOKEN automatically when it boots the gateway.

export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

2. Pull a model

Use the Meridian CLI to register a Hugging Face model as a routable alias. Once registered, the model is callable through the standard OpenAI-compatible chat endpoint at llm.getnimbus.net/v1. Cold-start latency is typically under three seconds for 7B parameter models.

3. Call from your app

Point the OpenAI SDK at the Meridian base URL and pass the alias as the model name. Streaming, tool calls, and JSON mode all work the same way they would against any first-party provider, with the added benefit of unified billing across every Hugging Face checkpoint you wire up.