Request / response shape
POST https://api.openai.com/v1/embeddings
Content-Type: application/json
Authorization: Bearer $OPENAI_API_KEY
{
"input": ["Your text here", "Another chunk"],
"model": "text-embedding-3-small",
"dimensions": 1536
}// Response
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0012, -0.0045, ...] // length = dimensions
}
],
"model": "text-embedding-3-small",
"usage": { "prompt_tokens": 8, "total_tokens": 8 }
}Model comparison
| Model | Max dimensions | Default dims | MTEB score | Pricing |
|---|---|---|---|---|
| text-embedding-3-small | 1,536 | 1,536 | 62.3% | $0.02 / 1M tokens |
| text-embedding-3-large | 3,072 | 3,072 | 64.6% | $0.13 / 1M tokens |
| text-embedding-ada-002 | 1,536 | 1,536 | 61.0% | $0.10 / 1M tokens |
text-embedding-3-small supports a dimensions parameter (8–1536). Lower dimensions reduce storage and speed up similarity search with minimal accuracy loss. ada-002 is fixed at 1536 and does not accept the parameter.
Python
import openai
client = openai.OpenAI()
resp = client.embeddings.create(
model="text-embedding-3-small",
input=["Meridian is a vector database."],
dimensions=512, # optional; omit for full 1536
)
vector = resp.data[0].embedding
print(f"Dimension: {len(vector)}") # 512TypeScript
const resp = await fetch("https://api.openai.com/v1/embeddings", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: "text-embedding-3-small",
input: ["Meridian is a vector database."],
dimensions: 512,
}),
});
const json = await resp.json();
const vector: number[] = json.data[0].embedding;
console.log(vector.length); // 512Pooling and tokenization
OpenAI embeddings use a transformer encoder. The final embedding is produced by mean-pooling token-level hidden states across the sequence dimension, then applying an L2 normalization layer. You do not need to implement pooling yourself — the API returns a single vector per input string.
- Chunk size matters. The model has an 8,191-token context window for
text-embedding-3-smalland 2,046 forada-002. Inputs exceeding the limit are truncated. - Cosine similarity is standard. Since vectors are L2-normalized, dot product equals cosine similarity. Use it for nearest-neighbor search.
- Batch when possible. Send up to 2,048 input strings per request to reduce round-trips and improve throughput.