Embeddings
Dense vectors for retrieval, clustering, and classification. Batch up to 512 inputs per call.
Create embeddings
POST/v1/embeddings
OpenAI-compatible. Returns a dense vector per input. Batch up to 512 inputs per call.
Body
modelstringRequired
Model id. Defaults vary by deployment; bge-m3 and e5-mistral-7b are common.
inputstring | string[]Required
Single string or array of strings. Per-input cap: 8,192 tokens.
encoding_formatstringOptional
float (default) or base64.
dimensionsintegerOptional
Matryoshka dimension reduction for models that support it.
Response
json
{
"object": "list",
"model": "bge-m3",
"data": [
{"object": "embedding", "index": 0, "embedding": [0.012, -0.044, ...]}
],
"usage": {"prompt_tokens": 8, "total_tokens": 8}
}Models
| Model | Dim | Context | Notes |
|---|---|---|---|
bge-m3 | 1024 | 8,192 | Multilingual, default for RAG. |
e5-mistral-7b | 4096 | 32,768 | Strong on long-document retrieval. |
gte-small-en | 384 | 512 | Cheap, fast, English-only. |