Docs/API/Embeddings

Embeddings

Dense vectors for retrieval, clustering, and classification. Batch up to 512 inputs per call.

Create embeddings

POST/v1/embeddings

OpenAI-compatible. Returns a dense vector per input. Batch up to 512 inputs per call.

Body

modelstringRequired

Model id. Defaults vary by deployment; bge-m3 and e5-mistral-7b are common.

inputstring | string[]Required

Single string or array of strings. Per-input cap: 8,192 tokens.

encoding_formatstringOptional

float (default) or base64.

dimensionsintegerOptional

Matryoshka dimension reduction for models that support it.

Response

json
{
  "object": "list",
  "model": "bge-m3",
  "data": [
    {"object": "embedding", "index": 0, "embedding": [0.012, -0.044, ...]}
  ],
  "usage": {"prompt_tokens": 8, "total_tokens": 8}
}

Models

ModelDimContextNotes
bge-m310248,192Multilingual, default for RAG.
e5-mistral-7b409632,768Strong on long-document retrieval.
gte-small-en384512Cheap, fast, English-only.