Docs/API/Chat completions

Chat completions

Drop-in OpenAI chat API with full support for streaming, tools, JSON mode, and structured outputs.

Create a chat completion

POST/v1/chat/completions

OpenAI-compatible. Full reference below; for migration notes see the compatibility guide.

Body

modelstringRequired

Model id. See /v1/models for what is available on your key.

messagesMessage[]Required

Conversation history. Each message has role (system | user | assistant | tool) and content (string or content-parts array for multimodal models).

temperaturenumberOptional

0–2. Default 1. Lower = more deterministic.

top_pnumberOptional

Nucleus sampling cutoff. Default 1.

max_tokensintegerOptional

Upper bound on completion length.

streambooleanOptional

Server-sent events. Default false.

toolsTool[]Optional

Function schemas the model may call.

tool_choicestring | objectOptional

auto (default), none, required, or {\"type\":\"function\",\"function\":{\"name\":\"...\"}}.

response_formatobjectOptional

{ type: text | json_object | json_schema }. Pass a schema to constrain output at decode time.

seedintegerOptional

Deterministic sampling seed when supported by the model.

stopstring | string[]Optional

Up to 4 stop sequences.

Response

json
{
  "id": "chatcmpl_01HZ...",
  "object": "chat.completion",
  "created": 1712345678,
  "model": "qwen3.5-4b",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "..."},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 182,
    "completion_tokens": 56,
    "total_tokens": 238
  }
}

Streaming response

data: {"id":"chatcmpl_01HZ...","choices":[{"delta":{"content":"The"}]}
data: {"id":"chatcmpl_01HZ...","choices":[{"delta":{"content":" quick"}]}
data: [DONE]