Chat completions

Drop-in OpenAI chat API with full support for streaming, tools, JSON mode, and structured outputs.

Package availability

Wordcab SDKs, CLI tools, Helm charts, model weights, and deployment packages are delivered directly to each customer for self-hosted installation. They are not publicly published package-manager artifacts, so install commands in these docs are placeholders until your Wordcab team provides your private package source or offline bundle.

Create a chat completion

POST/v1/chat/completions

OpenAI-compatible. Full reference below; for migration notes see the compatibility guide.

Body

modelstringRequired

Model id. See /v1/models for what is available on your key.

messagesMessage[]Required

Conversation history. Each message has role (system | user | assistant | tool) and content (string or content-parts array for multimodal models).

temperaturenumberOptional

0–2. Default 1. Lower = more deterministic.

top_pnumberOptional

Nucleus sampling cutoff. Default 1.

max_tokensintegerOptional

Upper bound on completion length.

streambooleanOptional

Server-sent events. Default false.

toolsTool[]Optional

Function schemas the model may call.

tool_choicestring | objectOptional

auto (default), none, required, or {\"type\":\"function\",\"function\":{\"name\":\"...\"}}.

response_formatobjectOptional

{ type: text | json_object | json_schema }. Pass a schema to constrain output at decode time.

seedintegerOptional

Deterministic sampling seed when supported by the model.

stopstring | string[]Optional

Up to 4 stop sequences.

Response

json

{
  "id": "chatcmpl_01HZ...",
  "object": "chat.completion",
  "created": 1712345678,
  "model": "qwen3.5-4b",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "..."},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 182,
    "completion_tokens": 56,
    "total_tokens": 238
  }
}

Streaming response

data: {"id":"chatcmpl_01HZ...","choices":[{"delta":{"content":"The"}]}
data: {"id":"chatcmpl_01HZ...","choices":[{"delta":{"content":" quick"}]}
data: [DONE]

← Previous

Speech

Embeddings