Docs/API/Models

Models

Every model this key can call, with metadata on context length, streaming, tool use, and hardware requirements.

List models

GET/v1/models

Returns everything this key can call. Fields per model:

json
{
  "id": "qwen3.5-4b",
  "object": "model",
  "type": "llm",
  "context_length": 131072,
  "streaming": true,
  "tool_use": true,
  "modalities": ["text"],
  "requirements": {"min_gpu": "L40S", "vram_gb": 22},
  "license": "Apache-2.0"
}

Types

  • llm — chat completions.
  • stt — transcription (batch + streaming).
  • tts — speech generation.
  • embedding — embedding models.
  • diarization — speaker segmentation models.

Retrieve a single model

GET/v1/models/{model_id}