Test suites

Scripted cases with assertions, runnable against any agent. Gate deployments on green runs.

Package availability

Wordcab SDKs, CLI tools, Helm charts, model weights, and deployment packages are delivered directly to each customer for self-hosted installation. They are not publicly published package-manager artifacts, so install commands in these docs are placeholders until your Wordcab team provides your private package source or offline bundle.

Create a test suite

POST/v1/test-suites

namestringRequired

descriptionstringOptional

casesTestCase[]Required

Each case has input (audio_url or scripted turn list) and expected (assertions).

Case shape

json

{
  "id": "case_refund_01",
  "input": {"audio_url": "s3://internal/eval/refund_01.wav"},
  "expected": {
    "tool_calls": ["lookup_order", "start_refund"],
    "assertions": [
      {"type": "transcript_contains",     "value": "I can help with that"},
      {"type": "transcript_not_contains", "value": "I will personally"}
    ]
  }
}

List / retrieve

GET/v1/test-suites

GET/v1/test-suites/{suite_id}

Run a suite

POST/v1/test-suites/{suite_id}/runs

agent_idstringRequired

Agent to test.

overridesobjectOptional

Per-run overrides: llm_model, stt_model, tts_model, temperature, tools.

Returns a run object with status = queued. Poll or subscribe to testrun.completed.

Retrieve a run

GET/v1/test-suites/{suite_id}/runs/{run_id}

json

{
  "id": "run_01HZ...",
  "suite_id": "suite_abc",
  "agent_id": "agent_abc",
  "status": "completed",
  "passed": 47,
  "total": 50,
  "cases": [
    {"id": "case_refund_01", "passed": true, "latency_ms": 2310, "failures": []},
    {"id": "case_refund_02", "passed": false, "failures": [
      {"assertion": "tool_calls", "expected": ["lookup_order","start_refund"], "actual": ["lookup_order"]}
    ]}
  ]
}

← Previous

Models

Experiments