Introduction
Overview
The scope and API composition of Vurbo.ai API Service (VAS).
Vurbo.ai API Service (VAS) is a Task-centric speech processing API providing real-time speech recognition, translation, text-to-speech (TTS), summarization and broadcasting.
Service composition
| Channel | Purpose | Authentication |
|---|---|---|
| REST API | Task management, audio import, webhook configuration | API Key (X-API-Key header) |
| WebSocket | Real-time recording, speech recognition and translation | Ticket (exchanged from an API Key, short-lived and single-use) |
| SSE | Broadcast viewers receiving real-time captions, task history playback | API Key; broadcast viewers use a share token |
| Webhook | Task completion and failure event callbacks | — |
The API Key is a long-lived credential used for REST and SSE. WebSocket connections instead exchange the key for a single-use Ticket, avoiding exposing the API Key in the connection URL; broadcast viewers are identified by a separate share token and do not need an API Key.
Recording types and use cases
| Type | Description | Participants / orientation | Scenario |
|---|---|---|---|
transcribe | Speech to text | Single-device recording, one or multiple speakers | Meeting notes, interview records |
conversation | Bilingual real-time interpretation | 1-to-1 (two people sharing one device) | Cross-language conversation, live interpretation |
record | Plain recording | Single person | Voice memos, quick notes |
broadcast | Broadcast / live stream | 1-to-many (one presenter → many viewers) | Lectures, talks, live streams |