Getting Started

Quickstart

Step 1: Get an API Key

  1. Log in to the Vurbo.ai Dashboard: https://vas-poc.vurbo.ai/dashboard
  2. Go to the "API Keys" page
  3. Click "Create New Key"
  4. Copy your API Key (format: vas_ + a 32-character string)

Step 2: Your First REST API Call

Use your API Key to call the Tasks API and verify the connection:

curl -X GET "https://vas-poc.vurbo.ai/api/v1/tasks" \
  -H "X-API-Key: vas_YOUR_API_KEY_HERE"

Successful response:

{
  "tasks": []
}

Step 3: Establish a WebSocket Connection

WebSocket uses a Ticket-based authentication mechanism. You must first exchange your API Key for a Ticket.

3.1 Get a Ticket

curl -X POST "https://vas-poc.vurbo.ai/api/v1/auth/ticket" \
  -H "X-API-Key: vas_YOUR_API_KEY_HERE"

Response:

{
  "ticket": "aBcDeFgHiJkLmNoPqRsTuVwXyZ012345",
  "expires_in": 60
}

3.2 Connect Using the Ticket

// Use the Ticket within 60 seconds after obtaining it
const ws = new WebSocket('wss://vas-poc.vurbo.ai/ws', [`ticket.${ticket}`]);

ws.onopen = () => {
  console.log('WebSocket connected');
};

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);
  console.log('Message received:', msg);
};

Step 4: Start Real-Time Voice Translation

4.1 Send the start Command

ws.send(JSON.stringify({
  type: 'voice-translation',
  data: {
    action: 'start',
    transcription_languages: ['zh-TW'],
    translation_languages: ['en-US'],
    type: 'transcribe',
    audio_format: 'pcm',
    summary_template: 'meeting'
  }
}));

Successful response:

{
  "type": "voice-translation",
  "data": {
    "action": "session_started",
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "recording_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
    "recording_type": "transcribe",
    "recognition_mode": "single",
    "message": "Speech recognition started"
  }
}

4.2 Send Audio

Capture audio using the browser's MediaRecorder or AudioWorklet, then Base64-encode it before sending:

// Assume you already have an ArrayBuffer of PCM audio
function sendAudio(pcmBuffer) {
  const base64 = btoa(String.fromCharCode(...new Uint8Array(pcmBuffer)));
  ws.send(JSON.stringify({
    type: 'voice-translation',
    data: {
      action: 'audio',
      payload: base64
    }
  }));
}

4.3 Receive Recognition and Translation Results

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);

  if (msg.type === 'voice-translation' && msg.data.action === 'result') {
    // Original text
    if (msg.data.origin) {
      console.log(`[Original] ${msg.data.origin.text}`);
    }

    // Translations
    if (msg.data.translations) {
      for (const [lang, translation] of Object.entries(msg.data.translations)) {
        console.log(`[${lang}] ${translation.text}`);
      }
    }
  }
};

Step 5: Stop Translation

ws.send(JSON.stringify({
  type: 'voice-translation',
  data: {
    action: 'stop'
  }
}));

After stopping, you will receive a task_complete event containing a task_id, which you can use for subsequent queries.


Full Flow Diagram

Get API Key
    │
    ▼
POST /api/v1/auth/ticket (get Ticket)
    │
    ▼
WebSocket connection (using Ticket)
    │
    ▼
send: start (start voice translation)
    │
    ▼
send: audio (keep sending audio) ──→ receive result (recognition/translation results)
    │
    ▼
send: stop (stop)
    │
    ▼
receive task_complete (get task_id)
    │
    ▼
GET /api/v1/tasks (view task list)

Next Steps


Version: V1.5.7 Last Updated: 2026-05-20

Copyright © 2026