Guides

Summary Customization

Table of Contents

  1. Overview
  2. Prerequisites
  3. Two Modes
  4. Three Entry Points
  5. Transcript Record Fields
  6. Handling Sensitive Words and Profanity
  7. Security and Limits
  8. Complete Examples
  9. Error Codes
  10. Related Documents

Overview

VAS summary generation supports two mutually exclusive modes:

  • builtin — Applies a VAS built-in template (scenarios such as meetings, interviews, courses, and medical consultations). You only need to specify a template slug.
  • custom — You provide a complete prompt that replaces the built-in template rules. The system still automatically handles output language and plain-text post-processing.

Across the three entry points (REST / WebSocket / SSE), both modes share the same request field semantics; the only difference is the field naming prefix.

When to Use Each Mode

ModeUse Case
builtinGeneral meetings, interviews, courses, medical consultations, and other scenarios that can directly use a built-in template
customYou have your own prompt rules (specific output fields, custom formats, domain-specific structures, etc.) and need to fully customize the summary generation behavior

Mutual Exclusion Rules

Modetemplateprompt / prompt_slug
builtinRequiredNot allowed
customNot allowedRequired (both are required)

Violating the mutual exclusion rules returns summary_mode_field_mismatch (REST 400 / SSE 422).


Prerequisites

1. Obtain an API Key

See Authentication for details.

2. (builtin path) Browse Available Templates

Call GET /api/v1/summary-templates to retrieve the list of built-in templates (supports ?category=summary|medical|legal|all filtering).

When you need to reference the design of a specific template, call GET /api/v1/summary-templates/{slug} to retrieve the template content as a design reference.

3. (custom path) Design Your Own Prompt

When writing your prompt, you must include the following yourself:

  • Output language requirement (e.g., "Please output in Traditional Chinese")
  • Structure and field descriptions (which information to extract, section order)
  • Output format requirement (JSON / bullet points / paragraphs)

In custom mode, your prompt replaces the built-in template rules, so you must include these elements in the prompt yourself.


Two Modes

builtin mode

{
  "mode": "builtin",
  "template": "meeting",
  "language": "zh-TW",
  "plain_text": true
}

Effect:

  • Generates the summary using the built-in template for the specified slug
  • The template slug is written to the backend record
  • The backend record is written with summary_mode: "builtin" and summary_template: <slug> (retrievable via the init_summary event)

custom mode

{
  "mode": "custom",
  "prompt": "你是皮膚科專科助手。請從逐字稿萃取:1) 主訴 2) Fitzpatrick 分型 3) 過敏原史。輸出 JSON。",
  "prompt_slug": "skin-clinic-acme-v2",
  "language": "zh-TW",
  "plain_text": true
}

Effect:

  • Does not query a built-in template; your prompt replaces the template rules
  • The prompt_slug is written to the backend record (pass-through, for historical lookup)
  • The original prompt text is forcibly snapshotted into the summary_prompt_snapshot field (the sole basis for reconstruction, retrievable via the init_summary event)

It is recommended that prompt_slug include a version (e.g., acme-v1acme-v2) so integrators can trace it.


Three Entry Points

The three entry points share the same semantics but use different field naming prefixes:

Entry PointField PrefixPurpose
REST POST /api/v1/summary(no prefix) mode / template / prompt / prompt_slugGenerate a summary for any transcript
WebSocket start actionsummary_*Automatically generated after a live recording ends
SSE regenerate/summary(no prefix; SSE uses camelCase) mode / template / prompt / promptSlugRegenerate a summary for an existing recording

REST POST /api/v1/summary

Generates a summary for any transcript (including external sources); the response is sent segment by segment as an SSE stream.

Full specification: reference/rest/summary.md

curl -N -X POST "https://vas-poc.vurbo.ai/api/v1/summary" \
  -H "Authorization: Bearer vas_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "...transcript...",
    "mode": "custom",
    "prompt": "你是會議分析助手...",
    "prompt_slug": "acme-meeting-v1",
    "language": "zh-TW",
    "plain_text": true
  }'

⚠️ This endpoint authenticates with Authorization: Bearer <api_key>, which differs from the X-API-Key used by other REST endpoints.

WebSocket start action

The start action of a live recording carries the summary settings; after the session ends, the summary is generated automatically and you are notified via the summary_done event.

Full specification: reference/websocket/voice-translation.md

{
  "type": "voice-translation",
  "data": {
    "action": "start",
    "transcription_languages": ["zh-TW"],
    "translation_languages": ["en-US"],
    "type": "transcribe",
    "audio_format": "pcm",
    "summary_mode": "custom",
    "summary_prompt": "你是會議分析助手...",
    "summary_prompt_slug": "acme-meeting-v1",
    "summary_language": "zh-TW",
    "summary_plain_text": true
  }
}

Summary completion event:

{
  "type": "summary_done",
  "data": {
    "summary_mode": "custom",
    "summary_template": "acme-meeting-v1",
    "summary_plain_text": true,
    "tokens_used": 890,
    "summary_fallback_level": null,
    "summary_dropped_segments": null
  }
}

summary_template is the effective slug — in custom mode it returns your slug (i.e., the summary_prompt_slug you submitted).

SSE regenerate/summary

Regenerates a summary for an existing recording, split into two endpoints:

MethodPurposeWrites DBStores TranscriptBilled
GET /api/v1/sse/regenerate/summary/{taskId}Preview (trial run, compare different prompt results)
POST /api/v1/sse/regenerate/summary/{taskId}Persist (officially saved)✅ + bump revision

Full specification: reference/sse/regenerate-summary.md

# Preview
curl -N "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary/550e8400-...?mode=custom&prompt=...&promptSlug=acme-v2&plainText=true" \
  -H "X-API-Key: vas_..."

# Persist
curl -N -X POST "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary/550e8400-..." \
  -H "X-API-Key: vas_..." \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "custom",
    "prompt": "...",
    "promptSlug": "acme-v2",
    "plainText": true
  }'

The done event carries persisted: bool — the client can determine directly from the payload whether this run was saved, without inferring it from the HTTP method.


Transcript Record Fields

After the summary completes, the following fields are written to the top-level of the transcript record (not nested under the summary object):

FieldWhen PresentDescription
summary_modeAlways"builtin" / "custom"
summary_templateAlwayseffective slug — builtin → built-in slug; custom → your slug
summary_plain_textAlwaysbool
summary_prompt_snapshotcustom mode onlyThe prompt content you passed in verbatim (forcibly snapshotted; the sole basis for reconstruction)
summary_fallback_levelOnly when fallback is triggeredValue is 2 or 3, indicating which content-filtering fallback path this summary actually took. Omitted when standard mode succeeds directly
summary_dropped_segmentsOnly when summary_fallback_level=3The indices of the omitted transcript segments (integer array in original order)

The init_summary event of GET /api/v1/sse/history/transcribe/{taskId} also carries the above fields (prompt_snapshot only in custom mode; fallback_level / dropped_segments only when fallback is triggered).

Roles of the Two Fields

  • summary_prompt_snapshot = your intent (the original prompt content)
  • summary_fallback_level = the actual execution path (standard / neutral / segment omission)

The two fields are complementary: the snapshot can be used for auditing and reconstruction; fallback_level can be used to inform the user of what actually happened during that summary.


Handling Sensitive Words and Profanity

VAS handles sensitive words (profanity, emotional or sensitive content) via three different paths depending on the "source."

The API layer does not proactively reject requests containing sensitive words. All handling is downgraded or masked at runtime. If you want to block them beforehand, perform your own keyword checks in the frontend UI (see Frontend Pre-Validation (Optional)).

Quick Reference

SourceTrigger MechanismIs a Summary Generated?How the Client Is Informed
The customer prompt itself contains sensitive wordsAutomatically enters neutral modesummary_done.summary_fallback_level === 2
Transcript text (STT layer)options.profanity_handling masks / removesThe transcript text is already processed; no additional notification
Transcript text (summary layer)Automatically omits the triggering segmentssummary_done.summary_fallback_level === 3
Multiple paths all failsummary_errorerror_code: llm_content_filtered

Source 1: Customer prompt contains sensitive words → neutral mode

When the system detects that the customer prompt content triggers content filtering:

Behavior
Automatically generates the summary using neutral instructions instead — does not reject, does not error
The customer's original prompt is still stored as a summary_prompt_snapshot for auditing
The summary_done event carries summary_fallback_level: 2 to notify the client

Suggested UI message: "Your custom instructions contain terms that cannot be processed due to filtering; the summary was generated in neutral mode."

Source 2: Transcript text (STT-layer masking)

The options.profanity_handling field of the WebSocket start action controls how profanity in the speech recognition output is handled:

ValueBehavior
mask (default)Profanity is masked with ***
removedProfanity is removed from the transcript
rawThe original text is kept unprocessed

See the options field in WebSocket - Voice Translation for details.

This option only affects STT output (the transcript text). It has nothing to do with sensitive words in the customer prompt.

Source 3: Transcript text (summary-layer segment omission) → segment omission mode

If the STT-layer profanity_handling is set to raw (or filtering is still triggered after masking), the system automatically omits the segments that trigger filtering during summary generation:

Behavior
The summary_done event carries summary_fallback_level: 3
summary_dropped_segments lists the indices of the omitted transcript segments (integer array in original order)
The client can use this to inform the user which segments were actually omitted

Suggested UI message: "The transcript contains N segments that cannot be processed; the summary was generated after omitting the related content." (N = summary_dropped_segments.length)

Total Failure → llm_content_filtered

When the customer prompt and the transcript both contain excessive sensitive content, and automatic downgrading still cannot produce a valid summary:

Behavior
Triggers the summary_error event with error_code: llm_content_filtered
The summary will not be saved

Suggested UI message: "A summary cannot be generated for this content (restricted by content-filtering rules)."

Billing Impact

When a fallback is triggered, the system may need to call the LLM service multiple times, and all calls are recorded in usage_logs for billing. We recommend treating "summaries that trigger a fallback" as normal but higher token-cost events.

Specification Coverage

PathFallback Integration Status
WebSocket realtime summary (auto-generated after recording ends)✅ Since V1.5.5
File import summary✅ Since V1.5.5
SSE regenerate/summary endpoint⚠️ Not yet integrated — still returns llm_content_filtered directly when content filtering is triggered. If you need automatic downgrading, we recommend using the POST /api/v1/summary REST endpoint instead

Corresponding Client Implementation

function renderSummary(summary, fallbackLevel, droppedSegments) {
  switch (fallbackLevel) {
    case undefined:
    case null:
    case 1:
      return summary;
    case 2:
      return summary + '\n\n(Your custom instructions contained filtered terms; the summary was generated in neutral mode)';
    case 3:
      return summary + `\n\n(The transcript contained ${droppedSegments.length} segments that could not be processed; the summary was generated after omitting the related content)`;
  }
}

Frontend Pre-Validation (Optional)

If you want to block prompts containing sensitive words before sending the request (saving the extra token billing from fallbacks), you can add keyword checks to the frontend UI.

VAS does not provide an API-layer prompt pre-check endpoint — whether to perform pre-validation and which keyword list to use is entirely up to the client.


Security and Limits

Built-in Safety Guards

In custom mode, the system automatically adds safety guards to the customer prompt, including:

  • Content neutralization guidance: Instructs the LLM to summarize the intent of any colloquial, emotional, or sensitive terms in the source text using neutral, objective language, avoiding verbatim quotation. The goal is to reduce the chance that standard mode triggers content filtering.
  • Instruction injection protection: Prevents instructions in the customer prompt from accidentally overriding system rules.

The safety guards are not exposed for client configuration; the customer's original prompt is still stored in the summary_prompt_snapshot field for auditing.

⚠️ The safety guards are not foolproof. You should not splice untrusted end-user input directly into prompt; prompt injection risk is your own responsibility.

Character and Length Limits

FieldLimit
prompt / summary_prompt≤ 2000 characters
prompt_slug / summary_prompt_slug≤ 64 characters, Unicode, no control characters (\n / \r / \t / \0, etc.)
content (the transcript input for REST POST /api/v1/summary)≤ 100,000 characters

Cross-Tenant Isolation

The customer prompt and summary results are fully isolated across tenants (session-scoped, with no memory persistence).

Server Log

The VAS server log does not record the prompt or the full transcript text (it only logs the length and slug). LLM error messages are sanitized so that the raw error is not exposed to the client; only the provider is indicated.


Complete Examples

Node.js — REST POST /api/v1/summary (custom mode)

const response = await fetch('https://vas-poc.vurbo.ai/api/v1/summary', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    content: transcriptText,
    mode: 'custom',
    prompt: `你是會議分析助手。請從逐字稿萃取:
1. 主要決議事項(每項一句話)
2. 待辦工作(含負責人、期限)
3. 風險與阻礙
輸出格式:JSON。請以繁體中文輸出。`,
    prompt_slug: 'acme-meeting-v2',
    language: 'zh-TW',
    plain_text: true,
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value);

  for (const line of buffer.split('\n\n')) {
    if (line.startsWith('event: chunk')) {
      const data = JSON.parse(line.split('\ndata: ')[1]);
      process.stdout.write(data.content);
    } else if (line.startsWith('event: done')) {
      const data = JSON.parse(line.split('\ndata: ')[1]);
      console.log('\n\n--- Done ---');
      console.log(`tokens: input=${data.tokens_used.input}, output=${data.tokens_used.output}`);
      console.log(`prompt_snapshot saved: ${data.prompt_snapshot ? 'yes' : 'no'}`);
    }
  }
}

Python — SSE Regenerate Summary (custom mode: preview then persist)

import requests

BASE = "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary"
TASK_ID = "550e8400-e29b-41d4-a716-446655440000"
HEADERS = {"X-API-Key": "vas_..."}

# 1. Preview first with GET
preview_params = {
    "mode": "custom",
    "prompt": "你是醫療摘要助手...",
    "promptSlug": "clinic-acme-v3",
    "plainText": "true",
}
with requests.get(f"{BASE}/{TASK_ID}", params=preview_params, headers=HEADERS, stream=True) as r:
    for line in r.iter_lines():
        if line.startswith(b"data: "):
            print(line[6:].decode())

# 2. Persist with POST after customer confirmation
persist_body = {
    "mode": "custom",
    "prompt": "你是醫療摘要助手...",
    "promptSlug": "clinic-acme-v3",
    "plainText": True,
}
with requests.post(f"{BASE}/{TASK_ID}", json=persist_body, headers=HEADERS, stream=True) as r:
    for line in r.iter_lines():
        if line.startswith(b"data: "):
            print(line[6:].decode())

WebSocket — Live Recording custom Summary

ws.send(JSON.stringify({
  type: 'voice-translation',
  data: {
    action: 'start',
    transcription_languages: ['zh-TW'],
    translation_languages: [],
    type: 'transcribe',
    audio_format: 'pcm',
    summary_mode: 'custom',
    summary_prompt: '你是法律會議助手。請輸出:1) 爭點 2) 立場 3) 結論。',
    summary_prompt_slug: 'legal-firm-acme-v1',
    summary_language: 'zh-TW',
    summary_plain_text: true,
  },
}));

ws.addEventListener('message', (evt) => {
  const msg = JSON.parse(evt.data);
  if (msg.type === 'summary_done') {
    console.log('Summary done', msg.data);
    if (msg.data.summary_fallback_level === 2) {
      showBanner('Your custom instructions contained filtered terms; the summary was generated in neutral mode');
    } else if (msg.data.summary_fallback_level === 3) {
      showBanner(`The transcript contained ${msg.data.summary_dropped_segments.length} segments that could not be processed and were omitted`);
    }
  } else if (msg.type === 'summary_error') {
    console.error('Summary failed', msg.data.error_code, msg.data.message);
  }
});

Error Codes

Error CodeHTTPTrigger Condition
summary_invalid_mode422 (SSE) / 400 (others)mode is not builtin / custom
summary_mode_field_mismatch422 / 400The mode and field combination do not match (required field missing / disallowed field provided)
summary_prompt_too_long422 / 400prompt exceeds 2000 characters
summary_prompt_slug_too_long422 / 400prompt_slug exceeds 64 characters
summary_prompt_slug_invalid422 / 400prompt_slug contains control characters (\n / \r / \t / \0, etc.)
template_not_found404(builtin mode) The template for the specified slug does not exist or has been disabled
llm_content_filtered400Blocked by content filtering (the customer prompt or transcript contains sensitive words), and the fallback chain failed entirely or the endpoint does not integrate fallback
summary_failed500Summary generation failed
summary_timeout504Generation timed out

For a complete list of error codes, see Error Code Reference.


DocumentDescription
POST /api/v1/summaryFull specification for the REST summary endpoint
Summary Templates APIBuilt-in template list and template lookup
SSE - Regenerate SummarySpecification for the GET preview / POST persist endpoints
WebSocket - Voice TranslationThe summary_* fields of the start action
SSE - HistoryThe mode / template / prompt_snapshot fields of the init_summary event
Error Code ReferenceComplete list of error codes
Changelog V1.5.5The version that introduced the builtin/custom mutual-exclusion spec and automatic content-filtering downgrade

Version: V1.5.7 Last Updated: 2026-05-20

Copyright © 2026