Summary Customization
Table of Contents
- Overview
- Prerequisites
- Two Modes
- Three Entry Points
- Transcript Record Fields
- Handling Sensitive Words and Profanity
- Security and Limits
- Complete Examples
- Error Codes
- Related Documents
Overview
VAS summary generation supports two mutually exclusive modes:
builtin— Applies a VAS built-in template (scenarios such as meetings, interviews, courses, and medical consultations). You only need to specify atemplateslug.custom— You provide a complete prompt that replaces the built-in template rules. The system still automatically handles output language and plain-text post-processing.
Across the three entry points (REST / WebSocket / SSE), both modes share the same request field semantics; the only difference is the field naming prefix.
When to Use Each Mode
| Mode | Use Case |
|---|---|
builtin | General meetings, interviews, courses, medical consultations, and other scenarios that can directly use a built-in template |
custom | You have your own prompt rules (specific output fields, custom formats, domain-specific structures, etc.) and need to fully customize the summary generation behavior |
Mutual Exclusion Rules
| Mode | template | prompt / prompt_slug |
|---|---|---|
builtin | Required | Not allowed |
custom | Not allowed | Required (both are required) |
Violating the mutual exclusion rules returns summary_mode_field_mismatch (REST 400 / SSE 422).
Prerequisites
1. Obtain an API Key
See Authentication for details.
2. (builtin path) Browse Available Templates
Call GET /api/v1/summary-templates to retrieve the list of built-in templates (supports ?category=summary|medical|legal|all filtering).
When you need to reference the design of a specific template, call GET /api/v1/summary-templates/{slug} to retrieve the template content as a design reference.
3. (custom path) Design Your Own Prompt
When writing your prompt, you must include the following yourself:
- Output language requirement (e.g., "Please output in Traditional Chinese")
- Structure and field descriptions (which information to extract, section order)
- Output format requirement (JSON / bullet points / paragraphs)
In custom mode, your prompt replaces the built-in template rules, so you must include these elements in the prompt yourself.
Two Modes
builtin mode
{
"mode": "builtin",
"template": "meeting",
"language": "zh-TW",
"plain_text": true
}
Effect:
- Generates the summary using the built-in template for the specified slug
- The
templateslug is written to the backend record - The backend record is written with
summary_mode: "builtin"andsummary_template: <slug>(retrievable via theinit_summaryevent)
custom mode
{
"mode": "custom",
"prompt": "你是皮膚科專科助手。請從逐字稿萃取:1) 主訴 2) Fitzpatrick 分型 3) 過敏原史。輸出 JSON。",
"prompt_slug": "skin-clinic-acme-v2",
"language": "zh-TW",
"plain_text": true
}
Effect:
- Does not query a built-in template; your prompt replaces the template rules
- The
prompt_slugis written to the backend record (pass-through, for historical lookup) - The original
prompttext is forcibly snapshotted into thesummary_prompt_snapshotfield (the sole basis for reconstruction, retrievable via theinit_summaryevent)
It is recommended that
prompt_sluginclude a version (e.g.,acme-v1→acme-v2) so integrators can trace it.
Three Entry Points
The three entry points share the same semantics but use different field naming prefixes:
| Entry Point | Field Prefix | Purpose |
|---|---|---|
REST POST /api/v1/summary | (no prefix) mode / template / prompt / prompt_slug | Generate a summary for any transcript |
WebSocket start action | summary_* | Automatically generated after a live recording ends |
SSE regenerate/summary | (no prefix; SSE uses camelCase) mode / template / prompt / promptSlug | Regenerate a summary for an existing recording |
REST POST /api/v1/summary
Generates a summary for any transcript (including external sources); the response is sent segment by segment as an SSE stream.
Full specification: reference/rest/summary.md
curl -N -X POST "https://vas-poc.vurbo.ai/api/v1/summary" \
-H "Authorization: Bearer vas_..." \
-H "Content-Type: application/json" \
-d '{
"content": "...transcript...",
"mode": "custom",
"prompt": "你是會議分析助手...",
"prompt_slug": "acme-meeting-v1",
"language": "zh-TW",
"plain_text": true
}'
⚠️ This endpoint authenticates with
Authorization: Bearer <api_key>, which differs from theX-API-Keyused by other REST endpoints.
WebSocket start action
The start action of a live recording carries the summary settings; after the session ends, the summary is generated automatically and you are notified via the summary_done event.
Full specification: reference/websocket/voice-translation.md
{
"type": "voice-translation",
"data": {
"action": "start",
"transcription_languages": ["zh-TW"],
"translation_languages": ["en-US"],
"type": "transcribe",
"audio_format": "pcm",
"summary_mode": "custom",
"summary_prompt": "你是會議分析助手...",
"summary_prompt_slug": "acme-meeting-v1",
"summary_language": "zh-TW",
"summary_plain_text": true
}
}
Summary completion event:
{
"type": "summary_done",
"data": {
"summary_mode": "custom",
"summary_template": "acme-meeting-v1",
"summary_plain_text": true,
"tokens_used": 890,
"summary_fallback_level": null,
"summary_dropped_segments": null
}
}
summary_templateis the effective slug — in custom mode it returns your slug (i.e., thesummary_prompt_slugyou submitted).
SSE regenerate/summary
Regenerates a summary for an existing recording, split into two endpoints:
| Method | Purpose | Writes DB | Stores Transcript | Billed |
|---|---|---|---|---|
GET /api/v1/sse/regenerate/summary/{taskId} | Preview (trial run, compare different prompt results) | ❌ | ❌ | ✅ |
POST /api/v1/sse/regenerate/summary/{taskId} | Persist (officially saved) | ✅ | ✅ + bump revision | ✅ |
Full specification: reference/sse/regenerate-summary.md
# Preview
curl -N "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary/550e8400-...?mode=custom&prompt=...&promptSlug=acme-v2&plainText=true" \
-H "X-API-Key: vas_..."
# Persist
curl -N -X POST "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary/550e8400-..." \
-H "X-API-Key: vas_..." \
-H "Content-Type: application/json" \
-d '{
"mode": "custom",
"prompt": "...",
"promptSlug": "acme-v2",
"plainText": true
}'
The
doneevent carriespersisted: bool— the client can determine directly from the payload whether this run was saved, without inferring it from the HTTP method.
Transcript Record Fields
After the summary completes, the following fields are written to the top-level of the transcript record (not nested under the summary object):
| Field | When Present | Description |
|---|---|---|
summary_mode | Always | "builtin" / "custom" |
summary_template | Always | effective slug — builtin → built-in slug; custom → your slug |
summary_plain_text | Always | bool |
summary_prompt_snapshot | custom mode only | The prompt content you passed in verbatim (forcibly snapshotted; the sole basis for reconstruction) |
summary_fallback_level | Only when fallback is triggered | Value is 2 or 3, indicating which content-filtering fallback path this summary actually took. Omitted when standard mode succeeds directly |
summary_dropped_segments | Only when summary_fallback_level=3 | The indices of the omitted transcript segments (integer array in original order) |
The init_summary event of GET /api/v1/sse/history/transcribe/{taskId} also carries the above fields (prompt_snapshot only in custom mode; fallback_level / dropped_segments only when fallback is triggered).
Roles of the Two Fields
summary_prompt_snapshot= your intent (the original prompt content)summary_fallback_level= the actual execution path (standard / neutral / segment omission)
The two fields are complementary: the snapshot can be used for auditing and reconstruction; fallback_level can be used to inform the user of what actually happened during that summary.
Handling Sensitive Words and Profanity
VAS handles sensitive words (profanity, emotional or sensitive content) via three different paths depending on the "source."
The API layer does not proactively reject requests containing sensitive words. All handling is downgraded or masked at runtime. If you want to block them beforehand, perform your own keyword checks in the frontend UI (see Frontend Pre-Validation (Optional)).
Quick Reference
| Source | Trigger Mechanism | Is a Summary Generated? | How the Client Is Informed |
|---|---|---|---|
The customer prompt itself contains sensitive words | Automatically enters neutral mode | ✅ | summary_done.summary_fallback_level === 2 |
| Transcript text (STT layer) | options.profanity_handling masks / removes | ✅ | The transcript text is already processed; no additional notification |
| Transcript text (summary layer) | Automatically omits the triggering segments | ✅ | summary_done.summary_fallback_level === 3 |
| Multiple paths all fail | summary_error | ❌ | error_code: llm_content_filtered |
Source 1: Customer prompt contains sensitive words → neutral mode
When the system detects that the customer prompt content triggers content filtering:
| Behavior |
|---|
| Automatically generates the summary using neutral instructions instead — does not reject, does not error |
The customer's original prompt is still stored as a summary_prompt_snapshot for auditing |
The summary_done event carries summary_fallback_level: 2 to notify the client |
Suggested UI message: "Your custom instructions contain terms that cannot be processed due to filtering; the summary was generated in neutral mode."
Source 2: Transcript text (STT-layer masking)
The options.profanity_handling field of the WebSocket start action controls how profanity in the speech recognition output is handled:
| Value | Behavior |
|---|---|
mask (default) | Profanity is masked with *** |
removed | Profanity is removed from the transcript |
raw | The original text is kept unprocessed |
See the options field in WebSocket - Voice Translation for details.
This option only affects STT output (the transcript text). It has nothing to do with sensitive words in the customer
prompt.
Source 3: Transcript text (summary-layer segment omission) → segment omission mode
If the STT-layer profanity_handling is set to raw (or filtering is still triggered after masking), the system automatically omits the segments that trigger filtering during summary generation:
| Behavior |
|---|
The summary_done event carries summary_fallback_level: 3 |
summary_dropped_segments lists the indices of the omitted transcript segments (integer array in original order) |
| The client can use this to inform the user which segments were actually omitted |
Suggested UI message: "The transcript contains N segments that cannot be processed; the summary was generated after omitting the related content." (N = summary_dropped_segments.length)
Total Failure → llm_content_filtered
When the customer prompt and the transcript both contain excessive sensitive content, and automatic downgrading still cannot produce a valid summary:
| Behavior |
|---|
Triggers the summary_error event with error_code: llm_content_filtered |
| The summary will not be saved |
Suggested UI message: "A summary cannot be generated for this content (restricted by content-filtering rules)."
Billing Impact
When a fallback is triggered, the system may need to call the LLM service multiple times, and all calls are recorded in usage_logs for billing. We recommend treating "summaries that trigger a fallback" as normal but higher token-cost events.
Specification Coverage
| Path | Fallback Integration Status |
|---|---|
| WebSocket realtime summary (auto-generated after recording ends) | ✅ Since V1.5.5 |
| File import summary | ✅ Since V1.5.5 |
SSE regenerate/summary endpoint | ⚠️ Not yet integrated — still returns llm_content_filtered directly when content filtering is triggered. If you need automatic downgrading, we recommend using the POST /api/v1/summary REST endpoint instead |
Corresponding Client Implementation
function renderSummary(summary, fallbackLevel, droppedSegments) {
switch (fallbackLevel) {
case undefined:
case null:
case 1:
return summary;
case 2:
return summary + '\n\n(Your custom instructions contained filtered terms; the summary was generated in neutral mode)';
case 3:
return summary + `\n\n(The transcript contained ${droppedSegments.length} segments that could not be processed; the summary was generated after omitting the related content)`;
}
}
Frontend Pre-Validation (Optional)
If you want to block prompts containing sensitive words before sending the request (saving the extra token billing from fallbacks), you can add keyword checks to the frontend UI.
VAS does not provide an API-layer prompt pre-check endpoint — whether to perform pre-validation and which keyword list to use is entirely up to the client.
Security and Limits
Built-in Safety Guards
In custom mode, the system automatically adds safety guards to the customer prompt, including:
- Content neutralization guidance: Instructs the LLM to summarize the intent of any colloquial, emotional, or sensitive terms in the source text using neutral, objective language, avoiding verbatim quotation. The goal is to reduce the chance that standard mode triggers content filtering.
- Instruction injection protection: Prevents instructions in the customer prompt from accidentally overriding system rules.
The safety guards are not exposed for client configuration; the customer's original prompt is still stored in the summary_prompt_snapshot field for auditing.
⚠️ The safety guards are not foolproof. You should not splice untrusted end-user input directly into
prompt; prompt injection risk is your own responsibility.
Character and Length Limits
| Field | Limit |
|---|---|
prompt / summary_prompt | ≤ 2000 characters |
prompt_slug / summary_prompt_slug | ≤ 64 characters, Unicode, no control characters (\n / \r / \t / \0, etc.) |
content (the transcript input for REST POST /api/v1/summary) | ≤ 100,000 characters |
Cross-Tenant Isolation
The customer prompt and summary results are fully isolated across tenants (session-scoped, with no memory persistence).
Server Log
The VAS server log does not record the prompt or the full transcript text (it only logs the length and slug). LLM error messages are sanitized so that the raw error is not exposed to the client; only the provider is indicated.
Complete Examples
Node.js — REST POST /api/v1/summary (custom mode)
const response = await fetch('https://vas-poc.vurbo.ai/api/v1/summary', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
content: transcriptText,
mode: 'custom',
prompt: `你是會議分析助手。請從逐字稿萃取:
1. 主要決議事項(每項一句話)
2. 待辦工作(含負責人、期限)
3. 風險與阻礙
輸出格式:JSON。請以繁體中文輸出。`,
prompt_slug: 'acme-meeting-v2',
language: 'zh-TW',
plain_text: true,
}),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { value, done } = await reader.read();
if (done) break;
buffer += decoder.decode(value);
for (const line of buffer.split('\n\n')) {
if (line.startsWith('event: chunk')) {
const data = JSON.parse(line.split('\ndata: ')[1]);
process.stdout.write(data.content);
} else if (line.startsWith('event: done')) {
const data = JSON.parse(line.split('\ndata: ')[1]);
console.log('\n\n--- Done ---');
console.log(`tokens: input=${data.tokens_used.input}, output=${data.tokens_used.output}`);
console.log(`prompt_snapshot saved: ${data.prompt_snapshot ? 'yes' : 'no'}`);
}
}
}
Python — SSE Regenerate Summary (custom mode: preview then persist)
import requests
BASE = "https://vas-poc.vurbo.ai/api/v1/sse/regenerate/summary"
TASK_ID = "550e8400-e29b-41d4-a716-446655440000"
HEADERS = {"X-API-Key": "vas_..."}
# 1. Preview first with GET
preview_params = {
"mode": "custom",
"prompt": "你是醫療摘要助手...",
"promptSlug": "clinic-acme-v3",
"plainText": "true",
}
with requests.get(f"{BASE}/{TASK_ID}", params=preview_params, headers=HEADERS, stream=True) as r:
for line in r.iter_lines():
if line.startswith(b"data: "):
print(line[6:].decode())
# 2. Persist with POST after customer confirmation
persist_body = {
"mode": "custom",
"prompt": "你是醫療摘要助手...",
"promptSlug": "clinic-acme-v3",
"plainText": True,
}
with requests.post(f"{BASE}/{TASK_ID}", json=persist_body, headers=HEADERS, stream=True) as r:
for line in r.iter_lines():
if line.startswith(b"data: "):
print(line[6:].decode())
WebSocket — Live Recording custom Summary
ws.send(JSON.stringify({
type: 'voice-translation',
data: {
action: 'start',
transcription_languages: ['zh-TW'],
translation_languages: [],
type: 'transcribe',
audio_format: 'pcm',
summary_mode: 'custom',
summary_prompt: '你是法律會議助手。請輸出:1) 爭點 2) 立場 3) 結論。',
summary_prompt_slug: 'legal-firm-acme-v1',
summary_language: 'zh-TW',
summary_plain_text: true,
},
}));
ws.addEventListener('message', (evt) => {
const msg = JSON.parse(evt.data);
if (msg.type === 'summary_done') {
console.log('Summary done', msg.data);
if (msg.data.summary_fallback_level === 2) {
showBanner('Your custom instructions contained filtered terms; the summary was generated in neutral mode');
} else if (msg.data.summary_fallback_level === 3) {
showBanner(`The transcript contained ${msg.data.summary_dropped_segments.length} segments that could not be processed and were omitted`);
}
} else if (msg.type === 'summary_error') {
console.error('Summary failed', msg.data.error_code, msg.data.message);
}
});
Error Codes
| Error Code | HTTP | Trigger Condition |
|---|---|---|
summary_invalid_mode | 422 (SSE) / 400 (others) | mode is not builtin / custom |
summary_mode_field_mismatch | 422 / 400 | The mode and field combination do not match (required field missing / disallowed field provided) |
summary_prompt_too_long | 422 / 400 | prompt exceeds 2000 characters |
summary_prompt_slug_too_long | 422 / 400 | prompt_slug exceeds 64 characters |
summary_prompt_slug_invalid | 422 / 400 | prompt_slug contains control characters (\n / \r / \t / \0, etc.) |
template_not_found | 404 | (builtin mode) The template for the specified slug does not exist or has been disabled |
llm_content_filtered | 400 | Blocked by content filtering (the customer prompt or transcript contains sensitive words), and the fallback chain failed entirely or the endpoint does not integrate fallback |
summary_failed | 500 | Summary generation failed |
summary_timeout | 504 | Generation timed out |
For a complete list of error codes, see Error Code Reference.
Related Documents
| Document | Description |
|---|---|
| POST /api/v1/summary | Full specification for the REST summary endpoint |
| Summary Templates API | Built-in template list and template lookup |
| SSE - Regenerate Summary | Specification for the GET preview / POST persist endpoints |
| WebSocket - Voice Translation | The summary_* fields of the start action |
| SSE - History | The mode / template / prompt_snapshot fields of the init_summary event |
| Error Code Reference | Complete list of error codes |
| Changelog V1.5.5 | The version that introduced the builtin/custom mutual-exclusion spec and automatic content-filtering downgrade |
Version: V1.5.7 Last Updated: 2026-05-20