Speakers
Overview
The recording speaker editing API is used to manage speakers in multi-speaker conversation mode. It provides three operations: global rename, single-sentence reassignment, and speaker merge.
V1.4.1 Naming Unification: The following two endpoints provide both
tasksandrecordingsequivalent paths (same Controller, same UUID):
Recommended (since V1.4.1) Deprecated (removed in V1.6.0) PATCH /api/v1/tasks/{taskId}/speakers/renamePATCH /api/v1/recordings/{recordingId}/speakers/renamePATCH /api/v1/tasks/{taskId}/speakers/reassignPATCH /api/v1/recordings/{recordingId}/speakers/reassignPATCH /api/v1/tasks/{taskId}/speakers/merge— Use the
taskspath for new integrations; existing integrations may continue to use therecordingspath until V1.6.0.taskIdandrecordingIdare the same UUID.
PATCH /api/v1/tasks/{taskId}/speakers/rename
Deprecated alias:
PATCH /api/v1/recordings/{recordingId}/speakers/rename(removed in V1.6.0)
Description
Renames all sentences of a given speaker in a recording at once. This is useful for replacing system-generated speaker names (such as Guest-1) with real names.
Authentication
Header: X-API-Key (see Authentication)
Request Parameters
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
taskId | string | Yes | Task ID (UUID) |
Body Parameters (JSON)
| Parameter | Type | Required | Description |
|---|---|---|---|
speaker_id | string | Yes | Original speaker ID (such as "Guest-1"); also accepts the current display label (speaker_label) for consecutive renaming; maximum 100 characters |
new_label | string | Yes | New display label; maximum 100 characters, must not contain control characters (\x00-\x1F, \x7F) or line breaks (it will be written into the transcript blob, SSE events, and TXT/SRT/CSV exports) |
Request Example
curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/rename" \
-H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
-H "Content-Type: application/json" \
-d '{
"speaker_id": "Guest-1",
"new_label": "Manager Wang"
}'
Success Response
HTTP 200
{
"data": {
"speaker_id": "Guest-1",
"new_label": "Manager Wang",
"affected_sids": [1, 3, 5]
}
}
Response Field Descriptions
| Field | Type | Description |
|---|---|---|
data.speaker_id | string | The resolved original speaker ID (even if the request sent a display label, the response is still the original ID) |
data.new_label | string | New display label |
data.affected_sids | array<int> | List of affected sentence SIDs |
Specific Error Codes
| Error Code | HTTP Status | Description | Suggested Action |
|---|---|---|---|
recording_not_found | 404 | Recording not found | Verify that taskId is correct |
recording_unauthorized | 403 | Unauthorized to operate on this recording | Verify that the recording belongs to this user |
validation_failed | 422 | Request validation failed | Verify that both speaker_id and new_label are provided, do not exceed 100 characters, and that new_label contains no control characters |
PATCH /api/v1/tasks/{taskId}/speakers/reassign
Deprecated alias:
PATCH /api/v1/recordings/{recordingId}/speakers/reassign(removed in V1.6.0)
Description
Reassigns the speaker of a specific sentence to another speaker. This is useful for correcting errors produced by automatic recognition (Speaker Diarization).
Authentication
Header: X-API-Key (see Authentication)
Request Parameters
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
taskId | string | Yes | Task ID (UUID) |
Body Parameters (JSON)
| Parameter | Type | Required | Description |
|---|---|---|---|
sid | integer | Yes | Sentence ID |
target_speaker_id | string | Yes | Target speaker's original ID (taken from init_sentence.speaker_id; reassign does not accept display labels, you must send the original ID); maximum 100 characters |
Request Example
curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/reassign" \
-H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
-H "Content-Type: application/json" \
-d '{
"sid": 3,
"target_speaker_id": "Guest-2"
}'
Success Response
HTTP 200
{
"data": {
"sid": 3,
"old_speaker_id": "Guest-1",
"new_speaker_id": "Guest-2",
"new_speaker_label": "Director Li"
}
}
Response Field Descriptions
| Field | Type | Description |
|---|---|---|
data.sid | integer | The modified sentence ID |
data.old_speaker_id | string | Original speaker ID |
data.new_speaker_id | string | Original speaker ID after reassignment |
data.new_speaker_label | string | Display label after reassignment (after applying speaker_aliases; equals new_speaker_id when there is no alias) |
Specific Error Codes
| Error Code | HTTP Status | Description | Suggested Action |
|---|---|---|---|
recording_not_found | 404 | Recording not found | Verify that taskId is correct |
recording_unauthorized | 403 | Unauthorized to operate on this recording | Verify that the recording belongs to this user |
validation_failed | 422 | Request validation failed | Verify that both sid and target_speaker_id are provided and correctly formatted |
PATCH /api/v1/tasks/{taskId}/speakers/merge
Aligned with the WebSocket
merge_speakersaction, providing merge capability for historical recordings.
Description
Reassigns all sentences of the source speaker to the target speaker; the source's alias (if any) is transferred to the target (if the target has no alias yet). This is useful when the diarization model misidentifies the same person as two separate speakers (for example, Guest-1 and Guest-2 are actually the same person).
vs. reassign:
reassignchanges only a single sentence;mergechanges all sentences of the speaker. vs. rename:renamechanges only the display name (alias) without touching the speaker ID;mergeconsolidates multiple speakers into one.
Authentication
Header: X-API-Key (see Authentication)
Request Parameters
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
taskId | string | Yes | Task ID (UUID) |
Body Parameters (JSON)
| Parameter | Type | Required | Description |
|---|---|---|---|
source_speaker_id | string | Yes | The original speaker ID to be merged, or the current display label (such as Guest-2 or Manager Wang); maximum 100 characters |
target_speaker_id | string | Yes | The original ID or current display label of the merge target speaker (such as Guest-1); maximum 100 characters |
Request Example
curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/merge" \
-H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
-H "Content-Type: application/json" \
-d '{
"source_speaker_id": "Guest-2",
"target_speaker_id": "Guest-1"
}'
Success Response
HTTP 200
{
"data": {
"source_speaker_id": "Guest-2",
"target_speaker_id": "Guest-1",
"target_speaker_label": "Manager Wang",
"affected_sids": [3, 5, 7]
}
}
Response Field Descriptions
| Field | Type | Description |
|---|---|---|
data.source_speaker_id | string | The original speaker ID that was merged (resolved back to the original ID, even if a display label was sent in the request) |
data.target_speaker_id | string | The original speaker ID of the merge target |
data.target_speaker_label | string | The target speaker's display label (after applying speaker_aliases; equals the original ID when there is no alias) |
data.affected_sids | array<int> | List of affected sentence SIDs |
Specific Error Codes
| Error Code | HTTP Status | Description | Suggested Action |
|---|---|---|---|
merge_speakers_same_id | 400 | source and target resolve to the same speaker | Provide different speaker IDs |
speaker_name_empty | 400 | source or target is an empty string | Provide a valid speaker ID |
speaker_not_found | 404 | source or target does not exist in this recording | Verify that the speaker ID is correct |
recording_not_found | 404 | Recording not found | Verify that taskId is correct |
speaker_diarization_required | 422 | This recording is not in multi-speaker conversation mode | This feature is only available for recognition_mode: multi_speaker recordings |
validation_failed | 422 | Request validation failed | Verify that both source_speaker_id and target_speaker_id are provided and do not exceed 100 characters |
Notes
- Alias input supported: If you have already changed the display name with
rename(for example, renaming Guest-1 to "Manager Wang"), you can send "Manager Wang" directly as the source or target during merge, and the system will automatically resolve it back to the original ID - Alias transfer: If the source has an alias but the target does not, after the merge the target inherits the source's alias
- Full-file rewrite + optimistic locking: Each merge updates the
revision; concurrent edits must handletranscript_revision_conflict(409) - Irreversible: After a merge, the source ID no longer has any corresponding sentences in this recording; to revert, you must use
reassignto change them back one sentence at a time
Related Resources
- Speaker Management - Complete guide to renaming, reassigning, and merging speakers
Version: V1.5.7 Last Updated: 2026-05-20