REST API

Speakers

Overview

The recording speaker editing API is used to manage speakers in multi-speaker conversation mode. It provides three operations: global rename, single-sentence reassignment, and speaker merge.

V1.4.1 Naming Unification: The following two endpoints provide both tasks and recordings equivalent paths (same Controller, same UUID):
Recommended (since V1.4.1) Deprecated (removed in V1.6.0)
PATCH /api/v1/tasks/{taskId}/speakers/rename PATCH /api/v1/recordings/{recordingId}/speakers/rename
PATCH /api/v1/tasks/{taskId}/speakers/reassign PATCH /api/v1/recordings/{recordingId}/speakers/reassign
PATCH /api/v1/tasks/{taskId}/speakers/merge —
Use the tasks path for new integrations; existing integrations may continue to use the recordings path until V1.6.0. taskId and recordingId are the same UUID.

Recommended (since V1.4.1)	Deprecated (removed in V1.6.0)
`PATCH /api/v1/tasks/{taskId}/speakers/rename`	`PATCH /api/v1/recordings/{recordingId}/speakers/rename`
`PATCH /api/v1/tasks/{taskId}/speakers/reassign`	`PATCH /api/v1/recordings/{recordingId}/speakers/reassign`
`PATCH /api/v1/tasks/{taskId}/speakers/merge`	—

PATCH /api/v1/tasks/{taskId}/speakers/rename

Deprecated alias: PATCH /api/v1/recordings/{recordingId}/speakers/rename (removed in V1.6.0)

Description

Renames all sentences of a given speaker in a recording at once. This is useful for replacing system-generated speaker names (such as Guest-1) with real names.

Authentication

Header: X-API-Key (see Authentication)

Request Parameters

Path Parameters

Parameter	Type	Required	Description
`taskId`	string	Yes	Task ID (UUID)

Body Parameters (JSON)

Parameter	Type	Required	Description
`speaker_id`	string	Yes	Original speaker ID (such as `"Guest-1"`); also accepts the current display label (`speaker_label`) for consecutive renaming; maximum 100 characters
`new_label`	string	Yes	New display label; maximum 100 characters, must not contain control characters (`\x00-\x1F`, `\x7F`) or line breaks (it will be written into the transcript blob, SSE events, and TXT/SRT/CSV exports)

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/rename" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "speaker_id": "Guest-1",
    "new_label": "Manager Wang"
  }'

Success Response

HTTP 200

{
  "data": {
    "speaker_id": "Guest-1",
    "new_label": "Manager Wang",
    "affected_sids": [1, 3, 5]
  }
}

Response Field Descriptions

Field	Type	Description
`data.speaker_id`	string	The resolved original speaker ID (even if the request sent a display label, the response is still the original ID)
`data.new_label`	string	New display label
`data.affected_sids`	array<int>	List of affected sentence SIDs

Specific Error Codes

Error Code	HTTP Status	Description	Suggested Action
`recording_not_found`	404	Recording not found	Verify that taskId is correct
`recording_unauthorized`	403	Unauthorized to operate on this recording	Verify that the recording belongs to this user
`validation_failed`	422	Request validation failed	Verify that both `speaker_id` and `new_label` are provided, do not exceed 100 characters, and that `new_label` contains no control characters

PATCH /api/v1/tasks/{taskId}/speakers/reassign

Deprecated alias: PATCH /api/v1/recordings/{recordingId}/speakers/reassign (removed in V1.6.0)

Description

Reassigns the speaker of a specific sentence to another speaker. This is useful for correcting errors produced by automatic recognition (Speaker Diarization).

Authentication

Header: X-API-Key (see Authentication)

Request Parameters

Path Parameters

Parameter	Type	Required	Description
`taskId`	string	Yes	Task ID (UUID)

Body Parameters (JSON)

Parameter	Type	Required	Description
`sid`	integer	Yes	Sentence ID
`target_speaker_id`	string	Yes	Target speaker's original ID (taken from `init_sentence.speaker_id`; reassign does not accept display labels, you must send the original ID); maximum 100 characters

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/reassign" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "sid": 3,
    "target_speaker_id": "Guest-2"
  }'

Success Response

HTTP 200

{
  "data": {
    "sid": 3,
    "old_speaker_id": "Guest-1",
    "new_speaker_id": "Guest-2",
    "new_speaker_label": "Director Li"
  }
}

Response Field Descriptions

Field	Type	Description
`data.sid`	integer	The modified sentence ID
`data.old_speaker_id`	string	Original speaker ID
`data.new_speaker_id`	string	Original speaker ID after reassignment
`data.new_speaker_label`	string	Display label after reassignment (after applying `speaker_aliases`; equals `new_speaker_id` when there is no alias)

Specific Error Codes

Error Code	HTTP Status	Description	Suggested Action
`recording_not_found`	404	Recording not found	Verify that taskId is correct
`recording_unauthorized`	403	Unauthorized to operate on this recording	Verify that the recording belongs to this user
`validation_failed`	422	Request validation failed	Verify that both `sid` and `target_speaker_id` are provided and correctly formatted

PATCH /api/v1/tasks/{taskId}/speakers/merge

Aligned with the WebSocket merge_speakers action, providing merge capability for historical recordings.

Reassigns all sentences of the source speaker to the target speaker; the source's alias (if any) is transferred to the target (if the target has no alias yet). This is useful when the diarization model misidentifies the same person as two separate speakers (for example, Guest-1 and Guest-2 are actually the same person).

vs. reassign: reassign changes only a single sentence; merge changes all sentences of the speaker. vs. rename: rename changes only the display name (alias) without touching the speaker ID; merge consolidates multiple speakers into one.

Parameter	Type	Required	Description
`taskId`	string	Yes	Task ID (UUID)

Body Parameters (JSON)

Parameter	Type	Required	Description
`source_speaker_id`	string	Yes	The original speaker ID to be merged, or the current display label (such as `Guest-2` or `Manager Wang`); maximum 100 characters
`target_speaker_id`	string	Yes	The original ID or current display label of the merge target speaker (such as `Guest-1`); maximum 100 characters

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/merge" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "source_speaker_id": "Guest-2",
    "target_speaker_id": "Guest-1"
  }'

Success Response

HTTP 200

{
  "data": {
    "source_speaker_id": "Guest-2",
    "target_speaker_id": "Guest-1",
    "target_speaker_label": "Manager Wang",
    "affected_sids": [3, 5, 7]
  }
}

Response Field Descriptions

Field	Type	Description
`data.source_speaker_id`	string	The original speaker ID that was merged (resolved back to the original ID, even if a display label was sent in the request)
`data.target_speaker_id`	string	The original speaker ID of the merge target
`data.target_speaker_label`	string	The target speaker's display label (after applying `speaker_aliases`; equals the original ID when there is no alias)
`data.affected_sids`	array<int>	List of affected sentence SIDs

Specific Error Codes

Error Code	HTTP Status	Description	Suggested Action
`merge_speakers_same_id`	400	source and target resolve to the same speaker	Provide different speaker IDs
`speaker_name_empty`	400	source or target is an empty string	Provide a valid speaker ID
`speaker_not_found`	404	source or target does not exist in this recording	Verify that the speaker ID is correct
`recording_not_found`	404	Recording not found	Verify that taskId is correct
`speaker_diarization_required`	422	This recording is not in multi-speaker conversation mode	This feature is only available for `recognition_mode: multi_speaker` recordings
`validation_failed`	422	Request validation failed	Verify that both `source_speaker_id` and `target_speaker_id` are provided and do not exceed 100 characters

Notes

Alias input supported: If you have already changed the display name with rename (for example, renaming Guest-1 to "Manager Wang"), you can send "Manager Wang" directly as the source or target during merge, and the system will automatically resolve it back to the original ID
Alias transfer: If the source has an alias but the target does not, after the merge the target inherits the source's alias
Full-file rewrite + optimistic locking: Each merge updates the revision; concurrent edits must handle transcript_revision_conflict (409)
Irreversible: After a merge, the source ID no longer has any corresponding sentences in this recording; to revert, you must use reassign to change them back one sentence at a time

Speaker Management - Complete guide to renaming, reassigning, and merging speakers

Version: V1.5.7 Last Updated: 2026-05-20

Imports

Summary