REST API

Speakers

Overview

The recording speaker editing API is used to manage speakers in multi-speaker conversation mode. It provides three operations: global rename, single-sentence reassignment, and speaker merge.

V1.4.1 Naming Unification: The following two endpoints provide both tasks and recordings equivalent paths (same Controller, same UUID):

Recommended (since V1.4.1)Deprecated (removed in V1.6.0)
PATCH /api/v1/tasks/{taskId}/speakers/renamePATCH /api/v1/recordings/{recordingId}/speakers/rename
PATCH /api/v1/tasks/{taskId}/speakers/reassignPATCH /api/v1/recordings/{recordingId}/speakers/reassign
PATCH /api/v1/tasks/{taskId}/speakers/merge

Use the tasks path for new integrations; existing integrations may continue to use the recordings path until V1.6.0. taskId and recordingId are the same UUID.


PATCH /api/v1/tasks/{taskId}/speakers/rename

Deprecated alias: PATCH /api/v1/recordings/{recordingId}/speakers/rename (removed in V1.6.0)

Description

Renames all sentences of a given speaker in a recording at once. This is useful for replacing system-generated speaker names (such as Guest-1) with real names.

Authentication

Header: X-API-Key (see Authentication)

Request Parameters

Path Parameters

ParameterTypeRequiredDescription
taskIdstringYesTask ID (UUID)

Body Parameters (JSON)

ParameterTypeRequiredDescription
speaker_idstringYesOriginal speaker ID (such as "Guest-1"); also accepts the current display label (speaker_label) for consecutive renaming; maximum 100 characters
new_labelstringYesNew display label; maximum 100 characters, must not contain control characters (\x00-\x1F, \x7F) or line breaks (it will be written into the transcript blob, SSE events, and TXT/SRT/CSV exports)

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/rename" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "speaker_id": "Guest-1",
    "new_label": "Manager Wang"
  }'

Success Response

HTTP 200

{
  "data": {
    "speaker_id": "Guest-1",
    "new_label": "Manager Wang",
    "affected_sids": [1, 3, 5]
  }
}

Response Field Descriptions

FieldTypeDescription
data.speaker_idstringThe resolved original speaker ID (even if the request sent a display label, the response is still the original ID)
data.new_labelstringNew display label
data.affected_sidsarray<int>List of affected sentence SIDs

Specific Error Codes

Error CodeHTTP StatusDescriptionSuggested Action
recording_not_found404Recording not foundVerify that taskId is correct
recording_unauthorized403Unauthorized to operate on this recordingVerify that the recording belongs to this user
validation_failed422Request validation failedVerify that both speaker_id and new_label are provided, do not exceed 100 characters, and that new_label contains no control characters

PATCH /api/v1/tasks/{taskId}/speakers/reassign

Deprecated alias: PATCH /api/v1/recordings/{recordingId}/speakers/reassign (removed in V1.6.0)

Description

Reassigns the speaker of a specific sentence to another speaker. This is useful for correcting errors produced by automatic recognition (Speaker Diarization).

Authentication

Header: X-API-Key (see Authentication)

Request Parameters

Path Parameters

ParameterTypeRequiredDescription
taskIdstringYesTask ID (UUID)

Body Parameters (JSON)

ParameterTypeRequiredDescription
sidintegerYesSentence ID
target_speaker_idstringYesTarget speaker's original ID (taken from init_sentence.speaker_id; reassign does not accept display labels, you must send the original ID); maximum 100 characters

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/reassign" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "sid": 3,
    "target_speaker_id": "Guest-2"
  }'

Success Response

HTTP 200

{
  "data": {
    "sid": 3,
    "old_speaker_id": "Guest-1",
    "new_speaker_id": "Guest-2",
    "new_speaker_label": "Director Li"
  }
}

Response Field Descriptions

FieldTypeDescription
data.sidintegerThe modified sentence ID
data.old_speaker_idstringOriginal speaker ID
data.new_speaker_idstringOriginal speaker ID after reassignment
data.new_speaker_labelstringDisplay label after reassignment (after applying speaker_aliases; equals new_speaker_id when there is no alias)

Specific Error Codes

Error CodeHTTP StatusDescriptionSuggested Action
recording_not_found404Recording not foundVerify that taskId is correct
recording_unauthorized403Unauthorized to operate on this recordingVerify that the recording belongs to this user
validation_failed422Request validation failedVerify that both sid and target_speaker_id are provided and correctly formatted

PATCH /api/v1/tasks/{taskId}/speakers/merge

Aligned with the WebSocket merge_speakers action, providing merge capability for historical recordings.

Description

Reassigns all sentences of the source speaker to the target speaker; the source's alias (if any) is transferred to the target (if the target has no alias yet). This is useful when the diarization model misidentifies the same person as two separate speakers (for example, Guest-1 and Guest-2 are actually the same person).

vs. reassign: reassign changes only a single sentence; merge changes all sentences of the speaker. vs. rename: rename changes only the display name (alias) without touching the speaker ID; merge consolidates multiple speakers into one.

Authentication

Header: X-API-Key (see Authentication)

Request Parameters

Path Parameters

ParameterTypeRequiredDescription
taskIdstringYesTask ID (UUID)

Body Parameters (JSON)

ParameterTypeRequiredDescription
source_speaker_idstringYesThe original speaker ID to be merged, or the current display label (such as Guest-2 or Manager Wang); maximum 100 characters
target_speaker_idstringYesThe original ID or current display label of the merge target speaker (such as Guest-1); maximum 100 characters

Request Example

curl -X PATCH "https://vas-poc.vurbo.ai/api/v1/tasks/rec_abc123/speakers/merge" \
  -H "X-API-Key: vas_aB3dE5fG7hI9jK1lM3nO5pQ7rS9tU1vW" \
  -H "Content-Type: application/json" \
  -d '{
    "source_speaker_id": "Guest-2",
    "target_speaker_id": "Guest-1"
  }'

Success Response

HTTP 200

{
  "data": {
    "source_speaker_id": "Guest-2",
    "target_speaker_id": "Guest-1",
    "target_speaker_label": "Manager Wang",
    "affected_sids": [3, 5, 7]
  }
}

Response Field Descriptions

FieldTypeDescription
data.source_speaker_idstringThe original speaker ID that was merged (resolved back to the original ID, even if a display label was sent in the request)
data.target_speaker_idstringThe original speaker ID of the merge target
data.target_speaker_labelstringThe target speaker's display label (after applying speaker_aliases; equals the original ID when there is no alias)
data.affected_sidsarray<int>List of affected sentence SIDs

Specific Error Codes

Error CodeHTTP StatusDescriptionSuggested Action
merge_speakers_same_id400source and target resolve to the same speakerProvide different speaker IDs
speaker_name_empty400source or target is an empty stringProvide a valid speaker ID
speaker_not_found404source or target does not exist in this recordingVerify that the speaker ID is correct
recording_not_found404Recording not foundVerify that taskId is correct
speaker_diarization_required422This recording is not in multi-speaker conversation modeThis feature is only available for recognition_mode: multi_speaker recordings
validation_failed422Request validation failedVerify that both source_speaker_id and target_speaker_id are provided and do not exceed 100 characters

Notes

  • Alias input supported: If you have already changed the display name with rename (for example, renaming Guest-1 to "Manager Wang"), you can send "Manager Wang" directly as the source or target during merge, and the system will automatically resolve it back to the original ID
  • Alias transfer: If the source has an alias but the target does not, after the merge the target inherits the source's alias
  • Full-file rewrite + optimistic locking: Each merge updates the revision; concurrent edits must handle transcript_revision_conflict (409)
  • Irreversible: After a merge, the source ID no longer has any corresponding sentences in this recording; to revert, you must use reassign to change them back one sentence at a time


Version: V1.5.7 Last Updated: 2026-05-20

Copyright © 2026