Speaker selection
Speaker selection helps you target the right face when a clip contains multiple people. You can either let Sync auto-detect the active speaker or provide a user-selected point from your UI and forward it via the active_speaker_detection DTO on /v2/generate.
When to use what
- Auto-detect: fastest setup; best for single/obvious speaker clips. Set
auto_detect: trueand skip manual fields. - Manual selection: best for multiple people or when you want deterministic control. Provide a reference frame and a point on the speaker’s face, or supply bounding boxes if you already have detections for that frame.
Workflow: selecting a speaker in your UI
Capture a reference frame
Seek the video to a frame where the target speaker’s face is visible. Keep track of the frame index you show in the UI.
Collect a point on the face
Record the [x, y] coordinates (in the same coordinate system/pixels as your extracted frame) for the clicked point on the speaker’s face. Keep the frame index and coordinates paired.
ActiveSpeaker DTO fields
See the full API reference for active_speaker_detection.
auto_detect(boolean, defaultfalse): let Sync pick the active speaker automatically.frame_number(number): frame index that corresponds to the provided coordinates.coordinates([x, y]): reference point on the speaker’s face inframe_number.bounding_boxes((number[] | null)[], optional): per-frame array of bounding boxes across the video. Each entry corresponds to that frame: set to[x1, y1, x2, y2](x1,y1= top-left;x2,y2= bottom-right) for the detected face, ornullif no box for that frame. Use this instead offrame_number+coordinateswhen you already run detection over the clip.
Request examples
TypeScript SDK
cURL (HTTP)
TypeScript SDK (bounding boxes instead of coordinates)
If you prefer auto-detection, omit manual fields and set auto_detect to true. For manual control, either provide frame + coordinates from your UI selection, or supply bounding_boxes for each frame if you already ran detection (no frame_number/coordinates needed in that case).

