Segments Guide
Overview
segments feature enables you to sync multiple video segments with different audio inputs in a single generation. Using segments, you can:
- LipSync different audio clips to different parts of your video
- Use specific portion of audio input to lipsync a segment for precise timing
- Use both audio and text-to-speech inputs to lipsync multiple segments with different input types in a single generation
Basic Concepts
To use segments feature, you need to provide a top-level segments
array with each item defining a video time range/segment, each with its own audio configuration.
Segment
Each segment item takes the following properties:
Segment start time in seconds
Segment end time in seconds
Audio configuration with refId and optional cropping
audioInput
Each segment requires exactly one audioInput. audioInput takes the following properties:
Reference ID of the audio/text-to-speech input to use for this segment
Optional start time (in seconds) to crop the referenced audio. When specified, endTime must also be provided
Optional end time (in seconds) to crop the referenced audio. When specified, startTime must also be provided
The specified audioInput will be used to lipsync the video segment between startTime and endTime.
API Usage Examples
Single Segment with Single Audio
Multiple Segments with Single Audio
Multiple Segments with Single Audio Input
Multiple Segments with Multiple Audio
Multiple Segments with Single Audio Input
Best Practices
Planning Your Segments
- Map your timeline: Identify video segments and corresponding audio needs
- Prepare audio files: Ensure audio quality and appropriate duration
- Test segment boundaries: Verify smooth transitions between segments
Audio Preparation
- Use consistent audio quality across all segments and the video’s audio.
- For best results, ensure proper timing alignment with video segments. If segment duration and corresponding audio duration don’t match, rely on sync_mode to determine how to handle the mismatch.
Troubleshooting
Common Errors
"Multiple audio inputs are only allowed when using multi-segments"
Provide a top-level segments
array when using multiple audio or text inputs.
"Unable to resolve audio input URL"
Ensure all audio inputs have valid url
or assetId
values and that referenced refId
values exist in your audio or text inputs.
"Segment at index X is missing a valid audioInput.refId"
This error occurs when a segment’s audio_input
is missing a refId
or the refId
is empty. Each segment must reference a valid audio or text input through its refId
.
"Segment at index X references unknown refId"
This error occurs when a segment references a refId
that doesn’t exist in your audio or text inputs. Ensure all referenced refId
values match exactly with those defined in your inputs.
"Invalid audio_input crop range"
Ensure both start_time
and end_time
are provided for audio cropping, and verify start_time < end_time
for all crop ranges.
"When using multi-segments, please provide at least one audio or text input"
Ensure you have at least one audio input or text input with a valid refId
when using segments.