Media Formats Support

Supported Media Formats

Video Formats

The Sync API accepts the following video file formats:

MIME TypeFile ExtensionFormat
video/mp4.mp4MP4
video/quicktime.movQuickTime
video/webm.webmWebM
video/x-msvideo.aviAVI

Audio Formats

The Sync API accepts the following audio file formats:

MIME TypeFile ExtensionFormat
audio/wav.wavWAV
audio/mpeg.mp3MP3
audio/ogg.oggOGG
audio/x-m4a.m4aM4A
audio/x-m3a.m3aM3A
audio/aac.aacAAC
audio/x-ms-wma.wmaWMA
audio/flac.flacFLAC
audio/mp4.mp4MP4 Audio

File Format Recommendation: While multiple formats are supported, we recommend using MP4 for video and WAV or MP3 for audio to ensure optimal compatibility and processing performance.

Output Quality

Video Processing Overview

The Sync video pipeline uses the H.264 codec for internal processing, and all videos are re-encoded. While we strive to preserve the input video’s quality and properties, this process may change properties like the original codec, bitrate, and frame rate.

A Note on HDR Video: 10-bit color space (HDR) videos are not fully supported. HDR videos will be normalized to 8-bit color space (SDR), which may cause changes to the color grading in the output.

A Note on Alpha Transparency: Alpha channels are not preserved in the output. The Sync pipeline uses H.264 codec and processes video in RGB color space, which does not support alpha channels. If your input video contains alpha transparency (such as WebP videos with transparency), the alpha channel will be removed and replaced with a solid background.

Video

PropertyRecommended Value
CodecH.264 (High Profile)
Resolution1920x1080
Average Bitrate50 Mbps
Frame Rate (FPS)24, 25, or 30 fps constant
Color Space8-bit (SDR)

Maximum Resolution Limit: Input videos above 4K (4096 x 2160 pixels) are not supported and will be rejected. If you need to process higher resolution content, downscale your video to 4K or below before uploading.

Audio

For the best results, use a sampling rate of 44.1kHz or 48kHz. If you provide audio with a higher sampling rate, it will be downsampled to 48kHz during lipsync, which can result in quality loss.

If an input file contains multiple audio streams, only the first stream is processed. All other streams are discarded.

Input Video Codec Comparison

Processing speed is similar for all codecs because every input is transcoded to a standard format. However, some codecs experience greater quality loss during this process.

The following results are from our internal testing, where quality was measured using VMAF.

Input CodecOutput Quality
H.264Best (Least quality loss)
MPEG-2Good (Up to 15% quality loss)
H.265Good (Up to 15% quality loss)
VP9Fair (Up to 20% quality loss)
AV1Fair (Over 20% quality loss)