At sync, we’re building foundational models to understand and manipulate humans in video. Our suite of lipsyncing models allows you to edit the lip movements of any speaker in any video to match a target audio. Explore and compare the capabilities of the different models below.

Featurelipsync-2lipsync-1.9.0-betalipsync-1.8.0lipsync-1.7.1
DescriptionOur most natural lipsyncing model yet. The first model that can preserve the unique speaking style of every speaker. Best across all kinds of video content.Our fastest lipsyncing model. Standard, general-purpose, accurate lipsync.Slow, legacy model, suited for budget-constrained tasks. Use lipsync-1.9 & later for best results.Fast, legacy model, best suited for simple low-res avatar videos.
Price / min @ 25 fps$2.4 — $3$1.2 — $1.5$1 — $0.8$1 — $0.8
Accuracy
Speed
StyleLip movements in the unique style of the speakerStandard generic lip movementsStandard generic lip movementsStandard generic lip movements
Identity Preservation
Teeth
Face Detection
Face Blending
Pose Robustness
Beard
Face Resolution512×512512×512512×512256×256
Best forBest across all kinds of videos, outperforms every other lipsync model across all key attributes.Simpler avatar-style use casesLegacy model, use 1.9.0 and aboveLegacy model, might work for low-quality videos

All models are available in both Playground and API.