React models

react-1 introduces the first performance control primitive for video editing. react-1 can synchronize the lip movements, facial expressions, and head movements to match a target audio, while following an emotional prompt. We describe a couple of workflows below that are only possible using react-1.

Key Features

Model Modes: You can choose which facial region to edit: just the mouth, or the facial expressions, or even the head movements
Expressive lipsync: react-1 operates on a much larger facial region, giving you the the most expressive mouth movements that match the speech
Facial Expressions: facial expressions rewritten using an emotion prompt. and every tiny micro-expressions are perfectly in sync with the speech.
Head Movements: react-1 can also synchronize your head movements to match the pacing, prosody, and intonation of the new dialogue.

Model modes

The model can be controlled with three modes of operation: lips, face, and head.

This allows you to specify the spatial region you want to edit. You can only opt for lipsync, or additionally choose for facial expressions or head movements as well. The default is face.

Mode	Lipsync	Facial Expressions	Head Movements
lips	✅	❌	❌
face	✅	✅	❌
head	✅	✅	✅

Emotion prompts

You can guide the facial expressions by specifying the emotion prompts. You can also choose not to specify one, in which case, the model will follow the emotional context of the input video. Please see usage below for more details.

Usage

react-1 is available through the same /v2/generate API endpoint used for standard lipsync, with additional parameters to control the emotional and movement effects.

Using react-1 over API

To use react-1 with the API, set the model parameter to react-1 and configure the additional options:

1 from sync import Sync
2 from sync.common import Audio, Video, GenerationOptions
3 
4 sync = Sync()
5 
6 response = sync.generations.create(
7     input=[
8         Video(url="https://assets.sync.so/docs/example-video.mp4"),
9         Audio(url="https://assets.sync.so/docs/example-audio.wav")
10     ],
11     model="react-1",
12     options=GenerationOptions(
13         prompt="happy",
14         model_mode="face"
15     )
16 )

API Parameters

`model`

Set to react-1 to use the react-1 model.

`options.model_mode`

Controls the edit region and movement scope for the model. Available options:

lips: Only lipsync using react-1 (minimal facial changes)
face (default): Lipsync + facial expressions without head movements
head: Lipsync + facial expressions + natural talking head movements

The model_mode parameter only works with the react-1 model. For other models, this parameter is ignored.

`options.prompt`

Emotion prompt for the generation. Currently supports single-word emotions only.

Available options:

happy
angry
sad
neutral
disgusted
surprised

The prompt parameter only works with the react-1 model. For other models, this parameter is ignored.

Using react-1 from Studio

react-1 is available in Sync Studio with an intuitive interface for controlling emotional expressions and head movements.

Navigate to Studio

Go to Studio and create a new project or open an existing one.

Select your assets

Upload or select your video and audio inputs. Remember that react-1 supports inputs up to 15 seconds in duration.

Choose react-1 model

Select react-1 from the model dropdown in the generation settings.

Configure model mode

Choose your desired model mode:

Lips: For lipsync only
Face: For lipsync with facial expressions (default)
Head: For lipsync with facial expressions and natural head movements

Change expression

Select an expression from the emotion wheel by clicking in the video player controls.

Generate

Click generate to create your lipsync with emotional expressions and optional head movements.

Best Practices

Choose the Right Mode

Use lips mode when you only need lipsync without emotional changes
Use face mode (default) for most use cases where you want natural expressions
Use head mode when you want the most dynamic and natural talking head movements

Select Appropriate Emotions

Choose emotion prompts that match the tone and context of your audio. The model will generate facial expressions that align with the selected emotion throughout the generation.

Input Duration

Keep your inputs under 15 seconds. For longer content, break your video into segments and process them separately, or use the standard lipsync models for longer durations.

Current Limitations

The following features are not yet supported for react-1:

Input Duration: react-1 supports inputs up to 15 seconds in duration. For longer content, consider breaking your video into segments.
Segments: Multi-segment generation with different audio inputs is not available. Process each segment separately if needed.
Speaker Selection: The active_speaker_detection option is not supported, including both automatic detection (auto_detect) and manual selection via bounding box or frame number. Ensure your input video contains a single, clearly visible speaker.
Occlusion detection: The occlusion_detection_enabled option for handling partially hidden faces is not available for react-1.

When to use react-1

react-1 is ideal for:

Short-form content (≤ 15 seconds) requiring emotional expressions
Videos where natural head movements enhance the result
Content that benefits from emotion-aware facial expressions
Projects where you want more dynamic and expressive lipsync results

For longer content (> 15 seconds) or when you only need standard lipsync, consider using lipsync-2 or lipsync-2-pro instead.