The Decoder October 23, 2024 NEUTRAL

Smartphone video could reshape facial animation with Act-One

RunwayML has introduced Act-One, an AI model that transfers an actor’s video and voice performance to animated characters. The company says the model can work from smartphone footage and apply one performance across different character designs, including photorealistic avatars.

RunwayML has introduced Act-One, an AI model built to make facial animation easier to produce. Instead of relying on complex facial animation workflows and specialized equipment, the system is designed to move an actor’s performance from video and voice recordings onto an animated character.

The core idea is simple: capture the performance once, then apply it to a character. For film production, games, animated movies, and avatar-based narrative content, that could change who can create expressive digital performances and how quickly they can move from acting to animation.

What Act-One is designed to do

Act-One focuses on facial performance transfer. RunwayML says the model can take an actor’s expressions and voice from an input video and carry them over to an animated character. The company also says a smartphone is enough to record the source performance.

That matters because traditional facial animation can require complex production steps and dedicated gear. Act-One is presented as a way to reduce that setup. The actor performs, the video and audio provide the source material, and the model applies that performance to a target character.

RunwayML says the model can capture and transfer subtle details in the performance. In facial animation, those details are often central to whether a character feels believable. Small shifts in expression, timing, and delivery can carry emotional meaning, especially in close-up shots or dialogue-heavy scenes.

The model is not described only as a tool for matching a human face to a similar digital face. RunwayML says Act-One can be used with different reference images, which means the same source performance can be mapped onto characters with different visual designs.

Why a smartphone input changes the workflow

The source article frames Act-One as a simplification of facial animation. The key production change is the input requirement: video and voice recordings, potentially captured on a smartphone, instead of a more specialized motion capture process.

That does not mean the creative work disappears. A convincing animated performance still depends on acting, character design, direction, and how the final scene is assembled. But the capture step becomes more accessible if the performance can begin with consumer-grade video rather than a dedicated capture setup.

For creators, that could make iteration easier. A performer can try a reading, adjust expression and timing, and create another version without needing the same kind of technical environment associated with more complex facial animation methods.

For production teams, the implication is speed and flexibility. If the model can preserve subtle performance details from simple footage, teams may be able to test character performances earlier in the creative process. That could be useful when exploring tone, casting, dialogue, or the feel of a scene before committing to a more complete production pass.

One actor, multiple digital roles

RunwayML highlights Act-One’s ability to apply a single input performance to multiple character designs and styles. In the company’s demonstration, one actor’s source video is transferred to different characters, showing how the same performance can be reused across varied visual targets.

The company also presents a more specific storytelling use case: one actor performing multiple virtual avatars in the same scene. The source describes this as a way to create narrative content with a consumer grade camera and a single actor reading and performing different characters from a script.

That use case is important because it moves Act-One beyond a narrow animation utility. It suggests a workflow where performance, character switching, and scene creation can happen with fewer people and less equipment than traditional approaches may require.

The obvious applications include games and animated movies, where human facial expressions and voice can be transferred to animated characters. RunwayML also points to photorealistic avatars, which opens another path: using acting performances to drive virtual humans or lifelike digital characters.

How flexible the model is claimed to be

RunwayML says Act-One is designed to keep facial expressions realistic even when the target character’s proportions are different from the original video source. That is a significant claim because animation often has to bridge the gap between a real actor and a stylized or differently shaped character.

The company also says the model produces cinematic, realistic results. In the demonstration described by the source, Act-One is shown working across different camera angles and focal lengths. Those details matter because production footage is rarely limited to one fixed framing or one type of shot.

Several capabilities stand out from the source:

It transfers an actor’s video and voice performance to an animated character.
It can use smartphone footage as the input.
It can apply the same performance to different reference images.
It is designed to handle characters with different proportions from the source actor.
It is shown working with different camera angles and focal lengths.
It can be used for animated characters as well as photorealistic avatars.

Together, those points describe a model aimed at making facial animation more direct. The promise is not only that a face can be animated, but that a performance can move across character designs while retaining the expressive qualities of the original actor.

Availability and the bigger production question

RunwayML says access to Act-One is being phased in for users now and will soon be available to everyone. The source does not provide a broader release schedule, pricing, technical requirements, or detailed limitations.

That leaves some practical questions open. The source does not explain how Act-One performs across every style of character, how much manual cleanup may be needed, or how it fits into full production pipelines. It also does not give technical details about the model’s training or deployment.

Still, the direction is clear. Act-One is positioned as a tool that lowers the barrier between acting and facial animation. If RunwayML’s claims hold up in everyday production use, the model could make expressive character work more accessible to creators working with simpler capture setups.

For film production, the source suggests implications beyond motion capture. For games and animated movies, the immediate value is easier performance transfer. For avatar-driven content, the model points toward scenes where one performer can drive multiple digital characters from recorded video and voice.