Veo 3.1: native audio and reference controls

by ¶.ai
¶.ai
On a mission to make AI more accessible, practical, and human-centric by bridging the gap between technical capabilities and real human needs.
- Website
- X
November 12, 2025
•
2 min read

Veo is Google's latest attempt to teach computers how to make videos from scratch. Now in version 3.1, it's available for anyone willing to pay for early access, either through Google AI Studio or Vertex AI. You can choose between the regular version or a faster one, depending on how impatient you are.

So what's new? Veo can now add its own audio straight into the videos it creates. That means you get everything from people talking to sound effects that actually match what's happening on screen, all in one go. No more stitching together separate audio and video files. On top of that, Veo has gotten better at understanding what makes a video look cinematic, and it does a much better job of sticking to your prompts and keeping characters looking the same from scene to scene.

There are also three new ways to control how Veo makes videos. If you want a character or an object to look the same every time, you can now give Veo up to three reference images. It will use these as a guide, so your videos don't end up with a hero who changes hair color halfway through.

Veo can now stretch your videos out, connecting new scenes to the end of your last clip so the story keeps flowing. If you want a smooth transition between two images, you just give Veo the first and last frame, and it fills in the gap with video and matching audio. You can make short clips—four, six, or eight seconds long—or string them together for something much longer.

Why does any of this matter? Well, studios like Promise are already using Veo to help directors sketch out stories before a single frame is shot, and companies like Latitude are letting people turn their own stories into videos in seconds. If you're building anything that needs to keep characters looking the same, scenes flowing smoothly, or audio and video in sync, Veo takes care of the hard parts for you. Instead of juggling a bunch of different tools, you just tell Veo what you want, and it does the rest.

Read the API announcement on Google Developers Blog

¶.ai

On a mission to make AI more accessible, practical, and human-centric by bridging the gap between technical capabilities and real human needs.