Google’s VEO 3 AI video model is an important reason – a league above any of its competitors for sound. Not just what you see on the screen, but also indicate what you hear.
Built by Google’s Deep Mind Lab, the first VEO model began in May 2024, and each new generation has increased further functionality. Compared to rivals, he has always performed well in the accuracy of the movement and the understanding of physics, but the sound increase was the game changer.
You can use it for a short commercial, a movie you are writing, or even for music video. But a use I have seen more than anyone else – ASMR (an autonomous sensory marijuana response): those soft tapping, whisper and solid sounds that produce a low sensation for some people.
You can like
To see how far it can go, I developed a series of ASMR food indicators – designed to produce each matching video and pure sounds around something.
(Image Credit: Shutter Stock)
To indicate the VEO 3 in the Gemini app
VEO 3 is now available in the Gemini app. Just select the video option, type custom, and an 8 -second clip has been developed when launching a new prompt.
Although Gemini is not necessarily the best way to access VEO 3 – I will recommend Freip, Fall, Hegsfield, or Google Flu – it’s easy to use and performs.
One of the important advantages of direct use of gym is that it automatically translates and adds your indicators. So if you ask for “a cool ASMR video characterized by Lasaguna”, you will find that.
You may be more specific to using anything, called structural indicators. But unless you need exact control, a simple paragraph (alias statement indicator) is usually more efficient.
Indicate
The first job in any AI project is thinking about your indicators. The model is getting better at interpreting intentions, but if you know what you want, it’s better to be better.
I knew I wanted ASMR food videos, so I started with a test: “ASMR Food Video with sound.”
Result? Civilized, he must have given me the lashes that was in my mind. Then I improved it – presenting specific types of food, adding sound details, and even trying to make a structural gesture for a feery drink with ice.
Most of the time, the indication of the statement works best. Just describe what you want to see, the video flow, and how the sound should come.
1.
Google VEO 3 Lasagan Video – YouTube
Look
The first prompt, “ASMR Food Video with Sound”, created a wonderful clip of a thorn in a piece of lasagna. When the fork enters, you listen to Scash, then when it collides with the plate, the clinic. This is a matter where I wish View 3 had the “Extending Clip” button.
There was no other gesture, so I had no way to identify what the food would be, how the sound would come out or whether the sound would work. That is why it is important to be specific when pointing to AI models, even in chat boats like Gemini.
2. Cooking and food
Google VEO 3 Kitchen Video – YouTube
Look
After that, I became more specific-a long, narrative style indicator VEO 3 is asked to prepare a close-up of a chef that produces satisfactory food in a well bright kitchen.
I demanded a cranch because of the slowly visuals, the loud sound of butter melting in the pan, and the chef cutting the ingredients.
I also added this line: “Emphasize the audio quality: clean, layered ASMR soundskap without music” not only to guide the sound, but also the sound style and what I do not want to hear.
3. Popcorn poping
Google VEO 3 Pop Corn Video – YouTube
Look
For the final indicator I started with an image. I used the Madzorine V7 to make a picture of a woman who was watching Rainbow popcorn, then adding “ASMR Food” immediately in Gemini.
Visit, the result was amazing – but for some reason, the woman says in the voice over, “It is delicious, it’s rainbow popcorn.” It’s up to me – I didn’t make it clear whether it should speak, or what it should say.
A simple accuracy: Keep your desired speech in entering the price. For example, I could “like to see me popcorn pop”, and point to the word of pop. I also made it clear that she was talking on the camera – and the V3 would have compatible with the lips movement.
Conclusion
Overall, VEO 3 provides impressive results, especially when it comes to producing high quality sound that accurately reflects the visuals. While there are some rates for navigating, such as unintentional voice overs or slightly less baked lasagna – they are easily resolved with more specific indications.
More from Tom Guide
Back to the laptop
Bai Price (Minimum) Price (Less) Product Name (A to Z) Product Name (Z To A) Retailer Name (A to Z) Retailer Name (Z To A)


