Google starts access to Gemini 2.5 Desi audio dialogue and viable speech Generation in a preview

Google introduces new audio generation capabilities in Google I/O 2025 with Gemini 2.5 models. The Mountain View -based tech -based tech develope is now letting developers and individuals testing these features on their platform. Two new abilities include Gemini 2.5 Flash previews with Desi Audio Dialogue and Control Text to Spitch (TTS). Although responding to the former user’s gestures, the former can produce audio like a human, but the latter can turn any script into a conversation speech. These features are currently not available to developers through the application programming interface (APIS).

Google Gemini 2.5 Flash’s audio output capabilities

In a blog post, Tech Dev described the features of these two audio generation methods, highlighting how developers can use them to create new experiences for people. Currently, the ancestral audio dialogue can be tested in the stream tab of the Google AI Studio, while the TTS feature can be tested in the media tab created inside the AI studio.

The ancestral audio dialogue with the Gemini 2.5 flash preview is designed for real -time conversation between the human user and AI. The user can either type a gesture or speak it, and AI responds orally. This process produces audio directly instead of developing a text and then turning it into speech.

It also has many benefits. It supports an impressive dialog, which means when Gemini responds to the sound of the 2.5 flash user’s voice, it can recognize the emotions behind the words. It can understand when the user responds to frightened, angry, or surprised and accordingly.

In addition, when the feature of audio generation speaks, adopts different accents and linguistic style, can access tools like Google Search, and support more than 24 languages.

Coming to the capable of controlling TTS, it offers a multi -speaker dialogue breed, describing the script, can produce emotions and tone, controls delivery speed and emphasizes accents, and supports the mixing of the same 24 languages and language.

Google says these capabilities were evaluated for potential risks in the development process. The company used both red teaming to find and fix any weaknesses along with both internal mechanisms. The company also highlighted that all audio outputs of these models are embedded with its water marking technology.

What's Hot

Honor Magic 8 Lite Review – gsmarena.com Test

Google Project Aura hands-on: Android XR’s biggest strength is in apps

Bussel PowerClean Fur Finder Review: This budget-friendly cordless vacuum is simple yet effective

Google Project Aura hands-on: Android XR’s biggest strength is in apps

Weekly deals: Google Pixel 10 and Motorola RAZR 2025 phones get price cuts

Dolly Koped review: Very viable entry-level bookshelf speakers that are hard to argue with

Redmi K90 Pro Max debuts with Snapdragon 8 Elite Gen 5 SoC and a Bose 2.1-channel speaker setup

GPT5 can be here in this month-there are five features we hope

Gut Hub spreads about the GPT5 model before the official announcement

Honor Magic 8 Lite Review – gsmarena.com Test

Google Project Aura hands-on: Android XR’s biggest strength is in apps

Bussel PowerClean Fur Finder Review: This budget-friendly cordless vacuum is simple yet effective

Most Popular

Redmi K90 Pro Max debuts with Snapdragon 8 Elite Gen 5 SoC and a Bose 2.1-channel speaker setup

GPT5 can be here in this month-there are five features we hope

Gut Hub spreads about the GPT5 model before the official announcement

Our Picks

Honor Magic 8 Lite Review – gsmarena.com Test

Google Project Aura hands-on: Android XR’s biggest strength is in apps

Bussel PowerClean Fur Finder Review: This budget-friendly cordless vacuum is simple yet effective

Subscribe to Updates

What's Hot

Google starts access to Gemini 2.5 Desi audio dialogue and viable speech Generation in a preview

Google Gemini 2.5 Flash’s audio output capabilities

Related Posts

Subscribe to Updates