{"id":21054,"date":"2025-12-12T17:26:56","date_gmt":"2025-12-12T17:26:56","guid":{"rendered":"https:\/\/scannn.com\/gemini-2-5-native-audio-upgrade-plus-text-to-speech-model-updates\/"},"modified":"2025-12-12T17:26:56","modified_gmt":"2025-12-12T17:26:56","slug":"gemini-2-5-native-audio-upgrade-plus-text-to-speech-model-updates","status":"publish","type":"post","link":"https:\/\/scannn.com\/lv\/gemini-2-5-native-audio-upgrade-plus-text-to-speech-model-updates\/","title":{"rendered":"Gemini 2.5 Native Audio upgrade, plus text-to-speech model updates"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<h3 data-block-key=\"61bfj\">What customers are saying<\/h3>\n<p data-block-key=\"c8rur\">Google Cloud customers are already using Gemini\u2019s native audio capabilities to drive real business results, from mortgage processing to customer calls.<\/p>\n<ul>\n<li data-block-key=\"f9d5j\"><i>\u201cUsers often forget they\u2019re talking to AI within a minute of using Sidekick, and in some cases have thanked the bot after a long chat\u2026New Live API AI capabilities offered through Gemini [2.5 Flash Native Audio] empower our merchants to win.\u201d<\/i> \u2013 David Wurtz, VP of Product, Shopify<\/li>\n<li data-block-key=\"cpoqr\"><i>&#8220;By integrating the Gemini 2.5 Flash Native Audio model\u2026we&#8217;ve significantly enhanced Mia&#8217;s capabilities since launching in May 2025. This powerful combination has enabled us to generate over 14,000 loans for our broker partners.<\/i>&#8221; \u2013 Jason Bressler, Chief Technology Officer, United Wholesale Mortgage (UWM)<\/li>\n<li data-block-key=\"5gvgc\"><i>\u201cWorking with the Gemini 2.5 Flash Native Audio model through Vertex AI allows Newo.ai AI Receptionists to achieve unmatched conversational intelligence &#8230; .They can identify the main speaker even in noisy settings, switch languages mid-conversation, and sound remarkably natural and emotionally expressive.\u201d<\/i> \u2013 David Yang, Co-founder, Newo.ai<\/li>\n<\/ul>\n<h2 data-block-key=\"7lcen\">Live Speech Translation<\/h2>\n<p data-block-key=\"9k6cr\">Gemini now natively supports new live speech-to-speech translation capabilities designed to handle both continuous listening and two-way conversation.<\/p>\n<p data-block-key=\"38f3\">With continuous listening, Gemini automatically translates speech in multiple languages into a single target language. This allows you to put headphones in and hear the world around you in your language.<\/p>\n<p data-block-key=\"eq38s\">For two-way conversation, Gemini\u2019s live speech translation handles translation between two languages in real-time, automatically switching the output language based on who is speaking. For example, if you speak English and want to chat with a Hindi speaker, you\u2019ll hear English translations in real-time in your headphones, while your phone broadcasts Hindi when you\u2019re done speaking.<\/p>\n<p data-block-key=\"86q6c\">Gemini\u2019s live speech translation has a number of key capabilities that help in the real world:<\/p>\n<ul>\n<li data-block-key=\"2afoq\"><b>Language coverage<\/b>: Translate speech in over 70 languages and 2000 language pairs by combining Gemini model\u2019s world knowledge and multilingual capabilities with its native audio capabilities<\/li>\n<li data-block-key=\"di844\"><b>Style transfer:<\/b> Captures the nuance of human speech, preserving the speaker\u2019s intonation, pacing and pitch so the translation sounds natural.<\/li>\n<li data-block-key=\"bdrko\"><b>Multilingual input:<\/b> Understands multiple languages simultaneously in a single session, helping you follow multilingual conversations without needing to fiddle around with language settings.<\/li>\n<li data-block-key=\"alfj9\"><b>Auto detection:<\/b> Identifies the spoken language and begins translation, so you don\u2019t even need to know what language is being spoken to start translating.<\/li>\n<li data-block-key=\"4j5i0\"><b>Noise robustness<\/b>: Filters out ambient noise so you can converse comfortably even in loud, outdoor environments.<\/li>\n<\/ul>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/blog.google\/products\/gemini\/gemini-audio-model-updates\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What customers are saying Google Cloud customers are already using Gemini\u2019s native audio capabilities to drive real business results, from mortgage processing to customer calls. \u201cUsers often forget they\u2019re talking to AI within a minute of using Sidekick, and in some cases have thanked the bot after a long chat\u2026New Live API AI capabilities offered [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":21055,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[100],"tags":[],"class_list":["post-21054","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-google"],"_links":{"self":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts\/21054","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/comments?post=21054"}],"version-history":[{"count":0,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts\/21054\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/media\/21055"}],"wp:attachment":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/media?parent=21054"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/categories?post=21054"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/tags?post=21054"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}