AI News 04月14日
DolphinGemma: Google AI model understands dolphin chatter
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

谷歌开发了DolphinGemma,一个旨在破译海豚复杂交流的AI模型。该模型与佐治亚理工学院和野生海豚项目合作,通过分析海豚发出的咔哒声、口哨声和脉冲信号来理解其交流方式。DolphinGemma能够识别海豚声音中的模式,甚至生成类似海豚的声音序列。同时,研究人员也在探索CHAT系统,通过建立简单的共享词汇来实现双向互动。谷歌计划将DolphinGemma作为开放模型发布,以加速全球对海豚交流的研究,并促进跨物种沟通的可能。

🐬 **DolphinGemma的诞生:** 谷歌与佐治亚理工学院和野生海豚项目合作,开发了DolphinGemma,一个专门用于理解海豚声音的AI模型。该模型基于谷歌的Gemma系列模型,能够分析海豚声音的结构并生成新的声音序列。

🗣️ **海豚声音的分类:** 野生海豚项目(WDP)长期研究海豚声音,将其分为几种类型,包括作为身份标识的“口哨声”、与冲突相关的“爆发脉冲声”以及在求偶活动中出现的“咔哒声”。这些研究为训练DolphinGemma提供了重要的基础数据。

📱 **AI与移动技术的结合:** 谷歌的Pixel手机在研究中发挥关键作用,它们用于实时处理高保真音频数据,支持CHAT系统,该系统旨在通过关联特定的人工合成口哨声与海豚喜欢的物体来实现互动。未来的CHAT系统将使用Pixel 9,以提高性能。

🤝 **开放模型的未来:** 谷歌计划在今年夏天发布DolphinGemma的开放模型,以便全球研究人员能够使用它来分析自己的声学数据集,加速对海豚交流的理解。这有望推动跨物种交流研究的进展。

Google has developed an AI model called DolphinGemma to decipher how dolphins communicate and one day facilitate interspecies communication.

The intricate clicks, whistles, and pulses echoing through the underwater world of dolphins have long fascinated scientists. The dream has been to understand and decipher the patterns within their complex vocalisations.

Google, collaborating with engineers at the Georgia Institute of Technology and leveraging the field research of the Wild Dolphin Project (WDP), has unveiled DolphinGemma to help realise that goal.

Announced around National Dolphin Day, the foundational AI model represents a new tool in the effort to comprehend cetacean communication. Trained specifically to learn the structure of dolphin sounds, DolphinGemma can even generate novel, dolphin-like audio sequences.

Over decades, the Wild Dolphin Project – operational since 1985 – has run the world’s longest continuous underwater study of dolphins to develop a deep understanding of context-specific sounds, such as:

WDP’s ultimate goal is to uncover the inherent structure and potential meaning within these natural sound sequences, searching for the grammatical rules and patterns that might signify a form of language.

This long-term, painstaking analysis has provided the essential grounding and labelled data crucial for training sophisticated AI models like DolphinGemma.

DolphinGemma: The AI ear for cetacean sounds

Analysing the sheer volume and complexity of dolphin communication is a formidable task ideally suited for AI.

DolphinGemma, developed by Google, employs specialised audio technologies to tackle this. It uses the SoundStream tokeniser to efficiently represent dolphin sounds, feeding this data into a model architecture adept at processing complex sequences.

Based on insights from Google’s Gemma family of lightweight, open models (which share technology with the powerful Gemini models), DolphinGemma functions as an audio-in, audio-out system.

Fed with sequences of natural dolphin sounds from WDP’s extensive database, DolphinGemma learns to identify recurring patterns and structures. Crucially, it can predict the likely subsequent sounds in a sequence—much like human language models predict the next word.

With around 400 million parameters, DolphinGemma is optimised to run efficiently, even on the Google Pixel smartphones WDP uses for data collection in the field.

As WDP begins deploying the model this season, it promises to accelerate research significantly. By automatically flagging patterns and reliable sequences previously requiring immense human effort to find, it can help researchers uncover hidden structures and potential meanings within the dolphins’ natural communication.

The CHAT system and two-way interaction

While DolphinGemma focuses on understanding natural communication, a parallel project explores a different avenue: active, two-way interaction.

The CHAT (Cetacean Hearing Augmentation Telemetry) system – developed by WDP in partnership with Georgia Tech – aims to establish a simpler, shared vocabulary rather than directly translating complex dolphin language.

The concept relies on associating specific, novel synthetic whistles (created by CHAT, distinct from natural sounds) with objects the dolphins enjoy interacting with, like scarves or seaweed. Researchers demonstrate the whistle-object link, hoping the dolphins’ natural curiosity leads them to mimic the sounds to request the items.

As more natural dolphin sounds are understood through work with models like DolphinGemma, these could potentially be incorporated into the CHAT interaction framework.

Google Pixel enables ocean research

Underpinning both the analysis of natural sounds and the interactive CHAT system is crucial mobile technology. Google Pixel phones serve as the brains for processing the high-fidelity audio data in real-time, directly in the challenging ocean environment.

The CHAT system, for instance, relies on Google Pixel phones to:

This allows the researcher to respond quickly with the correct object, reinforcing the learned association. While a Pixel 6 initially handled this, the next generation CHAT system (planned for summer 2025) will utilise a Pixel 9, integrating speaker/microphone functions and running both deep learning models and template matching algorithms simultaneously for enhanced performance.

Using smartphones like the Pixel dramatically reduces the need for bulky, expensive custom hardware. It improves system maintainability, lowers power requirements, and shrinks the physical size. Furthermore, DolphinGemma’s predictive power integrated into CHAT could help identify mimics faster, making interactions more fluid and effective.

Recognising that breakthroughs often stem from collaboration, Google intends to release DolphinGemma as an open model later this summer. While trained on Atlantic spotted dolphins, its architecture holds promise for researchers studying other cetaceans, potentially requiring fine-tuning for different species’ vocal repertoires..

The aim is to equip researchers globally with powerful tools to analyse their own acoustic datasets, accelerating the collective effort to understand these intelligent marine mammals. We are shifting from passive listening towards actively deciphering patterns, bringing the prospect of bridging the communication gap between our species perhaps just a little closer.

See also: IEA: The opportunities and challenges of AI for global energy

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post DolphinGemma: Google AI model understands dolphin chatter appeared first on AI News.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DolphinGemma 人工智能 海豚交流 跨物种沟通
相关文章