When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Google launches DolphinGemma, a new LLM to help understand what dolphins are saying

DolphinGemma How Google AI is helping decode dolphin communication

Days after launching Deep Research powered by Gemini 2.5 Pro Experimental, Google is back at it again with a new model, DolphinGemma. This large language model is meant to help scientists "study how dolphins communicate" and "hopefully find out what they're saying, too."

The company is working with researchers at Georgia Tech and the Wild Dolphin Project (WDP), led by its founder, Dr. Denise Herzing. The WDP's primary mission, as you can probably guess, is to observe, document, and report on the natural behaviors, social structures, communication patterns, and habitats of wild dolphins, specifically the Atlantic spotted dolphin (Stenella frontalis), through "non-invasive, long-term field research."

Over the years, the WDP has collected data that allows it to correlate certain dolphin sounds with behaviors. For example:

  • Signature whistles (unique names) that can be used by mothers and calves to reunite
  • Burst-pulse "squawks" often seen during fights
  • Click "buzzes" often used during courtship or chasing sharks

According to Google, "analyzing dolphins' natural, complex communication is a monumental task, and WDP's vast, labeled dataset provides a unique opportunity for cutting-edge AI."

That's where DolphinGemma comes in. Simply put, it's an AI model developed by Google on the WDP's dataset, which uses Google's own SoundStream tokenizer to break down dolphin vocalizations into more manageable audio units.

These are then run through a specialized model architecture designed to make sense of complex sequences. The whole setup sits at around 400 million parameters, making it lightweight enough to run natively on Pixel phones, which the WDP researchers carry with them out in the field.

Left Whistles left and burst pulses right generated during early testing of DolphinGemma

Now, unlike traditional machine learning models, DolphinGemma doesn’t deal in words or images; it’s strictly audio-in, audio-out. It takes in sequences of natural dolphin vocalizations, processes them using an approach inspired by how large language models understand human speech, and predicts the most likely next sound in a sequence.

Dr. Denise Herzing compares it to autocomplete but for dolphin whistles, burst pulses, and click trains. It’s trained to identify patterns, structure, and progression in these sounds, just like how a text-based model predicts the next word in a sentence based on context.

Before Google came along with DolphinGemma, the team of researchers at the WDP had been using CHAT (Cetacean Hearing Augmentation Telemetry) to explore the possibility of two-way communication with dolphins. The goal with CHAT wasn’t to crack the full complexity of dolphin language but to build a simpler, shared vocabulary for interaction.

The system works by associating new, synthetic whistles, created by CHAT, with specific objects the dolphins seem to enjoy. Think things like sargassum, seagrass, or even scarves the researchers use.

The hope was that by repeatedly associating these synthetic whistles with objects, the dolphins would start mimicking the sounds to "ask" for those items.

CHAT ran off a Google Pixel 6, which handled high-quality audio analysis in real time. Using off-the-shelf phones meant the team didn’t need custom gear. It made things smaller, cheaper, more efficient, and easier to maintain in the open ocean.

For the upcoming season, they’re upgrading to the Pixel 9, which adds better speaker and microphone capabilities and has enough power to run deep learning models and pattern matching at the same time.

A Google Pixel 9 inside the latest CHAT system hardware
A Google Pixel 9 inside the latest CHAT system hardware.

Just like other Gemma models, Google says it is bringing DolphinGemma as an open model this summer in hopes of "give researchers worldwide the tools to mine their own acoustic datasets, accelerate the search for patterns and collectively deepen our understanding of these intelligent marine mammals."

Gemma is a family of lightweight large language models developed by Google. The latest addition to the family is Gemma 3, available in four sizes: 1 billion, 4 billion, 12 billion, and 27 billion parameters.

Report a problem with article
Apple iPhone 16e front view
Next Article

iPhone 16e helps Apple claim the top spot in global smartphone market share in Q1 2025

Galaxy S25 Ultra
Previous Article

While One UI 7 struggles, One UI 8 enters alpha phase, with its first look leaked

Join the conversation!

Login or Sign Up to read and post a comment.

3 Comments - Add comment