
"Proactive listening" AI headphonesHu et al./EMNLP
People talk over music, clattering plates, and dozens of overlapping conversations in crowded rooms. For many, tuning into one voice takes mental effort.
For people with hearing challenges, that effort can be overwhelming.
A group of researchers at the University of Washington now says it has built a way to cut through that noise.
Their new AI-powered smart headphones can automatically separate a user’s conversation partners from the surrounding chaos.
Smarter hearing tech
Unlike existing speech-isolating devices, the prototype does not wait for manual input. The headphones detect who is part of the conversation and silence voices that don’t match the rhythm of turn-taking speech.
One AI model analyzes timing patterns, and another filters out unrelated sounds.
The system identifies conversation partners within two to four seconds.
The team shared the work on November 7 in Suzhou, China, during the Conference on Empirical Methods in Natural Language Processing. The underlying code is open source.
The researchers believe the technology could support future hearing aids, earbuds, and smart glasses.
Senior author Shyam Gollakota said earlier approaches go much further than users expect.
“Existing approaches to identifying who the wearer is listening to predominantly involve electrodes implanted in the brain to track attention,” he said.
He noted that natural patterns in dialogue offer a better path.
“Our insight is that when we’re conversing with a specific group of people, our speech naturally follows a turn-taking rhythm. And we can train AI to predict and track those rhythms using only audio, without the need for implanting electrodes.”
How it behaves in real use
The system activates when the wearer starts speaking. The first model runs a “who spoke when” check and looks for low overlap between speakers.
The second model cleans the signal and feeds real-time isolated audio back to the user.
The prototype currently supports conversations involving the wearer and up to four other speakers without noticeable lag. The researchers tested the experience with 11 participants.
They rated clarity, noise suppression, and comprehension with and without the filters. The filtered version scored more than twice as high.
The project builds on earlier experiments from Gollakota’s team. Previous prototypes required looking at a person to isolate their voice or adjusting distance-based audio “bubbles.”
Lead author Guilin Hu said the new design removes those steps. “Everything we’ve done previously requires the user to manually select a specific speaker or a distance within which to listen, which is not great for user experience,” Hu said.
He added that the new system reacts automatically. “What we’ve demonstrated is a technology that’s proactive — something that infers human intent noninvasively and automatically.”
Chaotic speech still poses problems.
People interrupting, talking over each other, or joining mid-conversation can confuse tracking.
Still, the early results impressed the team. The models were trained on English, Mandarin, and Japanese. Other languages may require adjustments.
The current version uses commercial over-ear headphones and basic circuitry.
Gollakota expects the tech to shrink into earbuds or hearing aids.
In related work presented at MobiCom 2025, the same team showed that similar AI models can already run on hearing-aid-sized chips.
The study is published in the ACL Anthology as part of the EMNLP 2025 proceedings.
The Blueprint