Getty Images
A team of Stanford scientists claims to have tested a new brain-computer interface (BCI) that can decode speech at up to 62 words per minute, improving the previous record by 3.4 times.
That'd be a massive step towards real-time speech conversion at the pace of natural human conversation.
Max Hodak, who founded BCI company Neuralink alongside Elon Musk, but wasn't involved in the study, called the research "a meaningful step change in the utility of implanted BCIs" in an email to Futurism.
As detailed in a yet-to-be-peer-reviewed paper, the team of Stanford scientists found that they only needed to analyze brain activity in a relatively small region of the cortex to convert them into coherent speech using a machine learning algorithm.
The goal was to give those who can no longer speak due to ALS or stroke their voice back. While keyboard-based solutions have allowed those with paralysis to communicate again to a certain degree, a brain-based speech interface could speed up the decoding significantly.
"Here, we demonstrated a speech BCI that can decode unconstrained sentences from a large vocabulary at a speed of 62 words per minute, the first time that a BCI has far exceeded the communication rates that alternative technologies can provide for people with paralysis, e.g. eye tracking," the researchers write.
In an experiment, the team recorded the neural activity of an ALS patient, who can move their mouth but has difficulties forming words, from two small areas in their brain.
Using a recurrent neural network decoder that can predict text, the researchers then turned these signals into words — and at a surprisingly fast pace.
They found that analyzing these orofacial movements and their associated neural activity was "likely strong enough to support a speech BCI, despite paralysis and narrow coverage of the cortical surface," according to the paper.
But the system wasn't perfect. The error rate of the researchers' recurrent neural network (RNN) decoder was still about 20 percent.
"Our demonstration is a proof of concept that decoding attempted speaking movements from intracortical recordings is a promising approach, but it is not yet a complete, clinically viable system," the researchers admitted in their paper.
To improve their system's error rate, the scientists propose probing more areas of the brain, while also optimizing the algorithm.