Enlarge ArtemisDiana | iStock / Getty Images Plus
AI models designed to closely simulate a person’s voice are making it easier for bad actors to mimic loved ones and scam vulnerable people out of thousands of dollars, The Washington Post reported.
Quickly evolving in sophistication, some AI voice-generating software requires just a few sentences of audio to convincingly produce speech that conveys the sound and emotional tone of a speaker’s voice, while other options need as little as three seconds. For those targeted—which is often the elderly, the Post reported—it can be increasingly difficult to detect when a voice is inauthentic, even when the emergency circumstances described by scammers seem implausible.
Tech advancements seemingly make it easier to prey on people’s worst fears and spook victims who told the Post they felt “visceral horror” hearing what sounded like direct pleas from friends or family members in dire need of help. One couple sent $15,000 through a bitcoin terminal to a scammer after believing they had spoken to their son. The AI-generated voice told them that he needed legal fees after being involved in a car accident that killed a US diplomat.
According to the Federal Trade Commission, so-called impostor scams are extremely common in the United States. It was the most frequent type of fraud reported in 2022 and generated the second-highest losses for those targeted. Out of 36,000 reports, more than 5,000 victims were scammed out of $11 million over the phone.
Because these impostor scams can be run from anywhere in the world, it’s extremely challenging for authorities to crack down on them and reverse the worrying trend, the Post reported. Not only is it hard to trace calls, identify scammers, and retrieve funds, but it’s also sometimes challenging to decide which agencies have jurisdiction to investigate individual cases when scammers are operating out of different countries. Even when it’s obvious which agency should investigate, some agencies are currently ill-equipped to handle the rising number of impersonations.
Ars could not immediately reach the FTC for comment. Will Maxson, an assistant director at the FTC’s division of marketing practices, told the Post that raising awareness of scams relying on AI voice simulators is likely consumers’ best defense currently. It’s recommended that any requests for cash be treated with skepticism. Before sending funds, try to contact the person who seems to be asking for help through methods other than a voice call.
Safeguards against AI voice impersonation
AI voice-modeling tools have been used to improve text-to-speech generation, create new possibilities for speech editing, and expand movie magic by cloning famous voices like Darth Vader’s. But the power of easily producing convincing voice simulations has already caused scandals, and no one knows who’s to blame when the tech is misused.
Earlier this year, there was backlash when some 4chan members made deepfake voices of celebrities making racist, offensive, or violent statements. At that point, it became clear that companies needed to consider adding more safeguards to prevent misuse of the technology, Vice reported—or potentially risk being held liable for causing substantial damage, like ruining the reputations of famous people.
The courts have not yet decided when or if companies will be held liable for harms caused by deepfake voice technology—or any of the other increasingly popular AI technology, like ChatGPT—where defamation and misinformation risks seem to be rising.
There may be increasing pressure on courts and regulators to get AI in check, though, as many companies seem to be releasing AI products without fully knowing the risks involved.
Right now, some companies seem unwilling to slow down releases of popular AI features, including controversial ones that allow users to emulate celebrity voices. Most recently, Microsoft rolled out a new feature during its Bing AI preview that can be used to emulate celebrities, Gizmodo reported. With this feature, Microsoft seems to be attempting to dodge any scandals by limiting what impostor celebrity voices can be prompted to say.
Microsoft did not respond to Ars’ request for comment on how well safeguards currently work to prevent the celebrity voice emulator from generating offensive speech. Gizmodo pointed out that, like many companies eager to benefit from the widespread fascination with AI tools, Microsoft relies on its millions of users to beta test its “still-dysfunctional AI,” which can seemingly still be used to generate controversial speech by presenting it as parody. Time will tell how effective any early solutions are in mitigating risks.
In 2021, the FTC released AI guidance, telling companies that products should “do more good than harm” and that companies should be prepared to hold themselves accountable for risks of using products. More recently, the FTC last month told companies, “You need to know about the reasonably foreseeable risks and impact of your AI product before putting it on the market.”