If some among us have stopped picking up the phone because of scam calls, it’s understandable, as voice deepfakes make it increasingly hard to figure out if there’s a real person on the other end of the line.
Deepfake detection is booming, but so is Gen AI-assisted fraud, and the effort to find an airtight anti-deepfake solution is ongoing. Tools like defensive AI and liveness detection can help – though some think it’s time to make robots sound more like robots to begin with.
‘Managing phone calls has become increasingly challenging’: Hiya
Seattle-based voice security services provider Hiya has released Hiya AI Phone, an AI call assistant mobile app. A release says the app screens phone calls, protects users from scams, and takes notes during calls.
The firm says its data shows Americans spend an average of 40 minutes each month screening 13 unwanted spam calls. It also found that in 2024, one in three Americans received at least one deepfake scam call, with 34 percent of those targeted losing an average of $7,200.
Hiya AI Phone literally works like a digital secretary: it answers unknown calls, asks callers to state their name and purpose, and evaluates whether to connect them to you. Its intelligent call screening can detect synthetic voices, continuously analyzes call audio in real-time, and sends alerts if it identifies potential scams. And it transcribes calls and takes notes.
“Hiya AI Phone represents a groundbreaking evolution in how people interact with their phones,” says Hiya CEO and founder Alex Algard. “For years, phone apps have stagnated, offering little innovation since their introduction on smartphones. The launch of Hiya AI Phone changes that by introducing the first call assistant purpose-built to navigate today’s challenging call landscape – dominated by robocalls, spam, deepfake scams, and other interruptions.”
Last year, Hiya acquired Loccus.ai, a company specializing in deepfake voice detection systems. That acquisition enabled the rebrand to Hiya AI voice detection, and integrate the deepfake voice software into its AI app.
Hiya AI Phone is free to download for Android and iPhone users.
On-device deepfake detection from LG Uplus to be integrated into AI agent
LG Uplus and Pindrop are among firms mustering against the audio deepfake threat. Maeil Business Newspaper reports that LG Uplus recently announced the development of a “voice anti-spoofing (fake voice discrimination)” tool that generates unique voiceprints. It is to be integrated into an AI call agent called Exio within the first half of this year.
LG’s contribution to security is that the tool operates in an “on-device environment that does not separately transmit voice information to a server.” It claims LG Uplus is “the first in the world to develop and commercialize voice anti-spoofing technology with on-device technology.”
Detection capabilities mean that even if a particular person speaks beyond voiceprint recognition, or the speed, accent, and tone are different, the system remains effective.
“Our AI can detect fake voices even during real-time phone calls,” says Park Ji-woong, head of speech technology at LG Uplus, claiming an accuracy rate of over 95 percent.
Pindrop Pulse liveness detection addresses synthetic voice problem
Pindrop, meanwhile, says the key for flagging audio deepfakes involves a layered approach that includes liveness detection.
A blog post from the company says primitive types of synthetic voice techniques like generative adversarial networks (GANs) and auto-encoders can be easily recognized as unnatural, but more advanced models based on neural networks present a bigger challenge.
WaveNet, a technique developed by Google DeepMind, “uses neural networks to produce high-quality speech by predicting waveforms.” Text-to-speech (TTS) synthesis “transforms written text into speech while adjusting elements like speed, pitch, and tone to make the voice sound natural.”
Future-proofing against deepfake impersonation means “adopting advanced detection technologies and fostering an adaptive and layered security approach that grows alongside the threat landscape.”
Audio deepfake detection tools, multi-factor authentication (MFA) for voice-based systems such as behavioral analysis or device-based authentication, and cloud-based AI systems that enable near real-time data analysis at scale can all help shore up defenses.
Want to solve deepfakes? Turn AI into a paranoid android
Deepfake detection, say some, is all well and good – but what if we made it much easier, by requiring AI to sound like a droid from Forbidden Planet?
IEEE Spectrum has an article on “a simple way to identify who, or what, is talking to us.” The piece argues that “AIs and robots should sound robotic.”
“You can’t just label AI-generated speech,” say authors Barath Raghavan and Bruce Schneier. “It will come in many different forms. So we need a way to recognize AI that works no matter the modality. It needs to work for long or short snippets of audio, even just a second long. It needs to work for any language, and in any cultural context. At the same time, we shouldn’t constrain the underlying system’s sophistication or language complexity.”
Their answer? A ring modulator – a device that takes two audio signals and processes them into one output signal to produce an oscillating sound. Before digital audio workstations, it was how sound designers made voices sound robotic. Think the Daleks from classic Doctor Who (which used 30 Hz ring modulation).
Making it mandatory to apply a ring modulator to synthetic voices, they say, “is computationally simple, can be applied in real-time, does not affect the intelligibility of the voice, and – most importantly – is universally ‘robotic sounding’ because of its historical usage for depicting robots.”
“Responsible AI companies that provide voice synthesis or AI voice assistants in any form should add a ring modulator of some standard frequency (say, between 30-80 Hz) and of a minimum amplitude (say, 20 percent). That’s it. People will catch on quickly.”
Article: No longer taking phone calls? An AI assistant or liveness detection could help