Voice deepfakes: how they are generated, used, misused and differentiated
- Recently several users of the social media platform, used “speech synthesis” and “voice cloning” service provider, to make voice deepfakes of celebrities.
- These deepfake audios made racist, abusive, and violent comments.
- Making deepfake voices to impersonate others without their consent is a serious concern that could have devastating consequences.
What are voice deepfakes?
- A voice deepfake is one that closely mimics a real person’s voice.
- The voice can accurately replicate tonality, accents, cadence, and other unique characteristics of the target person.
- People use AI and robust computing power to generate such voice clones or synthetic voices.
- Sometimes it can take weeks to produce such voices, according to Speechify, a text-to-speech conversion app.
How are voice deepfakes created?
- To create deepfakes one needs high-end computers with powerful graphics cards, leveraging cloud computing power.
- Powerful computing hardware can accelerate the process of rendering, which can take hours, days, and even weeks, depending on the process.
- Besides specialised tools and software, generating deepfakes need training data to be fed to AI models.
- This data are often original recordings of the target person’s voice. AI can use this data to render an authentic-sounding voice, which can then be used to say anything.
What are the threats arising from the use of voice deepfakes?
- Attackers are using such technology to defraud users, steal their identity, and to engage in various other illegal activities like phone scams and posting fake videos on social media platforms.
- Voice deepfakes used in filmmaking have also raised ethical concerns about the use of the technology.
- Gathering clear recordings of people’s voices is getting easier and can be obtained through recorders, online interviews, and press conferences.
- Voice capture technology is also improving, making the data fed to AI models more accurate and leading to more believable deepfake voices.
- This could lead to scarier situations, Speechify highlighted in their blog.
What tools are used for voice cloning?
- Open AI’s Vall-e, My Own Voice, Resemble, Descript, ReSpeecher, and iSpeech are some of the tools that can be used in voice cloning.
- ReSpeecher is the software used by Lucasfilm to create Luke Skywalker’s voice in the Mandalorian.
What are the ways to detect voice deepfakes?
- Detecting voice deepfakes need highly advanced technologies, software, and hardware to break down speech patterns, background noise, and other elements.
- Cybersecurity tools have yet to create foolproof ways to detect audio deepfakes.
- Researchers developed a technique to measure acoustic and fluid dynamic differences between original voice samples of humans and those generated synthetically by computers.
- They estimated the arrangement of the human vocal tract during speech generation and showed that deepfakes often model impossible or highly unlikely anatomical arrangements.
