Banner
Workflow

Voice deepfakes: how they are generated, used, misused and differentiated

Voice deepfakes: how they are generated, used, misused and differentiated

  • Recently several users of the social media platform, used “speech synthesis” and “voice cloning” service provider, to make voice deepfakes of celebrities.
  • These deepfake audios made racist, abusive, and violent comments.
  • Making deepfake voices to impersonate others without their consent is a serious concern that could have devastating consequences.

What are voice deepfakes?

  • A voice deepfake is one that closely mimics a real person’s voice.
  • The voice can accurately replicate tonality, accents, cadence, and other unique characteristics of the target person.
  • People use AI and robust computing power to generate such voice clones or synthetic voices.
  • Sometimes it can take weeks to produce such voices, according to Speechify, a text-to-speech conversion app.

How are voice deepfakes created?

  • To create deepfakes one needs high-end computers with powerful graphics cards, leveraging cloud computing power.
  • Powerful computing hardware can accelerate the process of rendering, which can take hours, days, and even weeks, depending on the process.
  • Besides specialised tools and software, generating deepfakes need training data to be fed to AI models.
  • This data are often original recordings of the target person’s voice. AI can use this data to render an authentic-sounding voice, which can then be used to say anything.

What are the threats arising from the use of voice deepfakes?

  • Attackers are using such technology to defraud users, steal their identity, and to engage in various other illegal activities like phone scams and posting fake videos on social media platforms.
  • Voice deepfakes used in filmmaking have also raised ethical concerns about the use of the technology.
  • Gathering clear recordings of people’s voices is getting easier and can be obtained through recorders, online interviews, and press conferences.
  • Voice capture technology is also improving, making the data fed to AI models more accurate and leading to more believable deepfake voices.
  • This could lead to scarier situations, Speechify highlighted in their blog.

What tools are used for voice cloning?

  • Open AI’s Vall-e, My Own Voice, Resemble, Descript, ReSpeecher, and iSpeech are some of the tools that can be used in voice cloning.
  • ReSpeecher is the software used by Lucasfilm to create Luke Skywalker’s voice in the Mandalorian.

What are the ways to detect voice deepfakes?

  • Detecting voice deepfakes need highly advanced technologies, software, and hardware to break down speech patterns, background noise, and other elements.
  • Cybersecurity tools have yet to create foolproof ways to detect audio deepfakes.
  • Researchers developed a technique to measure acoustic and fluid dynamic differences between original voice samples of humans and those generated synthetically by computers.
  • They estimated the arrangement of the human vocal tract during speech generation and showed that deepfakes often model impossible or highly unlikely anatomical arrangements.

Categories