Don't miss the latest stories
Nvidia Shows Off Next-Level Voice AI That Almost Sounds Too Human
By Mikelle Leow, 01 Sep 2021
Subscribe to newsletter
Like us on Facebook
Photo 111329508 © Boydz1980 | Dreamstime.com
Admit it; you have, on occasion, struck a casual conversation with Siri or Alexa like you would a friend. In time, your voice assistant could truly sound like a human confidant—just don’t mention anything you wouldn’t want Big Brother to hear.
After deepfaking its CEO into a keynote without anyone noticing, Nvidia proves that the choppy robot voice could be a thing of the past. At speech processing conference Interspeech, the AI graphics processing giant detailed a new artificial voice system that can speak in a way that’s virtually indistinguishable from a human voice.
To create narrations that are as realistic as possible, Nvidia approaches speech in one of two ways, Digital Trends notes. Firstly, it trains a text-to-speech algorithm to read out a human speech, and following after, it would then be able to vocalize any text input. The other technique sees a human-recorded audio file being translated into an artificial version with similar speaking patterns and intonations.
When training AI to speak like a human, Nvidia perceives speech as an extension of music, which has its own pitches, rhythms, and timbers.
Putting its technology to practice, Nvidia’s AI now naturally narrates a series of I Am AI videos demonstrating where machine learning is headed. The technology has also been shared in the open-source NeMo toolkit for anyone running an Nvidia GPU to experiment with.
Aside from making virtual assistants sound more convincing, Nvidia stresses that the voice tools could assist people with vocal impairments or naturally translate speech into other languages using a person’s voice.
Disclaimer: an AI voiced this:
[via Digital Trends, cover photo 111329508 © Boydz1980 | Dreamstime.com]
Receive interesting stories like this one in your inbox
Also check out these recent news