Meta Introduces Open-Source Speech AI That Can ‘Speak’ Over 1,000 Languages
By Alexa Heah, 23 May 2023
It’s no secret Meta is fully on board the artificial intelligence hype train. In April, the company released an open-source project that helps artists turn doodles into animations, while earlier this month, it debuted the ‘ImageBind’ software that learns to mimic human perception.
Now, the technology giant is back with a new AI-generated language model. Dubbed the Massively Multilingual Speech (MMS) Project, the impressive system can recognize over 4,000 languages from around the globe, and “speaks,” or produces speech, in more than 1,100 of them.
Similar to its other AI systems, MMS will be made available as an open-source tool in hopes of preserving the diversity of languages in the world and allowing users to access information in their preferred language for a more inclusive experience with technology.
According to Meta, this speech technology can be used in anything from virtual and augmented reality programs to messaging services so that everyone can understand and be understood. The firm is welcoming researchers to build on its database.
Coming up with a way to record the audio data for thousands of languages was the first challenge of the project, considering the largest existing speech dataset to date only stored over 100 languages at most.
To create a system that could identify 40 times more speech, the team decided to turn to religious texts, such as the Bible, that have been translated into a wide range of languages and are often used in language translation research.
By using the publicly-available recordings of these holy scriptures, Meta was able to create a database of readings of the New Testament in more than 1,100 languages—with an average of 32 hours per language.
That’s not all. By adding other unlabeled recordings, the number of languages MMS can recognize ballooned to more than 4,000. And while most of the data used were read by male speakers, analysis shows the software performs equally well for both male and female voices.
Going forward, the team hopes to increase the system’s coverage further to support even more languages, and in particular, dialects from around the globe, which has long proved a challenge for existing speech technology.
[via Engadget and Technology Review, images via various sources]