Page Loader
Meta's AI translates an oral language for the first time
Meta used Mandarin as a leverage to translate Hokkien (Photo credit: Meta)

Meta's AI translates an oral language for the first time

Oct 21, 2022
12:05 am

What's the story

Languages without a written script are the final frontier of machine learning-based language translation systems. It seems that Meta has found a way to conquer this. The company has developed an AI-based speech-to-speech translator that can help with languages that don't have a script. This is part of Meta's Universal Speech Translator (UST) program. Meta has open-sourced this new development.

Context

Why does this story matter?

Nearly half of the world's languages don't have a script. If you're a fan of Star Wars and have dreamt of owning a Personal Universal Translator (PUT), primarily oral languages are kind of deal-breakers. Meta's speech-to-speech translator could be the solution to this. Although the company has a wonky past when it comes to AI-based language systems, this might be a game changer.

Issue

Speech-to-text translations require languages with script

AI-based speech translation systems rely on the availability of extensive transcriptions. This is where primarily oral languages become a problem for these systems. Since they don't have a script, producing text as the translated output doesn't work. This is why Meta used speech-to-speech translation. They picked Hokkien, a primarily oral language spoken by the Chinese diaspora, to develop the system.

Working

How does Meta's speech-to-speech system work?

To make the speech-to-speech translation of Hokkien work, Meta first translated the input language into a sequence of acoustic sounds using speech-to-unit (S2U) translation. They then generated waveforms from the unit. Meta used Mandarin as an intermediate language between English and Hokkien to train the AI. They first translated Hokkien or English speech to Mandarin text and then translated it into Hokkien or English.

Information

It can be used to translate one sentence at a time

Meta's new speech-to-speech translation system is far from perfect. It can only translate one sentence at a time as of now. Mark Zuckerberg, the company's CEO, believes that the system can be used for translating more languages in real-time.

Availability

Meta is open-sourcing the translating system

Meta is open-sourcing the Hokkien translation models, evaluation, and datasets. It is being done so that other researchers can build on this. It is introducing a speech-to-speech benchmarking system called 'Taiwanese Across Taiwan' based on the Hokkien speech corpus. The company is also releasing SpeechMatrix, a collection of speech-to-speech translations developed through LASER, Meta's language processing toolkit.