In the Massively Multilingual Speech (MMS) project, we overcome some of these challenges by combining wav2vec 2.0, our pioneering work in self-supervised learning, and a new dataset that provides labeled data for over 1,100 languages and unlabeled data for nearly 4,000 languages. Some of these, such as the Tatuyo language, have only a few hundred speakers, and for most of these languages, no prior speech technology exists. Our results show that the Massively Multilingual Speech models outperform existing models and cover 10 times as many languages. Meta is focused on multilinguality in general: For text, the NLLB project scaled multilingual translation to 200 languages, and the Massively Multilingual Speech project scales speech technology to many more languages.
Meta continues to impress.