Meta AI Introduces Omnilingual ASR, Advancing Automatic Speech Recognition Across More Than 1,600 Languages

11-Nov-2025 mpost.io
Meta AI Introduces Omnilingual ASR, Advancing Automatic Speech Recognition Across More Than 1,600 Languages

Research division of technology company Meta specializing in AI and augmented reality, Meta AI announced the release of the Meta Omnilingual Automatic Speech Recognition (ASR) system. 

This suite of models delivers automatic speech recognition for over 1,600 languages, achieving high-quality performance at an unprecedented scale. In addition, Meta AI is open-sourcing Omnilingual wav2vec 2.0, a self-supervised, massively multilingual speech representation model with 7 billion parameters, designed to support a variety of downstream speech tasks.

Alongside these tools, the organization is also releasing the Omnilingual ASR Corpus, a curated collection of transcribed speech from 350 underserved languages, developed in partnership with global collaborators.

Automatic speech recognition has advanced in recent years, achieving near-perfect accuracy for many widely spoken languages. Expanding coverage to less-resourced languages, however, has remained challenging due to the high data and computational demands of existing AI architectures. The Omnilingual ASR system addresses this limitation by scaling the wav2vec 2.0 speech encoder to 7 billion parameters, creating rich multilingual representations from raw, untranscribed speech. Two decoder variants map these representations into character tokens: one using connectionist temporal classification (CTC) and another using a transformer-based approach similar to those in large language models.

This LLM-inspired ASR approach achieves state-of-the-art performance across more than 1,600 languages, with character error rates under 10 for 78% of them, and introduces a more flexible method for adding new languages. 

Unlike traditional systems that require expert fine-tuning, Omnilingual ASR can incorporate a previously unsupported language using only a few paired audio-text examples, enabling transcription without extensive data, specialized expertise, or high-end compute. While zero-shot results do not yet match fully trained systems, this method provides a scalable way to bring underserved languages into the digital ecosystem.

Meta AI To Advance Speech Recognition With Omnilingual ASR Suite And Corpus 

The research division has released a comprehensive suite of models and a dataset designed to advance speech technology for any language. Building on FAIR’s prior research, Omnilingual ASR includes two decoder variants, ranging from lightweight 300M models for low-power devices to 7B models offering high accuracy across diverse applications. The general-purpose wav2vec 2.0 speech foundation model is also available in multiple sizes, enabling a wide range of speech-related tasks beyond ASR. All models are provided under an Apache 2.0 license, and the dataset is available under CC-BY, allowing researchers, developers, and language advocates to adapt and expand speech solutions using FAIR’s open-source fairseq2 framework in the PyTorch ecosystem.

Omnilingual ASR is trained on one of the largest and most linguistically diverse ASR corpora ever assembled, combining publicly available datasets with community-sourced recordings. To support languages with limited digital presence, Meta AI partnered with local organizations to recruit and compensate native speakers in remote or under-documented regions, creating the Omnilingual ASR Corpus, the largest ultra-low-resource spontaneous ASR dataset to date. Additional collaborations through the Language Technology Partner Program brought together linguists, researchers, and language communities worldwide, including partnerships with Mozilla Foundation’s Common Voice and Lanfrica/NaijaVoices. These efforts provided deep linguistic insight and cultural context, ensuring the technology meets local needs while empowering diverse language communities globally.

The post Meta AI Introduces Omnilingual ASR, Advancing Automatic Speech Recognition Across More Than 1,600 Languages appeared first on Metaverse Post.

Also read: TeraWulf Q3 Revenue Surges 87% as Bitcoin Price Almost Doubles
About Author Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc fermentum lectus eget interdum varius. Curabitur ut nibh vel velit cursus molestie. Cras sed sagittis erat. Nullam id ante hendrerit, lobortis justo ac, fermentum neque. Mauris egestas maximus tortor. Nunc non neque a quam sollicitudin facilisis. Maecenas posuere turpis arcu, vel tempor ipsum tincidunt ut.
WHAT'S YOUR OPINION?
Related News