Meta's Fundamental AI Research (FAIR) team has announced a new program aimed at enhancing and expanding machine translation and speech recognition, particularly for underserved languages. In collaboration with UNESCO, Meta is expanding its support for linguistic diversity through open-source AI models and research.
Also Read: Meta Says Open-Source AI Is Transforming Healthcare Outcomes
Meta's New Initiative for Linguistic Diversity
To achieve the same, Meta announced on Friday the launch of the new Language Technology Partner Programme, which aims to find partners to collaborate on advancing and expanding its open-source language technologies, including AI translation technologies. Meta is particularly focusing efforts on underserved languages, supporting UNESCO's work as part of the International Decade of Indigenous Languages.
Partners Contribution
Partners will contribute speech recordings, transcriptions, and translated text to help improve AI-driven speech recognition and machine translation models. The Government of Nunavut, Canada, has already joined the initiative, providing data for the Inuit languages Inuktitut and Inuinnaqtun. Participants will also gain access to technical workshops led by Meta's researchers.
"We are looking for partners who can contribute 10+ hours of speech recordings with transcriptions, large amounts of written text (200+ sentences) and sets of translated sentences in diverse languages," Meta said on February 7, 2025. The company added that partners will work with its teams to help integrate these languages into AI-driven speech recognition and machine translation models, which, when released, will be open source and freely available to the community.
Also Read: Meta Plans to Invest up to USD 65 Billion in AI in 2025
Open Source Translation Benchmark
Additionally, Meta is launching an Open Source Translation Benchmark—a standardised test that Meta says will help evaluate the performance of AI models that conduct translation. Designed by linguistic experts, the benchmark assesses machine translation models. The benchmark is available in seven languages, and contributes translations that will be made open source and available to others, Meta said.
Meta said this new announcement is part of its long-term commitment to supporting under-served languages. In 2022, Meta released the No Language Left Behind (NLLB) project, an open-source machine translation engine that, according to the company, was the first neural machine translation model for many languages and laid the foundation for future research and development.
Also Read: Meta Expands Access to Llama AI Models for US Government Use
Meta Massively Multilingual Speech Project
More recently, Meta introduced the Meta Massively Multilingual Speech (MMS) project, which scaled speech recognition to over 1,100 languages. In 2024, the project added new capabilities, including zero-shot transcription, allowing AI to transcribe languages it has never encountered before without prior training.
"Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background," Meta said in a blog post.