Sarvam AI Launches Sarvam-1, New Language Model Optimised for Indian Languages

The model was trained on Yotta’s Shakti cluster, utilising 1,024 GPUs over a five-day period, with Nvidia's NeMo framework facilitating the training process.

Highlights

  • Supports ten Indian languages and English, including Bengali, Gujarati, Hindi, and more.
  • Built on a 2-trillion-token dataset, evenly distributed across ten languages except for Hindi.
  • Trained on Yotta's Shakti cluster using 1,024 GPUs over five days.

Follow Us

Sarvam AI Launches Sarvam-1, New Language Model Optimised for Indian Languages
Bengaluru-based Sarvam AI has launched a new large language model (LLM), Sarvam-1. This 2-billion-parameter model is optimised to support ten major Indian languages alongside English, including Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu, the official release said. The model addresses the technological gap faced by billions of speakers of Indic languages, which have largely been underserved by existing large language models (LLMs).

Also Read: Mistral AI Unveils New Models for On-Device AI Computing




Key Features and Performance Enhancements

Sarvam-1 was built from the ground up to improve two critical areas: token efficiency and data quality. According to the company, traditional multilingual models exhibit high token fertility (the number of tokens needed per word) for Indic scripts, often requiring 4-8 tokens per word compared to 1.4 for English. In contrast, Sarvam-1's tokeniser achieves improved efficiency, with token fertility rates of just 1.4-2.1 across all supported languages.

Sarvam-2T Corpus

A significant challenge in developing effective language models for Indian languages has been the lack of high-quality training data. "While web-crawled Indic language data exists, it often lacks depth and quality," Sarvam AI noted.

To address this, the team created Sarvam-2T, a training corpus consisting of approximately 2 trillion tokens, evenly distributed across the ten languages, with Hindi making up about 20 percent of the data. Using advanced synthetic-data-generation techniques, the company has developed a high-quality corpus specifically for these Indic languages.

"The Sarvam 1 model is the first example of an LLM trained from scratch with data, research, and compute being fully in India", said Pratyush Kumar, Co-Founder, Sarvam. He added; "We expect it to power a range of use cases including voice and messaging agents. This is the beginning of our mission to build full stack sovereign AI. We are deeply excited to be working together with Nvidia towards this mission."

"Enterprises are seeking to leverage generative AI to accelerate innovation and tackle complex challenges at scale," said Kari Briski, vice president of AI software, models and services at Nvidia. "Sarvam AI's multilingual model, developed using Nvidia's full-stack AI platform including NeMo and Hopper GPUs, showcases how tailored AI solutions can address linguistic diversity and drive inclusive technological growth in regions like India."

Edge Device Deployment

According to the company, Sarvam-1 has demonstrated exceptional performance on standard benchmarks, outperforming comparable models like Gemma-2-2B and Llama-3.2-3B, while achieving similar results to Llama 3.1 8B. Its compact size allows for 4-6x faster inference, making it particularly suitable for practical applications, including edge device deployment.

Also Read: Google Announces AI Collaborations for Healthcare, Sustainability, and Agriculture in India

Key Improvements

Key improvements in Sarvam-2T include twice the average document length compared to existing datasets, a threefold increase in high-quality samples, and a balanced representation of scientific and technical content.

Sarvam claims Sarvam-1 is the first Indian language LLM. The model was trained on Yotta’s Shakti cluster, utilising 1,024 GPUs over a five-day period, with Nvidia's NeMo framework facilitating the training process.

Recent Comments

Faraz :

Well it depends how operator wants to use this feature. But yeah, bandwidth, speed and latency all can be customised,…

Jio vs Airtel vs Vi ARPU Q2 FY25: Tariff Hikes…

Shivraj Roy :

Network slicing does what? Can you explain the speed based tariff? The way how plans works in foreign nations? 10mbps…

Jio vs Airtel vs Vi ARPU Q2 FY25: Tariff Hikes…

Faraz :

Remember initially when Jio started their service of unlimited, FUP was 4 GB per day. In 2017, Jio completely removed…

Jio vs Airtel vs Vi ARPU Q2 FY25: Tariff Hikes…

Sujata :

Tariff restructure? Can you elaborate? Like what kind of tarrif you wanted/were expecting?

Jio vs Airtel vs Vi ARPU Q2 FY25: Tariff Hikes…

Sujata :

they're too lazy to do that.

Jio vs Airtel vs Vi ARPU Q2 FY25: Tariff Hikes…

Load More
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments