The Indian technological space has been blowing up in February 2026, even as the India AI Impact Summit 2026 was underway in New Delhi.


The reason? The unveiling of new indigenous and sovereign AI models from Sarvam, Gnani.ai, and BharatGen. Spanning real-world, voice, and language interfaces, these models are a huge step towards building homegrown alternatives to global AI systems dominated by Big Tech. Their launch signals a shift from being a user of global artificial intelligence tools to designing, building, and using domestic AI infrastructure aimed at education, healthcare, agriculture, and government services at population scale.

Part of the larger IndiaAI Mission, the objective behind investing in and building this indigenous LLM ecosystem is to address concerns related to bias and ensure that Indian AI systems are trained primarily on data related to Indians. That being said, how effective will these models be in tackling this bias and what are the concerns we need to address during their deployment?

All About India’s Indigenous AI Models

Firstly, there are the two much-anticipated large language models (LLMs) from Sarvam AI, a 105-billion-parameter model, and a 30-billion-parameter one, both trained from scratch in India. According to Sarvam, the former, larger model can even outperform the likes of Google’s Gemini Flash and DeepSeek R1 on multiple benchmarks, while using an MoE (mixture-of-experts) architecture to lessen inference costs.

Moreover, Sarvam is positioning efficiency as central to scaling AI for population-level use, with the models aiming at agentic AI, programming, and complex reasoning workloads.

Next, there’s Gnani.ai, which unveiled text-to-speech model Vachana TTS, which can clone human voices across as many as 12 Indian languages, and that too by using reference audio of less than 10 seconds. According to Gnani.ai, Vachana not only preserves voice characteristics such as speaking style, pitch, and tone, but also allows the same voice to operate across multiple languages.

Designed for high-volume usage and low-bandwidth conditions, this model is targeted at large-scale enterprise deployments, customer support systems, and government services, with all models and datasets hosted within India.

Finally, there’s the big leagues: BharatGen’s 17-billion-parameter multilingual foundational model, BharatGen Param2 17B MoE, with an IIT Bombay-led consortium at the helm. Param2 17B is an MoE model optimised for the Indic language group, with a focus on enabling AI adoption across enterprise, agriculture, healthcare, education, and governance use cases.

The company also plans to release this open-source model, its documentation, and post-training workflows via its Hugging Face repository, allowing enterprises, startups, and developers to design, fine-tune, and deploy India-centric AI applications.

With nearly Rs 900 crores of funds from the IndiaAI Mission directed towards it, BharatGen certainly has been the biggest beneficiary of India’s sovereign LLM initiative until now.

The Need For Indigenous Ecosystems

As GenAI (generative AI) technologies reshape everything from human interaction to industries globally, India has been gradually carving a place for itself in this evolving landscape. There’s been a surge in the popularity of Large Language Models (LLMs), with Gemini and ChatGPT gaining ground in India. As of the third week of February 2026, ChatGPT has 100 million weekly users in India, easily making us its largest user base.

In fact, Indians use Google Gemini more than any other country in the world when it comes to learning and education.

However, despite these ostensibly high adoption rates, a Google-Kantar April 2025 report showed that merely 31% of Indians have used GenAI platforms. The main issue with foreign LLMs is the inherent bias in their responses and their inability to cater effectively to India’s contextual realities, socio-cultural diversity, and multilingual population. This, in turn, results in culturally inaccurate outputs, thus effectively preventing last-mile adoption as the datasets are not local.

For instance, when asked to generate the image of an Indian in May 2024, MetaAI displayed a peculiar predisposition to generating a ‘man with a turban’ – almost four out of five times. This is inherent bias, despite India’s cultural and demographic diversity. Furthermore, low formal and digital literacy result in additional barriers to adoption, only deepening the trust deficit due to this cultural mismatch.

With India being the world’s second-largest generator of digital data, we have immense potential to provide high-quality datasets to train models and make AI tools more accessible to populations that are underserved and below the poverty line. One of the best starting points is the government’s high-quality data capture initiative and dataset platform, AIKosh, where institutions like IIT Bombay contribute more than 16 datasets.

Furthermore, it works in tandem with the Indic translation tool that is the Bhashini initiative, whose objective is to overcome India’s literacy, digital, and linguistic barriers.

What Are The Challenges?

India’s wide-ranging regional, cultural, and linguistic diversity means that AI bias could manifest in ways that might not be obvious in western contexts. For instance, AI systems trained on urban contexts or English datasets might perform poorly for those who speak regional dialects or in rural areas.

Likewise, Western AI models are designed around race, but when it comes to India, caste is a reality and ignoring that risks embedding invisible bias into sovereign models. Indigenous AI systems can inadvertently perpetuate biases related to socio-economic background, language, and gender.

Furthermore, AI models might even get infused with disparities and bias if the data they are trained on are embedded with systemic disparities and historical biases. Hence, LLM developers in India need to ensure bias mitigation in these LLM models and ensure that they don’t throw up insensitive results when confronted with challenging prompts.

What Lies Ahead

With the IndiaAI Mission and government push and support, India’s GenAI journey has really taken off. Once they’re fully deployed, Indic LLMs could possibly offer what foreign LLMs currently cannot – unbiased, accurate, communication in Indian languages. However, successful execution is imperative, especially with the Indic GenAI sector navigating talent retention, ethical issues, data governance issues, and evolving computing supply chains.

The real challenge is whether these homegrown LLMs reach the underserved and those below the poverty line, and whether it could be possibly used to transform the lives of citizens in the Global South.

In case you missed:

Malavika Madgula is a writer and coffee lover from Mumbai, India, with a post-graduate degree in finance and an interest in the world. She can usually be found reading dystopian fiction cover to cover. Currently, she works as a travel content writer and hopes to write her own dystopian novel one day.

Leave A Reply

Share.
© Copyright Sify Technologies Ltd, 1998-2022. All rights reserved