Summarize

This AI text-to-speech model supports 11 Indian languages

By Dwaipayan Roy

May 07, 2025

08:11 pm

What's the story

Bengaluru-based artificial intelligence start-up Sarvam AI has launched Bulbul-V2, its latest text-to-speech (TTS) model. The new AI tool, as the company says, supports 11 Indian languages and promises authentic accents to reflect India's diversity. The voice generated by this AI model, as per Sarvam's LinkedIn post, sounds real and not robotic or rehearsed.

Model features

It sets new benchmarks for speech AI in India

Bulbul-V2 is Sarvam's flagship TTS model, designed specifically for Indian languages and accents. The company says this AI model delivers natural-sounding speech with human-like prosody and multiple voice personalities. It supports multi-language and code-mixed text, real-time synthesis capabilities, and also offers fine-grained control over pitch, pace, and loudness.

Customization options

Bulbul-V2 offers customizable voice characteristics

The Bulbul-V2 model gives users multiple sample rates ranging from 8kHz to 24kHz. The AI tool is designed for low latency and comes at cost-effective pricing, making it an efficient alternative to many global counterparts. Sarvam AI claims this model can help create the exact voice style one needs for various applications.

Achievements

Sarvam AI: A pioneer in India's AI landscape

Sarvam AI has come a long way in the world of artificial intelligence. It was the first start-up selected by the Indian government to develop India's sovereign large language model (LLM) under the larger IndiaAI mission. The company hopes to democratize AI access across the country with lower-latency models and India-first pricing for API access.