Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Which Sarvam model should I use for translation?

Use sarvam-translate:v1 for all 22 official Indian languages with formal style (2,000 chars/request). Use mayura:v1 for colloquial, code-mixed, or script-controlled translation across 11 languages (1,000 chars/request). Mayura supports modes like modern-colloquial and code-mixed with output_script control (roman, fully-native, spoken-form-in-native).

What languages does Sarvam AI support?

Chat LLMs support 10 most-spoken Indian languages plus English in native script, romanized, and code-mixed input. Saaras v3 ASR and Sarvam-Translate cover 23 languages (22 Indian + English). Sarvam Vision document intelligence supports all 22 official Indian languages plus English. Bulbul v3 TTS supports 11 languages with 30+ speaker voices.

Is Sarvam AI competitive with GPT or Claude?

On Indian-language benchmarks, Sarvam 105B wins ~90% of pairwise comparisons against frontier models. On global English-centric benchmarks like Artificial Analysis Intelligence Index, it trails GPT and Claude frontier models. Sarvam 105B scores 98.6 on Math500, 88.3 on AIME 25 (96.7 with tools), and 68.3 on Tau2 agentic benchmark — competitive with models of its class globally, but not frontier-tier on all English tasks.

Sarvam AI is a Bengaluru-based AI company founded in 2023 by Vivek Raghavan and Pratyush Kumar. It builds foundation models and APIs optimized for Indian languages — LLMs (Sarvam-30B, Sarvam-105B), speech (Saaras v3, Bulbul v3), translation (Sarvam-Translate, Mayura), and document intelligence (Sarvam Vision). Models are trained on IndiaAI Mission compute and available via API at docs.sarvam.ai.

What is the difference between Sarvam-30B and Sarvam-105B?

Sarvam-30B is a 30B MoE model with 2.4B active parameters, 64K context, optimized for real-time chat and voice-agent pipelines — it powers Samvaad. Sarvam-105B is the 105B+ flagship with MLA attention and 128K context, scoring 98.6 on Math500 and 49.5 on BrowseComp — it powers Indus. Both are Apache 2.0 open-source. Use 30B for throughput; 105B for complex reasoning and agentic workflows.

How do I use the Sarvam AI API?

Sign up at dashboard.sarvam.ai, get an api-subscription-key, install the SDK (pip install sarvamai or npm install sarvamai), and call endpoints. Chat uses OpenAI-compatible POST /v1/chat/completions with model sarvam-30b or sarvam-105b. Speech, translation, TTS, and document intelligence have dedicated REST endpoints. New users receive ₹100 free credits.

Sarvam AI Capabilities: Models, API & Indian Language Stack (2026) | explainx.ai Blog

Sarvam AI is building India's sovereign AI stack — not a single model, but a full product layer spanning chat LLMs, speech recognition, text-to-speech, translation, and document intelligence, all optimized for the way Indian languages are actually used: native script, romanized WhatsApp Hindi, code-mixed Hinglish, and 22 scheduled languages.

In March 2026, Sarvam open-sourced Sarvam-30B and Sarvam-105B — MoE reasoning models trained from scratch on IndiaAI Mission compute. Both are already in production: Sarvam 30B powers Samvaad (conversational agent platform), Sarvam 105B powers Indus (AI assistant for complex reasoning and agentic workflows).

This guide maps every model, API, pricing tier, and integration path — so you can pick the right Sarvam capability for your use case without reading six separate doc pages.

Quick reference: the Sarvam stack

Model	ID	What it does	Languages	Best for
Sarvam-105B	`sarvam-105b`	Flagship chat LLM (MoE + MLA)	10 Indic + English	Reasoning, agents, long docs
Sarvam-30B	`sarvam-30b`	Efficient chat LLM (MoE + GQA)	10 Indic + English	Voice agents, high-throughput chat
Saaras v3	`saaras:v3`	Speech-to-text	23 (22 Indic + English)	Call analytics, voice agents, telephony
Bulbul v3	`bulbul:v3`	Text-to-speech	11 (10 Indic + English)	IVR, narration, voice agents
Sarvam-Translate	`sarvam-translate:v1`	Formal translation	All 22 official Indic + English	Official docs, 22-language coverage
Mayura	`mayura:v1`	Colloquial translation	11 Indic + English	Code-mixed, conversational text
Sarvam Vision	`sarvam-vision`	Document intelligence (OCR)	23 (22 Indic + English)	Scanned archives, table extraction

Deprecated: sarvam-m (24B hybrid) — migrate to sarvam-30b or sarvam-105b.

SDK: pip install sarvamai · npm install sarvamai · docs.sarvam.ai

Free credits: ₹100 on signup · Pricing

Company and positioning

Sarvam AI was founded in 2023 in Bengaluru by Vivek Raghavan and Pratyush Kumar. It was selected under India's IndiaAI Mission to build the country's first homegrown LLM stack — trained entirely on Indian compute with datasets emphasizing Indian languages, code-mixed text, and culturally grounded content.

The strategic bet: unified multimodal models from Western labs treat Indian languages as secondary. Sarvam builds specialized foundations for multilingual India — the same thesis Ideogram applies to design typography, applied here to speech, script diversity, and romanized colloquial usage.

For sovereign AI policy context, see our India Sovereign AI Status 2026 post. This guide focuses on product capabilities and developer integration.

Chat LLMs: Sarvam-30B and Sarvam-105B

Both models are reasoning models trained from scratch — not fine-tunes of Mistral, Qwen, or Llama. Architecture: Mixture-of-Experts Transformer with 128 sparse experts, sigmoid-based routing, and in-house RL (async GRPO with CISPO-inspired policy optimization).

Sarvam-105B (flagship)

Spec	Value
Total parameters	105B+ MoE
Attention	Multi-head Latent Attention (MLA)
Active params	~10B per token
Context window	128K tokens
Pre-training	12T tokens
License	Apache 2.0
Powers	Indus AI assistant

Benchmark highlights (from Sarvam's blog):

Benchmark	Sarvam-105B
Math500	98.6
AIME 25 (w/ tools)	88.3 (96.7)
MMLU	90.6
LiveCodeBench v6	71.7
BrowseComp	49.5
Tau2 (avg.)	68.3 (highest in comparison set)
SWE-Bench Verified	45.0
Indian language win rate	~90% pairwise

Sarvam 105B leads on agentic benchmarks — BrowseComp and Tau2 — reflecting training on tool interaction, web search, and multi-step environments. On Indian-language pairwise evals, it wins ~90% of comparisons across fluency, script correctness, usefulness, and verbosity.

Sarvam-30B (efficient)

Spec	Value
Total parameters	30B MoE
Active params	2.4B per token
Attention	Grouped Query Attention (GQA)
Context window	64K tokens
Pre-training	16T tokens
License	Apache 2.0
Powers	Samvaad conversational platform
Inference	H100, L40S, Apple Silicon (MXFP4)

Benchmark	Sarvam-30B
Math500	97.0
HumanEval	92.1
LiveCodeBench v6	70.0
AIME 25 (w/ tools)	80.0 (96.7)
BrowseComp	35.5
Indian language win rate	~89% pairwise

Sarvam 30B is optimized for real-time deployment — Sarvam reports 3–6× throughput vs Qwen3 baseline on H100, and runs locally on MacBook Pro M3 via MXFP4.

Choosing between them

Need	Model
Voice-agent pipeline, low latency	Sarvam-30B
Multi-step reasoning, tool use, long docs	Sarvam-105B
Local/edge inference on laptop	Sarvam-30B (MXFP4)
Maximum Indian-language quality	Sarvam-105B
Cost-sensitive high-volume chat	Sarvam-30B (₹2.5/1M input vs ₹4)

Live WorkshopAug 1–2, 2026 · 2 days

Claude for Work

Use Claude as a thought partner for writing, research & decisions — no coding required. 2 live sessions with Yash Thakker.

API integration (OpenAI-compatible)

from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

response = client.chat.completions(
    model="sarvam-105b",
    messages=[
        {"role": "user", "content": "Explain GST impact on Indian MSMEs in Hindi."}
    ],
    temperature=0.5,
    max_tokens=2000,
)

print(response.choices[0].message.content)

curl -X POST https://api.sarvam.ai/v1/chat/completions \
  -H "api-subscription-key: $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Explain GST impact on Indian MSMEs."}],
    "model": "sarvam-105b",
    "temperature": 0.5,
    "max_tokens": 2000
  }'

Streaming is supported. Reasoning mode is on by default (reasoning_effort: low) — reasoning tokens count toward max_tokens. Increase max_tokens or set reasoning_effort=None to disable.

Weights: Hugging Face 30B · Hugging Face 105B · run with Transformers, vLLM, or SGLang.

Speech: Saaras v3 (ASR)

Saaras v3 is Sarvam's speech-to-text model — state-of-the-art ASR for Indian accents, code-mixed speech, and telephony audio (8 kHz).

Spec	Value
Model ID	`saaras:v3`
Languages	23 (22 Indic + English), auto-detect
REST limit	30 seconds per request
Batch limit	Up to 2 hours per file
Protocols	REST, Batch, WebSocket streaming

Five output modes

Mode	Output
`transcribe`	Text in source language
`translate`	Translated text (typically to English)
`verbatim`	Word-for-word including fillers
`translit`	Transliterated script
`codemix`	Preserves code-mixed structure

Example

from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

response = client.speech_to_text.transcribe(
    file="audio.wav",
    model="saaras:v3",
    language_code="hi-IN",
    mode="transcribe",
    with_timestamps=True,
)

print(response.transcript)

Batch API supports speaker diarization (diarized_transcript) — ideal for call center analytics, meetings, and long-form media.

Pricing: ₹30/hour (transcribe) · ₹45/hour (with diarization) · billed per second, rounded up.

Best for: Voice agents, IVR analytics, 8 kHz telephony, Hinglish/code-mixed call recordings.

Speech: Bulbul v3 (TTS)

Bulbul v3 converts text to natural-sounding speech across Indian languages.

Spec	Value
Model ID	`bulbul:v3`
Languages	11 (10 Indic + English)
Speakers	30+ voices (Shubh, Priya, Aditya, Ritu, Anand, …)
Max chars	2,500 per REST request
Sample rates	8–48 kHz (48 kHz REST/WebSocket only)
Pace control	0.5×–2.0×

Example

from sarvamai import SarvamAI
from sarvamai.play import play

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

response = client.text_to_speech.convert(
    text="आपका ऑर्डर confirm हो गया है।",
    target_language_code="hi-IN",
    model="bulbul:v3",
    speaker="priya",
    speech_sample_rate=24000,
)

play(response)

Critical limitation: Romanized Indic input degrades quality significantly. Always use native script for Indic words — e.g. "आपका order confirm हो गया है" not "Aapka order confirm ho gaya hai".

Pricing: ₹30/10K characters (v3 beta) · ₹15/10K (v2 legacy).

Protocols: REST, HTTP streaming, WebSocket — for real-time voice agent pipelines pair Bulbul (TTS) + Saaras (ASR) + Sarvam-30B (LLM).

Translation: Sarvam-Translate vs Mayura

Two translation models serve different styles:

	Sarvam-Translate	Mayura
Model ID	`sarvam-translate:v1`	`mayura:v1`
Languages	All 22 official Indic + English	11 Indic + English
Max input	2,000 characters	1,000 characters
Style	Formal only	formal, modern-colloquial, classic-colloquial, code-mixed
Script control	No	roman, fully-native, spoken-form-in-native
Best for	Government docs, legal, all-language coverage	WhatsApp-style Hinglish, conversational UI

Sarvam-Translate (formal, 22 languages)

response = client.text.translate(
    input="भारत एक महान देश है।",
    source_language_code="hi-IN",
    target_language_code="gu-IN",
    model="sarvam-translate:v1",
)

Open weights available on Hugging Face under Apache 2.0.

Mayura (colloquial + code-mixed)

response = client.text.translate(
    input="Your EMI of Rs. 3000 is pending",
    source_language_code="en-IN",
    target_language_code="hi-IN",
    mode="modern-colloquial",
    output_script="fully-native",
    numerals_format="native",
)
# → "आपका रु. 3000 का ई.एम.ऐ. पेंडिंग है।"

Also available: /transliterate (script conversion without translation) and /detect-language (language ID across all major Indian languages).

Pricing: ₹20/10K characters (translate/transliterate) · ₹3.5/10K (language ID).

Document intelligence: Sarvam Vision

Sarvam Vision is a 3B parameter vision-language model built for Indian-language OCR and document parsing — where global VLMs treat Indic scripts as secondary.

Spec	Value
Model ID	`sarvam-vision`
Parameters	3B (state-space VLM)
Languages	23 (22 Indic + English)
Input	PDF, PNG, JPG, ZIP
Output	HTML, Markdown, JSON (structured page data)
Max pages	10 per job
Max file size	200 MB

Capabilities

Text extraction with layout and reading order preserved
Complex tables — merged cells, multi-level headers, invisible borders → clean HTML/Markdown
End-to-end Indic — Marathi PDF → Marathi structured output (no forced English translation)

Example

job = client.document_intelligence.create_job(
    language="hi-IN",
    output_format="md",
)
job.upload_file("document.pdf")
job.start()
job.wait_until_complete()
job.download_output("./output.zip")

Pricing: ₹0.5/page · max 10 pages per job.

Best for: Digitizing government records, Indic academic archives, financial reports with complex tables, scanned legal documents.

API pricing summary (INR)

All prices from Sarvam's pricing page:

Service	Price	Unit
Sarvam-105B chat	₹4 / ₹2.5 / ₹16	input / cached / output per 1M tokens
Sarvam-30B chat	₹2.5 / ₹1.5 / ₹10	input / cached / output per 1M tokens
Speech-to-text	₹30	per hour of audio
STT + diarization	₹45	per hour
STT + translate	₹30	per hour
Sarvam-Translate	₹20	per 10K characters
Mayura translate	₹20	per 10K characters
Language ID	₹3.5	per 10K characters
Bulbul v3 TTS	₹30	per 10K characters
Document digitization	₹0.5	per page

Rate limits: Starter 60 req/min · Pro 200 · Business 1,000 · Enterprise custom.

Free tier: ₹100 credits on signup to explore all APIs.

Products built on Sarvam models

Product	Model	Description
Indus	Sarvam-105B	AI assistant for complex reasoning and agentic workflows
Samvaad	Sarvam-30B	Conversational agent platform for real-time multilingual chat

Both are live in production — the open-source release is not a research preview; these models serve real users today.

Sarvam Startup Program (March 2026): Selected early-stage companies receive 6–12 months of API credits, priority engineering support, and production infrastructure access.

Building a voice agent pipeline

The most common production pattern stacks three Sarvam APIs:

User speech → Saaras v3 (ASR) → Sarvam-30B (LLM) → Bulbul v3 (TTS) → Audio response

Why Sarvam-30B for the LLM layer: 2.4B active parameters = low latency; 64K context handles conversation history; trained on code-mixed Indian language input natively.

For agentic voice (tool calling, web search): swap in Sarvam-105B — 49.5 BrowseComp and 68.3 Tau2 scores reflect strong tool-use training.

For document-heavy workflows (scan a form, extract fields, respond in voice): add Sarvam Vision upstream of the LLM.

Honest benchmark framing

Sarvam's strength is structural, not universal frontier dominance:

Where Sarvam leads:

Indian-language pairwise evals (~90% win rate for 105B)
Agentic benchmarks in its class (Tau2, BrowseComp)
Math/reasoning at model scale (Math500 98.6, AIME 96.7 w/ tools)
Tokenizer efficiency for Indic scripts (lower cost per Indic token)
Speech/translation/OCR for 22+ languages

Where Sarvam trails:

English-centric global frontier benchmarks (Artificial Analysis Intelligence Index ~18 for 105B)
TerminalBench Hard (~1.5% for 105B vs GLM-4.5-Air ~20%)
SWE-Bench Verified (45% — competitive but below top coding models)

The honest use case: Indian-language applications, voice agents, document digitization, and sovereign deployment — not replacing Claude Fable 5 for English-only frontier coding.

MCP and agent integration

Sarvam publishes an MCP server at https://docs.sarvam.ai/_mcp/server for Claude Code, Cursor, and other MCP hosts — plus a Meta Prompt in their docs to guide any chat model on using Sarvam APIs effectively.

For wiring into agent harnesses, Sarvam-105B's tool-use training (BrowseComp 49.5, Tau2 68.3) makes it a strong backend for Indian-language agent loops. See our Agent Harness guide for loop architecture.

Getting started checklist

Sign up at dashboard.sarvam.ai — ₹100 free credits
Install SDK: pip install sarvamai
Pick your model from the stack table above
Test in Playground at docs.sarvam.ai before production
For self-hosted LLM: download weights from Hugging Face, run with vLLM/SGLang
For voice agents: Saaras → Sarvam-30B → Bulbul pipeline
For documents: Sarvam Vision batch API (split PDFs >10 pages)

Summary

Sarvam AI is the most complete India-first AI product stack available in 2026 — not just LLMs, but speech, translation, TTS, and document intelligence trained on Indian compute with open weights on the flagship models.

Three things to remember:

Two LLMs, two jobs: Sarvam-30B for speed and voice pipelines; Sarvam-105B for reasoning, agents, and maximum quality.
Translation has two modes: Sarvam-Translate for formal 22-language coverage; Mayura for colloquial and code-mixed Hinglish.
The moat is Indic depth — ~90% win rate on Indian-language benchmarks, native-script OCR, and code-mixed speech — not English frontier parity.

Sarvam AI: Full Capabilities Guide — Models, API, Speech, Vision & How to Run (2026)

Quick reference: the Sarvam stack

Company and positioning

Chat LLMs: Sarvam-30B and Sarvam-105B

Sarvam-105B (flagship)

Sarvam-30B (efficient)

Choosing between them

API integration (OpenAI-compatible)

Speech: Saaras v3 (ASR)

Five output modes

Example

Speech: Bulbul v3 (TTS)

Example

Translation: Sarvam-Translate vs Mayura

Sarvam-Translate (formal, 22 languages)

Mayura (colloquial + code-mixed)

Document intelligence: Sarvam Vision

Capabilities

Example

API pricing summary (INR)

Products built on Sarvam models

Building a voice agent pipeline

Honest benchmark framing

MCP and agent integration

Getting started checklist

Summary

Related reading

Related posts

Ideogram 4.0: Open-Weight Image Generation — How to Run, API & JSON Prompts (2026)

India's Sovereign AI Status: What It Really Means, What's Been Built, and What's Still Missing (2026)

BharatGen: IIT Bombay Launches India's Sovereign AI for All 22 Scheduled Languages