Model Catalog

Indic-native and multilingual models hosted on Indian infrastructure. All accessible via OpenAI-compatible API.

Sarvam-M

24B·ctx 32K

Indic-native

Instruction-tuned for 10 Indian languages. Strong for regional customer support and voice agent backends.

ChatDefaultApache 2.0

hibntatekngumrmlorpaen

Sarvam-30B

32.2B (MoE)·ctx 8K

Indic-native

Flagship Indic chat model with extensive code-mixing support across 22 Indian languages.

ChatDefaultApache 2.0

hibntatekngumrmlpaorasen+10

Qwen3-8B

8B·ctx 128K

High-throughput multilingual chat. Best economy option for English-first workloads with broad Indic coverage.

ChatEconomyApache 2.0

enhi+100 languages

Gemma 4 E4B

8B (4.5B eff.)·ctx 128K

Google's latest multimodal model. Low-cost serving with native support for major Indian languages.

ChatEconomyApache 2.0

enhibntateknml+133 languages

DeepSeek-R1 Distill 14B

14B·ctx 128K

R1 chain-of-thought reasoning distilled from Qwen2.5-14B. Fits single A100 comfortably at FP16.

ChatDefaultMIT

enzhmultilingual

Gemma 4 26B-A4B

25.2B (3.8B active)·ctx 256K

MoE design delivers frontier-quality reasoning at economy-tier throughput cost. 256K context window.

ChatDefaultApache 2.0

enhibntateknml+133 languages

Qwen3-32B

32B·ctx 128K

Strong coding and reasoning with hybrid thinking mode. High-context agentic and RAG workloads.

ChatDefaultApache 2.0

enhi+100 languages

DeepSeek-R1 Distill 32B

32B·ctx 128K

Best open reasoning model under 70B. Outperforms o1-mini on AIME 2024, MATH-500, and LiveCodeBench.

ChatDefaultMIT

enzhmultilingual

Sarvam-105B

106B·ctx 8K

Indic-native

Highest-quality Indic model available. Dedicated GPU deployment — contact us to discuss fit.

ChatPremiumApache 2.0

hibntatekngumrmlpaorasen+10

IndicConformer

600M

Indic-native

Fast, accurate ASR across all 22 scheduled Indian languages including low-resource variants.

Speech-to-TextEconomyMIT

22 Indian languages

Whisper Large V3 (Hindi)

1.5B

Whisper V3 fine-tuned on Vaani Hindi dataset. Handles diverse accents, noisy audio, and code-mixed speech.

Speech-to-TextEconomyApache 2.0

Qwen3-ASR-1.7B

1.7B

State-of-the-art open ASR with unified offline and streaming inference. Language identification included.

Speech-to-TextEconomyApache 2.0

hi+30 languages

Indic Parler TTS

937M

Indic-native

Expressive TTS with 500+ speaker voices across 18 Indian languages. Prompt-controlled voice style.

Text-to-SpeechEconomyApache 2.0

18 Indian languages

Indic-Mio

0.6B

Indic-native

44kHz TTS with zero-shot voice cloning and code-mixed text support across all 22 scheduled Indian languages.

Text-to-SpeechEconomyApache 2.0

22 Indian languages

Qwen3-Embedding-0.6B

0.6B·ctx 32K

MTEB-leading multilingual embeddings. Best for RAG over Indic and multilingual document corpora.

EmbeddingsEconomyApache 2.0

119 languages

Granite Embedding 311M

311M·ctx 32K

IBM's multilingual retrieval model with Matryoshka truncation and ONNX export. Minimal serving overhead.

EmbeddingsEconomyApache 2.0

200+ languages

Qwen3-Reranker-0.6B

0.6B·ctx 32K

Lightweight cross-encoder reranker for improving retrieval precision in multilingual RAG pipelines.

RerankerEconomyApache 2.0

multilingual

GTE Multilingual Reranker

306M·ctx 8K

Encoder-only reranker with ~10x throughput advantage over decoder-based alternatives. Proven in production.

RerankerEconomyApache 2.0

70+ languages

Need a specific model?

We evaluate new models continuously. Tell us what you need and we will check if it fits our serving infrastructure.

Request a Model