Production-grade inference for curated open-weight and Indic-native models. India-hosted. OpenAI-compatible.
Drop-in replacement for the OpenAI SDK. Change your base URL and keep your existing workflow.
Track token usage per client, project, or team. Set budgets and rate limits per API key.
Choose how long prompts and completions are stored. Default: 30 days. Configurable down to zero. Your data is never used for training.
Every API call logged with timestamp, model, token count, and API key ID. Exportable for compliance reviews.
Choose the right tradeoff between cost, latency, and capability.
Smaller models, lowest latency. Best for high-volume tasks, prototyping, and cost-sensitive workloads.
Qwen3-8B, Krutrim-2 12B
Balanced performance and cost. Indic-native and multilingual models for most production workloads.
Sarvam-30B, Qwen3-32B
Largest models, highest quality. Complex reasoning, enterprise use cases, and frontier capabilities.
DeepSeek-R1, Llama 3.3-70B, Sarvam-105B
If you use the OpenAI SDK today, migration takes one line. Pay in INR with proper GST invoices — no forex markups, no currency conversions.
from openai import OpenAI
# Change one line — everything else stays the same
client = OpenAI(
base_url="https://api.indicstack.ai/v1",
api_key="isk_your_api_key"
)
response = client.chat.completions.create(
model="sarvam-30b",
messages=[
{"role": "user", "content": "Explain DPDP Act in simple terms"}
]
)
print(response.choices[0].message.content)