Managed LLM API

Production-grade inference for curated open-weight and Indic-native models. India-hosted. OpenAI-compatible.

Request Early Access View Models

Built for production

# Before

base_url="https://api.openai.com/v1"

# After

base_url="https://api.indicstack.ai/v1"

chat.completionsembeddingsmodels

OpenAI-compatible

Drop-in replacement for the OpenAI SDK. Change your base URL and keep your existing workflow.

ClientModelTokens

axis-claims-botsarvam-30b

12,847

hdfc-kyc-assistqwen3-32b

8,231

navi-supportqwen3-8b

3,104

Client-level metering

Track token usage per client, project, or team. Set budgets and rate limits per API key.

prompt_retention

30 days→0 days

completion_retention30 days

metadata_retention90 days

training_opt_infalse

Configurable retention

Choose how long prompts and completions are stored. Default: 30 days. Configurable down to zero. Your data is never used for training.

10:42:15POST/v1/chat200812tok

10:42:11POST/v1/chat2001,247tok

10:42:03POST/v1/embeddings200256tok

10:41:58POST/v1/chat429—tok

Audit logs

Every API call logged with timestamp, model, token count, and API key ID. Exportable for compliance reviews.

Model tiers

Choose the right tradeoff between cost, latency, and capability.

Economy

Smaller models, lowest latency. Best for high-volume tasks, prototyping, and cost-sensitive workloads.

Qwen3-8B, Krutrim-2 12B

Default

Premium

Largest models, highest quality. Complex reasoning, enterprise use cases, and frontier capabilities.

DeepSeek-R1, Llama 3.3-70B, Sarvam-105B

Migration

If you use the OpenAI SDK today, migration takes one line. Pay in INR with proper GST invoices — no forex markups, no currency conversions.

migration.py

from openai import OpenAI

# Change one line — everything else stays the same
client = OpenAI(
    base_url="https://api.indicstack.ai/v1",
    api_key="isk_your_api_key"
)

response = client.chat.completions.create(
    model="sarvam-30b",
    messages=[
        {"role": "user", "content": "Explain DPDP Act in simple terms"}
    ]
)

print(response.choices[0].message.content)

Get started

We review requests based on pilot fit and rollout capacity.

Request Early Access

from openai import OpenAI # Change one line — everything else stays the same client = OpenAI( base_url="https://api.indicstack.ai/v1", api_key="isk_your_api_key" ) response = client.chat.completions.create( model="sarvam-30b", messages=[ {"role": "user", "content": "Explain DPDP Act in simple terms"} ] ) print(response.choices[0].message.content)