IndicStack
Platform
ConsultingModelsAgentsComplianceBlog
Request Early Access
IndicStack

Product

  • Platform
  • Models
  • Agents
  • Consulting

Solutions

  • IT Services & Agencies
  • SaaS Startups
  • Regulated Industries
  • WhatsApp Automation

Company

  • About
  • Compliance
  • Blog

Legal

  • Privacy Policy
  • Terms of Service

IndicStack Consultancy Services LLP — Built for Indian AI builders.

Layer 2: Managed Inference

Managed LLM API

Production-grade inference for curated open-weight and Indic-native models. India-hosted. OpenAI-compatible.

Request Early AccessView Models

Built for production

# Before
base_url="https://api.openai.com/v1"
# After
base_url="https://api.indicstack.ai/v1"
chat.completionsembeddingsmodels

OpenAI-compatible

Drop-in replacement for the OpenAI SDK. Change your base URL and keep your existing workflow.

ClientModelTokens
axis-claims-botsarvam-30b
12,847
hdfc-kyc-assistqwen3-32b
8,231
navi-supportqwen3-8b
3,104

Client-level metering

Track token usage per client, project, or team. Set budgets and rate limits per API key.

prompt_retention
30 days→0 days
completion_retention30 days
metadata_retention90 days
training_opt_infalse

Configurable retention

Choose how long prompts and completions are stored. Default: 30 days. Configurable down to zero. Your data is never used for training.

10:42:15POST/v1/chat200812tok
10:42:11POST/v1/chat2001,247tok
10:42:03POST/v1/embeddings200256tok
10:41:58POST/v1/chat429—tok

Audit logs

Every API call logged with timestamp, model, token count, and API key ID. Exportable for compliance reviews.

Model tiers

Choose the right tradeoff between cost, latency, and capability.

Economy

Smaller models, lowest latency. Best for high-volume tasks, prototyping, and cost-sensitive workloads.

Qwen3-8B, Krutrim-2 12B

Default

Most popular

Balanced performance and cost. Indic-native and multilingual models for most production workloads.

Sarvam-30B, Qwen3-32B

Premium

Largest models, highest quality. Complex reasoning, enterprise use cases, and frontier capabilities.

DeepSeek-R1, Llama 3.3-70B, Sarvam-105B

Migration

If you use the OpenAI SDK today, migration takes one line. Pay in INR with proper GST invoices — no forex markups, no currency conversions.

migration.py
from openai import OpenAI

# Change one line — everything else stays the same
client = OpenAI(
    base_url="https://api.indicstack.ai/v1",
    api_key="isk_your_api_key"
)

response = client.chat.completions.create(
    model="sarvam-30b",
    messages=[
        {"role": "user", "content": "Explain DPDP Act in simple terms"}
    ]
)

print(response.choices[0].message.content)

Get started

We review requests based on pilot fit and rollout capacity.

Request Early Access