Model › Mira

Mira. Built for speed.

Sub-400ms responses, high-volume throughput, and pricing that works at scale. Mira is the model for the moments where latency matters more than depth.

Try Mira API reference →

128K context4K max outputText only26+ languages<400ms first token₹13/1M input · ₹68/1M output

Capabilities

Mira is built for scale.

Real-time chat

Sub-400ms to first token means Mira feels instant. Build chatbots, support agents, and conversational interfaces where any perceptible lag breaks the experience.

High-volume classification

Tagging, routing, moderation, sentiment, intent - Mira handles the small decisions that happen millions of times a day. At ₹13 per million input tokens, running Mira at scale stays affordable.

Autocomplete and suggestion

Fast enough to live inside an editor or search box, Mira powers the inline AI experiences where response time is the product.

Pricing

Priced to run at scale.

Input

₹13per 1M tokens

For prompts at real-time-chat volume. $0.15 per 1M in USD.

Output

₹68per 1M tokens

For generated responses, at any volume. $0.80 per 1M in USD.

At these prices, a million user messages costs less than a cup of chai.

Choose well

When to pick Mira.

Pick Mira when speed and cost are the product - real-time chat, classification, autocomplete, anything that happens thousands of times per minute.

Pick Vaani when you need reasoning quality or vision, and the workload isn't latency-critical. Most interactive apps fit Vaani, not Mira.

Pick Kavi when the task is rare, hard, and accuracy-critical.

Built for volume.

When latency matters, Mira answers first.

Try Mira API reference →