Complete LLM API Pricing Table 2026
All prices are per 1 million tokens (input / output):
| Model | Provider | Input / 1M | Output / 1M | Context | Best For |
|---|---|---|---|---|---|
| Gemini 3 Flash-Lite | $0.05 | $0.20 | 1M tokens | Highest volume, cheapest | |
| Gemini 3 Flash | $0.075 | $0.30 | 1M tokens | Balanced budget option | |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K tokens | Cost-effective general AI |
| Claude 3.5 Haiku | Anthropic | $0.80 | $4.00 | 200K tokens | Fast, structured tasks |
| Gemini 3 Pro | $0.35 | $1.05 | 1M tokens | Mid-range capability | |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K tokens | General purpose, coding |
| Claude 4 Sonnet | Anthropic | $3.00 | $15.00 | 200K tokens | Long docs, analysis, writing |
| GPT-5 | OpenAI | $10.00 | $40.00 | 128K tokens | Maximum capability |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 64K tokens | Open-weight alternative |
| DeepSeek R1 | DeepSeek | $0.55 | $2.20 | 64K tokens | Reasoning model |
How to Use This Calculator
- Enter your input tokens: Typical prompt + context length
- Enter your output tokens: Expected response length
- Read the comparison: See costs ranked from cheapest to most expensive
- Note the savings: Compare cheapest vs most expensive option
Real-World Cost Scenarios
Scenario 1: Chatbot (100K daily users)
Request: 300 input + 150 output tokens
Daily volume: 100,000 requests
| Model | Daily Cost | Monthly Cost |
|---|---|---|
| Gemini Flash-Lite | $4.05 | $121.50 |
| GPT-4o mini | $7.83 | $234.90 |
| Claude 3.5 Sonnet | $57.60 | $1,728 |
| GPT-5 | $135.00 | $4,050 |
Gemini Flash-Lite saves $3,928/month vs GPT-5 for the same volume.
Scenario 2: Document Processing (10,000 documents/day)
Request: 5,000 input + 1,000 output tokens per document
| Model | Per Document | Daily Cost | Monthly Cost |
|---|---|---|---|
| Gemini 3 Flash-Lite | $0.00020 | $2.00 | $60 |
| GPT-4o mini | $0.00105 | $10.50 | $315 |
| Claude 4 Sonnet | $0.03000 | $300 | $9,000 |
At this scale, choosing Gemini over Claude saves $8,940/month.
How to Choose the Right Model
- Maximum cost savings: Use Gemini 3 Flash-Lite or GPT-4o mini for 90% of tasks
- Long documents: Claude 4 Sonnet excels with 200K context and superior document understanding
- Code generation: GPT-4o and GPT-5 have the strongest code performance
- Complex reasoning: Claude 4 Opus or o3 for multi-step problem solving
- Privacy/compliance: Self-hosted DeepSeek V3 for data-sensitive environments
- Hybrid approach: Route by task complexity — cheap models for simple queries, premium for complex ones
Frequently Asked Questions
Is Gemini really cheaper than GPT-4o?
Yes. Gemini 3 Flash-Lite costs $0.05/$0.20 per 1M tokens vs GPT-4o's $2.50/$10.00 — that's 50x cheaper on input and 50x cheaper on output. Gemini 3 Flash at $0.075/$0.30 is also excellent. The main trade-off is that Google's models have slightly different capability profiles, though Flash is excellent for most general tasks.
What about DeepSeek V3?
DeepSeek V3 at $0.27/$1.10 per 1M tokens is an excellent open-weight option with strong performance. DeepSeek R1 ($0.55/$2.20) is purpose-built for reasoning tasks. Both are available through the DeepSeek API, Groq, and other providers at competitive rates.
How much can I save by switching models?
Switching from GPT-5 ($50/1M total) to Gemini Flash-Lite ($0.375/1M total) saves 99.3% per token. For a typical workload of 1B tokens/month, that's the difference between $50,000 and $375. However, always validate output quality with the cheaper model before switching production workloads.