LLM API Cost Comparison 2026

Compare token costs across GPT-4o, Claude 3.5/4, Gemini 2.0/3, DeepSeek, and more. Side-by-side pricing for every major LLM provider.

Last updated: May 2026 · Pricing from OpenAI, Anthropic, Google, DeepSeek

Enter your typical request size to compare costs across all models.

Full Model Comparison

Sorted by total cost per request
🥇 Gemini 3 Flash-Lite $0.000225
🥈 Gemini 3 Flash $0.000300
🥉 GPT-4o mini $0.000435
4. DeepSeek V3 $0.000735
5. Claude 3.5 Haiku $0.003200
6. Gemini 3 Pro $0.003850
7. GPT-4o $0.007500
8. Claude 4 Sonnet $0.010500
9. GPT-5 mini $0.010875
10. GPT-5 $0.030000
11. DeepSeek R1 $0.001925
12. Gemini 3 Ultra $0.004333
Cheapest Model Gemini 3 Flash-Lite
Savings vs Most Expensive 99.3%

Complete LLM API Pricing Table 2026

All prices are per 1 million tokens (input / output):

Model Provider Input / 1M Output / 1M Context Best For
Gemini 3 Flash-Lite Google $0.05 $0.20 1M tokens Highest volume, cheapest
Gemini 3 Flash Google $0.075 $0.30 1M tokens Balanced budget option
GPT-4o mini OpenAI $0.15 $0.60 128K tokens Cost-effective general AI
Claude 3.5 Haiku Anthropic $0.80 $4.00 200K tokens Fast, structured tasks
Gemini 3 Pro Google $0.35 $1.05 1M tokens Mid-range capability
GPT-4o OpenAI $2.50 $10.00 128K tokens General purpose, coding
Claude 4 Sonnet Anthropic $3.00 $15.00 200K tokens Long docs, analysis, writing
GPT-5 OpenAI $10.00 $40.00 128K tokens Maximum capability
DeepSeek V3 DeepSeek $0.27 $1.10 64K tokens Open-weight alternative
DeepSeek R1 DeepSeek $0.55 $2.20 64K tokens Reasoning model

How to Use This Calculator

  1. Enter your input tokens: Typical prompt + context length
  2. Enter your output tokens: Expected response length
  3. Read the comparison: See costs ranked from cheapest to most expensive
  4. Note the savings: Compare cheapest vs most expensive option

Real-World Cost Scenarios

Scenario 1: Chatbot (100K daily users)

Request: 300 input + 150 output tokens

Daily volume: 100,000 requests

ModelDaily CostMonthly Cost
Gemini Flash-Lite$4.05$121.50
GPT-4o mini$7.83$234.90
Claude 3.5 Sonnet$57.60$1,728
GPT-5$135.00$4,050

Gemini Flash-Lite saves $3,928/month vs GPT-5 for the same volume.

Scenario 2: Document Processing (10,000 documents/day)

Request: 5,000 input + 1,000 output tokens per document

ModelPer DocumentDaily CostMonthly Cost
Gemini 3 Flash-Lite$0.00020$2.00$60
GPT-4o mini$0.00105$10.50$315
Claude 4 Sonnet$0.03000$300$9,000

At this scale, choosing Gemini over Claude saves $8,940/month.

How to Choose the Right Model

  • Maximum cost savings: Use Gemini 3 Flash-Lite or GPT-4o mini for 90% of tasks
  • Long documents: Claude 4 Sonnet excels with 200K context and superior document understanding
  • Code generation: GPT-4o and GPT-5 have the strongest code performance
  • Complex reasoning: Claude 4 Opus or o3 for multi-step problem solving
  • Privacy/compliance: Self-hosted DeepSeek V3 for data-sensitive environments
  • Hybrid approach: Route by task complexity — cheap models for simple queries, premium for complex ones

Frequently Asked Questions

Is Gemini really cheaper than GPT-4o?

Yes. Gemini 3 Flash-Lite costs $0.05/$0.20 per 1M tokens vs GPT-4o's $2.50/$10.00 — that's 50x cheaper on input and 50x cheaper on output. Gemini 3 Flash at $0.075/$0.30 is also excellent. The main trade-off is that Google's models have slightly different capability profiles, though Flash is excellent for most general tasks.

What about DeepSeek V3?

DeepSeek V3 at $0.27/$1.10 per 1M tokens is an excellent open-weight option with strong performance. DeepSeek R1 ($0.55/$2.20) is purpose-built for reasoning tasks. Both are available through the DeepSeek API, Groq, and other providers at competitive rates.

How much can I save by switching models?

Switching from GPT-5 ($50/1M total) to Gemini Flash-Lite ($0.375/1M total) saves 99.3% per token. For a typical workload of 1B tokens/month, that's the difference between $50,000 and $375. However, always validate output quality with the cheaper model before switching production workloads.