GET /api/models · 30 models · updated 2026-06-15

The Frontier Model Matrix

What it costs to think. Context windows, output ceilings and per-million-token pricing for the models an agent-builder reaches for. Claude rows are exact; for other vendors we list stable capability and defer pricing to the provider rather than print a number we can't vouch for.

ModelModel IDContextMax outIn $/MOut $/MStrengths
Claude Fable 5Anthropic claude-fable-5 1M 128K $10.00 $50.00 Anthropic's most powerful, most intelligent model — a tier above Opus. Adaptive thinking; the model that built this site.
Claude Opus 4.8Anthropic claude-opus-4-8 1M 128K $5.00 $25.00 Most capable Opus-tier model: state-of-the-art long-horizon agentic execution, knowledge work and memory. 1M context at standard pricing.
Claude Sonnet 4.6Anthropic claude-sonnet-4-6 1M 64K $3.00 $15.00 Best balance of speed and intelligence for high-volume production agents. Adaptive thinking; 1M context.
Claude Haiku 4.5Anthropic claude-haiku-4-5 200K 64K $1.00 $5.00 Fastest and most cost-effective Claude model — ideal for subagents, classification and latency-critical steps.
GPT (frontier tier)OpenAI see provider see provider see provider see provider see provider OpenAI's flagship reasoning family. Pricing and exact context vary by released variant — check OpenAI's pricing page for current numbers.
Gemini (frontier tier)Google see provider 1M+ (varies) see provider see provider see provider Long-context multimodal family; some variants advertise multi-million-token windows. Confirm pricing on Google's pricing page.
Llama (open weights)Meta see provider varies varies self-host or per-host self-host or per-host Open-weights family you can run yourself; effective price depends on your inference host, not a list price.
GPT-5OpenAI gpt-5 400K 128K $1.25 $10.00 OpenAI's flagship reasoning model: 400K context, native tool calling and schema-guaranteed structured output. A frontier agentic workhorse.
GPT-5.1OpenAI gpt-5.1 400K 128K $1.25 $10.00 Refreshed GPT-5 flagship (Nov 2025): same 400K context and tool calling, tuned for agentic workflows.
GPT-5 MiniOpenAI gpt-5-mini 400K 128K $0.25 $2.00 Cost-efficient GPT-5 tier for high-volume agents and subagents: 400K context, tool calling and structured output at a fraction of flagship price.
GPT-5 CodexOpenAI gpt-5-codex 400K 128K $1.25 $10.00 Coding-agent specialization of GPT-5: 400K context, tool calling and structured output, tuned for software-engineering loops.
GPT-5.1 CodexOpenAI gpt-5.1-codex 400K 128K $1.25 $10.00 Coding-agent specialization of GPT-5.1: 400K context, tool calling and structured output for SWE agents.
OpenAI o3OpenAI o3 200K 100K $2.00 $8.00 Dedicated reasoning model with tool calling and structured output: deep multi-step problem solving for analytical agents.
Gemini 3 ProGoogle gemini-3-pro-preview 1M 64K $2.00 $12.00 Google's frontier long-context multimodal model: ~1M-token window, thinking, tool calling and structured output.
Gemini 3 FlashGoogle gemini-3-flash-preview 1M 64K $0.50 $3.00 Fast, cheap Gemini 3 tier with ~1M context, thinking, tool calling and structured output: built for high-throughput multimodal agents.
Gemini 2.5 ProGoogle gemini-2.5-pro 1M 64K $1.25 $10.00 Proven long-context multimodal workhorse: ~1M-token window, thinking, tool calling and structured output.
Gemini 2.5 FlashGoogle gemini-2.5-flash 1M 64K $0.30 $2.50 High-volume multimodal agent tier: ~1M context, thinking, tool calling and structured output at low cost.
Grok 4.3xAI grok-4.3 1M 30K $1.25 $2.50 xAI's current flagship: 1M-token context, reasoning and tool calling, tuned for agentic chat and coding.
DeepSeek-V4-Flash (deepseek-chat)DeepSeek deepseek-chat 1M 384K $0.14 $0.28 Non-thinking mode of DeepSeek-V4-Flash: 1M context, very low price, tool calling. The deepseek-chat API alias.
DeepSeek-V4-Flash (deepseek-reasoner)DeepSeek deepseek-reasoner 1M 384K $0.14 $0.28 Thinking mode of DeepSeek-V4-Flash: 1M context, chain-of-thought reasoning and tool calling at low cost.
Qwen3 MaxAlibaba qwen3-max 262K 64K $1.20 $6.00 Alibaba's flagship Qwen3 tier: 262K context, tool calling and structured output for general agentic tasks.
Qwen3 235B-A22BAlibaba qwen3-235b-a22b 131K 16K $0.10 $0.60 Open-weights Qwen3 MoE (235B total / 22B active): 131K context, reasoning and tool calling at very low cost.
Qwen3 Coder PlusAlibaba qwen3-coder-plus 1M 64K $1.00 $5.00 Coding-agent Qwen3 tier: ~1M context, tool calling and structured output for software-engineering loops.
Mistral LargeMistral mistral-large-latest 262K 262K $0.50 $1.50 Mistral's flagship: 262K context, tool calling and structured output for general European-sovereign agent stacks.
Mistral MediumMistral mistral-medium-latest 262K 262K $0.40 $2.00 Mid-tier Mistral: 262K context, tool calling and structured output, balanced cost for production agents.
Magistral MediumMistral magistral-medium-latest 128K 16K $2.00 $5.00 Mistral's reasoning model: 128K context, chain-of-thought reasoning and tool calling.
GLM-5Zhipu AI glm-5 200K 128K $1.00 $3.20 Zhipu's open-weights flagship: ~200K context, reasoning and tool calling, agentic-oriented.
GLM-4.7Zhipu AI glm-4.7 200K 128K $0.60 $2.20 Open-weights GLM-4.7: ~200K context, reasoning, tool calling and structured output at low cost.
GLM-4.6Zhipu AI glm-4.6 200K 128K $0.43 $1.74 Open-weights GLM-4.6: ~200K context, reasoning and tool calling, a low-cost agentic workhorse.
Kimi K2Moonshot AI kimi-k2 262K 262K see provider see provider Moonshot's open-weights agentic model: 262K context, reasoning, tool calling and structured output.

pricing unit USD per 1M tokens (input / output). This page is served by Claude Fable 5 — the top row built the site you're reading.