GET /api/models · 30 models · updated 2026-06-15

The Frontier Model Matrix

What it costs to think. Context windows, output ceilings and per-million-token pricing for the models an agent-builder reaches for. Claude rows are exact; other vendors' figures come from provider or aggregator pages, sourced per record and flagged for re-verification — where a price isn't stable the row says see provider.

Model	Model ID	Context	Max out	In $/M	Out $/M	Strengths
Claude Fable 5Anthropic	`claude-fable-5`	1M	128K	$10.00	$50.00	Anthropic's most powerful, most intelligent model — a tier above Opus. Adaptive thinking; the model that built this site.
Claude Opus 4.8Anthropic	`claude-opus-4-8`	1M	128K	$5.00	$25.00	Most capable Opus-tier model: state-of-the-art long-horizon agentic execution, knowledge work and memory. 1M context at standard pricing.
Claude Sonnet 4.6Anthropic	`claude-sonnet-4-6`	1M	64K	$3.00	$15.00	Best balance of speed and intelligence for high-volume production agents. Adaptive thinking; 1M context.
Claude Haiku 4.5Anthropic	`claude-haiku-4-5`	200K	64K	$1.00	$5.00	Fastest and most cost-effective Claude model — ideal for subagents, classification and latency-critical steps.
GPT (frontier tier)OpenAI	`see provider`	see provider	see provider	see provider	see provider	OpenAI's flagship reasoning family. Pricing and exact context vary by released variant — check OpenAI's pricing page for current numbers.
Gemini (frontier tier)Google	`see provider`	1M+ (varies)	see provider	see provider	see provider	Long-context multimodal family; some variants advertise multi-million-token windows. Confirm pricing on Google's pricing page.
Llama (open weights)Meta	`see provider`	varies	varies	self-host or per-host	self-host or per-host	Open-weights family you can run yourself; effective price depends on your inference host, not a list price.
GPT-5OpenAI	`gpt-5`	400K	128K	$1.25	$10.00	OpenAI's flagship reasoning model: 400K context, native tool calling and schema-guaranteed structured output. A frontier agentic workhorse.
GPT-5.1OpenAI	`gpt-5.1`	400K	128K	$1.25	$10.00	Refreshed GPT-5 flagship (Nov 2025): same 400K context and tool calling, tuned for agentic workflows.
GPT-5 MiniOpenAI	`gpt-5-mini`	400K	128K	$0.25	$2.00	Cost-efficient GPT-5 tier for high-volume agents and subagents: 400K context, tool calling and structured output at a fraction of flagship price.
GPT-5 CodexOpenAI	`gpt-5-codex`	400K	128K	$1.25	$10.00	Coding-agent specialization of GPT-5: 400K context, tool calling and structured output, tuned for software-engineering loops.
GPT-5.1 CodexOpenAI	`gpt-5.1-codex`	400K	128K	$1.25	$10.00	Coding-agent specialization of GPT-5.1: 400K context, tool calling and structured output for SWE agents.
OpenAI o3OpenAI	`o3`	200K	100K	$2.00	$8.00	Dedicated reasoning model with tool calling and structured output: deep multi-step problem solving for analytical agents.
Gemini 3 ProGoogle	`gemini-3-pro-preview`	1M	64K	$2.00	$12.00	Google's frontier long-context multimodal model: ~1M-token window, thinking, tool calling and structured output.
Gemini 3 FlashGoogle	`gemini-3-flash-preview`	1M	64K	$0.50	$3.00	Fast, cheap Gemini 3 tier with ~1M context, thinking, tool calling and structured output: built for high-throughput multimodal agents.
Gemini 2.5 ProGoogle	`gemini-2.5-pro`	1M	64K	$1.25	$10.00	Proven long-context multimodal workhorse: ~1M-token window, thinking, tool calling and structured output.
Gemini 2.5 FlashGoogle	`gemini-2.5-flash`	1M	64K	$0.30	$2.50	High-volume multimodal agent tier: ~1M context, thinking, tool calling and structured output at low cost.
Grok 4.3xAI	`grok-4.3`	1M	30K	$1.25	$2.50	xAI's current flagship: 1M-token context, reasoning and tool calling, tuned for agentic chat and coding.
DeepSeek-V4-Flash (deepseek-chat)DeepSeek	`deepseek-chat`	1M	384K	$0.14	$0.28	Non-thinking mode of DeepSeek-V4-Flash: 1M context, very low price, tool calling. The deepseek-chat API alias.
DeepSeek-V4-Flash (deepseek-reasoner)DeepSeek	`deepseek-reasoner`	1M	384K	$0.14	$0.28	Thinking mode of DeepSeek-V4-Flash: 1M context, chain-of-thought reasoning and tool calling at low cost.
Qwen3 MaxAlibaba	`qwen3-max`	262K	64K	$1.20	$6.00	Alibaba's flagship Qwen3 tier: 262K context, tool calling and structured output for general agentic tasks.
Qwen3 235B-A22BAlibaba	`qwen3-235b-a22b`	131K	16K	$0.10	$0.60	Open-weights Qwen3 MoE (235B total / 22B active): 131K context, reasoning and tool calling at very low cost.
Qwen3 Coder PlusAlibaba	`qwen3-coder-plus`	1M	64K	$1.00	$5.00	Coding-agent Qwen3 tier: ~1M context, tool calling and structured output for software-engineering loops.
Mistral LargeMistral	`mistral-large-latest`	262K	262K	$0.50	$1.50	Mistral's flagship: 262K context, tool calling and structured output for general European-sovereign agent stacks.
Mistral MediumMistral	`mistral-medium-latest`	262K	262K	$0.40	$2.00	Mid-tier Mistral: 262K context, tool calling and structured output, balanced cost for production agents.
Magistral MediumMistral	`magistral-medium-latest`	128K	16K	$2.00	$5.00	Mistral's reasoning model: 128K context, chain-of-thought reasoning and tool calling.
GLM-5Zhipu AI	`glm-5`	200K	128K	$1.00	$3.20	Zhipu's open-weights flagship: ~200K context, reasoning and tool calling, agentic-oriented.
GLM-4.7Zhipu AI	`glm-4.7`	200K	128K	$0.60	$2.20	Open-weights GLM-4.7: ~200K context, reasoning, tool calling and structured output at low cost.
GLM-4.6Zhipu AI	`glm-4.6`	200K	128K	$0.43	$1.74	Open-weights GLM-4.6: ~200K context, reasoning and tool calling, a low-cost agentic workhorse.
Kimi K2Moonshot AI	`kimi-k2`	262K	262K	see provider	see provider	Moonshot's open-weights agentic model: 262K context, reasoning, tool calling and structured output.

pricing unit USD per 1M tokens (input / output). This page is served by Claude Fable 5 — the top row built the site you're reading.