Llama (open weights)

Open-weights family you can run yourself; effective price depends on your inference host, not a list price.

name: Llama (open weights)
vendor: Meta
model_id: see provider
context_window: varies
max_output: varies
input_per_mtok: self-host or per-host
output_per_mtok: self-host or per-host
strengths: Open-weights family you can run yourself; effective price depends on your inference host, not a list price.
provider: Meta
family: Llama
release_date: — verify-against-primary-at-build ↗ https://www.llama.com/ — this row pins no specific Llama version; record the release date of the pinned version at build.
last_updated: 2026-06-15
open_weights: true source
license: Llama 4 Community License Agreement source
params_total: — verify-against-primary-at-build ↗ https://www.llama.com/ — total parameter count varies by pinned Llama variant; record at build.
params_active: — verify-against-primary-at-build ↗ https://www.llama.com/ — active (MoE) parameter count varies by pinned Llama variant; record at build.
tool_call: — verify-against-primary-at-build ↗ https://www.llama.com/docs/ — Llama 4 supports tool calling, but this row pins no specific version and tool-call behavior is host/template-dependent; confirm per pinned variant + host at build. (Selection-gate attribute — must resolve true for inclusion.)
reasoning: — verify-against-primary-at-build ↗ https://www.llama.com/docs/ — reasoning support varies by pinned Llama variant; confirm at build.
structured_output: — verify-against-primary-at-build ↗ https://www.llama.com/docs/ — structured-output support is host/framework-dependent for open weights; confirm per pinned variant + host at build.
attachment: — verify-against-primary-at-build ↗ https://www.llama.com/docs/ — multimodal input varies by pinned Llama variant; confirm at build.
temperature: true source
knowledge_cutoff: — verify-against-primary-at-build ↗ https://www.llama.com/ — knowledge cutoff varies by pinned Llama variant; confirm at build.
context_advertised: varies verify-against-primary-at-build ↗ https://www.llama.com/docs/ — context window varies by pinned Llama variant; preserves the live record's deferred value.
context_effective: — verify-against-primary-at-build ↗ No measured effective-context value sourced for a pinned Llama variant.
price_input: self-host or per-host verify-against-primary-at-build ↗ Open weights: effective input price depends on the inference host, not a Meta list price (preserves the live record's discipline).
price_output: self-host or per-host verify-against-primary-at-build ↗ Open weights: effective output price depends on the inference host, not a Meta list price.
price_cache_read: — verify-against-primary-at-build ↗ Prompt-cache pricing is host-defined for self-hosted Llama; confirm per host at build.
price_cache_write: — verify-against-primary-at-build ↗ Prompt-cache pricing is host-defined for self-hosted Llama; confirm per host at build.
cost_per_full_window: — verify-against-primary-at-build ↗ Not a list price for open weights; depends on host + chosen GPU economics.
cost_per_agent_task: — verify-against-primary-at-build ↗ Not a list price for open weights; depends on host + chosen GPU economics.
modalities: — verify-against-primary-at-build ↗ https://www.llama.com/ — modality arrays vary by pinned Llama variant; confirm at build.
gpqa_diamond: — verify-against-primary-at-build ↗ https://artificialanalysis.ai/ — per pinned Llama variant at build.
swe_bench_verified: — verify-against-primary-at-build ↗ https://www.swebench.com/ — per pinned Llama variant at build.
terminal_bench: — verify-against-primary-at-build ↗ https://www.tbench.ai/ — per pinned Llama variant at build.
tau2_bench: — verify-against-primary-at-build ↗ Primary τ²-Bench leaderboard — per pinned Llama variant at build.
bfcl_tool_use: — verify-against-primary-at-build ↗ https://gorilla.cs.berkeley.edu/leaderboard.html — per pinned Llama variant at build.
aa_index: — verify-against-primary-at-build ↗ https://artificialanalysis.ai/ — per pinned Llama variant at build.
lmarena_elo: — verify-against-primary-at-build ↗ https://lmarena.ai/leaderboard — per pinned Llama variant at build.
tokens_per_sec: — verify-against-primary-at-build ↗ Throughput is host/hardware-dependent for self-hosted Llama; confirm per host at build.
ttft: — verify-against-primary-at-build ↗ TTFT is host/hardware-dependent for self-hosted Llama; confirm per host at build.
hallucination_rate: — verify-against-primary-at-build ↗ Per pinned Llama variant at build from a primary source.
agent_readiness_score: — verify-against-primary-at-build ↗ Score withheld: row pins no specific Llama version and most inputs are host-dependent. Pin a variant + host, then compute per /models/agent-readiness-score.
score_confidence: partial
source_url: https://www.llama.com/llama4/license/
source_type: provider_card
last_verified: 2026-06-15

last verified 15 Jun 2026 · by Özden Erdinc

← all The Frontier Model Matrix · .md · JSON