Llama (open weights)

Open-weights family you can run yourself; effective price depends on your inference host, not a list price.

name
Llama (open weights)
vendor
Meta
model_id
see provider
context_window
varies
max_output
varies
input_per_mtok
self-host or per-host
output_per_mtok
self-host or per-host
strengths
Open-weights family you can run yourself; effective price depends on your inference host, not a list price.
provider
Meta
family
Llama
release_date
— verify-against-primary-at-build ↗ https://www.llama.com/ — this row pins no specific Llama version; record the release date of the pinned version at build.
last_updated
2026-06-15
open_weights
true source
license
Llama 4 Community License Agreement source
params_total
— verify-against-primary-at-build ↗ https://www.llama.com/ — total parameter count varies by pinned Llama variant; record at build.
params_active
— verify-against-primary-at-build ↗ https://www.llama.com/ — active (MoE) parameter count varies by pinned Llama variant; record at build.
tool_call
— verify-against-primary-at-build ↗ https://www.llama.com/docs/ — Llama 4 supports tool calling, but this row pins no specific version and tool-call behavior is host/template-dependent; confirm per pinned variant + host at build. (Selection-gate attribute — must resolve true for inclusion.)
reasoning
— verify-against-primary-at-build ↗ https://www.llama.com/docs/ — reasoning support varies by pinned Llama variant; confirm at build.
structured_output
— verify-against-primary-at-build ↗ https://www.llama.com/docs/ — structured-output support is host/framework-dependent for open weights; confirm per pinned variant + host at build.
attachment
— verify-against-primary-at-build ↗ https://www.llama.com/docs/ — multimodal input varies by pinned Llama variant; confirm at build.
temperature
true source
knowledge_cutoff
— verify-against-primary-at-build ↗ https://www.llama.com/ — knowledge cutoff varies by pinned Llama variant; confirm at build.
context_advertised
varies verify-against-primary-at-build ↗ https://www.llama.com/docs/ — context window varies by pinned Llama variant; preserves the live record's deferred value.
context_effective
— verify-against-primary-at-build ↗ No measured effective-context value sourced for a pinned Llama variant.
price_input
self-host or per-host verify-against-primary-at-build ↗ Open weights: effective input price depends on the inference host, not a Meta list price (preserves the live record's discipline).
price_output
self-host or per-host verify-against-primary-at-build ↗ Open weights: effective output price depends on the inference host, not a Meta list price.
price_cache_read
— verify-against-primary-at-build ↗ Prompt-cache pricing is host-defined for self-hosted Llama; confirm per host at build.
price_cache_write
— verify-against-primary-at-build ↗ Prompt-cache pricing is host-defined for self-hosted Llama; confirm per host at build.
cost_per_full_window
— verify-against-primary-at-build ↗ Not a list price for open weights; depends on host + chosen GPU economics.
cost_per_agent_task
— verify-against-primary-at-build ↗ Not a list price for open weights; depends on host + chosen GPU economics.
modalities
— verify-against-primary-at-build ↗ https://www.llama.com/ — modality arrays vary by pinned Llama variant; confirm at build.
gpqa_diamond
— verify-against-primary-at-build ↗ https://artificialanalysis.ai/ — per pinned Llama variant at build.
swe_bench_verified
— verify-against-primary-at-build ↗ https://www.swebench.com/ — per pinned Llama variant at build.
terminal_bench
— verify-against-primary-at-build ↗ https://www.tbench.ai/ — per pinned Llama variant at build.
tau2_bench
— verify-against-primary-at-build ↗ Primary τ²-Bench leaderboard — per pinned Llama variant at build.
bfcl_tool_use
— verify-against-primary-at-build ↗ https://gorilla.cs.berkeley.edu/leaderboard.html — per pinned Llama variant at build.
aa_index
— verify-against-primary-at-build ↗ https://artificialanalysis.ai/ — per pinned Llama variant at build.
lmarena_elo
— verify-against-primary-at-build ↗ https://lmarena.ai/leaderboard — per pinned Llama variant at build.
tokens_per_sec
— verify-against-primary-at-build ↗ Throughput is host/hardware-dependent for self-hosted Llama; confirm per host at build.
ttft
— verify-against-primary-at-build ↗ TTFT is host/hardware-dependent for self-hosted Llama; confirm per host at build.
hallucination_rate
— verify-against-primary-at-build ↗ Per pinned Llama variant at build from a primary source.
agent_readiness_score
— verify-against-primary-at-build ↗ Score withheld: row pins no specific Llama version and most inputs are host-dependent. Pin a variant + host, then compute per /models/agent-readiness-score.
score_confidence
partial
source_url
https://www.llama.com/llama4/license/
source_type
provider_card
last_verified
2026-06-15

← all The Frontier Model Matrix · .md · JSON