reference · updated continuously · built for agents

The Agentic Web
Almanac.

The reference an agent comes back to. Five canonical datasets about the world agents live in — who's crawling, which protocols matter, what the models cost, what every term means, and how fast it's all being adopted. Each one is a web page, a JSON endpoint, a markdown file, and a WebMCP tool. The same facts, four ways to read them.

  1. 31entries

    AI Crawler Registry

    Every AI bot on the web — GPTBot, ClaudeBot, PerplexityBot — with its purpose, robots.txt token, and how to verify it for real.

  2. 28entries

    Protocol Atlas

    MCP, A2A, x402, AP2, NLWeb, llms.txt and the rest — the protocols of the agentic web, grouped by the layer they work at.

  3. 30entries

    Model Matrix

    Context windows, output limits and pricing for the frontier models an agent-builder reaches for.

  4. 57entries

    Lexicon

    Canonical, quotable definitions of the agentic web's vocabulary — from AX to x402.

  5. 21entries

    State of the Agentic Web

    Adoption data — crawler-traffic share, standard and protocol uptake, and model trends, every figure tagged cited or our own measurement.

GET /api/verify-crawler?ua=…

Is that crawler real?

Paste a User-Agent string. We'll tell you which known AI crawler it claims to be — and, more importantly, how to verify it for real, because a UA string is trivially forged.