Content Signals

A robots.txt extension that lets a site declare how its content may be used after access — search, ai-input, and ai-train — as machine-readable preferences.

name
Content Signals
full_name
Cloudflare Content Signals Policy
layer
licensing
creator
Cloudflare
status
live (2025)
year
2025
one_liner
A robots.txt extension that lets a site declare how its content may be used after access — search, ai-input, and ai-train — as machine-readable preferences.
spec_url
https://blog.cloudflare.com/content-signals-policy/
snippet
# robots.txt
Content-Signal: search=yes, ai-input=yes, ai-train=no
abbreviation
Content Signals
also_known_as
Content Signals Policy Content-Signal
canonical_spec_url
https://contentsignals.org
entity_uri
https://blog.cloudflare.com/content-signals-policy/
taxonomy_layer
licensing
sub_layer
content-use-preferences
protocol_type
declaration
central_problem
Lets a site express, in robots.txt, how crawlers may use its content after access — for search, AI input (RAG/grounding), or AI training — as clear machine-readable preferences.
maintainer
Cloudflare (Content Signals Policy; published openly under CC0)
governance_body
vendor (Cloudflare); spec released under CC0
license
CC0 (the policy text is released under a CC0 license for open adoption)
maturity_tag
emerging
current_spec_version
— verify-against-primary-at-build ↗ https://contentsignals.org
spec_date
2025-09-24
launch_date
2025-09-24
last_verified
2026-06-15
transport
robots.txt extension (Content-Signal directive: search / ai-input / ai-train)
core_mechanism
The policy adds a short human-readable block plus a machine-readable Content-Signal line to robots.txt declaring per-use preferences with yes/no values: search (use in a search index), ai-input (use to ground/answer queries, e.g. RAG), and ai-train (use to train models). Cloudflare's managed robots.txt defaults to search=yes, ai-train=no. Signals express preferences, not technical enforcement.
discovery_endpoint
robots.txt Content-Signal directive
settlement_type
adoption_metric
Auto-applied to Cloudflare's managed robots.txt (Cloudflare states 3.8M+ domains) with default Content-Signal: search=yes, ai-train=no source
notable_adopters
{"value":"Cloudflare (creator; applied across its managed robots.txt fleet)","source":"https://blog.cloudflare.com/content-signals-policy/"}
relationships
{"predicate":"complements","target":"rsl","note":"Content Signals expresses non-binding use preferences in robots.txt; RSL declares enforceable machine-readable licensing terms — paired Layer-6 declarations."} {"predicate":"extends","target":"agents-json","note":"Content Signals is a robots.txt extension in the same Layer-1/6 declaration family as the other root-file declarations; it adds an after-access use dimension robots.txt's allow/disallow lacks."}
ideal_use_case
A site that wants to state, in robots.txt, that search use is welcome but AI training is not — without standing up enforcement.
when_to_use
When you want to communicate after-access content-use preferences (search vs AI-input vs AI-train) to crawlers in a standard, machine-readable line.
when_not_to_use
When you need binding, enforceable licensing or payment (use RSL or Pay Per Crawl) — Content Signals are preferences, not controls.
code_example
# robots.txt # Content usage preferences (Cloudflare Content Signals Policy) User-agent: * Content-Signal: search=yes, ai-input=yes, ai-train=no Allow: /
source
Launch 2025-09-24, three signals (search / ai-input / ai-train), robots.txt extension, CC0, Cloudflare managed-robots default search=yes/ai-train=no, contentsignals.org: https://blog.cloudflare.com/content-signals-policy/ . Listed for addition in research §2.
agent_readiness_link
access-economics
layer_legacy
content

← all The Agent Protocol Atlas · .md · JSON