Content Signals
A robots.txt extension that lets a site declare how its content may be used after access — search, ai-input, and ai-train — as machine-readable preferences.
- name
- Content Signals
- full_name
- Cloudflare Content Signals Policy
- layer
- licensing
- creator
- Cloudflare
- status
- live (2025)
- year
- 2025
- one_liner
- A robots.txt extension that lets a site declare how its content may be used after access — search, ai-input, and ai-train — as machine-readable preferences.
- spec_url
- https://blog.cloudflare.com/content-signals-policy/
- snippet
# robots.txt Content-Signal: search=yes, ai-input=yes, ai-train=no- abbreviation
- Content Signals
- also_known_as
Content Signals PolicyContent-Signal- canonical_spec_url
- https://contentsignals.org
- entity_uri
- https://blog.cloudflare.com/content-signals-policy/
- taxonomy_layer
- licensing
- sub_layer
- content-use-preferences
- protocol_type
- declaration
- central_problem
- Lets a site express, in robots.txt, how crawlers may use its content after access — for search, AI input (RAG/grounding), or AI training — as clear machine-readable preferences.
- maintainer
- Cloudflare (Content Signals Policy; published openly under CC0)
- governance_body
- vendor (Cloudflare); spec released under CC0
- license
- CC0 (the policy text is released under a CC0 license for open adoption)
- maturity_tag
- emerging
- current_spec_version
- — verify-against-primary-at-build ↗ https://contentsignals.org
- spec_date
- 2025-09-24
- launch_date
- 2025-09-24
- last_verified
- 2026-06-15
- transport
- robots.txt extension (Content-Signal directive: search / ai-input / ai-train)
- core_mechanism
- The policy adds a short human-readable block plus a machine-readable Content-Signal line to robots.txt declaring per-use preferences with yes/no values: search (use in a search index), ai-input (use to ground/answer queries, e.g. RAG), and ai-train (use to train models). Cloudflare's managed robots.txt defaults to search=yes, ai-train=no. Signals express preferences, not technical enforcement.
- discovery_endpoint
- robots.txt Content-Signal directive
- settlement_type
- —
- adoption_metric
- Auto-applied to Cloudflare's managed robots.txt (Cloudflare states 3.8M+ domains) with default Content-Signal: search=yes, ai-train=no source
- notable_adopters
{"value":"Cloudflare (creator; applied across its managed robots.txt fleet)","source":"https://blog.cloudflare.com/content-signals-policy/"}- relationships
{"predicate":"complements","target":"rsl","note":"Content Signals expresses non-binding use preferences in robots.txt; RSL declares enforceable machine-readable licensing terms — paired Layer-6 declarations."}{"predicate":"extends","target":"agents-json","note":"Content Signals is a robots.txt extension in the same Layer-1/6 declaration family as the other root-file declarations; it adds an after-access use dimension robots.txt's allow/disallow lacks."}- ideal_use_case
- A site that wants to state, in robots.txt, that search use is welcome but AI training is not — without standing up enforcement.
- when_to_use
- When you want to communicate after-access content-use preferences (search vs AI-input vs AI-train) to crawlers in a standard, machine-readable line.
- when_not_to_use
- When you need binding, enforceable licensing or payment (use RSL or Pay Per Crawl) — Content Signals are preferences, not controls.
- code_example
- # robots.txt # Content usage preferences (Cloudflare Content Signals Policy) User-agent: * Content-Signal: search=yes, ai-input=yes, ai-train=no Allow: /
- source
- Launch 2025-09-24, three signals (search / ai-input / ai-train), robots.txt extension, CC0, Cloudflare managed-robots default search=yes/ai-train=no, contentsignals.org: https://blog.cloudflare.com/content-signals-policy/ . Listed for addition in research §2.
- agent_readiness_link
- access-economics
- layer_legacy
- content