cybersecuritycloudflareartificial intelligenceagents interactive

Preparing Your Website for the AI Agentic Internet

The web once learned to speak to browsers, then to search engines. Now it must speak to AI agents. This is a general walkthrough — with live, interactive mini-demos — for making any website ready for the agentic Internet with Cloudflare.

Context: Cloudflare's Agents Week 2026 roundup frames this shift as the emerging agentic web. This walkthrough focuses on the website layer: control what bots can access, package content for agents, and measure whether your site is ready.

David Tofan

April 19, 2026 14 min read · 12 sections

01 Where we are

The Internet's next historic phase

This isn't incremental. The web is learning a new audience the way it once learned browsers and search engines — and the old search-to-click bargain is breaking as agents read more than they refer.

The Internet's next historic phase

Packet-switched networks stitched universities and labs together.
HTTP and HTML gave browsers a universal document interface.
Responsive design taught sites to fit pocket-sized screens.
APIs became the real product; the browser became optional.
Now the web must speak to autonomous AI — on its own terms.

The web has always had to adapt to new standards. It learned to speak to web browsers, and then it learned to speak to search engines. Now, it needs to speak to AI agents.
— Cloudflare, Agent Readiness

Audit your site on isitagentready.com

In 1994, robots.txt taught the web to talk to search crawlers. Thirty-two years later, almost nothing on your site is ready for the crawlers that matter now: AI agents and the humans driving them. The numbers are mind-blowing — see AI Search Crawl Refer Ratio on Radar and the crawl-to-click insights. What follows is Cloudflare's original five-pillar Agent Readiness framework, plus two additional pillars I added here: Performance and Security.

02 The framework

What "agent-ready" actually means

Being "agent-ready" is less about adding AI to your site and more about letting AI reliably and respectfully read, act on, and transact with it. Cloudflare scanned the top 200,000 domains — here is a baseline:

78%

have a robots.txt

the 1994 standard, still the entrypoint

declare Content Signals

stating AI usage preferences

<15

expose an MCP Server Card

out of the top 200,000 domains

4.6%

support markdown negotiation

~80% token savings when they do

Source: Cloudflare Radar — Adoption of AI agent standards (top 200,000 domains), as of April 2026.

Star tool · built by Cloudflare

Audit your site — isitagentready.com

The original Cloudflare rubric this walkthrough starts from: 12 checks across Discoverability, Content, Bot control, Capabilities, and Commerce. This article then extends that model with Performance and Security. Most sites score 2 of 12 today — where does yours stand?

Run scan

Framework note: isitagentready.com scores the original five Cloudflare pillars. This walkthrough extends that rubric with two operational pillars — Performance and Security — so the full checklist here spans seven categories.

Discoverability

robots.txt, sitemap, Link headers — so agents can find your surfaces.

Content

Serve markdown when agents ask for it. Drop 80% of tokens on the wire.

Bot Access Control

Content Signals, AI Crawl Control, and bot-provider identity via Web Bot Auth — declare and enforce.

Capabilities

OAuth discovery, API Catalog, MCP Server Cards, Agent Skills.

Commerce

Pay Per Crawl and x402 — monetize, or block, but stop leaving bytes on the table.

Performance

Core Web Vitals, caching, compression, and markdown delivery — so agents do not time out.

Security

Every /.well-known/ is both an invitation and an attack surface.

03 Pillar 1 · Discoverability

robots.txt, sitemap, and Link headers

Most sites already have a robots.txt — but not prepared for agents. The first three isitagentready.com checks fall under this pillar.

Most useful for: any public website, docs portal, blog, or product site that wants search engines and AI agents to reliably discover crawlable pages and machine-readable resources.

Interactive demo · robots.txt builder — click any bot to toggle

Selected AI crawlers

This simplified view focuses on the bots most likely to matter for crawl or training control. Allowed bots do not need explicit entries here; robots.txt only needs the blocks you actually want to emit.

Content signals

ai-input

ai-train

Output · /robots.txt

Allowed crawlers are implicit — only explicit Disallow blocks are emitted.
Tip: on Cloudflare, flip Security Settings → Bot traffic → robots.txt to auto-generate and maintain this for you.

Why `Disallow: /cdn-cgi/` matters

Cloudflare reserves the /cdn-cgi/ path for internal features (challenge pages, email obfuscation, etc.). Crawling it produces noise in Search Console. Disallow it — but if you use Cloudflare Image Transformations (/cdn-cgi/image/), scope the rule to avoid blocking your own image variants. Here is more information on the SEO (Search Engine Optimization) impact by Cloudflare.

Link response headers for agent discovery

Return Link: headers on your homepage so agents can find machine-readable resources without parsing HTML. The syntax is specified by RFC 8288; RFC 9727 §3 registers api-catalog. Common companion relations — service-desc, service-doc, describedby — come from RFC 8631 and appear in RFC 9727 Appendix A.1 as usage examples.

Link: </.well-known/api-catalog>; rel="api-catalog"
Link: </openapi.json>; rel="service-desc"; type="application/json"

Multiple Link: headers or a single comma-separated value are both valid.

On Cloudflare: add these via Transform Rules → Response Header Modification or a Worker — no origin changes required. The AI Crawl Control + Transform Rules guide has a practical licensing terms example.

04 Pillar 2 · Content

Markdown for Agents — content that isn't wasted on them

Agents parsing HTML burn tokens on nav, scripts, and chrome. Markdown is the right wire format for LLMs (Large Language Models). Toggle the Accept header below to see Cloudflare's edge conversion in action.

Most useful for: publishers of articles, documentation, knowledge bases, changelogs, and other text-heavy pages where the real content is buried under lots of HTML chrome.

Interactive demo · content negotiation — flip the Accept header

Request

curl https://yourdomain.com/docs \
  -H "Accept: text/markdown"

Response headers

Body preview

Tokens: 3,150

HTML 16,180 tokens ↓ 80.5% Markdown

Enable with one toggle on Cloudflare — Markdown for Agents.
Edge converts HTML → Markdown via Accept: text/markdown on the fly. No new .md files required.

Markdown for Agents performs edge-side HTML→Markdown conversion on the fly — no new .md files required. Cloudflare's own docs reported up to ~80% token reduction — see Agent Readiness · content accessibility. The response also adds x-markdown-tokens, vary: accept, and a content-signal declaration.

In practice only a handful of coding agents — Claude Code, OpenCode, Cursor — are known to send Accept: text/markdown by default, but emitting it costs you nothing.

For everyone else, add a URL fallback: make pages available at /index.md relative to the canonical URL. Cloudflare documents this pattern by combining a URL Rewrite Rule that strips /index.md back to the base path with a Request Header Transform Rule that matches on raw.http.request.uri.path and injects accept: text/markdown. That gives agents a deterministic Markdown URL even when they never negotiate on headers.

Flip side: if you're building the agent, Cloudflare's Browser Run Markdown endpoint gives you a one-call API to pull clean Markdown from any URL (or raw HTML) — useful for ingesting sites that don't yet serve it natively.

05 Pillar 3 · Bot access control

From passive disclosure to active enforcement

Content Signals declare intent. AI Crawl Control + WAF (Web Application Firewall) enforce it. Web Bot Auth adds cryptographic identity when a bot provider signs requests.

Most useful for: site owners who need to decide which AI crawlers may read, train on, pay for, or be blocked from their content. Within this pillar, Web Bot Auth mainly matters to bot providers or operators who sign traffic; site owners mostly consume the verification result.

Cloudflare's Moving past bots vs. humans is the broader framing for this pillar: site owners do not just need to know whether a request is automated; they need to understand intent, proportional load, accountability, and whether the client should be allowed, limited, charged, or blocked.

Content Signals — launched September 2025 under a CC0 (Creative Commons Zero) license and submitted to the IETF (Internet Engineering Task Force) AIPREF (AI Preferences) working group (the initial individual draft has since expired; track the WG for the current revision) — introduces three preferences: search, ai-input, and ai-train. These are preferences plus a reservation of rights under the European Union (EU) Directive 2019/790 on Copyright and Related Rights in the Digital Single Market (DSM), Article 4.

The three layers

AI Crawl Control — per-bot allow / block / charge, with a dashboard view of every AI category (AI Crawler, AI Search, AI Assistant, Archiver). Free and self-serve plans detect by user-agent; Enterprise plans with Enterprise Bot Management use full Detection IDs.
Redirects for AI Training — one toggle converts <link rel="canonical"> tags into HTTP 301s for verified AI training crawlers. Cloudflare's own docs redirected 100% of training crawler requests to deprecated pages in the first week.
Verified Bots + Web Bot Auth — cryptographic identification via Ed25519 HTTP Message Signatures (RFC 9421), public keys at /.well-known/http-message-signatures-directory. This is primarily for bot providers/operators: they publish keys and sign requests, while site owners mostly consume the verification result at the edge. Cloudflare uses Web Bot Auth for both verified bots and signed agents, but a bot can only be registered as one classification.

Web Bot Auth · Ed25519 handshake

docs →

Bot operator

Public key directory

Cloudflare edge

1 Publish JWKS

GET /.well-known/
  http-message-
  signatures-directory

JWKS hosted
kid + Ed25519 public key

2 Signed request

GET /api/data
Signature-Agent:
  "https://claude.ai"
Signature-Input:
  sig1=("@authority"
    "signature-agent");
  created=1752953825;
  expires=1752957425;
  keyid="poqkLGiymh_W0uP6...";
  alg="ed25519";
  tag="web-bot-auth"
Signature:
  sig1=:3NxHWBjJUw...:

Arrives at edge

Headers per
RFC 9421

(Listed in
web-bot-auth registry)

3 Verify

fetch JWKS → key
verify Ed25519 sig
cf.bot_management.
  verified_bot = true

4 Enforce

allow → origin

charge → HTTP 402

block → WAF action

Operators publish an Ed25519 public key; Cloudflare verifies the signature per RFC 9421 on every request. Identity becomes cryptographic — user-agent spoofing stops working. This is primarily a bot-operator protocol: organizations that run bots publish keys and sign requests, while site owners mostly use Cloudflare's verification result to allow, charge, or block. Cloudflare uses Web Bot Auth for both verified bots and signed agents, but a bot can only be registered as one classification.

Signals declare. Enforcement executes. Ship both.

Content Signals publish intent; Managed robots.txt + AI Crawl Control + WAF publish consequences. Web Bot Auth adds cryptographic identity when incoming bot traffic is signed by a bot provider. Without the enforcement layer, signals are wishes — not policy.

06 Pillar 4 · Capabilities

Give agents things to do

Beyond reading content, agents should be able to authenticate, discover APIs, and call tools. Most sites have zero presence here — the upside is huge.

Most useful for: organizations exposing APIs, SaaS actions, search endpoints, MCP servers, or other machine-callable services. If your site is purely informational, you likely need only a subset of these surfaces.

The capability well-knowns, in order of the isitagentready.com checklist:

/.well-known/api-catalog — RFC 9727, a Linkset of your APIs. Mainly useful for organizations or people who expose one or more API services and want agents to discover them systematically.
/.well-known/oauth-authorization-server — RFC 8414 / OIDC (OpenID Connect) discovery. Useful when you operate your own authorization server for APIs, MCP servers, or agent actions.
/.well-known/oauth-protected-resource — RFC 9728, tells MCP (Model Context Protocol) clients which AS (Authorization Server) to use and which scopes to request. Useful for protected APIs or MCP endpoints that require scoped access.
/.well-known/mcp/server-card.json — MCP SEP-2127, describes your server's tools, transport, and auth. Useful for teams operating an MCP server and wanting clients to discover its capabilities cleanly.
/.well-known/agent-skills/index.json — Agent Skills, Anthropic's directory convention. Useful when you publish reusable agent workflows, prompts, or skills as first-class assets.
WebMCP — W3C Community Group draft, the browser API surface around navigator.modelContext and methods like registerTool() for exposing in-page tools to agents. Mainly useful for browser applications that want to expose live, in-page actions or context directly to agents.

Build your first one below — a minimal MCP Server Card. Host it on a Cloudflare Worker with AI Search as the backing retrieval.

Interactive demo · MCP server card — edit any field

SEP-2127 · draft

Server name Title Endpoint

Transport

Auth required

Tool name Tool description

Preview

Host this via a Cloudflare Worker + Durable Object and pair it with Workers OAuth Provider for scoped, RFC 9728-compliant auth.

Enterprise pattern: if you operate multiple MCP servers, put them behind an MCP Server Portal and use Code Mode so clients do not need every upstream tool schema in context. Cloudflare describes this as the pattern behind its internal AI engineering stack.

Note: Content Signals, MCP Server Card, WebMCP, and Agent Skills are all drafts or community reports — adopting them is a forward bet. Publish drafts on non-critical paths, and keep versions pinned.

07 Pillar 5 · Commerce

Pay Per Crawl and x402 — monetize, or block

HTTP 402 has existed since 1997; x402 and Pay Per Crawl are reviving it for agent and crawler payments. The handshake below is what actually travels between crawler and origin.

Most useful for: publishers, data providers, and API operators with high-value content or actions that should be monetized, rate-limited, or blocked for bots instead of treated like free traffic.

HTTP 402 · Pay Per Crawl handshake

docs →

Crawler

CF Edge

Publisher

Settlement

1 Request
```
GET /article
Signature: :MEUCIQD...
```

2 402

402 Payment Required
crawler-price: USD 0.05

3 Retry

GET /article
crawler-max-price:
  USD 0.10

4 Verify

Signature ok
Price match

Fetch

origin HTML
5 200
```
200 OK
crawler-charged:
  USD 0.05
```
6 Charge

record ledger

✓ MoR

Cloudflare
pays publisher

The 402 Payment Required status has existed since HTTP/1.1 in 1997 — it finally has a use. Cloudflare acts as Merchant of Record; signatures verified via Web Bot Auth.
In practice, the crawler provider signs these requests and the publisher consumes the verification result. This diagram shows the common 402-then-retry flow: Step 2 returns crawler-price, and Step 3 retries with crawler-max-price. If a crawler already knows the site's policy, it can send crawler-max-price on the first request; after a 402, it can also retry with crawler-exact-price: USD 0.05. The open-standard version — x402 — settles on-chain.

Pay Per Crawl lets publishers choose Allow, Charge, or Block per crawler. Charge typically starts with 402 plus crawler-price; crawlers then retry with either crawler-max-price or crawler-exact-price, and successful responses include crawler-charged. All payment headers must be Web Bot Auth-signed by the crawler/provider. The open-standard version is x402, governed by the x402 Foundation (Coinbase + Cloudflare) with stablecoin settlement.

The broader commerce layer — Visa's Trusted Agent Protocol (built on Cloudflare Web Bot Auth), Mastercard Agent Pay, Google AP2 (Agent Payments Protocol) — treats agent authentication and signed intent as first-class primitives.

402

Try it live

playground.x402.cloudflare.com

Issue a signed payment, watch the 402 handshake, inspect the charge receipt.

Open →

08 Added pillar · Performance

Performance matters (for agents, too)

Slow sites don't just frustrate humans — they time out agents and blow token budgets. Cloudflare's web performance stack doubles as your agent-readiness layer.

Most useful for: any public site, but especially docs portals, blogs, template-heavy pages, ecommerce catalogs, and other experiences that agents may fetch repeatedly or summarize under time and token constraints.

Request path · performance features grouped by stage

reference architecture →

User (eyeball)

Cloudflare edge

Tiered cache · R2

Origin

Request

GET /

Reduce latency · connection

Global DNS HTTP/3 · QUIC TLS 1.3 · 0-RTT HSTS Early Hints Cloudflare Fonts Speed Brain

CDN network footprint

Anycast · 330+ cities Nearest colo Tiered Cache

URL & traffic handling

URL Normalization Redirect Rules Waiting Room Custom Errors

Edge processing

Cache Rules Cache Response Rules Prefetch URLs Zaraz Google Tag Gateway

Reduce latency · caching

Smart Tiered Caching Cache Reserve Cloud Connector

Cache HIT

R2 · Tiered

Reduce origin latency

Argo Smart Routing HTTP/2 to origin Connection reuse Load Balancing Dedicated CDN Egress IPs

Cache MISS

fetch origin

Reduce size

Brotli · Gzip · Zstd Polish · Images Image Transformations Markdown for Agents Shared Dictionaries

Response

200 OK · LCP < 1.5s

Every stage is a toggle — measure with Speed Observatory (Lighthouse + RUM at p75), enable features, re-measure.
For agents specifically, Markdown for Agents is the cheapest compression you can ship.

measure

Speed Observatory

Synthetic Lighthouse + RUM (Real User Monitoring) at p75 for LCP (Largest Contentful Paint), INP (Interaction to Next Paint), CLS (Cumulative Layout Shift), TTFB (Time to First Byte). Recommendations map to Image Transformations, Argo, Brotli, HTTP/3, Early Hints and more.

optimize

Shared Dictionaries

Phase 1 shipping April 30, 2026. Significant payload compression for repeat fetches — valuable when agents crawl template-heavy pages.

agent-specific

Serve Markdown

Markdown is a perf optimization: fewer bytes, fewer tokens, faster turnarounds. The cheapest perf win you'll ship this quarter.

09 Added pillar · Security

AI Security implications

Seven concrete risks — one-line mitigation each — mapped to OWASP (Open Worldwide Application Security Project) LLM Top 10 (2025), Agentic Top 10 (2026), and the MCP Authorization Spec.
Tap "Mitigate" on any card to reveal the fix, or review AI-related use cases in the Cloudflare AI security demo.

Most useful for: any organization exposing /.well-known/ endpoints, APIs, MCP servers, auth flows, or other machine-readable surfaces. The more agent-facing capability you publish, the less optional this pillar becomes.

OWASP LLM01

Indirect prompt injection

Content agents read becomes instructions an attacker controls.

Never inject user content into MCP tool descriptions. Strip remote image markdown from untrusted sources. Enforce human-in-the-loop for destructive tools.

AI Gateway WAF

OWASP LLM02

Sensitive information disclosure

/.well-known/, sitemaps, MCP cards leak staging paths and runbooks.

Curate these files explicitly. Gate private trees with Cloudflare Access. Add X-Robots-Tag: noindex on ingestion endpoints you don't want indexed.

Cloudflare Access

OWASP LLM03

Supply-chain (MCP)

Third-party MCP updates can introduce poisoned tool descriptions.

Pin versions. Diff release notes. Central allowlist. Sandbox on Workers + Durable Objects. Log every tool call via AI Gateway.

AI Gateway Logpush Cloudflare Access

OWASP LLM06

Excessive agency (confused deputy)

MCP servers pass tokens upstream without audience validation.

OAuth 2.1 + PKCE, RFC 8707 Resource Indicators, RFC 9728 Protected Resource Metadata. Least-privilege scopes per tool. Never passthrough tokens.

Workers OAuth Provider Cloudflare Access

OWASP LLM07

Agent impersonation

UA strings and residential proxies defeat simple allowlists.

Require cryptographic identity from bot providers: Web Bot Auth (HTTP Message Signatures, RFC 9421), Cloudflare Verified Bots, reverse DNS verification.

Verified Bots Web Bot Auth Cloudflare Access

OWASP LLM10

Unbounded consumption

A 28.7K:1 crawl-to-refer ratio torches your egress bill.

AI Crawl Control + Rate Limiting + Bot Management. Use HTTP 402 / Pay Per Crawl for bots you'd rather charge than block.

AI Crawl Control Rate Limiting Bot Management AI Gateway Pay Per Crawl

OWASP Agentic

Agentic commerce fraud

Replay, impersonation, and unbounded spend by autonomous agents.

Trusted Agent Protocol. For operators running bots, Web Bot Auth with nonce + created + expires. Spend caps enforced outside the LLM loop.

Trusted Agent Protocol

Every /.well-known/ endpoint you publish is both an invitation and an attack surface. Treat them accordingly.

Observability: log Rules-language fields — cf.bot_management.verified_bot, cf.verified_bot_category, cf.bot_management.ja4, cf.bot_management.score, the Signature-Agent request header — and tool-call telemetry to your SIEM (Security Information and Event Management) via Logpush. Alert on per-agent anomalies, not just global thresholds.

10 For website owners

The 30-year search bargain is broken

If you're not the developer shipping this, this section is for you.

"The web is being stripmined by AI crawlers with content creators seeing almost no traffic and therefore almost no value."

— Matthew Prince, Cloudflare, July 2025

Crawl-to-refer ratio · April 2026 log scale · lower = better

Anthropic

28.7K:1

OpenAI

1.1K:1

Perplexity

133:1

Microsoft

37.7:1

Mistral

21.2:1

Yandex

19.4:1

Google

7.8:1

Baidu

2.7:1

ByteDance

2:1

DuckDuckGo

0.99:1

Read: for every single visitor Anthropic's crawler refers back, it has read ~28.7K pages.
Live data on Cloudflare Radar — AI Insights.

What changed

Around 75% of mobile Google queries now resolve without a click, and training drives ~80% of AI bot activity — see the purpose & industry breakdown and the crawl-to-refer ratio on Radar.

Your website isn't just optimizing for "search ranking → click → conversion" anymore. It's optimizing for "agent answer → brand mention → qualified action".

Four owner-level moves

Measure — run your domain through isitagentready.com and review Cloudflare Radar AI Insights.
Decide — declare intent via Content Signals. Which uses do you permit (search, ai-input, ai-train)?
Enforce — turn on AI Crawl Control and Managed robots.txt.
Monetize or block — choose Pay Per Crawl for high-value content.

If you do nothing: your content still gets scraped, your brand still gets summarized in AI answers — you just give up both (potential) compensation and the ability to correct what the model says about you. This is your choice.

11 Ship it

The developer workflow with Cloudflare

The fast path from plan to deployed. Everything below is official Cloudflare tooling.

cli

Wrangler + Local Explorer

Deploy Workers, configure content_converter, run wrangler dev. Local Explorer for debugging.

mcp

MCP Servers for Cloudflare

17 official remote MCP servers — docs, bindings, builds, Radar, AI Gateway, AI Search, Logs. Drop in with one config entry.

agent

Agent Lee

Cloudflare's agent for dashboard interaction. Handy for non-destructive account introspection while you build.

AI Search (AutoRAG)

Build a knowledge base or add search to your site. MCP endpoint at autorag.mcp.cloudflare.com/mcp.

Public demos — play with the primitives

LLM playground

playground.ai

Chat with Workers AI models routed through AI Gateway. Compare latency, tokens, cost side by side.

Open →

multi-modal

multi-modal.ai

Image + audio + text inference on Workers AI — drop in a picture, get structured captions back.

Send a signed HTTP request, watch the 402 handshake, inspect the charge receipt. End-to-end x402 flow.

Open →

12 Close the loop

The agent-ready checklist, mapped to Cloudflare

Tick the boxes as you ship. Filter by pillar. Reset when you start on a new zone.

Interactive checklist · mapped to Cloudflare

0 / 17 shipped

	What to ship	Spec	Cloudflare
01	robots.txt with AI UAs + Disallow: /cdn-cgi/ RFC 9309 Managed robots.txt	RFC 9309	Managed robots.txt
02	sitemap.xml with canonicals sitemaps.org Managed robots.txt	sitemaps.org	Managed robots.txt
03	Link: headers for resource discovery RFC 8288 Transform Rules	RFC 8288	Transform Rules
04	Serve markdown on Accept: text/markdown easy win Cloudflare spec Markdown for Agents	Cloudflare spec	Markdown for Agents
05	Content Signals in robots.txt contentsignals.org Content Signals Policy	contentsignals.org	Content Signals Policy
06	AI-training redirects via canonical easy win Cloudflare Redirects for AI Training	Cloudflare	Redirects for AI Training
07	Verified-bot enforcement Web Bot Auth (draft) Verified Bots · Web Bot Auth	Web Bot Auth (draft)	Verified Bots · Web Bot Auth
08	/.well-known/api-catalog RFC 9727 Workers	RFC 9727	Workers
09	OAuth / OIDC discovery RFC 8414 / OIDC Workers OAuth Provider	RFC 8414 / OIDC	Workers OAuth Provider
10	/.well-known/oauth-protected-resource RFC 9728 Workers OAuth Provider	RFC 9728	Workers OAuth Provider
11	MCP Server Card SEP-2127 (draft) Workers + Durable Objects	SEP-2127 (draft)	Workers + Durable Objects
12	Agent Skills index agentskills.io Static hosting / R2	agentskills.io	Static hosting / R2
13	WebMCP (navigator.modelContext) W3C CG draft Any site	W3C CG draft	Any site
14	HTTP 402 / Pay Per Crawl easy win Cloudflare Pay Per Crawl	Cloudflare	Pay Per Crawl
15	x402 for content and MCP tools x402.org Agents SDK	x402.org	Agents SDK
16	Core Web Vitals + RUM web.dev Speed Observatory · Shared Dictionaries	web.dev	Speed Observatory · Shared Dictionaries
17	Bot auth + scoped OAuth + tool allowlists OWASP LLM / Agentic AI Gateway · Access · WAF · Logpush	OWASP LLM / Agentic	AI Gateway · Access · WAF · Logpush

Build it with AI tools

Cloudflare publishes a style guide for AI-assisted development — curated prompts, llms.txt indexes, IDE (Integrated Development Environment) setup, and the docs MCP server so your agent pulls verified Cloudflare references while it writes code.

developers.cloudflare.com/llms.txt — curated index of every docs page, agent-readable.
llms-full.txt — every page inlined as markdown in a single file for bulk ingestion.
Cloudflare Docs MCP server — https://docs.mcp.cloudflare.com/mcp. Add once to Claude Code, Cursor, or VS Code and your agent can search the docs directly.
Workers prompting guide + cloudflare/templates — starter prompts for Workers, AI Gateway, AI Search.

A grounding counterpoint

Read all of the above alongside Stop shipping AI files nobody reads by Walshy. The core argument: most AI-specific files (llms.txt and friends) largely go unread, and the durable win is to make your site human-friendly — you get agent-friendly for free. I personally agree with these findings, especially two of them:

"Write clean HTML, use semantic markup, write content humans want to read."
"If you really want to do something agent-specific, the right shape is content negotiation, not a separate file."

That lands squarely on Pillar 2 of this walkthrough: the strongest agent-readiness move isn't a pile of bespoke files to maintain — it's clean, semantic content plus content negotiation (Accept: text/markdown), so one canonical URL serves humans and agents alike. Treat the well-knowns and capability surfaces here as deliberate, high-value additions, not a checklist of files to litter your site with.

This is a non-exhaustive practitioner's intro to making your website AI agent-ready. Educational purposes only — the standards above will keep shifting, and that's the point. Audit your site at isitagentready.com, read your crawl-to-refer ratio on Cloudflare Radar AI Insights, then decide consciously what you want to expose, monetize, or block. Properly inform yourself, keep learning, keep testing — and, as always, secure before you ship.

IsItAgentReady assessment result for davidtofan.com — Live check: isitagentready.com/davidtofan.com

Audit your site — isitagentready.com

Discoverability

Content

Bot Access Control

Capabilities

Commerce

Performance

Security

Selected AI crawlers

Content signals

Why Disallow: /cdn-cgi/ matters

Link response headers for agent discovery

The three layers

Indirect prompt injection

Sensitive information disclosure

Supply-chain (MCP)

Excessive agency (confused deputy)

Agent impersonation

Unbounded consumption

Agentic commerce fraud

What changed

Four owner-level moves

Public demos — play with the primitives

Build it with AI tools

A grounding counterpoint

Why `Disallow: /cdn-cgi/` matters