TechCrunch AI July 1, 2026 TERMINATOR

New Cloudflare defaults put AI crawlers on a pay-or-separate path

Cloudflare says that starting on September 15, 2026, its default settings will block mixed-use crawlers from ad-supported pages unless site owners choose otherwise. The policy is meant to push AI companies to separate search crawling from agent use and training, while giving publishers more ways to control and monetize content access.

WTF Index TERMINATOR

◄ Terminator 1 Idiocracy 0 ►

The story mildly leans Terminator because it concerns AI crawlers, agentic services, and training systems gaining access to publisher content without clear control or consent.

New Cloudflare defaults put AI crawlers on a pay-or-separate path

Cloudflare is changing the default rules for how AI crawlers reach publisher websites, and the shift is aimed squarely at bots that combine traditional search, AI agents, and model training in the same crawling activity.

Starting on September 15, 2026, Cloudflare says its default settings will block mixed-use crawlers from pages that host ads. Site owners can adjust those settings, but the default position will no longer be open access for crawlers that do not clearly separate search from AI agent use and training.

What Cloudflare is changing

The new policy applies to crawlers that blend several purposes at once: search discovery, agentic services, and AI training. Cloudflare is drawing a line between web crawling for traditional search products, like Google Search, and crawling that supports AI systems.

Under the change, mixed-use crawlers will be blocked by default from crawling ad-supported pages unless the website owner changes the configuration. Cloudflare says these default changes will apply to new Cloudflare customers, new sites created by existing customers, and all existing free customers.

The practical effect is that AI model providers may face a more controlled route to publisher content. If a crawler wants access, Cloudflare is signaling that it should identify its purpose more clearly, separate its functions, or work through commercial arrangements that publishers can understand and approve.

Why the policy matters for publishers

Cloudflare says most website owners want to remain visible through search. Many also want their content to appear through AI services. The conflict is that publishers do not want their intellectual property to be taken and reused without protection or payment.

That tension has become sharper as AI companies use web content to train models and power agentic services. Publishers depend on discoverability, but they also need control over how their work is accessed, reused, and monetized.

Cloudflare frames the new defaults as a way to shift leverage back toward website owners. Instead of forcing publishers to choose between being found and being scraped, the company is trying to make crawler intent more transparent.

The source article also notes Cloudflare’s criticism of the “world’s largest search engine,” described as having access to about “2x more information” than other AI companies because it is difficult for customers to stay discoverable without also being used for AI. Google has pushed back on that kind of characterization in the past, pointing to Google Extended, a bot that lets site owners opt out of having content used for training and AI products and services like Gemini Apps and Vertex API without affecting Google Search inclusion. At the same time, Googlebot crawls for Search, including AI features like AI Overviews and AI Mode.

Cloudflare’s broader AI content strategy

The crawler policy is part of a larger set of Cloudflare tools focused on publisher control in the AI era. The company has launched tools to combat AI bots and created a marketplace that lets websites charge AI bots for scraping, called Pay Per Crawl.

That product is now evolving into Pay Per Use. Cloudflare says the idea is to let publishers charge AI companies when their content creates value, not only when it is fetched by a crawler.

This distinction matters because scraping is only one part of the AI content pipeline. A publisher may care not just that a bot accessed a page, but that the content later appeared in an AI search result or helped power a premium AI experience. Pay Per Use is designed around that broader value exchange.

Cloudflare is initially working with Ceramic.ai and You.com to put the model into action. When a publisher opts in, they are paid when their content appears in Ceramic’s AI search results or when You.com accesses a piece of their premium content.

Cloudflare says other AI companies can customize the model for how they work. That leaves room for different commercial arrangements, while keeping the basic premise the same: publishers should have visibility and a path to compensation when AI systems benefit from their content.

The bandwidth and transparency argument

Cloudflare is also presenting the change as a technical efficiency issue. According to the source article, Cloudflare’s data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages.

For publishers, that means AI crawling can consume bandwidth and compute resources even when the crawler is not collecting anything new. If AI companies can access content through clearer, more structured arrangements, Cloudflare argues that publishers may be able to conserve those resources.

The company also connects the policy to a wider change in internet traffic. Cloudflare co-founder and CEO Matthew Prince said, “Now that the majority of traffic on the Internet is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” referring to the milestone where bots surpassed human traffic online for the first time. The source article notes that this shift was not expected to occur until next year.

Prince also said, “Cloudflare’s new tools and partnerships give website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent. We hope that our proposed default changes encourage mixed-use crawlers to separate out search from agent use and training,”

What happens next

The deadline gives AI companies time to decide how they want their crawlers to behave before September 15, 2026. Crawlers that continue to mix search, agent use, and training may find themselves blocked by default on eligible Cloudflare-protected sites.

For publishers, the change offers a more assertive default posture. They can still choose to allow access, but Cloudflare is making the starting point more protective for ad-supported pages.

For AI companies, the message is direct: crawler intent needs to be clearer, and content access may increasingly come with commercial expectations. Cloudflare is not only blocking mixed-use crawlers by default in certain cases; it is also building payment models that could turn publisher content into a licensed input for AI search, agents, and related services.