What is Cloudflare doing to AI crawlers, and why should a small business care?
Cloudflare is about to block certain AI crawlers by default. On September 15, 2026, Cloudflare will change its default settings to automatically block AI crawlers that, in its words, do not distinguish their intent. Site owners can opt out or customize the rules in their Cloudflare Security settings before that date. It’s a big deal because a huge share of the internet sits behind Cloudflare, so a default flip touches a lot of small business websites at once.
Here’s why you should care even if you’ve never logged into Cloudflare in your life. The tools that people increasingly use to find a local business, ChatGPT, Perplexity, and Google’s AI answers, rely on crawlers reaching your website. Block the wrong ones, and you can quietly vanish from the exact answers where customers are deciding who to call.
Say you run a plumbing business. Someone types “best emergency plumber near me” into an AI assistant. If the crawler that powers that answer can’t fetch your site, you’re not in the running. You didn’t lose to a better plumber. You lost to a checkbox.
Will blocking AI crawlers stop AI from recommending your business?
It can, and this is the part that trips people up. Not all AI bots do the same job, and treating them as one blob is the mistake.
There are two very different kinds of AI crawlers:
- Search and answer crawlers. These fetch your page in real time to cite and recommend it inside an answer. This group includes OAI-SearchBot and ChatGPT-User (OpenAI), PerplexityBot (Perplexity), Google-Extended (Google’s AI features), and ClaudeBot (Anthropic). These are how you show up when someone asks an AI for a recommendation.
- Training scrapers. These ingest content to help train the underlying models. This group includes GPTBot (OpenAI), CCBot from Common Crawl, and Applebot-Extended (Apple). Blocking these does not remove you from live AI answers, because they’re not the ones fetching your page mid-conversation.
The danger is a blunt “block AI bots” setting that scoops up both. You might think you’re just opting out of model training, and accidentally cut off the answer crawlers that were sending you customers. That’s the digital equivalent of unplugging your phone to stop telemarketers and then wondering why nobody’s booking appointments. (Yes, that’s the dad joke. It only rings once.)
How much does being crawlable actually matter for AI visibility?
Being reachable is the price of admission. If an AI answer engine can’t fetch your page, it can’t quote you, and being quoted is what puts your name in front of a ready customer.
According to GEO: Generative Engine Optimization (Aggarwal et al., 2024), adding statistics and citing credible sources on a page can lift how often AI answer engines recommend it by up to roughly 40 percent [1]. But that entire upside assumes one thing: the crawler can actually read the page. Optimize all you want. If the door is locked, none of it counts.
For a plumbing business, that means the goal isn’t just a good website. It’s a good website that answer engines are allowed to open.
Which AI crawlers should most local businesses keep allowed?
For most local service businesses, the practical answer is to keep the search and answer crawlers allowed and make a deliberate choice about training scrapers. Here’s a simple way to think about the common bots.
| Crawler | Type | What blocking it does |
|---|---|---|
| OAI-SearchBot, ChatGPT-User | Search / answer (OpenAI) | Can remove you from ChatGPT's live answers and citations |
| PerplexityBot | Search / answer (Perplexity) | Can remove you from Perplexity's cited results |
| Google-Extended | Search / answer (Google AI) | Can limit your presence in Google's AI-generated answers |
| ClaudeBot | Search / answer (Anthropic) | Can remove you from Claude's cited responses |
| GPTBot, CCBot, Applebot-Extended | Training scrapers | Blocks model training use; does not remove you from live answers |
The short version: blocking the training scrapers is a values call you’re allowed to make. Blocking the answer crawlers is usually an accident you don’t want.
What should you actually do before September 15, 2026?
Don’t panic, and don’t blanket-block. A few practical steps cover most of it.
- Check your AI crawl setting. If you’re on Cloudflare, open your dashboard and look for AI Crawl Control or the bot settings under Security. On another host, look for a similar AI bot control.
- Don’t flip a block-everything switch. If you want to opt out of training, target the training scrapers specifically instead of every AI bot.
- Keep your robots.txt permissive for answer engines. Make sure you’re not disallowing the search and answer crawlers there either. Two locked doors are no better than one.
- Confirm you’re actually crawlable. Being reachable is a prerequisite for being recommended. This is the whole point of the discipline people call GEO or AEO, getting your business surfaced by answer engines.
- Strengthen your trust signals. Reviews, a complete profile, and clear service pages are what tip an AI toward recommending you once it can read you.
None of this requires a developer. It’s mostly a matter of knowing the setting exists and not treating “AI bots” as one big scary category.
How does Rhody Reviews fit into all this?
Rhody Reviews helps on the half of this you control every day: your trust signals. Once the answer crawlers can reach your site, what makes them pick you over the plumber down the street is a strong, current reputation. Rhody Reviews helps you build a steady flow of genuine reviews and keep your profile complete and accurate, which is exactly what AI answer engines lean on when they choose who to recommend.
Rhody Reviews also gives you a free AI Visibility Check so you can see how your business shows up right now when someone asks an AI assistant for a recommendation. It’s a fast way to catch a problem, including an accidental crawler block, before it quietly costs you calls.
Curious where you stand today? Run your free AI Visibility Check and start a free trial to build the review base that gets you recommended.
Sources
- GEO: Generative Engine Optimization (Aggarwal et al., 2024). https://arxiv.org/abs/2311.09735