Recipe

robots.txt Patterns

Control crawler access with precision — from open-door indexing to surgical disallow rules.

Allow Everything

The simplest policy. Every bot can crawl every path.

User-agent: *
Allow: /

Disallow Everything

Block all compliant crawlers from the entire site.

User-agent: *
Disallow: /

Target Specific Bots

Block GPTBot while allowing Googlebot. Rules cascade — the most specific User-agent match wins.

User-agent: GPTBot
Disallow: /

User-agent: Googlebot
Allow: /

User-agent: *
Disallow: /admin
Disallow: /api

Sitemap + Crawl-Delay

Point crawlers to your sitemap and throttle polite bots.

User-agent: *
Disallow: /checkout
Disallow: /account
Crawl-delay: 10

Sitemap: https://yoursite.com/sitemap.xml

Wildcard & Path Matching

Use * for any sequence and $ for end-of-URL anchoring.

User-agent: *
Disallow: /*.pdf$
Disallow: /tmp/*
Disallow: /search?*

Pro tip: Place robots.txt at the root of your domain. Crawlers fetch it first — no redirects, no auth walls. Test with Google Search Console or Bing Webmaster Tools before deploying.