robots.txt Checker — Validate robots.txt Rules

Validate your robots.txt file and check which crawlers are allowed or blocked. Syntax error detection included.

About robots.txt Checker — Validate robots.txt Rules

Robots.txt Checker validates your robots.txt file instantly. It detects syntax errors, Disallow/Allow rule conflicts, missing Sitemap directives, and full-site blocking mistakes. Unique feature: AI Crawler Block Audit — see whether GPTBot, ClaudeBot, Google-Extended, Bytespider, and 11 more AI bots are blocked or allowed, displayed in a clear visual matrix. Enter a URL to auto-fetch, or paste your robots.txt directly. Free, no sign-up.

How to Use

  1. 1To check by URL, select the "URL" tab, enter your domain (e.g. https://example.com), and click "Check." The tool fetches your robots.txt automatically.
  2. 2To check by text, select the "Text" tab and paste your robots.txt content directly.
  3. 3Review the syntax check results. Errors (red), warnings (yellow), and OK (green) indicators show the status of each rule.
  4. 4Check the AI Crawler Block Matrix. It shows whether GPTBot, ClaudeBot, Google-Extended and 12 more AI bots are blocked or allowed.
  5. 5Use the URL Path Test to check if a specific path (e.g. /admin/login) is crawlable by a specific User-agent.

Features

  • Enter a URL and the tool auto-fetches your robots.txt — no copy-paste needed
  • AI Crawler Block Audit: see block status for 15 AI bots (GPTBot, ClaudeBot, Google-Extended, Bytespider, and more)
  • Instantly detect the most dangerous mistake: Disallow: / blocking your entire site from Google
  • Find contradictions between Allow and Disallow rules that silently hurt your SEO
  • Free, no sign-up, works in your browser. Found a problem? Fix it instantly with our robots.txt Generator.
01

robots.txt Syntax Rules and Common Error Patterns

A robots.txt file follows simple syntax rules, but small mistakes can prevent crawlers from behaving as intended. Understanding common error patterns helps you catch problems before they affect your SEO.

Syntax Rules: User-agent, Disallow, Allow, and Sitemap Format

Each line in robots.txt follows the format "Directive: value." Valid directives are User-agent, Disallow, Allow, Sitemap, and Crawl-delay. While directive names are case-insensitive by specification, the convention is to capitalize the first letter (User-agent, not user-agent). Paths must start with a forward slash (/). Blank lines separate User-agent blocks. Comments start with #.

Common Syntax Errors Detected by This Tool

This checker detects: (1) Unknown or misspelled directives (Disalow, User_agent), (2) Missing leading slash on paths (Disallow: admin/ instead of /admin/), (3) Disallow/Allow lines before any User-agent declaration, (4) Empty User-agent blocks with no rules, (5) BOM or non-UTF-8 encoding issues. Each error is flagged with a line number and specific fix suggestion.

Full-Site Block Detection: The Most Dangerous Mistake

The combination of User-agent: * and Disallow: / blocks every crawler from your entire site. This tool automatically detects this pattern and displays a critical red warning. While this may be intentional for staging environments, on a production site it causes complete removal from Google search results.

02

Testing robots.txt Rules and AI Crawler Audit

Validating syntax is only the first step. You also need to verify that your rules produce the intended crawl behavior, especially for AI training bots that have become a major concern since 2024.

URL Path Test: Check If Specific Pages Are Crawlable

The URL Path Test lets you enter any path (e.g. /admin/login) and a User-agent (e.g. Googlebot) to check whether that combination is allowed or blocked. Rules are evaluated using prefix matching, where longer (more specific) paths take precedence. This mirrors how actual search engine crawlers interpret robots.txt.

AI Crawler Block Audit: 15 Bots at a Glance

Since 2024, controlling AI training crawlers has become as important as traditional SEO. This tool checks block status for GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, Bytespider (ByteDance), PerplexityBot, and 10 more AI bots. The visual matrix lets you see at a glance which bots can access your content. Google Search Console cannot show this information.

When to Use This Tool vs. Google Search Console

Google Search Console's robots.txt tester is the authoritative tool for Googlebot-specific behavior and requires site ownership verification. This checker covers all crawlers (including AI bots), requires no account, and works instantly. For production sites, use both: this tool for broad coverage and AI audit, and GSC for Googlebot-specific verification.

FAQ

What does this robots.txt checker validate?
It checks for syntax errors (invalid directives, typos, missing slashes), Disallow/Allow contradictions, full-site blocking (Disallow: / under User-agent: *), missing Sitemap directives, and AI crawler block status for 15 bots including GPTBot, ClaudeBot, and Google-Extended.
Can I check robots.txt just by entering a URL?
Yes. Enter your domain URL (e.g. https://example.com) and the tool automatically fetches /robots.txt from the server and analyzes it. You can also paste robots.txt text directly in the Text tab.
What happens if my site has no robots.txt?
If robots.txt returns a 404, the tool reports that no robots.txt was found. This means all crawlers can access your entire site freely. A link to the robots.txt Generator is provided so you can create one.
How is the AI crawler block status displayed?
A visual matrix shows 15 AI crawlers (GPTBot, ClaudeBot, Google-Extended, Bytespider, PerplexityBot, etc.) with blocked (red) or allowed (green) status icons. You can see at a glance which AI bots can scrape your content.
Can it detect the Disallow: / full-site block mistake?
Yes. If User-agent: * has Disallow: / with no counteracting Allow rules, the tool displays a critical red warning stating your entire site is blocked from all crawlers. This is the most dangerous robots.txt configuration error.
How is this different from Google Search Console's robots.txt tester?
Google Search Console only tests Googlebot and requires site ownership verification. This tool tests all crawlers including 15 AI bots, requires no account, and works instantly with just a URL. For comprehensive coverage, use both tools together.
What should I do if problems are found?
Click the "Fix with robots.txt Generator" button to create a corrected robots.txt file using our visual builder. Each error message also includes specific guidance on what to fix.

Found a bug or something not working as expected?

Report a bug →