SEO Feb 8, 2026 6 min read

robots.txt: The Complete Guide for SEO

robots.txt is the first file search engines look at when they visit your site. A misconfigured robots.txt can tank your SEO or accidentally expose private pages. Here's how to get it right.

What is robots.txt?

robots.txt is a plain text file at the root of your website (e.g., https://example.com/robots.txt) that tells search engine crawlers which pages to crawl and which to skip. It follows the Robots Exclusion Protocol, a standard from 1994 that's still used by every major search engine.

Important: robots.txt is a suggestion, not enforcement. Well-behaved crawlers (Google, Bing, etc.) respect it. Malicious bots ignore it. Never use robots.txt to hide sensitive pages — use authentication or access control instead.

Basic syntax

# Allow all crawlers to access everything

User-agent: *

Allow: /

# Block all crawlers from /admin

User-agent: *

Disallow: /admin

# Point to your sitemap

Sitemap: https://example.com/sitemap.xml

Common mistakes that hurt SEO

Blocking your entire site

User-agent: *
Disallow: /

This tells all crawlers to not index any page. Common after a dev/staging site goes live with the old robots.txt still in place. Your pages won't appear in search results.

Blocking CSS and JavaScript

Disallow: /static/
Disallow: /assets/

Google needs to render your pages to understand them. Blocking CSS/JS files means Google sees a broken version of your site, which hurts rankings. Only block files you genuinely don't want crawled.

No sitemap reference

Always include a Sitemap: directive in your robots.txt. This helps crawlers find your XML sitemap faster, especially for new sites.

Using robots.txt for security

Blocking /admin in robots.txt actually tells attackers exactly where your admin panel is. Use proper authentication instead — robots.txt is public.

Best practices

  • Keep robots.txt as permissive as possible — only block what's truly unnecessary for search
  • Always include a Sitemap directive
  • Don't block CSS, JavaScript, or image files unless you have a specific reason
  • Use Google Search Console's robots.txt tester to validate changes before deploying
  • Check your robots.txt after every deployment — framework migrations can reset it
  • Remember that Disallow doesn't remove pages already indexed — use noindex meta tag for that

Analyze any site's robots.txt

Use our Robots.txt Analyzer to instantly fetch and parse any site's robots.txt — see which paths are blocked, which user-agents have rules, and whether a sitemap is declared.

Check your robots.txt now

Enter any URL to see its robots.txt rules, blocked paths, and sitemap references.

Open Robots.txt Analyzer