robots.txt: The Complete Guide for SEO
robots.txt is the first file search engines look at when they visit your site. A misconfigured robots.txt can tank your SEO or accidentally expose private pages. Here's how to get it right.
What is robots.txt?
robots.txt is a plain text file at the root of your website (e.g., https://example.com/robots.txt) that tells search engine crawlers which pages to crawl and which to skip. It follows the Robots Exclusion Protocol, a standard from 1994 that's still used by every major search engine.
Important: robots.txt is a suggestion, not enforcement. Well-behaved crawlers (Google, Bing, etc.) respect it. Malicious bots ignore it. Never use robots.txt to hide sensitive pages — use authentication or access control instead.
Basic syntax
# Allow all crawlers to access everything
User-agent: *
Allow: /
# Block all crawlers from /admin
User-agent: *
Disallow: /admin
# Point to your sitemap
Sitemap: https://example.com/sitemap.xml
Common mistakes that hurt SEO
Blocking your entire site
User-agent: *
Disallow: / This tells all crawlers to not index any page. Common after a dev/staging site goes live with the old robots.txt still in place. Your pages won't appear in search results.
Blocking CSS and JavaScript
Disallow: /static/
Disallow: /assets/ Google needs to render your pages to understand them. Blocking CSS/JS files means Google sees a broken version of your site, which hurts rankings. Only block files you genuinely don't want crawled.
No sitemap reference
Always include a Sitemap: directive in your robots.txt. This helps crawlers find your XML sitemap faster, especially for new sites.
Using robots.txt for security
Blocking /admin in robots.txt actually tells attackers exactly where your admin panel is. Use proper authentication instead — robots.txt is public.
Best practices
- Keep robots.txt as permissive as possible — only block what's truly unnecessary for search
- Always include a Sitemap directive
- Don't block CSS, JavaScript, or image files unless you have a specific reason
- Use Google Search Console's robots.txt tester to validate changes before deploying
- Check your robots.txt after every deployment — framework migrations can reset it
- Remember that Disallow doesn't remove pages already indexed — use
noindexmeta tag for that
Analyze any site's robots.txt
Use our Robots.txt Analyzer to instantly fetch and parse any site's robots.txt — see which paths are blocked, which user-agents have rules, and whether a sitemap is declared.
Check your robots.txt now
Enter any URL to see its robots.txt rules, blocked paths, and sitemap references.
Open Robots.txt Analyzer