Sitemap Checker
Analyze a website's XML sitemap. Check URL count, last modified dates, and common issues.
What is an XML Sitemap?
An XML sitemap is a file that lists all the important URLs on your website. It acts as a roadmap
for search engine crawlers like Googlebot and Bingbot, helping them discover and index your pages
more efficiently. Sitemaps are typically located at the root of a domain, e.g. example.com/sitemap.xml.
Why do sitemaps matter for SEO?
- Help search engines discover new and updated pages faster
- Provide metadata like last modified date and change frequency for smarter crawling
- Essential for large sites, new sites with few backlinks, and sites with dynamic content
- Allow you to signal which pages are most important via the priority tag
- Sitemap indexes let you organize thousands of URLs across multiple files
Common sitemap issues
- Missing sitemap entirely -- search engines rely on crawling alone
- No lastmod dates -- crawlers cannot tell which pages changed recently
- Duplicate URLs -- wastes crawl budget and can confuse indexing
- Broken child sitemaps in a sitemap index -- pages in those files are not discovered
- Exceeding the 50,000 URL or 50 MB limit per sitemap file
Frequently Asked Questions
Do I need a sitemap for SEO?
While not mandatory, sitemaps are strongly recommended — especially for new sites with few backlinks, large sites, and sites with dynamic or frequently updated content. Google uses sitemaps to prioritize and schedule crawling.
How large can a sitemap be?
A single sitemap can contain at most 50,000 URLs and must be no larger than 50 MB uncompressed. For larger sites, use a sitemap index file that references multiple individual sitemap files.
How do I submit my sitemap to Google?
Submit your sitemap in Google Search Console under Indexing > Sitemaps. Enter the full URL and click Submit. You can also reference it in your robots.txt: Sitemap: https://example.com/sitemap.xml
Should I include every page in my sitemap?
Only include canonical, indexable pages with real value. Exclude pages with noindex directives, paginated pages (unless specifically needed), thank-you pages, internal search results, and any page already blocked in robots.txt.