Sitemap Checker

Analyze a website's XML sitemap. Check URL count, last modified dates, and common issues.

What is an XML Sitemap?

An XML sitemap is a file that lists all the important URLs on your website. It acts as a roadmap for search engine crawlers like Googlebot and Bingbot, helping them discover and index your pages more efficiently. Sitemaps are typically located at the root of a domain, e.g. example.com/sitemap.xml.

Why do sitemaps matter for SEO?

  • Help search engines discover new and updated pages faster
  • Provide metadata like last modified date and change frequency for smarter crawling
  • Essential for large sites, new sites with few backlinks, and sites with dynamic content
  • Allow you to signal which pages are most important via the priority tag
  • Sitemap indexes let you organize thousands of URLs across multiple files

Common sitemap issues

  • Missing sitemap entirely -- search engines rely on crawling alone
  • No lastmod dates -- crawlers cannot tell which pages changed recently
  • Duplicate URLs -- wastes crawl budget and can confuse indexing
  • Broken child sitemaps in a sitemap index -- pages in those files are not discovered
  • Exceeding the 50,000 URL or 50 MB limit per sitemap file

Frequently Asked Questions

Do I need a sitemap for SEO?

While not mandatory, sitemaps are strongly recommended — especially for new sites with few backlinks, large sites, and sites with dynamic or frequently updated content. Google uses sitemaps to prioritize and schedule crawling.

How large can a sitemap be?

A single sitemap can contain at most 50,000 URLs and must be no larger than 50 MB uncompressed. For larger sites, use a sitemap index file that references multiple individual sitemap files.

How do I submit my sitemap to Google?

Submit your sitemap in Google Search Console under Indexing > Sitemaps. Enter the full URL and click Submit. You can also reference it in your robots.txt: Sitemap: https://example.com/sitemap.xml

Should I include every page in my sitemap?

Only include canonical, indexable pages with real value. Exclude pages with noindex directives, paginated pages (unless specifically needed), thank-you pages, internal search results, and any page already blocked in robots.txt.