XML Sitemaps: The Complete Guide to Getting Indexed Faster
A well-structured XML sitemap tells search engines exactly what to crawl, when pages were updated, and which pages matter most. Here's everything you need to know.
What is an XML sitemap?
An XML sitemap is a file (usually at /sitemap.xml) that lists URLs on your website along with optional metadata — when they were last modified, how frequently they change, and their relative importance. It's written in XML format specifically for search engine crawlers, not human readers.
Think of it as a table of contents for your website that you hand directly to Google and Bing.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/tools/ssl-checker</loc>
<lastmod>2026-02-16</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
Does Google actually use sitemaps?
Yes, but with nuance. Google's John Mueller has confirmed that sitemaps are one of the primary ways Googlebot discovers new URLs. However, having a sitemap doesn't guarantee indexing — Google still evaluates page quality, duplicate content, and crawl budget.
Sitemaps are most valuable when:
- Your site is new with few inbound links
- You have a large site with thousands of pages
- Pages are added or updated frequently
- Your site has poor internal linking
- You have rich media content (images, video) that needs discovery
Types of sitemaps
Standard XML sitemap
Lists page URLs with optional lastmod, changefreq, and priority attributes. The most common type. One sitemap can hold up to 50,000 URLs (or 50MB uncompressed).
Sitemap index file
A sitemap that references other sitemaps. Used when your site exceeds 50,000 URLs. The index lives at /sitemap.xml and points to individual sitemap files like /sitemap-blog.xml and /sitemap-products.xml.
Image and video sitemaps
Extend the standard sitemap with image or video-specific tags. Help Google discover media content, especially if the images aren't directly linked in HTML.
Common sitemap mistakes
- Including non-canonical URLs — only include the canonical version of each page. If
/pageand/page/both work, pick one and only include it. - Including noindex pages — never put pages with a
noindexmeta tag in your sitemap. It sends conflicting signals to Google. - Stale lastmod dates — if you update lastmod without actually changing the page, Googlebot learns to ignore it. Only update lastmod when content genuinely changes.
- Wrong priority values — priority only signals relative importance within your own site, not compared to other sites. Setting everything to 1.0 defeats the purpose.
- Not submitting to Google Search Console — adding your sitemap URL in GSC allows Google to report crawl errors, indexing status, and warnings directly to you.
- Forgetting to reference in robots.txt — add
Sitemap: https://example.com/sitemap.xmlat the bottom of robots.txt so any crawler can find it automatically.
How to generate a sitemap
Most modern frameworks and CMS platforms can auto-generate sitemaps:
- WordPress — Yoast SEO and Rank Math both generate sitemaps automatically
- SvelteKit / Next.js — use a sitemap generation library or a
sitemap.xmlroute that outputs XML dynamically - Static sites — tools like
next-sitemap, Astro's sitemap integration, orsitemapnpm package - Any site — crawl-based tools like Screaming Frog can generate sitemaps from an existing site
Check any site's sitemap instantly
Verify sitemap validity, count URLs, check for errors, and see last-modified dates — free.
Open Sitemap Checker