Sitemap.xml Guide 2026: Create, Submit & Optimize

A sitemap.xml tells search engines which pages exist on your site and when they were last updated. This complete guide covers creating the right sitemap, what to include and exclude, how to submit it to Google and Bing, and how to troubleshoot common sitemap errors.

SEO Tip: A properly structured sitemap can significantly speed up Google indexing of new pages. After submitting your sitemap, use PageGuard to verify your pages have the correct SEO meta tags, canonical URLs, and structured data that improve your indexing rate.

Check your sitemap health right now

Scan your site to find missing canonical URLs, meta tags, and structured data issues that hurt Google indexing.

Free Site Scan

1. What Is a Sitemap.xml?

A sitemap.xml is an XML-formatted file that serves as a roadmap of your website for search engine crawlers. It lists the URLs you want indexed, along with optional metadata about each URL: when it was last modified, how often it changes, and its relative importance.

The Sitemaps Protocol was originally developed by Google in 2005 and is now supported by all major search engines including Google, Bing, Yahoo, and DuckDuckGo. While search engines can discover pages through link crawling alone, sitemaps give you direct control over which pages get submitted for indexing.

Minimum valid sitemap structure:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://yourdomain.com/page</loc>
    <lastmod>2026-03-01</lastmod>
  </url>
</urlset>

The only required element is <loc> — the canonical URL of the page. All other elements are optional.

2. When Do You Need a Sitemap?

Google's official guidance says a sitemap is particularly valuable if your site meets any of these criteria:

Even for small, well-linked sites, having a sitemap costs nothing and provides useful signals to Google. There is no downside to having one.

3. Sitemap XML Elements Explained

Understanding each XML element helps you build an effective sitemap:

Element Required Description
<loc> Yes Absolute URL of the page. Must use HTTPS if available. Must match the canonical URL.
<lastmod> Optional Date of last modification in W3C Datetime format (YYYY-MM-DD). Must be accurate — Google may demote sites that set false lastmod dates.
<changefreq> Optional* How often the page changes: always, hourly, daily, weekly, monthly, yearly, never. *Largely ignored by Google; Bing may use it.
<priority> Optional* Relative importance from 0.0 to 1.0 (default 0.5). *Largely ignored by Google. Bing may use it for crawl scheduling.

Note: Google officially states it ignores <changefreq> and <priority>. Focus on accurate <lastmod> dates instead.

4. How to Create a Sitemap by Platform

The right approach depends on your website platform:

WordPress

Install Yoast SEO or RankMath. Both auto-generate and continuously update your sitemap at yourdomain.com/sitemap_index.xml. They create separate sitemaps for posts, pages, categories, tags, and custom post types. Configure which post types to include in the plugin settings.

Shopify / WooCommerce

Shopify automatically generates a sitemap at /sitemap.xml that includes products, collections, pages, and blog posts. WooCommerce with Yoast SEO includes product and product category pages in the sitemap automatically.

Next.js / React

Use the next-sitemap package. It generates sitemaps at build time or via an API route. Configure next-sitemap.config.js to exclude private pages like /admin/* and /dashboard/*.

Hugo / Jekyll / Eleventy

All three static site generators include built-in sitemap support. Hugo generates sitemap.xml automatically. Jekyll uses the jekyll-sitemap gem. Eleventy uses eleventy-plugin-sitemap.

Custom / Headless

Generate sitemaps programmatically using your CMS API. Fetch all published URLs, filter out private or duplicate pages, and output valid XML. For large sites (50,000+ URLs), use a sitemap index file that references multiple sitemap files.

Squarespace / Webflow / Wix

These platforms auto-generate sitemaps. Squarespace: /sitemap.xml. Webflow: automatically created and submitted. Wix: /sitemap.xml is generated automatically.

5. What to Include in Your Sitemap

Only include pages that you want Google to index and that represent unique, valuable content:

✓ Include These

  • • Homepage and main section pages
  • • Blog posts and articles
  • • Product and service pages
  • • Category and collection pages
  • • Landing pages and guides
  • • Contact, About, FAQ pages
  • • Pages with canonical pointing to themselves
  • • Pages returning HTTP 200

✗ Exclude These

  • • Pages with noindex meta tag
  • • Login / dashboard / admin pages
  • • Thank-you / confirmation pages
  • • Paginated duplicates (page 2, 3...)
  • • Filtered / sorted URL variants
  • • Pages blocked in robots.txt
  • • 301 redirect URLs (use destination)
  • • 404 and error pages

6. How to Submit Your Sitemap to Google

  1. 1

    Add Sitemap to robots.txt

    Add Sitemap: https://yourdomain.com/sitemap.xml to your robots.txt file. Any crawler that reads your robots.txt will automatically find your sitemap.

  2. 2

    Open Google Search Console

    Go to search.google.com/search-console and select your property. If you haven't verified your site yet, complete verification first using the HTML meta tag method.

  3. 3

    Navigate to Sitemaps

    In the left sidebar, under Indexing, click Sitemaps. You'll see a list of any previously submitted sitemaps and their status.

  4. 4

    Enter and Submit Your Sitemap URL

    In the "Add a new sitemap" field, enter the relative path (just sitemap.xml, not the full URL). Click Submit. GSC will start processing your sitemap immediately.

  5. 5

    Submit to Bing Webmaster Tools

    Don't forget Bing. Go to bing.com/webmasters, verify your site, and submit your sitemap under Sitemaps. Bing drives 6–10% of search traffic in the US and is the default search engine on Edge and many AI assistants.

7. Sitemap Index Files for Large Sites

A single sitemap file can contain a maximum of 50,000 URLs and must be under 50MB uncompressed. For large sites, use a sitemap index file that references multiple sitemap files:

Sitemap index structure:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://yourdomain.com/sitemap-pages.xml</loc>
    <lastmod>2026-03-01</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://yourdomain.com/sitemap-posts.xml</loc>
    <lastmod>2026-03-01</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://yourdomain.com/sitemap-products.xml</loc>
    <lastmod>2026-03-01</lastmod>
  </sitemap>
</sitemapindex>

Group your sitemaps logically by content type (pages, posts, products, images) or by date range for archives. Submit only the index file to Google Search Console — it will process all referenced sitemaps automatically.

8. Image and Video Sitemaps

Extend your sitemap with image and video extensions to make media content eligible for Google's image and video search results:

Image Sitemap Extension

<url>
  <loc>https://yourdomain.com/blog/post</loc>
  <image:image>
    <image:loc>https://yourdomain.com/images/hero.jpg</image:loc>
    <image:title>Hero image description</image:title>
    <image:caption>Caption for the image</image:caption>
  </image:image>
</url>

Video Sitemap Extension

For video content, include video:video extensions with video:thumbnail_loc, video:title, video:description, and video:content_loc or video:player_loc. This makes your videos eligible for video carousels in search results.

News Sitemaps

News publishers should use the Google News sitemap extension with news:news, news:publication, and news:publication_date to get content included in Google News. Only include articles published within the last 2 days.

9. Common Sitemap Errors and How to Fix Them

❌ "Couldn't fetch" error in GSC

Google can't access your sitemap file. Check: (1) The sitemap URL is publicly accessible (no login required); (2) Your server returns a 200 status for the sitemap URL; (3) The file isn't blocked in robots.txt; (4) Your server's User-Agent policy doesn't block Googlebot.

❌ URLs discovered but not indexed

Google found your pages but chose not to index them. Common causes: thin content, duplicate content, quality issues, noindex tags, incorrect canonicals, or low page authority. Improve content quality, verify no accidental noindex tags, and build internal links to affected pages.

❌ Invalid URL in sitemap

Special characters must be HTML-encoded in XML sitemaps: &amp; for &, &lt; for <, &gt; for >, &apos; for apostrophes. URLs must use UTF-8 encoding. Spaces and special characters in URLs should be percent-encoded.

❌ noindex URL in sitemap

Including noindex pages in your sitemap is a contradiction and wastes crawl budget. GSC will report these as errors. Audit your sitemap regularly to ensure every included URL is indexable and returns 200.

❌ Sitemap contains redirect URLs

Always use the final destination URL in your sitemap, not the redirecting URL. Update your sitemap whenever you change URL structure and implement 301 redirects. Submitting redirect URLs wastes crawl budget and confuses Google about your canonical URL structure.

❌ Invalid date format for lastmod

Use W3C Datetime format: YYYY-MM-DD (e.g., 2026-03-04) or the full datetime format YYYY-MM-DDThh:mm:ss+00:00. Common mistake: using MM/DD/YYYY or DD-MM-YYYY which causes GSC to report invalid date errors.

10. Sitemap Best Practices

11. Sitemap vs. robots.txt — What's the Difference?

These two files serve complementary but distinct purposes:

sitemap.xml robots.txt
Purpose Tell crawlers which pages exist and should be indexed Tell crawlers which pages NOT to crawl
Function Discovery aid — helps find and index pages Access control — restricts crawl access
Format XML with URL list Plain text with Allow/Disallow directives
Location Any path (commonly /sitemap.xml) Must be at /robots.txt (root only)
Indexing effect Suggests indexing (not a guarantee) Disallow prevents crawling but not indexing if linked

Use both together: robots.txt restricts crawler access to private sections; sitemap.xml promotes your public content pages for indexing.

12. Monitor Sitemap Performance with PageGuard

Submitting your sitemap is the first step — but you also need to ensure the pages in your sitemap are technically sound enough for Google to actually index them. Pages with missing canonical tags, incorrect meta tags, or structured data errors are often deprioritized or skipped by Google's indexer even if they're in your sitemap.

PageGuard scans individual pages and checks the technical SEO signals that influence whether Google indexes them: correct canonical URLs, valid meta tags, structured data, Core Web Vitals, and accessibility compliance — all factors Google uses when deciding indexing priority.

Ensure your sitemap pages are indexable

Scan any URL from your sitemap to verify it has the correct canonical, meta tags, and structured data Google needs to index it.

Frequently Asked Questions

What is a sitemap.xml and do I need one?

A sitemap.xml is an XML file that lists all the URLs on your website you want search engines to discover and index. While Google can find pages through links alone, a sitemap ensures every important page gets crawled — especially for new sites, large sites, or pages with few internal links. Even if your site is small and well-linked, having a sitemap costs nothing and is always recommended.

How do I create a sitemap.xml?

The method depends on your platform: WordPress uses Yoast SEO or RankMath plugins; Shopify, Squarespace, and Wix generate sitemaps automatically; Next.js uses the next-sitemap package; Hugo, Jekyll, and Eleventy have built-in sitemap support. For custom sites, generate XML programmatically from your URL list and deploy at /sitemap.xml.

How do I submit a sitemap to Google?

Go to Google Search Console → Indexing → Sitemaps → enter 'sitemap.xml' in the Add a new sitemap field → click Submit. Google will process your sitemap and show discovered vs. indexed URL counts. Also add 'Sitemap: https://yourdomain.com/sitemap.xml' to your robots.txt file so all crawlers can find it automatically.

What pages should I exclude from my sitemap?

Exclude: pages with noindex meta tags, login and admin pages, thank-you/confirmation pages, duplicate paginated pages, filtered URL variants, pages blocked in robots.txt, and redirect URLs (use the destination URL instead). Only include pages with unique, valuable content that return HTTP 200 and have a canonical pointing to themselves.

How often should I update my sitemap?

Update your sitemap whenever you add, remove, or significantly change important pages. For CMS-based sites, automate sitemap regeneration on publish. Use accurate lastmod dates — Google may demote sites that set all dates to today as a trick to force re-crawling. Resubmit in Google Search Console after major content additions.

Related SEO Guides