Table of Contents

About

http://www.sitemaps.org/

Syntax

A Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL:

  • when it was last updated,
  • how often it usually changes,
  • how important it is, relative to other URLs in the site

Simple example from the Protocol documentation (ie syntax)

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

   <url>
      <loc>http://www.example.com/</loc>
      <lastmod>2005-01-01</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.8</priority> <!-- 0 to 1 with a default of 0.5 -->
   </url>

</urlset> 

Usage

  • Used by Web crawlers to discover pages from links within the site and from other sites.

How to advertise it

Submit to individual engine

Submit sitemap to Google at: https://search.google.com/search-console/sitemaps

Robots.txt

The sitemap.xml file can be defined in the robots.txt file.

You can specify more than one

Sitemap: http://www.example.com/sitemap-host1.xml
Sitemap: http://www.example.com/sitemap-host2.xml

Search engine HTTP request (Ping)

via an HTTP get request Ref

<searchengine_URL>/ping?sitemap=sitemap_url
# ie
<searchengine_URL>/ping?sitemap=http%3A%2F%2Fwww.yoursite.com%2Fsitemap.gz
Search Engine Endpoint
google http://www.google.com/webmasters/sitemaps/ping?sitemap='.encoded_sitemap_url
microsoft http://www.bing.com/webmaster/ping.aspx?siteMap='.encoded_sitemap_url
yandex http://blogs.yandex.ru/pings/?status=success&url='.encoded_sitemap_url