Search Engine - Sitemap



A Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL:

  • when it was last updated,
  • how often it usually changes,
  • how important it is, relative to other URLs in the site

Simple example from the Protocol documentation (ie syntax)

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="">

      <priority>0.8</priority> <!-- 0 to 1 with a default of 0.5 -->



  • Used by Web crawlers to discover pages from links within the site and from other sites.

How to advertise it

Submit to individual engine

Submit sitemap to Google at:


The sitemap.xml file can be defined in the robots.txt file.

You can specify more than one


Search engine HTTP request (Ping)

via an HTTP get request Ref

# ie
Search Engine Endpoint

Discover More
Card Puncher Data Processing
Dokuwiki - Sitemap

generation in dokuwiki. The file is triggered by the taskrunner and generated by the class Mapper.php at DOKU_HOME/sitemap.xml.gz Generation frequency...
HTML - Canonical URL

URL A canonical url is a URL that has a canonical value identifier for a web page meaning that the value should be unique on the internet scope. This is the URL that people will see in: the search...
Google Search Console Index
Search Engine - Google Index

The google index is a search index created by the googlebot Check the GoogleSearch Console Index category: coverage - indexed or not and why sitemaps (ie ) removal See if the page is...
Search Engine - Search Index

A search index is an index of token (word) to web page A search engine query it in order to return result. It's structure is inverted index meaning that it maps word to URL (page) The search index is...
Robots Useragent
Web - Robots.txt

robots.txt is a file that control and gives permission to Web Bot when they crawl your website. Googlebot should not crawl and all sub directory All other...
Web Structured Page - PageMaps

PageMaps is a web markup format that adds arbitrary metadata to a web page. They are useful to add any data to a web page that make sense only to your web application and that you might not want to display...

Share this page:
Follow us:
Task Runner