HTML - Document


An HTML document 1) is a well-formed HTML string (ie that contains the html root element).

As from one document with Javascript, you can show different text, a web page is a logical representation of a document


The HTML textual representation can be stored:

When the browser process this page, it will create an in-memory/code representation and the HTML page becomes a DOM document.


The URL associated with a Document is the document's address.


Elements represent things; that is, they have intrinsic meaning, also known as semantics. For example, an img element represents an image.


<!-- tells the browser what language it's reading -->
<!DOCTYPE html>
<!-- Starts the HTML document -->
  <title>Sample page</title>
  <h1>Sample page</h1>
  <p>This is a <a href="demo.html">simple</a> sample.</p>
  <!-- this is a comment -->
<!-- Ends the HTML document -->

There are always two parts to the file:

  • the head: The head includes metadata information about the web page, such as its title.
  • and the body. The content in the body is what will be visible on the actual page.


See DOM - HTML vs XML document


To display a page, a browser:

  • fetches the HTML document over the network (generally with a Http get request)
  • parses the HTML,
  • and then converts it into a code / in-memory representation called a DOM document: (Document Object Model: A tree of nodes)

This is described in details in this page: Web - Timeline of a page load (Page Speed|Page Latency)


See Resource (Web Page) - Type (Structured Data | Metadata )


The identifier of a HTML document is its URL.

But because you may render or move a HTML page to another URL, the rel=canonical was introduced to represent an universal id (as described in RFC 6596).

And to complicate things, all metadata schema may also have their own meta where to store this information. For instance, og:url for the open graph tag


On Facebook, If you move a page to a new URL, you can use the old URL as the canonical source for the new URL, retaining likes, comments and shares for the object. Ref Ref2 - How do I move a page to a different URL?

Html document inside Html document

Note that a html document can contain other document with the frame, iframe.

Discover More
A web resource is the data of the web

This articles shows what a web resource is, how to access it and how it's defined. Web resources is also known as web content. The most known web resource is an html page with its CSS and Javascript but...
Cors Flowchart
Browser - Cross Origin Resource Sharing (CORS)

Cross-origin resource sharing (CORS) is a mechanism that: * allows a HTTP server * to control the cross-origin requests executed by a browser. In short, a HTTP server may allow or not to receive...
Browser - Document variable (DOM) - Javascript

In a browser, the document is: a DOM document (in-memory tree) of a XML document (generally a HTML dom document) The document is provided by the browser via its DOM webapi (Not by the Javascript...
DOM - AppendChild

AppendChild is a node dom tree operation that: adds (a new node) move (a node of the DOM tree) as children at the end of all already existing children of this node. where: returnedValue is:...
DOM - Document (Object)

Every XML doc and HTML document (Web page) in an HTML UA is represented by a TR/html5/dom.htmlDocument object. A document in the context of a browser is generally a HTML document (Web Page). The Document...
Devtool Chrome Event Listener
DOM - Event Type (name)

An event is categorize by its type which is a property of the event object The type of an event is also known as the name of the event (Ref)...
DOM - HTML vs XML document

The DOM specification gives the distinction between this two type of document: XML documents (ie XHTML) and HTML documents. HTML elements are rarely nested. In a XML file, the subsection (SECT)...
HTML - (Cross-document|Web) Messaging

Two mechanisms for communicating between browsing contexts in HTML documents. A messaging system that allows documents to communicate with each other regardless of their source domain, in a way designed...
HTML - (Document) Outline

This article talks: the extraction of the outline from a HTML document in order to create a HTML Table of Content The outline also known as the table of content is a list of one or more potentially...
HTML - (Document) Parser

HTML documents consist of a tree of elements and text. The specification defines a set of elements that can be used in HTML, along with rules. If a document is transmitted with the text/html mime type,...

Share this page:
Follow us:
Task Runner