HTML - (Document) Parser


HTML documents consist of a tree of elements and text. The specification defines a set of elements that can be used in HTML, along with rules.

If a document is transmitted with the text/html mime type, then it will be processed as an HTML document by Web browsers.

HTML user agents (e.g. Web browsers) parse the markup, turning it into a DOM (Document Object Model) tree.

HTML documents represent a media-independent description of interactive content. (screen, speech synthesizer, braille display). By using a language such as CSS, the authors can influence the rendering.

Parsing of HTML files happens asynchronously and incrementally, meaning that the parser can pause at any point to let scripts run.


You can't parse [X]HTML with regex. Because HTML can't be parsed by regex


Documentation / Reference

Recommended Pages
Browser - DOMContentLoaded event (page load)

DomContentLoaded is a page load event that signals that the parser has finished the construction of the DOM. The resources at the left or touching the blue line are blocking the construction of the DOM....
Browser - Page loading

Page loading is a status of the timeline of a page load. When a page is loaded, it means that the browser: has received a HTTP response from an request and has parsed the content (during the parse...
Browser - Render blocking resources

All resources that are before the DomContentLoaded event are blocking the construction of the DOM. They are therefore known as render blocking resources They all have a render...
Browser - Rendering ( HTML Latency)

( HTML Latency) HTML Rendering is a page load phase that consists of generating an output that can be read by the client. Render tree building stage: The CSSOM and DOM trees are combined into a...
DOM - Document Loaded - onload event

The load event is an event that is fired / emitted on: an HTML element that fetch resources on the browser window when the page has finish loading. This event is a timing page load event To...
DOM - Ready event (Readiness|readyState)

A document is in a ready state when the DOM has been built. Any work that tries to interact with the DOM should wait for the DOM to be ready. HTML For an HTML page, in a browser, all images are not...
Devtool Chrome Event Listener
Event - Missed Event

Parsing of HTML files happens synchronously and incrementally, meaning that the parser can pause at any point to let scripts run. It does mean that authors need to be careful to avoid hooking event handlers...
HTML - (Client-side) Script (Element)

A client-side script is a program that is: linked (server-side script) or directly embedded in an HTML document (in-line script) Scripts in HTML have “run-to-completion” semantics, meaning that...
Html Script Async Vs Defer
HTML - Defer attribute

defer is an boolean attribute of script element that indicates to the user agent (browser) that it should execute the script: after the DOM has been created (the HTML document read entirely and has...
HTML - noscript element

noscriptThe no script element is used to present different markup to user agents that don’t support scripting, by affecting how the document is parsed. It's the fallback element of the script element....

Share this page:
Follow us:
Task Runner