How Rich Text editor in HTML are made (Principles and Demo)

About

A rich text editor is a component that permits to edit text to make it rich. ie it allows the text to be styled via editing button.

Basic Example

This code is explained in the html_editor section. If you want to see the code, hover over the form and click on the Try this code button.

Ref: Example adapted from the Editing section of the spec

How it works

HTML Editor

Editable element

The most simple implementation are made via Javascript that manipulates directly the syntax tree of the HTML document called the DOM.

The content is made editable with the contentEditable attribute and the user edit that text.

In other words, the HTML DOM is edited. ie when the user bolds a word, a b html element is added.

The styling buttons are implemented with:

  • the execCommand This function will be replaced by Content Editable and Input Events but they are still not fully implemented

The basic_example is using the execCommand.

Canvas

If you need to create text that is constrained in a fix format such as a page A4, you may want to use a canvas instead. 1). Google docs use it with their Kix editor 2)

Other Language Editor

If you want to create a editor for another language, there is two possible architectures that increase in complexity:

  • an unidirectional editor, the easiest, where you translate your language to HTML
  • or a bidirectional editor, the hardest, where you need also to translate the HTML back to your language.

Translation

To translate you language to HTML, you can do it:

  • with a function directly (one pass)
  • or via a parser that builds a syntax tree

Example of parser:

Virtual DOM

The HTML dom may be and is generally virtual. Editors may use:

Some implements also there own such as:

The advantage of a virtual DOM is that the page does not need to be entirely refreshed. The virtual dom is making the diff and its update transparent.

Keep in mind that the DOM implementation chosen will limit the expressiveness of your language.

It's easy to add an inline feature such as bold but it's:

  • less easy-to-define hierarchical components such as :
    • a table,
    • or a image gallery
  • even less if you want to introduce templating feature such as:
    • the list of the last 10 pages published

Debounce

To smoothing the experience and to not render on every stroke, you should apply a debounce. It reduces overhead by not rendering for every keystroke. The rendering is triggered via a keystroke or a timer.

Sanitization / Security

To avoid Xss attack, your input should be sanitized to delete every scriptable/callable text also on the server please as the client code is never reliable.

Editors

This sections regroups:

  • the rich text editor
  • but also the code_editor because if you are creating a markup language, you may also just want to go this route.

Rich Text Editor

SlateJs

https://www.slatejs.org/ our favorite.

SlateJs provide:

  • an interface to manipulate our own AST (a Json tree structure based on block and text node)
  • a render for the AST, with React

Lexical

https://lexical.dev/

Gutenberg Wordpress

https://github.com/WordPress/gutenberg

React-contenteditable

react-contenteditable - Example with:

  • two directions update between HTML Text to React HTML Dom
  • with a edit button powered by execCommand to modify the HTML

Draft

Draft.js - Framework for building rich text editors in React such as:

QuilJs

https://quilljs.com/

(Used by purpose)

Prosemirror

https://prosemirror.net/ uses its own sort of DOM with one important difference, is that it stops to be recursive in inline/paragraph node 3)

Prose Mirror Paragraph Dom Xml Paragraph Dom
Prosemirror Dom Html Dom

The explanation is that it :

  • allows to represent positions in a paragraph using a character offset rather than a path in a tree,
  • makes it easier to perform operations like splitting or changing the style of the content without performing tree manipulation.

Example:

Squire

https://github.com/neilj/Squire - Largely used. The HTML remains the source-of-truth (the DOM)

Syntax tree to HTML

Medium Editor

Medium Editor - Simple project, buggy in the extension that shows a medium like editor

Pell

https://github.com/jaredreich/pell

Primrose

Text editor based on canvas.

https://github.com/capnmidnight/Primrose

CkEditor

https://ckeditor.com/ - JavaScript rich text editor (free + commercial)

RoosterJs

The structure manipulated is the DOM. Meaning that creating component can be very difficult. (ie try to add a code block formatted by prism and you will understand)

https://github.com/microsoft/roosterjs

Code Editor

The editor may be specialized to modify code. See What are the code editor options available in HTML ?

See also

Language Server Protocol

  • Language Server Protocol - Interface protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references etc.

Template Editor / Page Builder

WYSIWYG editors are good for content editing but inappropriate for creating HTML structures.

Documentation / Reference





Discover More
HTML - Escape / Sanitizer

A sanitizer is a program that will: not accept all HTML elements and or transform them as text (escape) This is to avoid script injection and should be used on the server side (ie not client) to...
What are the code editor options available in HTML ?

HTML This page lists you the HTML editor that specializes in editing code. Note that markup languages such as markdown are also code. Monaco,...
Data System Architecture
What is rich text ?

Rich Text is a text where extra metadata information have been added. Generally, they are styling information such as boldness but all visuals such as graphic can be considered as rich text.



Share this page:
Follow us:
Task Runner