Javascript - Character

About

This article is about the character representation and manipulation in Javascript (ie code point).

They:

are all unicode UTF-16 character
are an element in a string starting at the index 0.
may have a length of two (for the unicode character above 16bit code unit)

You don't need Javascript to show unicode character on HTML. See: HTML - How to show an Unicode Character in HTML

Creation

from Literal

From a literal

A character is just a string with one character

let char = 'a';
console.log(`The character a: ${char}`);

Rendered by WebCode

Example with an unicode hexadecimal notation High Five character

console.log('\u270B');

Rendered by WebCode

For character above 16 bit such as grining face (1F600), you need to use a surrogate pair (\uD83D\uDE00)

console.log('\uD83D\uDE00');

Rendered by WebCode

from String

let foo = "foo \u270B";
let character = foo.charAt(foo.length-1);
console.log(character);

Rendered by WebCode

from Code Point (number)

From a code point (ie the index of the character in the character set).

Example with the High Five character

let hexa = '270B';
let codePoint = parseInt(hexa, 16);
let character = String.fromCodePoint(codePoint);
console.log(`The character with the code point (${codePoint}) is ${character}`);

Rendered by WebCode

Length

For one character, you may get a length of one or two in Javascript.

Why ? Because :

Javascript uses UTF-16 as encoding character set, all unicode character with an index above 16bit (ie 65535) cannot be represented directly. They uses therefore two code point known as a surrogate pair. See Unicode - Surrogate pair (UTF-16)
the javascript length returns the number of code point, you may get a value of two for one character.

Example with the grining face (1F600)

console.log('😀'.length);

Rendered by WebCode

Other example of character encoded with two code point

Rendered by WebCode