Text - UTF-16 Character Set

1 - About

UTF-16 is a variant of unicode. It's variable-length encoding: Each code point in a UTF-16 encoding may require either one or two 16-bit code units. The size in memory of a string of length n varies based on the particular code points in the string.

Finding the nth code point of a string is no longer a constant-time operation as in ucs-2: It generally requires searching from the beginning of the string.

3 - Management

3.1 - How to show character above 16 bit

Unicode can now show characters above 16 bit (to 32 bit), to show this additional characters, code point are concatenated in what's called Unicode - Surrogate pair (UTF-16).

More, see Unicode - Surrogate pair (UTF-16)

4 - Example

4.1 - Javascript

The charCodeAt() method returns the UTF-16 code unit (an integer between 0 and 65535) at the given index.


codePointDecimal='ΓΈ'.charCodeAt(0)
console.log(codePointDecimal);
codePointHexa=codePointDecimal.toString(16)
console.log(codePointHexa);

If you go to the character map of windows, you can search it and validate that it's the good one.


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap