BOM (byte order mark)

Data System Architecture


The byte order mark (BOM) is a magic number (header) (Unicode character, U+FEFF BYTE ORDER MARK (BOM)

It is not a character, but a byte sequence at the beginning of the file.

It can be found at the start of a text file and indicates

  • The byte order, or endianness, of the text file
  • The fact that the text stream's encoding is Unicode, to a high level of confidence;
  • Which Unicode encoding the text stream is encoded as.


If you pass a character set to a file reader, you don't need to handle the BOM.

Documentation / Reference

Discover More
Character Set Code Pages
Text - Encoding (Character Set|charset|code page)

A character set is a repertoire of characters in which each character is (assigned|encoded) into a numeric code point. An character set (as an alphabet) is any finite set of symbols (characters). In...
Data System Architecture
What is Unicode / Universal Coded Text Character Set (UCS)?

Unicode is a global character set that allows multilingual text to be displayed in a single application. Unicode is a acronym of Universal Coded Character Set Unicode enables the development of a single...

Share this page:
Follow us:
Task Runner