Lexical Analysis - (Token|Lexical unit|Lexeme|Symbol|Word)



A token is symbols of the vocabulary of the language.

Each token is a single atomic unit of the language.

The token syntax is typically a regular language, so a finite state automaton constructed from a regular expression can be used to recognize it.

A token is:

The process of finding and categorizing tokens from an input stream is called “tokenizing” and is performed by a Lexer (Lexical analyzer).

Token represents symbols of the vocabulary of a language.

A token is the result of parsing the document down to the atomic elements generally of a language.

Lexeme Type

A token might be:


Consider the following programming expression:

sum = 3 + 2;

Tokenized in the following table:

Lexeme Lexeme type
sum Identifier
= Assignment operator
3 Integer literal
+ Addition operator
2 Integer literal
; End of statement


Terminal / Non terminal


A token that has a name is called an identifier

Symbol Table

A symbol table is a table of all token with a name (ie an identifier)

Documentation / Reference

Discover More
Card Puncher Data Processing
Calcite - Syntax Tree (SqlNode)

SQL tree in Calcite. The tree is build by the parser and each node (Tokens) is represented by SqlNode SqlNode can be converted back to SQL via the unparse...
Card Puncher Data Processing
Code - (Programming|Computer) Language

how the language is structured (grammar), how to name things you want to talk (vocabulary), and the customary and effective ways to say everyday things (usage). ...Grammarvocabularycommunity...B00B8V09HYEffective...
Card Puncher Data Processing
Code - Grammar / Syntax (Lexical)

This section regroups the entity of a computer language from a lexical point of view. It's the same as Parts of the speech for a natural language. Grammars are useful models when designing software...
Compiler - LL parser

An LL(k) parser is a type of parser that build the tree in a top-down way. LR approach It is a top-down parser that parses from left to right, constructs a leftmost derivation of the input ...
Compiler - Lexer rule (Token names, Lexical rule, Token name)

lexer rule are rules written specifically for the lexer as they defines tokens The lexer creates token from the input text that match this rules. A lexer rule is also known as: lexical rule token...
Compiler - Parser Rule

A parser rule is a rule that defines the structure of the parse tree. The parser uses them to build the parse tree. lexer rulestokenstree Parser rule names always start with a lowercase letter (...
Computer Language - (Compiler|Interpreter) - Language translator

Computer Language are written in plain text. However, computers interpret only particular sequence of instructions. This transformation from a plain text language to instructions is called compilation...
Grammar - Start Symbol

A start symbol is a special nonterminal symbol that appears in the initial string generated by the grammar.
Kb Us International
IDE - My Keyboard Shortcut - (Hotkey|Access) Key Scheme

Trying to make my perfect shortcut parameters. Ctrl: Original Shift: Undo Ctrl+Alt+Left and Right is also a Intel Graphic Shortcut to turn the screen. Disable them. If this is not possible...
Grammar Different Structural Trees For The Same Expression
Language - (Grammar | Syntax | Lexicon)

Grammar in the context of a compiler. Ie how to define it. grammar section In processing computer languages, semantic processing generally comes after syntactic processing, but in some cases semantic...

Share this page:
Follow us:
Task Runner