Compiler - Lexer rule (Token names, Lexical rule, Token name)



lexer rule are rules written specifically for the lexer as they defines tokens

The lexer creates token from the input text that match this rules.

A lexer rule is also known as:

  • lexical rule
  • token specification
  • or token name

Lexer rule names begin with an uppercase letter whereas parser rule names begin with a lowercase letter.


Example of lexer rule with the antlr grammar syntax

ID  :   [a-zA-Z]+ ;      // match lowercase and uppercase letters from A to Z
INT :   [0-9]+ ;         // match a serie of digit from 0 to 9
DIGITS : [0-9] +; // same
NEWLINE:'\r'? '\n' ;     // match/return newlines to parser (end-statement signal)
WS  :   [ \t]+ -> skip ; // toss out whitespace and tab
HEX : ('%' [a-fA-F0-9] [a-fA-F0-9])+ ; // hexadecimal
STRING : ([a-zA-Z~] |HEX) ([a-zA-Z0-9.-] | HEX)*; // lexer rule can use other lexer rule


You can apply lexer rule conditionally with the lexical mode

Discover More
Card Puncher Data Processing
Antlr - Lexer Rule (Token names|Lexical Rule)

in Antlr. They are rules that defines tokens. They are written generally in the grammar but may be written in a lexer grammar file Each lexer rule is either matched or not so every lexer rule expression...
Compiler - Parser Rule

A parser rule is a rule that defines the structure of the parse tree. The parser uses them to build the parse tree. lexer rulestokenstree Parser rule names always start with a lowercase letter (...
Language - Compiler compilers or (lexer|parser) generators

Compiler-compilers splits the work into a lexer and a parser: The Lexer reads text data (file, string,...) and divides it into tokens using lexer rule (patterns). It generates as output a list of tokens...
Lexical Mode / Lexer Context

The lexical mode is a lexer property while creating the token. It's also known as: lexer context or lexer state This is generally the only context related data. It permits to apply lexer rules...
Syntactic equations - Backus Naur Form (BNF), Extended BNF (EBNF) and ABNF

This grammar notation was introduced in 1960 by J. Backus and P. Naur. It is therefore called Backus Naur Form (BNF) (Naur, 1960). For each EBNF construct there exists a translation rule which yields...

Share this page:
Follow us:
Task Runner