About
Grammar in the context of Antlr.
The grammar definition of Antlr is called a Lexicon because the grammar is used by the lexer (hence the lexer grammar)
See:
- antlr/grammars-v4 for a list of existent grammar.
Articles Related
Name
The file name containing grammar X must be called X.g4
File
The name of the file containing the grammar must match the grammar-name and have a .g4 extension.
The name of the grammar must be declared at the top of the file.
The syntax can be specified in a file with syntax:
- only for the lexer called the lexer grammar
- only for the parser called the parser grammar
- or for the lexer and parser combined. (Default). The tree grammar in one file is the most common approach.
Syntax
Code Source Structure. See antlr/antlr4/blob/master/doc/grammars.md
/** Optional javadoc style comment */
grammar_type? grammar Name;
options {...}
import ... ;
tokens {...}
channels {...} // Only lexer grammars can contain custom channels specifications
@actionName {...}
rule1 // parser and lexer rules, possibly intermingled
...
ruleN
where:
- grammar type may be:
- parser for a parser grammar
- or lexer for a lexer grammar
Tokens
Token definition needed by a grammar for which there is no associated lexical rule. The tokens section defines a set of tokens to add to the overall set.
Rule
There is two kinds of rule:
Name Case | Type | Description | Example from the getting started |
---|---|---|---|
uppercase letter | lexer rule | (known as Token name, they defines the token that the lexer will produce | ID : [a-z]+ ; defines an ID token that is made of letters from a to z |
lowercase letter | parser rule | They defines how the relation between the token and therefore how the parser tree is build | r : 'hello' ID ; defines a pattern with the world hello and the token ID defines just above |
Channels
Only lexer grammars can contain custom channels specifications
channels {
WHITESPACE_CHANNEL,
COMMENTS_CHANNEL
}
Action
Actions are code blocks written in the target language.
Keywords / Reserved words
- import: for grammar import
- fragment
- lexer
- parser
- grammar
- returns
- locals
- throws
- catch
- finally
- mode
- options
- tokens
Further, do not use:
- the word rule as a rule name.
- any keyword of the target language as a token, label, or rule name. For example, rule if would result in a generated function called if. That would not compile obviously.