Syntactic equations - Backus Naur Form (BNF), Extended BNF (EBNF) and ABNF

About

This grammar notation was introduced in 1960 by J. Backus and P. Naur. It is therefore called Backus Naur Form (BNF) (Naur, 1960).

For each EBNF construct there exists a translation rule which yields to a program fragment.

BNF grammars are pretty easy to read (Just replace the ::= sign with is or matches)

Each lexer rule is either matched or not, so every BNF expression is a boolean expression.

Example

Let:

  • non-terminal symbols be identifiers as we know them from programming languages, that is, as sequences of letters (and possibly digits), for example, expression, term.
  • terminal symbols be character sequences enclosed in quotes (strings), for example, “=”, “|”.

Definition of the structure of these equations in

EBF

syntax = production syntax | ∅.
production = identifier "=" expression "." .
expression = term | expression "|" term.
term = factor | term factor.
factor = identifier | string.
identifier = letter | identifier letter | identifier digit.
string = stringhead """.
stringhead = """ | stringhead character.

letter = "A" | ... | "Z".
digit = "0" | ... | "9".

BNF

From bnf-syntax

- definition
    =
    :=
    ::=
- concatenation
    ,
    <whitespace>
- termination
    ;
- alternation
    |
- option
    [ ... ]
    ?
- repetition
    { ... } => 0..N
    expression* => 0..N
    expression+ => 1..N
    <digits> * expression => <digits>...<digits>
    <digits> * [expression] => <0>...<digits>
    <digits> * expression? => <0>...<digits>
- grouping
    ( ... )
- literal
    " ... " or ' ... '
- special characters
    (? ... ?)
- comments
    (* ... *)

EBNF

Using recursion to express simple repetitions is rather detrimental to readability.

The extension of BNF called EBNF (Wirth, 1977):

  • adds two constructs to express repetition and optionality.
  • allows expressions to be enclosed within parentheses.
syntax = {production}.
production = identifier "=" expression "." .
expression = term {"|" term}.
term = factor {factor}.
factor = identifier | string | "(" expression ")" | "[" expression "]" | "{" expression "}".
identifier = letter {letter | digit}.
string = """ {character} """.
letter = "A" | ... | "Z".
digit = "0" | ... | "9".

where

  • A factor of the form {x} is equivalent to an arbitrarily long sequence of x, including the empty
  • A factor of the form [x] is equivalent to “x or nothing”, that is, it expresses optionality.

Hence, the need for the special symbol ∅ for the empty sequence vanishes.

Operators

  • from regular expressions - quantifier
    • ?: which means that the preceding symbol (or group of symbols) is optional. That is, it may appear once or not at all.
    • *: Means that the preceding symbol (or group of symbols) may appear zero or more times.
    • +: Means that the preceding symbol (or group of symbols) may appear one or more times.
  • Parentheses may be used to group symbols together.

ABNF

The differences between standard BNF and ABNF (Augmented BNF) involve naming rules, repetition, alternatives, order-independence, and value ranges.

Augmented BNF for Syntax Specifications: ABNF, D. Crocker, P. Overell. IETF.

Rfc 5324 - ABNF

Used in HTTP specification

Characters specifications

'%'  ( 'b = binary' | 'd = decimal' | 'x = hexadecimal' ) 'value' ( '. = concatenation' 'value' )?

For example in ASCII:

  • a carriage return is specified by:
    • %d13 in decimal
    • or %x0D in hexadecimal
  • a carriage return followed by a line feed may be specified with concatenation as:
    • %d13.10

Grammar Dictionary

Library

Symbol

They may have different signification for each different implementation.

Symbol Description
< > Angle brackets are used to surround the name of a syntactic element (BNF nonterminal) of the language.
::= The definition operator is used to provide definitions of the element appearing on the left side of the operator in a production rule.
[ ] Square brackets are used to indicate optional elements in a formula. Optional elements can be specified or omitted.
{ } Braces group elements in a formula. Repetitive elements (zero or more elements) can be specified within brace symbols.
| The alternative operator indicates that the portion of the formula following the bar is an alternative to the portion preceding it.
Ellipsis indicates that the element can be repeated any number of times. If ellipsis appears after grouped elements, the grouped elements enclosed with braces can be repeated any number of times. If ellipsis appears after a single element, only this element can be repeated any number of times.
!! Introduces normal English text. This is used when the definition of a syntactic element is not expressed in BNF.

Documentation / Reference


Powered by ComboStrap