Language - Compiler compilers or (lexer|parser) generators

Compiler

About

Compiler-compilers splits the work into a lexer and a parser:

Compiler-compilers generates the lexer and parser from a language description file called a grammar

Parsers and lexical analysers are long and complex components. A software engineer writing an efficient lexical analyser or parser directly has to carefully consider the interactions between the rules.

Application

Build process

As Compiler-compilers need to generates the lexer and parser, it makes the build process a little bit more complicated:

  • First, the compiler-compiler must generate the lexer and parser from the grammar file
  • then, you can use them to compile the code

Error handling

The lexical analyser and parser also are responsible for generating error messages, if the input does not conform to the lexical or syntactic rules of the language.

Tools

Antlr, javacc, sablecc, lex are not a parser or a lexical anaylzer but a generator. This means that it outputs lexical analyzers and parser according to a specification that it reads in from a file (the grammar)

Flex

JFlex

https://www.jflex.de/ - JFlex is a lexical analyzer generator (also known as scanner generator) for Java, written in Java.

JFlex lexers are based on deterministic finite automata (DFAs).

JFlex is designed to work together with:

It can also be used together with other parser generators like ANTLR or as a standalone tool.

Fro JFlex files editing support, see Grammar Kit

REx Parser Generator

http://bottlecaps.de/rex/

Lex / Yacc

wiki/Lex_(software), wiki/Yacc

Installtion - cygwin includes lex and yacc

Yac is used by php. See bnf yacc grammar

ANTLR

See ANTLR

Babelfish

Dead project Babelfish

Chevrotain

Parser Building Toolkit for JavaScript

https://github.com/Chevrotain/chevrotain

JavaCC

JavaCC is a tool used in many applications, which is much like antlr, with few features different here and there. However, it just generates Java code.

Used by: BeanShell

Janino

Java : http://janino-compiler.github.io/janino/

used by Calcite (Farrago, Optiq)

SableCC

SableCC is a compiler-compiler tool for the Java environment. It handles LALR(1) grammars (for those who remember their grammar categories). In other words it's a bottom up parser (unlike JavaCC and Antlr which are top-down).

SableCC is a bottom up parser, which takes an unconventional and interesting approach of using object oriented methodology for constructing parsers. This results in easy to maintain code for generated parser. However, there are some performance issues at this point of time. It generates output in both C++ and Java

Unified

https://unified.js.org/

unified is an interface for processing text using syntax trees. It’s what powers remark, retext, and rehype, but it also allows for processing between multiple syntaxes.

unified enabled new exciting projects like Gatsby to pull in markdown, MDX to embed JSX, and Prettier to format it. It’s used to check code for Storybook, debugger.html (Mozilla), and opensource.guide (GitHub).

Acorn (Javascript)

https://github.com/acornjs/acorn

Jison (Deprecated)

Jison - used by mermaid), unfortunately, no more updated.

Documentation / Reference





Discover More
Card Puncher Data Processing
Antlr (ANother Tool for Language Recognition)

ANTLR is lexer generator. It translates: a grammar to a lexer and parser. ANTLR is implemented in Java and generates lexer and parser in the following languages: Java, Ruby, Python, C,...
Compiler
Compiler - Parser Software

Compiler - Parser Software
Compiler
Computer Language - (Compiler|Interpreter) - Language translator

Computer Language are written in plain text. However, computers interpret only particular sequence of instructions. This transformation from a plain text language to instructions is called compilation...
Card Puncher Data Processing
Grammar Kit

Grammar-kit adds: BNF Grammars and JFlex files editing support, and a parser/PSI code generator.
Prosemirror Dom
How Rich Text editor in HTML are made (Principles and Demo)

How do you create a Rich Text editor in HTML, what are the well-known text editor and what are the principals. This article includes also a basic example where you can extend from to build your own
Grammar Different Structural Trees For The Same Expression
Language - (Grammar | Syntax | Lexicon)

Grammar in the context of a compiler. Ie how to define it. grammar section In processing computer languages, semantic processing generally comes after syntactic processing, but in some cases semantic...
Compiler
Lexical Analysis - Parser (Syntax analysis|Linter)

A parser create a parse tree data structure from a series of token created by the lexer. The creation of the tree is based on the rules declared in the grammar (which defines the syntactic structure of...
Web - Compiler

This page is compiler in the web context. A web compiler is transforming a third language into a web language Java to Javascript: Gwt Typescript to Javascript: Typescript compiler Markdown to...
Word Recognition Automaton
What is a Lexer ? known also as Tokenizer or Scanner - Lexical Analysis

Lexer are also known as: Lexical analyzer Lexical Tokenizer Lexical Scanner Consider the following expression: A lexer will tokenized it in the following symbol table: Token Lexeme Token...



Share this page:
Follow us:
Task Runner