Calcite (Farrago, Optiq)


Calcite is a Java SQL Processing engine where the data storage is developed in plugin.

Calcite is an open source cost based query optimizer and query execution framework.

Getting Started


  • Catalog: metadata and namespace
  • Sql Parser: Parse the SQL string to a SQLNode - abstract syntax tree
  • Sql validator: Validate the SQL tree against the catalog
  • Sql to Rel Converter: Transform a SQL to a relational expression
  • Query Optimizer: Optimize/rewrite the logical plan (relational expression) - The output is called a physical plan.
  • SQL Generator: Converts relational expression to SQL

Key Concept

Relational Algebra

Row Expression

  • Row Expression - RexNode (Equivalent to Sparks' column)
    • Projection Fields
    • Filter Condition
    • Join Condition
    • Sort fields


  • Input Column Ref - RexInputRef
  • Literal - RexLiteral
  • Struct Field access - RexFieldAccess
  • Function call - RexCall
  • Windows expression - RexOver


Rules - RelOptRule (Interface) used to modified query plan

  • Planners - RelOptPlanner
  • Programs - Program

Documentation / Reference

Query to relational Operator

Every query is represented as a tree of relational operators.

You can:

  • translate from SQL to relational algebra,
  • or build the tree directly.


Schemas are defined as a list of tables, each containing minimally a table name and a url.

  • Html page and file adapter : If a page has more than one table, you can include in a table definition selector and index fields to specify the desired table. If there is no table specification, the file adapter chooses the largest table on the page.


// or
  • A JSON model of a simple Calcite schema.
  "version": "1.0",
  "defaultSchema": "SALES",
  "schemas": [
      "name": "SALES",
      "type": "custom",
      "factory": "org.apache.calcite.adapter.csv.CsvSchemaFactory",
      "operand": {
        "directory": "sales"


Adapter can be built programmatically using the Schema SPI. see Calcite Schema SPI


SELECT and DML are standardized, but DDL tends to be database-specific, so the calcite policy is that DDL extensions are made outside of Calcite. See CALCITE-609 for example.

You could copy work that has already been done in Drill and Phoenix in extending Calcite’s core parser for DDL.



Dataset: Database - HyperSQL DataBase (HSQLDB)




Documentation / Reference

Task Runner