Calcite (Farrago, Optiq)
About
Calcite is a Java SQL Processing engine where the data storage is developed in plugin.
Calcite is an open source cost based query optimizer and query execution framework.
- Data Federator
Articles Related
Component
- Catalog: metadata and namespace
- Sql Parser: Parse the SQL string to a SQLNode - abstract syntax tree
- Sql validator: Validate the SQL tree against the catalog
- Sql to Rel Converter: Transform a SQL to a relational expression
- Query Optimizer: Optimize/rewrite the logical plan (relational expression) - The output is called a physical plan.
- SQL Generator: Converts relational expression to SQL
Key Concept
Row Expression
- Row Expression - RexNode (Equivalent to Sparks' column)
- Projection Fields
- Filter Condition
- Join Condition
- Sort fields
List:
- Input Column Ref - RexInputRef
- Literal - RexLiteral
- Struct Field access - RexFieldAccess
- Function call - RexCall
- Windows expression - RexOver
Rules
Rules - RelOptRule (Interface) used to modified query plan
- Planners - RelOptPlanner
- Programs - Program
Documentation / Reference
https://www.slideshare.net/JordanHalterman/introduction-to-apache-calcite
Query to relational Operator
Every query is represented as a tree of relational operators.
You can:
- translate from SQL to relational algebra,
- or build the tree directly.
Schema
Schemas are defined as a list of tables, each containing minimally a table name and a url.
- Html page and file adapter : If a page has more than one table, you can include in a table definition selector and index fields to specify the desired table. If there is no table specification, the file adapter chooses the largest table on the page.
Jdbc
jdbc:calcite:model=target/test-classes/model.json
// or
jdbc:calcite:schemaFactory=org.apache.calcite.adapter.druid.DruidSchemaFactory;schema.url=http://localhost:8082;schema.coordinatorUrl=http://localhost:8081
- A JSON model of a simple Calcite schema.
{
"version": "1.0",
"defaultSchema": "SALES",
"schemas": [
{
"name": "SALES",
"type": "custom",
"factory": "org.apache.calcite.adapter.csv.CsvSchemaFactory",
"operand": {
"directory": "sales"
}
}
]
}
where:
Adapter can be built programmatically using the Schema SPI. see Calcite Schema SPI
DDL
SELECT and DML are standardized, but DDL tends to be database-specific, so the calcite policy is that DDL extensions are made outside of Calcite. See CALCITE-609 for example.
You could copy work that has already been done in Drill and Phoenix in extending Calcite’s core parser for DDL.
Test
VM:
Dataset: Database - HyperSQL DataBase (HSQLDB)
Planner
- Eigenbase: the project where Calcite’s initial IP came from
Build
Stream
Documentation / Reference
- Parser Extension/Implementation in Phoenix. A 'commit' was added.