About
An Aggregate table is a persistence unit of aggregate data.
Aggregate tables are also known as summary table (from summary)
A materialized view can be used to implement aggregate data
A aggregate is a table that contains the result of a query (generally an aggregated one but not always …).
Articles Related
Usage
Query Performance
Data - Cache - Pre-aggregating and storing additive information is the standard practice for improving the query performance of databases.
An aggregate table is for an analytics system what an index is for an OLTP system.
When users request information at a high grain of aggregation, Pre-aggregating and storing additive information is the standard practice for improving the query performance of relational databases.
Query Rewriting
When a user or an application asks for a column in a fact table, a sql engine that is aware of the aggregate table is capable of rewrite the query against/using the aggregate table. See SQL Engine - Query Rewrite
When data becomes bigger
Parallel
To improve the throughput, you can run the process in parallel by adding more
- hardware (more cpu, memory, disk)
- node (ie computer) - in distributed computing such as Hadoop.
Incremental Processing
Stream - How to capture change of state - Change Data Capture (CDC)
Implementation
You can implement an aggregate table with:
- a full load or incremental dataflow that executes or implements an aggregate sql insert into … select dim1, dim2, sum(add1) from fact group by dim1, dim2)
- or a materialized view