Relational Data Modeling - Aggregate Table (Summary Table)

1 - About

An Aggregate table is a persistence unit of aggregate data.

Aggregate tables are also known as summary table (from summary)

A materialized view can be used to implement aggregate data

A aggregate is a table that contains the result of a query (generally an aggregated one but not always …).

3 - Usage

3.1 - Query Performance

Data - Cache - Pre-aggregating and storing additive information is the standard practice for improving the query performance of databases.

An aggregate table is for an analytics system what an index is for an OLTP system.

When users request information at a high grain of aggregation, Pre-aggregating and storing additive information is the standard practice for improving the query performance of relational databases.

3.2 - Query Rewriting

When a user or an application asks for a column in a fact table, a sql engine that is aware of the aggregate table is capable of rewrite the query against/using the aggregate table. See SQL Engine - Query Rewrite

4 - When data becomes bigger

4.1 - Parallel

To improve the throughput, you can run the process in parallel by adding more

  • hardware (more cpu, memory, disk)
  • node (ie computer) - in distributed computing such as Hadoop.

4.2 - Incremental Processing

5 - Implementation

You can implement an aggregate table with:

  • a full load or incremental dataflow that executes or implements an aggregate sql insert into … select dim1, dim2, sum(add1) from fact group by dim1, dim2)

6 - Documentation / Reference


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap