About
A materialized view is an auxiliary data structure that persist the result of a query in a table format managed by the cache system of the database.
The refresh of the data is defined in criteria:
- at time interval,
- or incrementally via the the transaction log
- …
A materialized view creates a real table, and this table may be indexed, analyzed, and so on.
When the queries are known in advance and predictable, materialized views are very helpful.
Articles Related
Example
Oracle
create materialized view EmpSummary as
select deptno, COUNT(*) as c, SUM(sal) as s
from Emp
group by deptno
Usage
Materialized Views are used to:
- cache expensive queries in a data warehouse environment with summarized table. It's then a cache or pre-compute operations engine.
- replicate data to non-master sites in a replication environment
- enable query rewriting
Summary table
Materialized views are the equivalent of a summary table.
In a Online Analytical Processing (Olap) approach, each of the elements of a dimension could be summarized using a hierarchy.
The end user queries the tables and views in the database. The query rewrite mechanism in a database automatically rewrites the SQL query to use this summary tables. See also: Relational Data Modeling - View selection problem (recommending the best aggregation tables) - Data Warehousing
This mechanism reduces response time for returning results from the query. Materialized views within the data warehouse are transparent to the end user or to the database application.
This is relatively straightforward and is answered in a single word - performance. By calculating the answers to the really hard questions up front (and once only), we will greatly reduce the load on the machine, We will experience:
- Less physical reads - There is less data to scan through.
- Less writes - We will not be sorting/aggregating as frequently.
- Decreased CPU consumption - We will not be calculating aggregates and functions on the data, as we will have already done that.
- Markedly faster response times - Our queries will return incredibly quickly when a summary is used, as opposed to the details. This will be a function of the amount of work we can avoid by using the materialized view, but many orders of magnitude is not out of the question.
Materialized views will increase your need for one resource - more permanently allocated disk. We need extra storage space to accommodate the materialized views, of course, but for the price of a little extra disk space, we can reap a lot of benefit.
Also notice that we may have created a materialized view, but when we ANALYZE, we are analyzing a table. A materialized view creates a real table, and this table may be indexed, analyzed, and so on.
Replica
A materialized view is a replica of a target master from a single point in time. The master can be either a master table at a master site or a master materialized view at a materialized view site. Whereas in multimaster replication tables are continuously updated by other master sites, materialized views are updated from one or more masters through individual batch updates, known as a refreshes, from a single master site or master materialized view site.
Database Implementation
- materialized query tables (DB2)
- indexed views (SQL Server)
Algorithm
Given a database with 30 columns, 10M rows. Find X summary tables with under Y rows that improve query response time the most. AdaptiveMonteCarlo algorithm:
- Based on research [2]
- Greedy algorithm that takes a combination of summary tables and tries to find the table that yields the greatest cost/benefit improvement
- Models “benefit” of the table as query time saved over simulated query load
- The “cost” of a table is its size