Businesses today require information in a timely fashion. It is no longer acceptable to produce a report once a day, now some people need it hourly or even every few minutes. To be responsive to the business need for analysing an ever growing set of business data, requires performing any analysis in a timely manner. This is an increasing challenge when the analysis is over vast quantities of data and can involve time consuming operations such as joins and aggregations and performing other analytical functions and comparisons. Being able to pre-compute these operations can deliver fast query response times and quicker analysis.
This paper discusses the two approaches available in Oracle Database 11g for pre-computing these query operations and improving query response times:
- And cubes in Analytic Workspaces with Oracle Olap.
Attribute Presence For Querying
What happens if our query refers to an attribute that is not present in the materialized view or analytic workspace? The relational query returns results because a query rewrite mechanism known as join back can automatically join the existing materialized view back to the base table to access the missing data attribute. Even if no viable materialized view were available then the optimizer allows the query to return results by operating against the base tables where all the attributes and fact data are present.
In Oracle Database 11g, the use of query rewrite to access the analytic workspace means that the queries can now also take advantage of the join back rewrite mechanism so that even if attributes are missing from the analytic workspace, they can still be derived from the base tables. Previously in Oracle Database 10g, query rewrite was not available to access the analytic workspace and if the attribute had not been loaded into the analytic workspace dimension then it was not even possible to construct the query and thus no results could be returned. To resolve this, the dimension would have to have been rebuilt to add the missing attribute information.
Data Completeness and Sparsity
In relational tables, records are only present for data that exists, whereas in analytic workspace cubes, an empty value is used where no data exists. The cube stores the real data values and whereas the empty values are easily addressable and queryable, they are not actually stored in the cube. For example, you can easily refer to last month’s data value in a query even if that month has no data, but the empty value is not stored.
This highlights the very important feature of analytic workspace cubes in that they operate as if they are fully populated with both actual data and zero data for the full combination of their dimensions’ values. This can have a very important benefit because it makes defining calculations easier as it can be assumed that all data points are present in the cube. For example, the following formula can be defined even if there is no actual value, or physical storage used, for Tents in Feb2002:
nvl(sales('Feb2002','Tents'),0) - nvl(sales('Jan2002', 'Tents'), 0)
Analytic workspaces have significant improvements to data loading, aggregation performance and query performance resulting from efficient handling of the sparsity due to industry leading compression technology in Oracle Database 11g. If your reporting requirements need to process null value rows, serious consideration should be given to using analytic workspaces because they can handle this information by default rather than having to write a more complex SQL statement to generate the same result.
Supporting Join Performance
By pre-computing the join results in advance, materialized views are excellent at improving the performance of relational joins between any relational tables in the database. In contrast, a single analytic workspace cube only pre-computes and stores the join between that cubes measure data and its dimensions.
However, even though analytic workspaces do not pre-join cubes together in advance, due to their internal storage structure, they are very efficient at performing this type of join operation on the fly at query time.
In addition, the ability to define this join is significantly easier and less complex in the analytic workspace than when defining it relationally. We look at this operation in the analytic workspace in more detail in the section Writing SQL Queries Against The Analytic Workspace Cube.
Maintaining aggregates, such as the sum and average, requires time and resources and if possible it is advantageous to minimize the number that must be maintained.
If aggregates are available at a monthly level then it is useful to be able to use these and roll them up at query time to the quarterly or yearly level rather than explicitly maintaining the quarterly or yearly totals themselves at load time.
Aggregate rollup in both analytic workspaces and as used by query rewrite against materialized views, are equally powerful and efficient in using the metadata in their dimension definitions to perform this rollup operation. Both approaches can utilize pre-computed lower level aggregates to answer user queries requiring aggregates at a higher level in the dimension hierarchy.
Adhoc Analytic Functions
The multi dimensional calculation engine in Oracle OLAP that operates on the cube data provides a very rich set of analytical functionality. In addition, because of the specialized storage structures of the cubes, adding a new derived data value, known as a calculated measure, does not require the cube to be rebuilt. To achieve the same performance relationally, it might be necessary to create a new materialized view which would take additional time to build thus delaying the time to generate the report and increasing the time required to maintain the materialized views. Contrast that with the analytic workspace, because the analytic function is already present there is practically no reduction in performance compared to accessing the cube base data values themselves.
The Sparsity Advisor in AWM 10.2.0.3 will analyze the dimensions for the cube to determine which are sparse and which are dense to be able to recommend the order to be used in the cube construction, compression and whether or not cube partitioning is required. Correctly specifying this order can have a significant impact on the time that it takes to load and query the cube.
Advisors in AWM 11g The Sparisty Advisor has been replaced with three new in Oracle Database 11g:
- The Cube Partitioning Advisor makes recommendations for partitioning the cube based on the partitioning of its source table and the sparsity of the data in its dimensions.
- The Cube Storage Advisor analyzes an existing cube and its dimensions and makes recommendations about how the cube could be more optimally rebuilt. e.g. the Advisor will examine which dimensions in the cube are dense or sparse in order to determine their correct storage order in the cube and whether or not compression should be used.
- The Relational Schema Advisor generates a SQL script for creating all of the necessary database objects and constraints required for query rewrite to operate against cube materialized views.
Enhanced Reporting Via Dimensions
Supporting different types of hierarchies in the dimensional data, and allowing more flexibility in how hierarchies are defined, can enable a wider variety of business scenarios to be modeled. The SQL Dimension, which is used by materialized views, only supports level based type dimensions, whereas analytic workspaces can use both value and level based, as illustrated below.
Within the analytic workspace dimensions, ragged hierarchies can also be implemented, which enables a different number of levels to be used when different paths in the hierarchy are traversed from the root level to the bottom of the hierarchy. Use of ragged hierarchies enables more flexibility in the data modeling and can increase the number of business scenarios that the dimension can model.
Frequently, analyzing the historical data is required so that businesses can predict future trends. This is an area that analytic workspaces do very well because it is not easily accomplished relationally.
In Oracle Database 10g, forecasting models are implemented using Calculation Plans defined using a wizard. The wizard takes the user step-by-step through the process of defining the forecasting model to be used to and how the model operates for each dimension of the cube.
A number of different models are provided, or, an automatic method can be selected where the historical data is analyzed for you to determine the best fit model to use. Once the calculation plan has been defined it must be executed to generate the forecast data. Defining the forecast functionality will be available in a future version of AWM 11g but can still be easily performed in the current version by using the analytic workspace programming language OLAP DML.
In comparison, materialized views pre-compute joins and aggregations using actual historical data present in the tables and do not have the capability to forecast trends.
Relationally this can be done using the SQL MODEL Clause, which was introduced in Oracle Database 10g, however there are no wizards available to define and manage the forecasts in the same way that AWM does for analytic workspaces.
The overhead for attaching the analytic workspace has been significantly reduced and is negligible in Oracle Database 11g, where as in Oracle Database 10g, when the first relational query against an analytic workspace in a database session is executed, an overhead of approximately 20 seconds is incurred as the analytic workspace is attached to the session for the first time. There are also overheads as part of the first aggregation operations as the analytic workspace caches the aggregation results. Subsequent queries against the analytic workspace will not suffer these overheads.
There is no equivalent overhead when performing a relational query against tables and using query rewrite to use materialized views.