SQL Engine - Query Optimizer (Query Optimization)

1 - About

Algebraic Optimization in a SQL Engine.

A SQL statement can be executed in many different ways, such as:

  • full table scans,
  • index scans,
  • nested loops,
  • hash joins.

The query optimizer determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query.

The Query optimizer determines the most efficient way to execute a SQL statement after considering many factors including the Optimizer Goal.

This determination is an step in the processing of any SQL statement and can greatly affect execution time.

The optimizer might not make the same decisions from one version of Database to the next. In recent versions, the optimizer might make different decisions, because better information is available.

The output from the optimizer is an execution_plan that describes an optimum method of execution. The plans shows the combination of the steps Oracle Database uses to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them in some way for the user issuing the statement.

In general the databases are going to correctly optimize queries. Queries are rarely incorrectly optimized when the statistics are good.

3 - Understanding the optimizer

The query optimizer performs the following steps:

  • 1- The query transformer transform the original query (if it is advantageous)
  • 2- The optimizer generates a set of potential plans for the SQL statement based on available access paths and hints.
  • 3- The optimizer estimates the cost of each plan based on statistics in the data dictionary for the data distribution and storage characteristics of the tables, indexes, and partitions accessed by the statement.
  • 4- The optimizer compares the costs of the plans and chooses the one with the lowest cost.

4 - Influence the optimizer's choices

You can influence the optimizer's choices by

5 - Type

5.1 - CBO

A CBO will try to minimize the expected cost of executing a query

5.2 - RBO

A RBO (Rule based Optimizer) will apply a set of rules in order.

It's then sensitive to the order of the tables in the FROM clause and will use the order in which we enter them to choose a Nested Loop - Driving Table for the query if none of the predicates do so (in the event of a “tie”, the RBO will look at the order the developer typed in table names to pick what table to use first!).

6 - Optimizer and Joins

Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins.

Join algorithm Implementation

7 - Example

7.1 - Index Optimization

  • Data: If the data is an Unordered Set of tuples with 100 fields each.
  • Select Pattern: One specific attribute is always selected for a couple of specific rows identified by the 1st attribute
  • Optimization: Rewrite the data into a 99-Tuple of Maps with the key as the 1st attribute

8 - Reference

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap