Nested Loop - Driving Table

Relational Algebra Between Sql And Query Plan

Example

select COMPANY.Name 
from COMPANY, SALES
where COMPANY.Company_ID = SALES.Company_ID
and SALES.Period_ID =3
and SALES.Sales_Total>1000;

Execution Plan

NESTED LOOPS
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Implications

The key to the performance of a NESTED LOOPS join is the order in which the tables are joined. The selection of the driving table, the first table in the join, is critical.

The amount of repetition in the nested loop is the product of the previous result set and the current accessed table. For example, if there are 100 rows in SALES, and they join to 100 rows in COMPANY, then the total number of accesses through the COMPANY_PK index will be 100*100 = 10,000.

If more tables are used in the join, the selection of the driving table becomes even more critical, since the driving set (join between the driving table and the first driven table) of records is used for each successive join. As a result, the time needed to perform a join can grow exponentially as tables are added to the join unless the driving set starts with very selective criteria.

In the previous example, SALES was used as the driving table. There are several important points to note about the selection of the driving table:

  • Although all of the SALES table’s primary key columns were specified in the query, the SALES_PK index was not used. The SALES_PK index was not used because there was not a limiting condition on the leading column (the Company_ID column) of the SALES_PK index. The only condition on SALES.Company_ID is a join condition.
  • The SQL Engine - Query Optimizer (Query Optimization) could have selected either table as the driving table. If COMPANY was the driving table, it would have had a full table scan performed on it.
  • In rule-based optimization, if there is equal chance of using an index regardless of the choice of the driving table, the driving table will be the one that is listed last in the FROM clause.
  • In cost-based optimization, the optimizer will consider the size of the tables and the selectivity of the indexes when selecting a driving table.

When selecting a driving table, the SQL Engine - Query Optimizer (Query Optimization) ranks all the tables in the FROM clause based on the limiting conditions and the join conditions. The optimizer ranks each table as a potential driving table. For each table, it evaluates the possible access paths that could be used during the query (unique index scans, non-unique index range scans, or table scans). The optimizer will choose to drive the query from the table that has the poorest access path, as defined by its index structure and the query’s limiting conditions.

Starting from Nonselective Criteria

NESTED LOOPS is a directional operation; if you join two tables via a NESTED LOOPS operation, you will get different performance depending on which of the tables is the driving table.

Consider SALES and COMPANY again. What if COMPANY has 5 records, and SALES has 100,000 records?

If SALES is used as the driving table, then the execution path will require a full table scan of SALES, plus repeated accesses to the same rows in COMPANY. There would be 100,000 rows read from SALES, plus 100,000 accesses of COMPANY, for a total of 200,000 accesses.

However, if COMPANY were used as the driving table, then how many accesses would be necessary? There would be 5 accesses of COMPANY, plus 100,000 accesses of SALES - a total of 100,005 accesses. Thus, changing the choice of the driving table in this example reduces the number of accesses performed by almost 50 percent. As more and more tables are added to the query, the size of the driving set passed to each successive join has a great impact on the performance of the query.





Discover More
Relational Algebra Between Sql And Query Plan
Nested Loop - Adding More Tables

When a nested loops join uses four tables, the optimizer performs the following steps: Select a driving table. Perform a NESTED LOOPS join between the driving table and the second table. Perform...
Relational Algebra Between Sql And Query Plan
Nested Loop - Operations Involved

A NESTED LOOPS operation joins two data sources. It indicates that an index is available for use during the join. As a row operation, NESTED LOOPS returns each row to the next operation as it is processed...
Bi Server Architecture With Client
OBIEE - Stitch Join (Intern Join Operations)

Stitch Join are join operations performed by OBIEE with its intern engine: the Oracle BI Server’s join engine. The BI Server creates TMP (temporary) files in the /tmp directory The Oracle BI Server’s...
Bi Server Architecture With Client
OBIEE 10G/11G - Driving Table

You can specify a Driving table parameter in a complex join. Driving tables are for use in optimizing the manner in which the Oracle BI Server processes cross-database joins when one table is very small...
Relational Algebra Between Sql And Query Plan
Relation Operator - Nested Loop Join

Nested loop joins are join operator. They are useful when small subsets of data are being joined and if the join condition is an efficient way of accessing the second table. NESTED LOOPS is a row operation....
Data System Architecture
SQL - Semijoins

A semijoin returns rows that match an EXISTS subquery without duplicating rows from the left side of the predicate when multiple rows on the right side satisfy the criteria of the subquery. Semijoin and...
Query Optimizer Process
SQL Engine - Query Optimizer (Query Optimization)

in a SQL Engine. A SQL statement can be executed in many different ways, such as: full table scans, index scans, nested loops, hash joins. The query optimizer determines the most efficient...



Share this page:
Follow us:
Task Runner