Oracle Database - Selectivity

Table of Contents

1 - About

The first measure of the plan_estimator, selectivity, represents a fraction of rows from a row set. The row set can be a base table, a view, or the result of a join or a GROUP BY operator.

<MATH> \text{Selectivity} = \frac{\text{Numbers of rows satisfying a predicate}}{\text{Total number of rows}} </MATH>

The selectivity is tied to a query predicate. The selectivity of a predicate indicates how many rows from a row set will pass the predicate test.

Selectivity lies in a value range from 0.0 to 1.0 :

  • 0.0 means that no rows will be selected from a row set,
  • 1.0 means that all rows will be selected.

If no statistics are available then the optimizer either uses:

  • dynamic sampling (if enabled)
  • or an internal default value, Different internal defaults are used, depending on the predicate type. For example, the internal default for an equality predicate (last_name = 'Smith') is lower than the internal default for a range predicate (last_name > 'Smith'). The estimator makes this assumption because an equality predicate is expected to return a smaller fraction of rows than a range predicate.

When statistics are available, the estimator uses them to estimate selectivity.

For example:

  • for an equality predicate (last_name = 'Smith'), selectivity is set to the reciprocal of the number n of distinct values of last_name, because the query selects rows that all contain one out of n distinct values. If a histogram is available on the last_name column, then the estimator uses it instead of the number of distinct values. The histogram captures the distribution of different values in a column, so it yields better selectivity estimates. Having histograms on columns that contain skewed data greatly helps the query optimizer generate good selectivity estimates.

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap