About
The first measure of the plan_estimator, selectivity, represents a fraction of rows from a row set. The row set can be a base table, a view, or the result of a join or a GROUP BY operator.
<MATH>
\text{Selectivity} = \frac{\text{Numbers of rows satisfying a predicate}}{\text{Total number of rows}}
</MATH>
The selectivity is tied to a query predicate. The selectivity of a predicate indicates how many rows from a row set will pass the predicate test.
Selectivity lies in a value range from 0.0 to 1.0 :
If no statistics are available then the optimizer either uses:
-
or an internal default value, Different internal defaults are used, depending on the predicate type. For example, the internal default for an equality predicate (last_name = 'Smith') is lower than the internal default for a range predicate (last_name > 'Smith'). The estimator makes this assumption because an equality predicate is expected to return a smaller fraction of rows than a range predicate.
When statistics are available, the estimator uses them to estimate selectivity.
For example:
-
for an equality predicate (last_name = 'Smith'), selectivity is set to the reciprocal of the number n of distinct values of last_name, because the query selects rows that all contain one out of n distinct values. If a
histogram is available on the last_name column, then the estimator uses it instead of the number of distinct values. The histogram captures the distribution of different values in a column, so it yields better selectivity estimates. Having histograms on columns that contain
skewed data greatly helps the query optimizer generate good selectivity estimates.