Oracle Database - Statistics - Density (Column Selectivity)

1 - About

Density is a column statistics that store a decimal number between 0 and 1 that measures the selectivity of a column. Values close to 1 indicate that the column is

Density provides selectivity estimates for:

  • equi-join predicates (e.g. A.COL1 = B.COL1)
  • and equality predicates (e.g. COL1 = 5).

The more selective a column, the less rows are likely to be returned by a query referencing this column in its predicate list.

The density is expressed as a decimal number between 0 and 1.

  • Values close to 1 indicate that this column is unselective
  • Values close to 0 indicate that this column is highly selective

3 - Calculation

3.1 - Without Histogram

Without histogram: <MATH> \text{Density} = \frac{1}{\text{Number of distinct NON null values}} </MATH>

3.2 - With Histogram

Since Oracle 7.3, histogram are used in the calculation.

The calculation i splits by the type of popularity:

  • A non-popular value is one that does not span multiple bucket end points.
  • A popular value is one that spans multiple end points.

For non-popular values the density is calculated as the number of non-popular values divided by the total number of values. Formula: <MATH> \text{Density} = \frac{\text{Number of non-popular values}}{\text{Total number of values}} </MATH>

Popular values calculate the selectivity of a particular column values by using histograms.

The Selectivity (Density ?) for popular values is calculated as: <MATH> \text{Selectivity} = \frac{\text{Number of end points spanned by this value}}{\text{Total number of end points}} </MATH>

4 - Documentation / Reference

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap