Table of Contents

Oracle Database - The (index) Clustering Factor Statistics

About

The (index) Clustering Factor Statistics is a index statistics that indicates the amount of order of the rows in the table based on the values of the index :

We could also view the clustering factor as :

How the clustering factor can affect cost.

Example Effects of Clustering Factor on cost to retrieve data from a Oracle Database - Index Scans

Assume the following situation:

Case 1: The index clustering factor is low for the rows as they are arranged in the following diagram.

Block 1     Block 2       Block 3 
-------     -------       -------- 
A  A  A     B  B  B       C  C  C 

This is because the rows that have the same indexed column values for c1 are located within the same physical blocks in the table. The cost of using a range scan to return all of the rows that have the value A is low, because only one block in the table needs to be read.

Case 2: If the same rows in the table are rearranged so that the index values are scattered across the table blocks (rather than collocated), then the index clustering factor is higher.

Block 1       Block 2        Block 3 
-------       -------        --------
A  B  C       A  B  C        A  B  C

This is because all three blocks in the table must be read in order to retrieve all rows with the value A in col1.

Example

Index Num Rows Clustering Factor
COLOCATED_PK 100,000 1190
DISORGANIZED_PK 100,000 99932

So the database is saying, :

Reason

The reason for the large difference is that as Oracle range scans through the index structure, if it discovers the next row in the index is on the same database block as the prior row, it does not perform an other I/O to get the table block from the buffer cache.

However, if the next row is not on the same block, the it will release that block and perform another I/O into the buffer cache.

I want a good clustering factor ?

Many factors influence the use of an index by the optimizer, including physical data layout.

You might be tempted, therefore , to run out and try to rebuild all of your tables now to make all indexes have a good clustering factor but that would be a waste of time.

You must keep in mind that :