The implementation of btree index in the Oracle database.
To get to the leaf block to retrieve the first row of a query of the form
SELECT INDEXED-COL FROM T WHERE INDEX_COL = :X
will take the same number of I/Os regardless of the value :X that is used. In other words, the index is height balanced
Most of B*Tree indexes will have a height of 2 or 3 even for millions of record. This means that it will take in general, two or three I/Os to find your key in the index which is not too bad.
Example
nico@ORCL>select index_name,blevel, num_rows
2 from user_indexes
3 where table_name = 'BIG_TABLE';
INDEX_NAME BLEVEL NUM_ROWS
------------------------------ ---------- ----------
BIG_TABLE_PK 2 9542757
nico@ORCL>set autotrace on;
nico@ORCL>select id from big_table where id = 42;
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 2 (0)| 00:00:01 |
|* 1 | INDEX UNIQUE SCAN| BIG_TABLE_PK | 1 | 6 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
3 consistent gets
1 rows processed
nico@ORCL>select id from big_table where id = 5000000;
Execution Plan
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 2 (0)| 00:00:01 |
|* 1 | INDEX UNIQUE SCAN| BIG_TABLE_PK | 1 | 6 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
3 consistent gets
1 rows processed
Whenever the value selected 42 of 5000000, you will have always the same statistics :
In a nonunique index, Oracle simply stores the rowid by appending it to the key as an extra column with a length byte to make the key unique. For example, an index such as
CREATE INDEX I ON T(X,Y)
is conceptually
CREATE UNIQUE INDEX I ON T(X,Y, ROWID)
In a nonunique index, you will find that the data is sorted first by index key values (in the order of the index key) and then by rowid ascending. In a unique index, the data is sorted only by the index key values.
This is not compression in the same manner that ZIP files are compressed; rather, this is compression that removes redundancies from concatenated (multi-column) indexes.
The basic concept behind a compressed key index is that every entry is broken into two pieces :
Index block with owner colomn repeated
Sys.Package.Dbms_Alert
Sys.Package.Dbms_Application_info
Sys.Package.Dbms_Aq
Sys.Package.Dbms_Aqadm
Sys.Package.Dbms_Aqadm_Sys
Sys.Package.Dbms_Alert_Syscalls
...
become
Index block with owner colomn factored out
**Sys** //<- prefix//
Package.Dbms_Alert //<- suffix ...//
Package.Dbms_Application_info
Package.Dbms_Aq
Package.Dbms_Aqadm
Package.Dbms_Aqadm_Sys
Package.Dbms_Alert_Syscalls
...
nico@ORCL>create table t
2 as
3 select * from all_objects;
Table created.
nico@ORCL>create index t_idx on t(owner, object_type, object_name);
Index created.
nico@ORCL>analyze index t_idx validate structure;
Index analyzed.
nico@ORCL>create table idx_stats
2 as
3 select 'noncompressed' what, a.*
4 from index_stats a;
Table created.
nico@ORCL>define 1=1;
nico@ORCL>drop index t_idx;
Index dropped.
nico@ORCL>create index t_idx on t(owner,object_type,object_name) compress &1;
old 1: create index t_idx on t(owner,object_type,object_name) compress &1
new 1: create index t_idx on t(owner,object_type,object_name) compress 1
Index created.
nico@ORCL>analyze index t_idx validate structure;
Index analyzed.
nico@ORCL>insert into idx_stats select 'compress &1', a.* from index_stats a;
old 1: insert into idx_stats select 'compress &1', a.* from index_stats a
new 1: insert into idx_stats select 'compress 1', a.* from index_stats a
1 row created.
........................................
For compress 2 and 3 same procedure ....
........................................
nico@ORCL>select what, height, lf_blks, br_blks, btree_space, opt_cmpr_count, opt_cmpr_pctsave
2 from idx_stats;
WHAT HEIGHT LF_BLKS BR_BLKS BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
------------- ---------- ---------- ---------- ----------- -------------- ----------------
noncompressed 3 407 3 3280096 2 29
compress 1 3 354 3 2854680 2 19
compress 2 3 287 3 2318948 2 0
compress 3 3 455 3 3662276 2 36
We see for COMPRESS 1 :
Further, we see for COMPRESS 2 :
The ANALYZE command against the noncompressed index populated the OPT_CMPR_PCTSAVE/OPT_CMPR_COUNT columns and estimated a 29 percent saving with COMPRESS 2.
nico@ORCL>select 3280096*(1-0.29) from dual;
3280096*(1-0.29)
----------------
2328868.16
Not so bad compare to the value of 2318948
What happen for COMPRESS 3 :
This is due to the fact that each repeated prefix we remove saves the space of N copies, but add 4 bytes of overhead on the leaf block as part of the compression scheme. By adding the OBJECT_NAME column to the compressed key, we made that key unique, meaning there were no duplicate copies to factor out. Therefore, we ended up adding 4 bytes to every single index key entry and factoring out no repeating data.
The column OPT_CMPR_COUNT is dead accurate at providing the best compression count to be used and OPT_CMPR_PCTSAVE will tell you exactly how much saving to expect.
The compressed index structure is more complex than it used to be. Oracle will spend more time processing the data in this structure, both :
What we search is the tradeoff between CPU time and I/O time.
With compression :
BUT
B*Tree index has several subtypes :