Data Partitions (Clustering of data)


A partition cut out the storage in several part according to a predicate.

You can have two types of partition :

Partition can be located:

  • on different system (related to cutting between system)
  • on the same system

See also: Data Mining - Clustering (Function|Model)

Partitioning Large Fact Tables

Taking advantage of partitioning for fact tables reduces index and statistics maintenance time during ETL processes as well as improves query performance. Because the majority of inserts and updates impact the last partition(s), you only need to disable indexes on a few impacted partitions, and then rebuild disabled indexes after the load and compute statistics on updated partitions only.

Query should also render results faster, since the optimizer builds more efficient execution plans using partitioning elimination logic.

Large fact tables, with more than 20 million rows, can be suitable for partitioning. To build an optimal partitioned table with reasonable data distribution, you can consider partitioning by month, quarter, year, and so on.

The recommended partitioning range for most implementations is a month; however, you may want to consider implementing quarterly or yearly partitioning ranges.

Task Runner