Data Partitions (Clustering of data)
About
A partition cut out the storage in several part according to a predicate.
You can have two types of partition :
- horizontal (sharding) (related to a cutting by row)
- vertical (related to a cutting by column)
Partition can be located:
- on different system (related to cutting between system)
- on the same system
See also: Data Mining - Clustering (Function|Model)
Articles Related
Partitioning Large Fact Tables
Taking advantage of partitioning for fact tables reduces index and statistics maintenance time during ETL processes as well as improves query performance. Because the majority of inserts and updates impact the last partition(s), you only need to disable indexes on a few impacted partitions, and then rebuild disabled indexes after the load and compute statistics on updated partitions only.
Query should also render results faster, since the optimizer builds more efficient execution plans using partitioning elimination logic.
Large fact tables, with more than 20 million rows, can be suitable for partitioning. To build an optimal partitioned table with reasonable data distribution, you can consider partitioning by month, quarter, year, and so on.
The recommended partitioning range for most implementations is a month; however, you may want to consider implementing quarterly or yearly partitioning ranges.