Data Partition - Row or Column Data Store

The problem

Most data warehouses either use a star schema (cube), the central fact table in such a schema usually has many attributes. 50 attributes is very common and 200 is not unusual. So assume a fact table with 60 attributes.

However, the query from a business analyst invariably only accesses 3 or 4 of these 60 attributes.

Physical

A row store will read all 60 attributes because the attributes needed are intermixed with all of the others.

In contrast, a column store will read only the relevant attributes. This will result in a factor of 15-20 less I/O activity.

In addition, a column store has other benefits. Compression is more effective in a column store than a row store, because any storage block only contains one kind of thing. In contrast a storage block contains many different kinds of objects. Obviously, it is easier to compress one thing than many things. A column compression is typically a factor of 3 better than a row compression. This further lowers I/O activity in a column store.

In a columnar data stores, the physical storage components vertically partition the entire model, there is no physical concept of “table” just a logical one where a table structure is made up of columns.

Reference

https://www.slideshare.net/mobile/julienledem/strata-ny-2016-the-future-of-columnoriented-data-processing-with-arrow-and-parquet
Interview with Mike Stonebraker
Column-Store Databases and DW Appliances: How to Make the Right Choice. With data volumes exploding, conventional enterprise data warehouses are fast running out of headroom. Data warehouse appliances are starting to fill the gap, but the emerging category of column-oriented databases may offer a better option. The key to success is matching your application to the right product.
Column-oriented DBMS