Data Warehousing - Data Loading (Fast Refresh)

Data System Architecture

About

Fast Refresh refer to the better way to quickly load data modifications of a data source.

Several techniques exist:

In fact, the load process is often the primary consideration in choosing the partitioning scheme of data warehouse tables and indexes. In very common scenario, the data warehouse is being loaded by time. You should them partition by time first.

Bulk loading data rates

Bulk loading data rates are governed by the following operations and hardware resources:

  • How fast can the data be read
  • How fast can data be written out
  • How much CPU power is available

Structural Database

  • Little number of rows per block
  • Too many indexes

Flat File loading

The database operations can be simplified to:

  • Read lines from flat files
  • Process lines into columns, internal data types and optionally compress
  • Write rows/columns out to disk

In most cases a reasonably size system becomes CPU bound, not write bound, on data loads as almost all Oracle data warehouses use compression which increases CPU consumption but reduces the IO requirement for the writes. Or putting it another way: Bulk loading into a compressed table should be a CPU bound operation, not a disk (write) bound operation.

Documentation / Reference





Discover More
Data System Architecture
Data Warehouse

A data warehouse is a large central data repository of current, history and summarised data coming from operational and external sources used primarily for analysis. s is large historical databases for...
Data System Architecture
Data Warehouse - Staging Area

The first destination of the data that has been extracted from source is the staging area. Sometimes staging areas are also called landing zones for flat files, XML files, Cobol files and the like. This...
Card Puncher Data Processing
Oracle Database - Data Load

Data loading in a Oracle Database environment. In the Oracle Database, you can load data in the following elements: Table, View , Materialized View, External Table Oracle Version Utility...
Data System Architecture
Relational Data Modeling - Normal Forms

A relational database is : in First Normal Form (1NF) if each attribute is single-valued with atomic values. in Second Normal Form (2NF) if it is in 1NF and each attribute that is not a primary key...
Event Centric Thinking
Stream - How to capture change of state - Change Data Capture (CDC)

This page is how to capture the changes of state in order to create a stream. The stream processing will then perform an incremental computation Capturing data changes is far from a trivial task, and...



Share this page:
Follow us:
Task Runner