Dimensional Data Modeling - Fact Table

Star Schema

About

A fact table is a table that stores fact where a fact can be seen as the representation of an event.

It's also known as:

  • Transactional history
  • Unalterable fact. (The content of this kind of table cannot legally change)
  • Transactional Link in the data vault

In a dimensional modelling, this table is located at the centre of a star schema or a snowflake schema, surrounded by dimension tables.

They contains only transactions and are therefore only one time column. If this is not the case, this is an order table.

Transactions are often group in what is called an order. The cardinality between the order table and the transaction table is really low. There has on average a couple of transaction for one order. Due to the number of rows in this table, she is really often seen as a fact table but it's not. You may find an order attribute into the fact table for performance reason (see Dimensional Data Modeling - Degenerate Dimension of Fact dimension (i.e. event/header entity))

A fact table is the primary table in a dimensional model where the performance measurements of the events are stored.

One fact table = One business process Step = One Star Schema in a data mart

Fact tables express the many-to-many relationships between dimension in dimensional model.

Star Schema

Fact are determined by retrieving the temporal nature of the table:

  • Has the table a transaction date column. This kind of column are updated with the current date of the transaction (and not in the future of in the past)
  • If the table doesn't have a transaction date column, does the relationship it creates between the dimensions has a temporal character. For instance, between orders and contacts. A contact over an order has obligatory a temporal nature because the contact (the relationship) takes place in the time.

In a reporting point of view, a fact table can only hold one date column. No analytic report will ask you to show more than one date). The time is universal and hold more than one event. The reports want to know what kind of event have taking place in one day.

Data Mart - Dimensional Model - Best Practice

A fact table in a Data Mart is composed :

  • of a primary key for auditing (to be able to retrieve the record)
  • of a date column (if it represent an event)
  • of several foreign dimension keys
  • occasionally degenerate dimensions and flags.
  • and of one of many numeric measure

The combination of all foreign keys (customer, store, …) and of the date column doesn't create a unique key. On the same second, you can have another transaction that concerns the same foreign keys.

Using Id columns for the foreign keys and for the date column permits:

  • to load all records
  • and to be sure that for all records in the fact table, we have a records in the dimension table.

How to identify a fact table in a star schema ?

A fact table is a table with measures (column where you will perform an aggregation sum, …). In a classic star schema any table that only has many-to-one (N:1) joins to it is a fact table.

In a schema, a table where all the joins have the many-to-one symbol (crow's feet, arrow, …) pointing into the table, and no joins pointing out of it, is, in essence, a fact table. One exception to this rule is a bridge table when you model a many-to-many relationship.

Grain

The grain is the level of detail and is defined by the combination of dimension.

Event

A row in a fact table correspond to a event. An event is a row in a fact table. All events have the same grain.

Factless Fact Table

In some cases, fact tables may contain no numerical facts at all. There are no measures involved. The important information is simply that the event took place or not.

The only relevant aggregation operation for such types of table is COUNT.

Event Fact table

For example:

  • the incidence of crimes which records when and where a crime took place, the type of offense committed, and who committed it.
  • Traffic accident,
  • Student enrolment,
  • Disease statistics,

Non Event Fact Table

To answer this kind of question:

What product were on promotion but did not sell ?

One solution that found database designer is to create a factless fact table which contains all the related informations. This table can leverage a huge amount of data.

Fortunately, other solutions exist such as:

Historical fact tables

There are basically two forms of historical fact tables:

  • transactional (net change) facts

A transactional fact table is insert only and stores changes as the difference between the current state and the new state. Depending on what changes, it would require one or two new rows.

  • and accumulating snapshots.

An accumulating snapshot stores each new version of the fact bound by effective and expiration dates. It requires updating the expiration date of the previous version of the fact.

Both will allow you to generate a set of facts at any point in time, the first by summing up to the desired date and the other by filtering facts within the effective range. A transactional fact has the advantage of being able to easily query the magnitude and direction of change.

Documentation / Reference





Discover More
Card Puncher Data Processing
BOBJ Getting Started

The most problem with Business Object is to find good tutorial to start. The key word to find tutorial on BOBJ is “Getting Started”. You can find this kind of document : on the web : Try...
Data System Architecture
Data Partition - Row or Column Data Store

Most data warehouses either use a star schema (cube), the central fact table in such a schema usually has many attributes. 50 attributes is very common and 200 is not unusual. So assume a fact table with...
Data System Architecture
Data Property - Data Consistency - (Strong|Atomic) consistency - Linearizability)

In its most basic form, consistency refers to data values in one data set being consistent with values in another data set at the same point in time. In an other form, consistency, also known as atomic...
Datavault
Data Vault - Hub

Reasons: Suppress the dependency between tables and allow then a parallel load The fact that the hub can be feed from any table (from a transaction table but also from a master data table) allows:...
Datavault
Data Vault - Link

A Link table in the data vault methodology is: transactions, associations, hierarchies, and re-definition of business terms. Every table which has more than one foreign key is a candidate...
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Datavault
Datawarehouse Data Modeling - Data Vault

The data vault (DV) is a relational data model technique used in the persistence layer of a datawarehouse The Data Vault consists of three primary entity types: ...
Star Schema
Dimensional Data Modeling - A boolean is not a numerical measure but a descriptif attribute

A boolean in dimensional data modeling is not a numerical measure but a discrete attribute A boolean in this article can be represented by: a real boolean (true/false) 0/1 Indicator (Y/N) Really...
Star Schema
Dimensional Data Modeling - Dimension (Perspective)

A dimension is a part of a dimensional schema and provide the basis for analyzing data through slicing and dicing A dimension is just a set of descriptif attribute that are generally grouped: in a...
Star Schema
Dimensional Data Modeling - Dimensional Schemas

This section is dimensional data modeling, That is the building of a cube (hypercube) A dimensional schema is a schema based on dimension. The dimensional schema can modeled: in a relational database...



Share this page:
Follow us:
Task Runner