About
A fact table is a table that stores fact where a fact can be seen as the representation of an event.
It's also known as:
- Transactional history
- Unalterable fact. (The content of this kind of table cannot legally change)
- Transactional Link in the data vault
In a dimensional modelling, this table is located at the centre of a star schema or a snowflake schema, surrounded by dimension tables.
They contains only transactions and are therefore only one time column. If this is not the case, this is an order table.
Transactions are often group in what is called an order. The cardinality between the order table and the transaction table is really low. There has on average a couple of transaction for one order. Due to the number of rows in this table, she is really often seen as a fact table but it's not. You may find an order attribute into the fact table for performance reason (see Dimensional Data Modeling - Degenerate Dimension of Fact dimension (i.e. event/header entity))
A fact table is the primary table in a dimensional model where the performance measurements of the events are stored.
One fact table = One business process Step = One Star Schema in a data mart
Fact tables express the many-to-many relationships between dimension in dimensional model.
Fact are determined by retrieving the temporal nature of the table:
- Has the table a transaction date column. This kind of column are updated with the current date of the transaction (and not in the future of in the past)
- If the table doesn't have a transaction date column, does the relationship it creates between the dimensions has a temporal character. For instance, between orders and contacts. A contact over an order has obligatory a temporal nature because the contact (the relationship) takes place in the time.
In a reporting point of view, a fact table can only hold one date column. No analytic report will ask you to show more than one date). The time is universal and hold more than one event. The reports want to know what kind of event have taking place in one day.
Articles Related
Data Mart - Dimensional Model - Best Practice
A fact table in a Data Mart is composed :
- of a primary key for auditing (to be able to retrieve the record)
- of a date column (if it represent an event)
- of several foreign dimension keys
- occasionally degenerate dimensions and flags.
- and of one of many numeric measure
The combination of all foreign keys (customer, store, …) and of the date column doesn't create a unique key. On the same second, you can have another transaction that concerns the same foreign keys.
Using Id columns for the foreign keys and for the date column permits:
- to load all records
- and to be sure that for all records in the fact table, we have a records in the dimension table.
How to identify a fact table in a star schema ?
A fact table is a table with measures (column where you will perform an aggregation sum, …). In a classic star schema any table that only has many-to-one (N:1) joins to it is a fact table.
In a schema, a table where all the joins have the many-to-one symbol (crow's feet, arrow, …) pointing into the table, and no joins pointing out of it, is, in essence, a fact table. One exception to this rule is a bridge table when you model a many-to-many relationship.
Grain
The grain is the level of detail and is defined by the combination of dimension.
Event
A row in a fact table correspond to a event. An event is a row in a fact table. All events have the same grain.
Factless Fact Table
In some cases, fact tables may contain no numerical facts at all. There are no measures involved. The important information is simply that the event took place or not.
The only relevant aggregation operation for such types of table is COUNT.
Event Fact table
For example:
- the incidence of crimes which records when and where a crime took place, the type of offense committed, and who committed it.
- Traffic accident,
- Student enrolment,
- Disease statistics,
- …
Non Event Fact Table
To answer this kind of question:
What product were on promotion but did not sell ?
One solution that found database designer is to create a factless fact table which contains all the related informations. This table can leverage a huge amount of data.
Fortunately, other solutions exist such as:
- a data densification with the help of a SQL Statement: Dimensional Data Operation - Data Densification (sparse to dense data) - Preservation of the dimensions
- or the implementation of an Olap (Multidimensional) Database
Historical fact tables
There are basically two forms of historical fact tables:
- transactional (net change) facts
A transactional fact table is insert only and stores changes as the difference between the current state and the new state. Depending on what changes, it would require one or two new rows.
- and accumulating snapshots.
An accumulating snapshot stores each new version of the fact bound by effective and expiration dates. It requires updating the expiration date of the previous version of the fact.
Both will allow you to generate a set of facts at any point in time, the first by summing up to the desired date and the other by filtering facts within the effective range. A transactional fact has the advantage of being able to easily query the magnitude and direction of change.