Fact Dimension
In a dimensional model, you may find a dimension table:
- with a cardinality (distinct value is higher than 10,000). A high cardinality means generally that the values are different for almost every transaction. For example, order ID, order line ID, and notes).
- which has a many-to-one relationship with the fact table close to a one-to-one and then grows roughly at the same rate as the fact table
This kind of tables are often called:
- fact dimensions
- event facts
- factless fact table
- header facts
as they contain often attributes that are related to the event of the fact table without containing a measure.
In an OLTP environment, you may find them as header tables of transaction lines such as sales, and invoices.
In this dimension table, you may find only two different data type attributes:
- a character data type known/referenced as textual facts. Textual facts are useful for grouping together related fact table rows, such as all the rows that are related to a particular invoice number.
- a time data type
- but you may not find a number attribute that you can aggregate (i.e. with a SUM for instance). if it's already an aggregation of another table (such as a sold for a customer).
Degenerate Dimension
To avoid an expensive join between this fact dimension and the fact table, the attribute of this dimension are degenerate i.e. moved from the dimension to the fact table and can then be classified as:
- Fact attributes
- Degenerate attribute
For example of degenerate attribute:
- Order Numbers,
- Invoice Number,
- Credit-Debit Indicator
- etc
Do not confuse! Degenerate dimensions can in articles referred to a fact dimension.
Pros and Cons of the degeneration
Pros
- Avoid a expensive join between two tables with a one-to-many relationship but which is close of a one-to-one relationship
- By degenerating a date dimension attribute, you avoid the need for a join between the event dimension and a time dimension (which enable the drill through the time hierarchy)
Cons: Increase the size of the fact table
By moving the dimensional attribute in the fact table, you increase the need for storage and you may use a partial_degeneration
Partial degeneration
To overcome the increase of capacity needs when you degenerate a complete fact dimension, you can make a partial degeneration of the fact dimension by choosing to degenerate only a set of attribute with a high analytic value (i.e often used in reporting).
While it is generally not considered good dimensional modeling practice to create a dimension with a potential one-to-one relationship with the fact table, in this situation it's a reasonable tradeoff. This design tradeoff works because the reference dimension should very seldom actually join back to the fact table.