Relational Data Modeling - Normal Forms

Type of Normal Form

A relational database is :

  • in First Normal Form (1NF) if each attribute is single-valued with atomic values.
  • in Second Normal Form (2NF) if it is in 1NF and each attribute that is not a primary key is fully functionally dependent on the entity's primary key.
  • in Third Normal Form (3NF) if it is in 2NF and no non-primary key attribute is functionally dependent on another non-primary key.

Higher order normal forms are usually more desirable than lower order normal forms.

Third Normal Form (3NF)

3NF for large data warehouse

3NF schemas are typically chosen for large data warehouses, especially environments with significant data-loading requirements that are used to feed data marts and execute long-running queries.

The main advantages of 3NF schemas are that they:

  • Provide a neutral schema design, independent of any application or data-usage considerations
  • May require less data-transformation than denormalized schemas such as star schemas

Optimizing Third Normal Form Queries

Queries on 3NF schemas are often very complex and involve a large number of tables. The performance of joins between large tables is thus a primary consideration when using 3NF schemas.

One particularly important feature for 3NF schemas is partition-wise joins. The largest tables in a 3NF schema should be partitioned to enable partition-wise joins. The most common partitioning technique in these environments is composite range-hash partitioning for the largest tables, with the most-common join key chosen as the hash-partitioning key.

Optimizing a 3NF schema requires the three Ps:

Documentation / Reference

Powered by ComboStrap