Relational Data Modeling - Normal Forms

Data System Architecture

Type of Normal Form

A relational database is :

  • in First Normal Form (1NF) if each attribute is single-valued with atomic values.
  • in Second Normal Form (2NF) if it is in 1NF and each attribute that is not a primary key is fully functionally dependent on the entity's primary key.
  • in Third Normal Form (3NF) if it is in 2NF and no non-primary key attribute is functionally dependent on another non-primary key.

Higher order normal forms are usually more desirable than lower order normal forms.

Third Normal Form (3NF)

3NF for large data warehouse

3NF schemas are typically chosen for large data warehouses, especially environments with significant data-loading requirements that are used to feed data marts and execute long-running queries.

The main advantages of 3NF schemas are that they:

  • Provide a neutral schema design, independent of any application or data-usage considerations
  • May require less data-transformation than denormalized schemas such as star schemas

Optimizing Third Normal Form Queries

Queries on 3NF schemas are often very complex and involve a large number of tables. The performance of joins between large tables is thus a primary consideration when using 3NF schemas.

One particularly important feature for 3NF schemas is partition-wise joins. The largest tables in a 3NF schema should be partitioned to enable partition-wise joins. The most common partitioning technique in these environments is composite range-hash partitioning for the largest tables, with the most-common join key chosen as the hash-partitioning key.

Optimizing a 3NF schema requires the three Ps:

Documentation / Reference

Discover More
Dw Layers
Data Warehouse - Layer (Architecture)

Name Properties Stage Layer Real-Time, CDC, continuous refresh DataWarehouse Layer / ODS Normalized, Data History, refresh: 2-6 daily Data Mart Performance, Access layer, Star schema, refresh: 1-4...
Star Schema
Dimensional Data Modeling - Dimension (Perspective)

A dimension is a part of a dimensional schema and provide the basis for analyzing data through slicing and dicing A dimension is just a set of descriptif attribute that are generally grouped: in a...
Dimensional Data Modeling - Snowflake schema

A snowflake schema is a star schema with fully normalised (3NF) dimensions. It gets its name from that it has a similar shape than a snowflake. A snowflake is a dimensional model : in which a central...
Excel Transpose Pivot
Dimensional Data Operation - (Pivot|Transpose|Cross-tab|Matrix)

The pivot is an dimensional data operation where rows and columns are interchanged. Pivot is also known as: Transpose in the linear algebra word Matrix in the Microsoft world Cross-tab This operation...
Erd Entity Instance
Logical Data Modeling - Entity Set (Class, Table)

An entity set is a set of entity. An entity set is implemented: in a relational database by a table and an entity by a row in code by a Class and an instance by an object entity of a language...
Data System Architecture
Relational Data Modeling

This section is data Modeling applied to a relational structure (mostly table). Relational databases normalize data, removing as much duplication as possible. A relational database’s approach to...
Normalization Process
Relational Data Modeling - Database Normalization

Normalization is a logical data base design method. Normalization is a process of systematically breaking a complex table into simpler ones. It is built around the concept of normal forms. Database normalization...

Share this page:
Follow us:
Task Runner