Databases – Normal Form


title: Normal Form

Normal Form

Normalization was first introduced as part of the relational model. It is the process of organizing data tables and columns in a way that reduces redundancies and improves integrity. This can either be done through :

  • synthesis : creates a normalized database design based on a known set of dependencies.
  • decomposition : takes an existing (insufficiently normalized) database design and improves it based on the known set of dependencies

There are three common normal forms (1st, 2nd and 3rd) plus a rather advanced form called BCNF. They are progressive : in orther to qualify for the 3rd normal form, a database schema must satisfy the rules of the 2nd normal form, and so on for the 1st normal form.

  • 1st normal form : The information is stored in a table, each column contains atomic values, and there are not repeating groups of columns. This :
  1. Eliminates repeating groups in individual tables.
  2. Creates a separate table for each set of related data.
  3. Identifies each set of related data with a primary key
Example

A design that violates the 1st normal form, the “telephone” column does not contain atomic values

customer IDFirst nameLast nameTelephone
123PoojaSingh555-861-2025, 192-122-1111
789JohnDoe555-808-9633
456SanZhang(555) 403-1659 Ext. 53; 182-929-2929

One solution would be to have an extra column for each phone number. But then, this will repeat conceptually the same attribute(phone number). Moreover, adding extra telephone number will require reorganizing the table by adding more column.This is definitely not practicle.

Another solution is to have a separate table for the association customer <-> Telephone: This respects the 1st normal form and there can be as many rows per customer as needed.

customer IDFirst nameLast name
123PoojaSingh
789JohnDoe
456SanZhang
customer IDTelephone
123555-861-2025
123192-122-1111
789555-808-9633
456(555) 403-1659 Ext. 53
456182-929-2929
  • 2nd normal form : The table is in the first normal form and all the non-key columns depend on the table’s primary key. This narrows the table’s purpose.
Example

A design that violates the 2nd normal form. The model full name being the primary key, there are other candidate keys like {manufacturer, model}. The “Manufacturer Country” column is dependant on a non-key column (the Manufacturer).

ManufacturerModelModel Full NameManufacturer Country
ForteX-PrimeForte X-PrimeItaly
ForteUltracleanForte UltracleanItaly
Dent-o-FreshEZbrushDent-o-Fresh EZbrushUSA
KobayashiST-60Kobayashi ST-60Japan
HochToothmasterHoch ToothmasterGermany
HochX-PrimeHoch X-PrimeGermany

The normalized design would be to split into two tables like the following:

ManufacturerManufacturer Country
ForteItaly
Dent-o-FreshUSA
KobayashiJapan
HochGermany
ManufacturerModelModel Full Name
ForteX-PrimeForte X-Prime
ForteUltracleanForte Ultraclean
Dent-o-FreshEZbrushDent-o-Fresh EZbrush
KobayashiST-60Kobayashi ST-60
HochToothmasterHoch Toothmaster
HochX-PrimeHoch X-Prime
  • 3rd normal form : The table is in second normal form and all of its columns are not transitively dependent on the primary key.
    A column is said to be dependant on an another column if it can be derived from it, for example, the age can be derived from the birthday. Transitivity means this dependance might involve other columns. for example, if we consider three columns PersonID BodyMassIndex IsOverweight , the column ‘IsOverweight’ is transitively dependant on ‘personID’ through ‘BodyMassIndex’.
Example

A design that violates the 3rd normal form. {Tournament, Year} is the primary key for the table and the column ‘Winner Date of Birth’ transitively depends on it.

TournamentYearWinnerWinner Date of Birth
Indiana Invitational1998Al Fredrickson21 July 1975
Cleveland Open1999Bob Albertson28 September 1968
Des Moines Masters1999Al Fredrickson21 July 1975
Indiana Invitational1999Chip Masterson14 March 1977

A design compliant with the 3rd normal form would be :

TournamentYearWinner
Indiana Invitational1998Al Fredrickson
Cleveland Open1999Bob Albertson
Des Moines Masters1999Al Fredrickson
Indiana Invitational1999Chip Masterson
WinnerDate of Birth
Chip Masterson14 March 1977
Al Fredrickson21 July 1975
Bob Albertson28 September 1968

More Information:

This article needs improvement. You can help improve this article. You can also write similar articles and help the community.