Data Normalization Rules

You can apply the data normalization rules (sometimes just called normalization rules) as the next step in your design. You use these rules to see if your tables are structured correctly. The process of applying the rules to your database design is called normalizing the database, or just normalization.

Normalization is most useful after you have represented all of the information items and have arrived at a preliminary design. The idea is to help you ensure that you have divided your information items into the appropriate tables. What normalization cannot do is ensure that you have all the correct data items to begin with.

You apply the rules in succession, at each step ensuring that your design arrives at one of what is known as the “normal forms.” Five normal forms are widely accepted — the first normal form through the fifth normal form. This lesson looks at the first two, because they are all that is required for the types of work we do here with Access at the university.

First normal form

First normal form states that at every row and column intersection in the table, there exists a single value, and never a list of values. For example, you cannot have a field named Price in which you place more than one Price. If you think of each intersection of rows and columns as a cell, each cell can hold only one value.

Second normal form

Second normal form requires that each non-key column be fully dependent on the entire primary key, not on just part of the key. This rule applies when you have a primary key that consists of more than one column. For example, suppose you have a table containing the following columns, where Student ID and Class ID form the primary key:

  • Student ID (primary key)
  • Class ID (primary key)
  • Class Name

This design violates second normal form, because Class Name is dependent on Class ID, but not on Student ID, so it is not dependent on the entire primary key. You must remove Class Name from the table. It belongs in a different table (Classes).

There are also third, fourth, and more normal forms, but the databases we make with Access 2007 do not need to go to those levels of normalization.