BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Programming Concepts

Database Normalisation

The seventeenth part of the SQL Server Programming Fundamentals tutorial discusses the concept of database normalisation. Normalisation is a database design technique that minimises duplication of information, reducing the risk of introducing data errors.

Second Normal Form

For a table to comply with second normal form (2NF), it must meet all of the requirements of 1NF. Additionally, all non-prime columns, ie. columns that are not part of a candidate key, must depend upon entire candidate keys, not just parts of those keys. This rule only applies to tables that contain composite keys. If all possible candidate keys are composed only of single columns, this rule has no effect.

Example

EmployeeSkillsSalary
BobC#£30,000
BobSQL£30,000
JimC#£35,000
JimVB£35,000
JimSQL£35,000
SueVB£30,000

In this example, a column has been added to the Employee Skills table to hold the employee's salary. This column breaks the 2NF rule, as the salary is dependant upon the employee but not the skill. This is only one column of the two-column composite key.

To modify the design to achieve 2NF, the Salary column should be removed from the Employee Skills table and added to the Employees table instead. This also removes the duplication of salary data that could cause update anomalies.

Employees Table
NameJob TitleSalary
BobDeveloper£30,000
JimDeveloper£35,000
SueDeveloper£30,000
Employee Skills Table
EmployeeSkills
BobC#
BobSQL
JimC#
JimVB
JimSQL
SueVB

Third Normal Form

Third normal form (3NF) includes all of the requirements of 2NF. In addition, 3NF states that all non-prime columns must be directly dependant upon the candidate keys. Non-key columns may not be transitively dependant upon prime columns. ie. A column should not be dependant upon a non-key column that is dependant upon a key column.

Example

EmployeePrimary OfficeOffice Phone
BobLondon020 123456
JimLeeds0113 234567
SueManchester0161 987654

In the above sample, the designer has created a new table that links each employee to an office location and that office's main telephone number. This table meets all of the requirements for 2NF. The only candidate key for the table is the Employee column. The Primary Office column is directly dependant upon this key. However, the Office Phone column is not directly related to the employee. It is transitively related to the employee via the primary office. This table does not therefore meet the requirements for 3NF.

To achieve 3NF, the table must be normalised into two tables. The first table will link the employee to their location. The second will hold the details of each location, including the telephone number. To perform a query that returns an employee's name and telephone number, you must join the tables. This solution removes the risk of update anomalies where the same office appears for multiple employees but the telephone number is changed for only one of them.

Employee Offices Table
EmployeePrimary Office
BobLondon
JimLeeds
SueManchester
Offices Table
OfficeOffice Phone
London020 123456
Leeds0113 234567
Manchester0161 987654

Limitations

Third normal form can be overly complex for some business scenarios. You may decide not to fully normalise your tables to achieve 3NF for performance or other reasons. An example of this can be seen in the following table. This table holds details of customer locations including the customer ID, address and postcode. Technically, the address can be though of as dependent upon the postcode, which itself is related to the customer ID. To achieve 3NF, you could create a table of possible locations with the postal code as its primary key. In many situations, this would be deemed to be unnecessary.

CustomerIDAddressPostcode
1231 Long LaneW1 1WW
4562 Short StreetLS2 2LS
7893 Wide RoadM3 3MM
15 December 2008