SQL Server Primary Keys
The tenth part of the SQL Server Programming Fundamentals tutorial explains the use and creation of primary keys. A primary key is a column, or group of columns, within a database table that provides a unique reference for each row added to that table.
What is a Primary Key?
A primary key provides a unique reference for every row in a table. A primary key is defined in terms of a single column or a group of columns in the table. A primary key with more than one column is known as a composite key. Once created, every row within the table must have a different value in the column, or a different combination of values across composite key columns.
A primary key is a special type of unique index or unique constraint. A unique index indicates that a single column, or a combination of columns, in a table must contain only unique values. You may create several unique constraints in a table to ensure that different elements of the data have uniqueness. However, you may only define one primary key for a table. Another important difference is that whilst a unique index may contain nullable columns, a primary key cannot.
In addition to defining a unique reference for every row, a primary key adds a clustered index to the table. This special type of index controls the physical ordering of information within the table, so that the data is sorted according to the values in the primary key columns. This vastly improves the performance of queries that select data according to the primary key column values. NB: Although the primary key index is clustered by default, the clustering can be removed and applied to another index. This will be described in a later article in this tutorial.
An important use of primary keys in relational databases is for the creation of "one-to-many" relationships between tables. In most cases, the primary key is used as the unique reference that is specified in the linked table, by means of a foreign key.
A natural key is a primary key that uses naturally unique information from the table in its constituent columns. Many database developers prefer this type of key as the use of naturally unique data makes reading the raw data in linked tables easier. In the Parts table below, the PartNumber column provides a naturally unique value that is an ideal candidate for a primary key.
|15COPIPE||15mm Copper Pipe||1.52|
|22COPIPE||22mm Copper Pipe||2.97|
|10MICCOP||10mm Microbore Copper Tube||1.17|
|25STPTAP||25x25mm Stop Tap||5.99|
This table defines a set of parts that engineers may use when performing jobs for customers. With the part number being used as the primary key, a related table can be defined that holds details of the stock that engineers have in their possession. In the EngineerStock table below, the PartNumber column provides the link to the Parts table and is easily readable by a developer who understands the data. For example, we can see immediately that the engineer with an EngineerId of "1" has 18 units of 15mm copper pipe.
There can be problems when trying to select naturally unique values for a primary key. These occur when the unique information in the table does not meet the following guidelines and rules:
- Primary key data should be immutable. This means that once set, the information should never be changed. Although it is possible to modify the information in a primary key column, it is inadvisable. This is because the change would need to be propagated to every linked table to maintain referential integrity. This usually prevents the use of data such as a car's registration number as the primary key; if a car's registration number is changed, the primary key information would need to be updated.
- The natural key must be unique for every row in the table and must not include nullable information. This would prevent the use of a person's name as a primary key as names are not unique. It is also possible that a person's name changes through marriage or a legal name change, breaking the first guideline.
- The information within the primary key columns should be as compact as possible. If the column sizes in the primary key are large, the number of index entries that can be held in memory simultaneously is reduced. Large keys causes inefficiencies in index access and input/output, reducing the performance of your database. In a flight school database, the combination of pilot reference number, co-pilot reference number, aircraft registration and departure date and time may be unique but could be considered too large for use in a primary key.
- SQL Server does not permit the use of large object columns in primary keys. If you are holding binary file data in a database it may be unique but it will not be possible to use it in a primary key.
17 August 2008