The tutorial has been created using Microsoft SQL Server 2005 for all code samples and downloadable demonstrations. However, all of the general concepts and the majority of the samples are valid for earlier versions of the database management system. They should also be valid for future releases of SQL Server.
To try the tutorial examples, a SQL Server system must be available. This can be a locally installed instance of SQL Server or a version installed on an available networked system. Any service-based edition of SQL Server 2005 is ideal, including the free-of-charge Express edition, which may be downloaded from the Microsoft SQL Server web site. SQL Server Compact Edition (CE) may also be used for some samples, although this edition is designed to be usable on hand-held devices and, as such, has limited functionality. Access to SQL Server Management Studio, SQL Server Management Studio Express or a similar tool is strongly recommended.
What is a Database?
A database is simply a place to store and process structured information. Databases permit manipulation or retrieval of their contained data in usable manners. There are many forms of database that store and organise information using different structures.
Microsoft SQL Server is a database management system or DBMS. This terminology should not be confused with the term, "database". A DBMS is the software that provides the facilities to create and maintain databases and manipulate the information stored within. In the case of SQL Server, the databases that are created are relational databases, so the product is often known as a relational database management system or RDBMS.
What is a Relational Database?
Relational databases are probably the most common type of database used for general-purpose tasks. In a relational database, information is grouped according to its type, generally in tables (see below). For example, in a database designed to hold fleet information you may include a table of employees and a table of vehicles. These two types of structured data would be held separately as they hold fundamentally different information.
In addition to separating information according to its data structure, a relational database allows relationships to be created. A relationship defines a possible link between data types; the actual linkage of data is dependant upon the information held. For example, in the fleet database there may be a relationship between employees and vehicles, indicating that one or more employees drive a particular vehicle.
Normalisation and Denormalisation
Relational databases use the concept of normalisation. Normalisation is a design technique that minimises the duplication of information. It also reduces the risk of errors in data caused by copies of the same information not being kept synchronised. By using relationships, the duplication required can be lessened or eliminated completely. For example, the fleet database could be created with a single table containing all employees and the details of the vehicle that they drive. If two employees use the same vehicle, the vehicle's details will be entered twice and must be maintained in two places. By creating a relationship between the employees table and the vehicles table, the vehicle data appears only once and requires updating in only one place.
The opposite of normalisation is denormalisation. In a denormalised database information is purposefully duplicated. This provides greater performance than a normalised database at the cost of increased database size and the risk of data integrity problems. Denormalised databases are useful where speed of data retrieval is paramount, such as in a database used for reporting purposes only.
Normalisation and denormalisation will be considered in more detail in a later article in this tutorial.
Elements of a Relational Database Schema
There are several key elements to a relational database created using Microsoft SQL Server. Each of these forms a part of the database's schema. The schema is the logical data model that determines the information that may be stored in the database and how it is to be arranged. This section describes some of the key terminology that should be understood before continuing with the tutorial.
The table is the most important element of a relational database. Tables are where the information within a database is held. A table can be thought of as a grid of information where the columns in the grid define the elements of data required and each row is an individual record. In fact, the terms "columns" and "rows" are those used to describe the table's schema and its data respectively.
|Payroll ID||Name||Vehicle Registration|
|1001||John Smith||ZX08 W4S|
|1002||Bill Jones||NX08 DS8|
|1003||Lisa Brown||NJ57 D1D|
|1005||Louise Green||NX08 DS8|
The above data could be held in a table of employees. The table contains three columns for the payroll ID, name and vehicle registration number; these being defined in the table's schema. The table has four rows of data, one for each employee that has access to a vehicle. The vehicle's details will be held in a separate, related table linked by the unique vehicle registration number.
NB: The storage of data in a table is more complicated than that shown above. However, this is a useful metaphor for this article.
18 May 2008