BlackWaspTM
SQL Server
SQL 2005+

SQL Server Basic Aggregation Functions

The twenty-seventh part of the SQL Server Programming Fundamentals tutorial describes the use of the aggregation functions that can be used when executing queries. These functions allow you to perform calculations based upon multiple result rows.

Aggregations

Transact-SQL (T-SQL) includes several aggregate functions that can be used in queries. An aggregate function calculates a value based upon the contents of multiple result rows. The calculated values of aggregations are returned in columns in a result set.

In this article we will examine the five most commonly used aggregate functions. These allow you to count the rows returned by a query, calculate a sum or average value and determine the maximum and minimum values in a column. There are further aggregations for calculating checksums and statistical values but these are used less often and will not be examined here. We will consider simple examples that perform calculations across entire result sets in this article. In the next article we will introduce grouping, which allows multiple aggregate values to be generated for groups of related rows in a result set.

JoBS Database

The examples in this article use queries that are executed against the JoBS tutorial database, which has been constructed and populated through the course of these articles. If you do not have an up-to-date version of the database, you can create one using the script that can be downloaded using the link at the top of this page.

Counting Rows

The first aggregation function that we will examine is count. In its simplest form, this function returns the number of rows in a table as an integer value. For example, to count all of the rows in the CustomerAddresses table you can execute the following query. This should return a value of twenty-one:

SELECT count(*) FROM CustomerAddresses

If you have a query that may count billions of rows, an int value may not be large enough. In this case, you may use the count_big function. The functionality is almost identical to count, except that the returned result is a bigint value.

SELECT count_big(*) FROM CustomerAddresses

The above syntaxes use an asterisk to specify that all rows in the resultant data will be counted irrespective of the data that they contain. If you replace the asterisk character with the name of a column, the functionality is modified slightly. In that case, all rows that contain information in the named column are counted. Any rows with a NULL value in the stated column are excluded from the count.

The CustomerAddresses table contains several columns for each customer address. The second of these, "Address2", is a nullable column so can be used in a demonstration. Run the following query to see that the NULL values are not counted:

SELECT count(Address2) FROM CustomerAddresses

In addition to counting only non-null values, you may decide that you want to count only distinct, non-null values. This can be achieved by adding the DISTINCT clause to the column name. The following query should return a count of nineteen rows even though there are twenty-one values in the "TownOrCity" column. As the Leeds and Manchester values are duplicated, each is only counted once.

SELECT count(DISTINCT TownOrCity) FROM CustomerAddresses

Using Column Aliases

You will have noticed that the name of the calculated column is generated automatically by SQL Server when running aggregation queries. As with other types of query, you can include an alias to give a name to the calculated columns. The next query calculates a count of distinct cities and a count of all addresses. To easily identify the two values, the columns are named.

SELECT count(DISTINCT TownOrCity) AS Cities, count(*) AS Addresses
FROM CustomerAddresses

NB: This query includes two aggregate functions. It is acceptable to add many aggregated columns to a query. However, you may not mix aggregated and non-aggregated columns in a query without the use of grouping.

14 May 2009