Big O Notation Basics

by Richard Carr, published at http://www.blackwasp.co.uk/BigONotation.aspx

Big O notation is a mathematical notation that can be applied to algorithms to describe their computation complexity. It gives a guide to the scalability of an algorithm as the number of items in the input data that it processes increases.

What is Big O Notation?

Big O notation is a mathematical notation that can be applied to the algorithms that we use when developing software. In this context, its purpose is to describe the computational complexity of an algorithm. Specifically, it allows you to estimate how scalable an algorithm will be as the amount of data to process grows. The scalability indicated by big O notation can be used to describe various things, such as the time or processing cycles required as the data set expands, or the increasing memory or resource utilisation.

Big O belongs to a family of called the asymptotic notation. This includes a number of different notations, each with a different purpose. Big O notation's purpose is to indicate the upper bound or worst-case scenario for an algorithm's growth rate. It is not an exact measure of performance or resource utilisation. For example, consider an algorithm that Big O notation suggests has a linear growth in processing time. Doubling the data set size does not necessarily mean that the running time for the algorithm will double. It merely means that the upper bound processing time for the larger data set will be twice that of the worst-case scenario for the smaller input.

The Notation

Big O notation uses a capital letter "O", followed by a function in parentheses. It is the function that describes the scalability. The function almost always includes the letter "n", which represents of the number of elements in the data being processed. The following subsections describe some of the more commonly seen examples.

O(1)

O(1) algorithms are those that have no increase in complexity related to the size of the input data; the processing time, memory utilisation or other item being measured is fixed. A simple example would be Language-Integrated Query's (LINQ), "First" operator. This returns the first item from a sequence. Having a larger sequence does not increase the time required to extract the first element.

O(log n)

O(log n) describes algorithms with logarithmic scalability. These increase in complexity as the number of elements in the data set increases but by a relatively smaller amount as the size grows. A simple example is the binary search algorithm for a sorted sequence. Each iteration of this algorithm can disregard half of the remaining items. Doubling the size of the search data only adds one extra iteration to the worst-case search scenario.

O(n)

O(n) algorithms show linear growth. A simple example is a search through an unsorted set of data by reading one item after another until a specific item is found. The worst-case would be when the item being searched for is the last in the set. Adding an extra item adds one extra potential search iteration. This example highlights the idea that Big O describes the upper bound only. A set of one thousand or one million items can be searched in one iteration if the item you are looking for is the first in the list.

O(n log n)

O(n log n) algorithms are known as loglinear. Often they are algorithms that may perform an O(log n) operation for every item in the input data. Several sorting algorithms, such as quick sort and heap sort are O(n log n).

O(n^x)

O(n²), or quadratic, algorithms increase in complexity proportionally to the square of the number of items in the input data. If you have two nested loops for the input data in your algorithm, such as with bubble sort, it is likely to be O(n²). Other variations of this are O(n^x) or polynomial algorithms. For example, three nested loops would be O(n³), four nested loops are O(n⁴) and so on.

O(xⁿ)

O(2ⁿ) algorithms give exponential growth and very poor scalability indeed. Each additional item in the input data has the potential to double the amount of time or resources used by the algorithm. The constant value 2 can be changed for algorithms with yet poorer scalability. For example, O(3ⁿ) indicates that the complexity can treble with each additional item.

O(n!)

Algorithms that may consider every permutation of the items in the input data set are O(n!), giving factorial growth in complexity. An example of such scalability is attempting to solve the travelling salesman problem by brute force.

Comparison

As we move down the above list, the scalability becomes poorer with each step. To see the difference, consider the following graph. This shows the relative upper bound scenario for algorithms of each type, showing the number of cycles or resource requirements per algorithm type, according to the input set size.

Big O Notation Comparison Chart

Final Notes

When considering the scalability of an algorithm in order to determine the correct big O option, you should ignore constants. For example, imagine you have a linear algorithm where each iteration takes one second to execute. This is O(n). If you improve the code and halve the processing cycles required for each iteration, you would not deem this O(0.5n). The algorithm is still O(n), even though it runs more quickly.

When calculating a formula for the worst-case execution plan for your algorithm, it will often not match one of the aforementioned options. In these cases, only the poorest scalability item is used as the function. For example, you might have an algorithm that includes three nested loops, followed by two nested loops. The worst case scenario for the number of iterations would be (n³ + n²). However, as n³ is the poorer scalability, you would say that the algorithm was O(n³).

27 January 2013