Computational complexity

From optimization
Jump to: navigation, search

Authors: David O'Brien, David Chen, Mark Caswell.

Contents

Introduction

Computational complexity refers to the amount of resources required to solve a type of problem by systematic application of an algorithm. Resources that can be considered include the amount of communications, gates in a circuit, or the number of processors. Because the size of the particular input to a problem will affect the amount of resources necessary, measures of complexity will have to take into account this difference. As such, they can be reported as a function of the input size. As a basic example, consider the problem of sorting an array of random numbers. Take two arrays of different sizes. Intuitively, the larger one will require more steps when using the same method. Additionally, comparisons of complexity across a range of input sizes may not be consistent – that is, an algorithm may be considered to be less complex than others for small inputs, but the same comparison may not hold for larger input sizes. Depending on starting points, iterative application of algorithms may not cover the same trajectory to the same solution even if the input to the problem is identical. One way of addressing this variability is to compare the upper and lower bounds of complexity (worst and best case scenarios, respectively) and average values. Returning to our example, consider the case when the arrays are randomly populated. There is a probability, however small, that either or both of the arrays will be populated already sorted. In this case, it is possible that the larger array needs less sorting than the smaller one.

Machine Models

In our example of sorting number arrays, we did not consider the resources needed to completely traverse the arrays. A basic machine model is the Turing machine, which is essentially a length of tape that contains data to be retrieved and manipulated. Again, consider our example, using the following sorting mechanism: At each array location, compare the current value with the value in the next location. If they are out of order, swap them. Do this at every location until the penultimate location, and start again from the beginning. Repeat until a pass through the array is completed in which no swaps are made, meaning everything is finally sorted. There are more efficient algorithms such as mergesort, heapsort, and quicksort. The comparison between our simple algorithm and the improved ones is analogous to comparing a Turing machine to other computational models. The standard Turing machine has to take time to access all the locations on its way between two locations. This is in contrast to a random access Turing machine, in which the intermediate elements can be skipped.

Quantifying Complexity

When discussing complexity, “time” is often used as the basis for comparison. However, the “time” used is not absolute time, but rather an indication of the number of steps required to solve a problem. Because hardware is constantly improving, the absolute time required to solve a given problem will decrease, even if no change is made to the algorithm. The complexity of a problem will be identical, as it is the number of steps required, but the processing is faster on modern machines because of their processing powerz. Complexity should thus be quantified in a manner that is independent of hardware. It will, however, necessarily depend on the methods used, namely algorithm and machine model. A meaningful way of quantifying complexity is to state how the time scales with the size of the input.

Polynomial time

Polynomial time refers to problems whose upper bound of computation is a function of the input size. Other time classes include linear time, quadratic time, or exponential time. Examples of polynomial time algorithms include the "quicksort" algorithm and basic arithmetic operations such as addition, subtraction, multiplication, division, and comparison.

Notable Comparison

There are many different complexity classes. A notable comparison is the one between the P and NP classes of problems. P and NP stand for Polynomial Time and Nondeterministic Polynomial Time, respectively. P contains decision problems that can be solved in polynomial time on a deterministic Turing machine. NP, on the other hands, contains all decision problems that, given a potential solution, can be verified in polynomial time on a deterministic Turing machine. Solution of NP problems frequently occurs in exponential time. Polynomial time means that the steps required to solve the problem has an upper bound given by a polynomial function of the input size. The P vs. NP problem asks whether every problem with solutions that can be verified in polynomial time can also be solved in polynomial time. It is one of the Millennium Prize Problems. The significance of this is that if P and NP are indeed equal, then a polynomial time solution exists for problems whose solutions can be verified in polynomial time rather than the frequently encountered exponential time solutions.