Heuristic algorithms

From optimization
Jump to: navigation, search

Authors: Vincent Kenny, Matthew Nathal, and Spencer Saldana (ChE 345 Spring 2014)

Steward: Dajun Yue, Fengqi You

Date Presented: May 25, 2014


A heuristic algorithm is one that is designed to solve a problem in a faster and more efficient fashion than traditional methods by sacrificing optimality, accuracy, precision, or completeness for speed. Heuristic algorithms often times used to solve NP-complete problems, a class of decision problems. In these problems, there is no known efficient way to find a solution quickly and accurately although solutions can be verified when given. Heuristics can produce a solution individually or be used to provide a good baseline and are supplemented with optimization algorithms. Heuristic algorithms are most often employed when approximate solutions are sufficient and exact solutions are necessarily computationally expensive.1


Contents

Example Algorithms

The following are well-known examples of “intelligent” algorithms that use clever simplifications and methods to solve computationally complex problems.

Swarm Intelligence

Swarm Intelligence systems employ large numbers of agents interacting locally with one another and the environment. Swarm intelligence refers to the collective behavior of decentralized systems and can be used to describe both natural and artificial systems. Specific algorithms for this class of system include the particle swarm optimization algorithm, the ant colony optimization algorithm, and artificial bee colony algorithm. Each of the previous algorithms was inspired by the natural, self-organized behavior of animals.1

Tabu Search

This heuristic technique uses dynamically generated tabus to guide the solution search to optimum solutions. It examines potential solutions to a problem and checks immediate local neighbors to find an improved solution. The search creates a set of rules dynamically and prevents the system from searching around the same area redundantly by marking rule violating solutions as “tabu” or forbidden. This method solves the problem of local search methods when the search is stuck in suboptimal regions or in areas when there are multiple equally fit solutions.2

Simulated Annealing

Borrowing the metallurgical term, this technique converges to a solution in the same way metals are brought to minimum energy configurations by increasing grain size. Simulated annealing is used in global optimization and can give a reasonable approximation of a global optimum for a function with a large search space. At each iteration, it probabilistically decides between staying at its current state or moving to another while ultimately leading the system to the lowest energy state.2

Genetic Algorithms

Genetic algorithms are a subset of a larger class of evolutionary algorithms that describe a set of techniques inspired by natural selection such as inheritance, mutation, and crossover. Genetic algorithms require both a genetic representation of the solution domain and a fitness function to evaluate the solution domain. The technique generates a population of candidate solutions and uses the fitness function to select the optimal solution by iterating with each generation. The algorithm terminates when the satisfactory fitness level has been reached for the population or the maximum generations have been reached.2

Artificial Neural Networks

Artificial Neural Networks (ANNs) are models capable of pattern recognition and machine learning, in which a system analyzes a set of training data and is then able to categorize new examples and data. ANNs are influenced by animals’ central nervous systems and brains, and are used to solve a wide variety of problems including speech recognition and computer vision.1

Support Vector Machines

Support Vector Machines (SVMs) are models with training data used by artificial intelligence to recognize patterns and analyze data. These algorithms are used for regression analysis and classification purposes. Using example data, the algorithm will sort new examples into groupings. These SVMs are involved with machine learning, a subset of artificial intelligence where systems learn from data, and require training data before being capable of analyzing new examples.1

Example Problems

Traveling Salesmen Problem

A well-known example of a heuristic algorithm is used to solve the common Traveling Salesmen Problem. The problem is as follows: given a list of cities and the distances between each city, what is the shortest possible route that visits each city exactly once? A heuristic algorithm used to quickly solve this problem is the nearest neighbor (NN) algorithm (also known as the Greedy Algorithm). Starting from a randomly chosen city, the algorithm finds the closest city. The remaining cities are analyzed again, and the closest city is found.3

Nearestneighbor.gif

Figure 1: Example of how the nearest neighbor algorithm functions.4

These are the steps of the NN algorithm:

  1. Start at a random vertex
  2. Determine the shortest distance connecting the current vertex and an unvisited vertex V
  3. Make the current vertex the unvisited vertex V
  4. Make V visited
  5. Record the distance traveled
  6. Terminate if no other unvisited vertices remain
  7. Repeat step 2.5

This algorithm is heuristic in that it does not take into account the possibility of better steps being excluded due to the selection process. For n cities, the NN algorithm creates a path that is approximately 25% longer than the most optimal solution.6

Traveling Salesman Example Problem

There are 4 points of interest located in a 10x10 plot of space: (3,4.5), (9,6.25), (1,8), and (5.5,0). The table below lists the distance required to touch all 4 points with the first and last point known using the nearest neighbor algorithm:

TravelingSalesmanH.JPG

Starting at point (1,8): The shortest distance to an unvisited point is 4.03 units to point (3,4.5). The shortest distance to an unvisited point is 5.15 units to point (5.5,0). The shortest distance to an unvisited point is 7.16 units to point (9,6.25). The total distance traveled is 16.34 units.

Starting at point (9,6.25): The shortest distance to an unvisited point is 6.25 units to point (3,4.5). The shortest distance to an unvisited point is 4.03 units to point (1,8). The shortest distance to an unvisited point is 9.18 units to point (5.5,0). The total distance traveled is 19.46 units.

Both situations followed the NN algorithm to solve the problem, however the total distance traveled changed based on the started location. This shows how a heuristic algorithm can give a good solution, but not the best solution.


Knapsack Problem

Another common use of heuristics is to solve the Knapsack Problem, in which a given set of items (each with a mass and a value) are grouped to have a maximum value while being under a certain mass limit. The heuristic algorithm for this problem is called the Greedy Approximation Algorithm which sorts the items based on their value per unit mass and adds the items with the highest v/m as long as there is still space remaining.

To illustrate, there is a bag with max weight limit W. We want to maximize the value of all the objects that go into the bag, so the objective function is:

 max z = sum(a_{{j}}\times x_{{j}})

s.t. sum(b_{{j}}\times x_{{j}}) \leq W

x_{{j}} is a binary variable, and determines if object j will go in the bag.

a_{{j}} is the value of object j.

b_{{j}} is object j’s weight, and the sum of all the weights must not be larger than W.7


In general, Greedy Algorithms are used to approximately solve combinatorics problems in a timely manner.8


Virus Scanning

In virus scanning, an algorithm searches for key pieces of code associated with particular kinds or viruses, reducing the number of files that need to be scanned. One of the benefits of heuristic virus scanning is that different viruses of the same family can be detected without being known due to the common code markers.9

Searching and Sorting

One of the most common uses of heuristic algorithms is in searching and sorting. As a search runs, it adjusts its working parameters to optimize speed, an important characteristic in a search function. The algorithm discards current possibilities if they are worse than already found solutions.10 Some forms of the heuristic methods can be detrimental to searching such as the best-first search algorithm. It takes search results close to the goal and follows the new path even when it may not continue to lead to the optimal search result.11

References

1. S. A. Cook. ”An overview of computational complexity”, in Communication of the ACM, vol. 26, no. 6, June 1983, pp. 401–408.

2. D. Karaboga, D. Pham. Intelligent Optimisation Techniques: Genetic Algorithms, Tabu Search, Simulated Annealing and Neural Networks. Springer Verlag, 2000.

3. Jon Louis Bentley (1982). Writing Efficient Programs. Prentice Hall. p. 11.

4. Travelling salesman problem. (n.d.). In Wikipedia. Retrieved June 8, 2014, from [[1]]

5. Nearest neighbour algorithm. (n.d.). In Wikipedia. Retrieved June 4, 2014, from [[2]]

6. Johnson, D.S. and McGeoch, L.A.. "The traveling salesman problem: A case study in local optimization", Local search in combinatorial optimization, 1997, 215-310

7. Knapsack problem. (n.d.). In Wikipedia. Retrieved June 8, 2014, from [[3]]

8. George B. Dantzig, Discrete-Variable Extremum Problems, Operations Research Vol. 5, No. 2, April 1957, pp. 266–288

9. Wong, W.; Stamp, M. (2006). "Hunting for metamorphic engines". Journal in Computer Virology 2 (3): 211–229.

10. Stuart Russell and Peter Norvig (2010). Artificial Intelligence: A Modern Approach. Prentice Hall.

11. R. Battiti. ”Reactive search: towards self-tuning heuristics”, in Modern heuristic search methods. Wiley&Sons, 1996, pp. 61-83.