Difference between revisions of "Line search methods"

From optimization
Jump to: navigation, search
Line 4: Line 4:
 
=Introduction=
 
=Introduction=
 
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme
 
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme
 
  
 
[[File:CodeCogsEqn.gif]],
 
[[File:CodeCogsEqn.gif]],
 
  
 
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction.  
 
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction.  
Line 27: Line 25:
  
 
==Wolfe Conditions==
 
==Wolfe Conditions==
 
 
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount.  This amount is defined by  
 
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount.  This amount is defined by  
  
Line 38: Line 35:
 
[[File:CodeCogsEqn (5).gif]]
 
[[File:CodeCogsEqn (5).gif]]
  
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1
+
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions.
 +
 
 +
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''.  The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. 
 +
 
 +
[[File:CodeCogsEqn (6).gif]]
 +
 
 +
This left hand side of the curvature condition is simply the derivative of the <math>\phi (\alpha_k)</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values.
 +
 
  
  

Revision as of 21:29, 24 May 2015

Author names: Elizabeth Conger
Steward: Dajun Yue and Fengqi You

Contents

Introduction

An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function f(x), an initial x_k is chosen. To find a lower value of f(x), the value of x_{k+1} is increased by the following iteration scheme

CodeCogsEqn.gif,

in which \alpha_k is a positive scalar known as the step length and p_k defines the step direction.

Step Length

Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:

CodeCogsEqn (2).gif,

but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function \phi has multiple local minima or stationary points, as shown in Figure 1.

Figure 1: Complexity of finding ideal step length (Nocedal & Wright)

A common and practical method for finding a suitable step length that is not too near to the global minimum of the \phi function is to require that the step length of \alpha_k reduces the value of the target function, or that

CodeCogsEqn (3).gif

This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a significant decrease condition during every iteration.

Wolfe Conditions

These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires \alpha_k to decreased the objective function by significant amount. This amount is defined by

CodeCogsEqn (4).gif

where c_1 is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the Armijo condition. In general, c_1 is a very small value, ~10^-4.

The Armijo condition must be paired with the curvature condition

CodeCogsEqn (5).gif

to keep the \alpha_k value from being too short. In this condition, c_2 is greater than c_1 but less than 1. These two conditions together are the Wolfe Conditions.

Another, more stringent form of these conditions is known as the strong Wolfe conditions. The Armijo condition remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality.

CodeCogsEqn (6).gif

This left hand side of the curvature condition is simply the derivative of the \phi (\alpha_k) function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of \phi from consideration as viable \alpha_k values.


Step Direction

Steepest Descent Method

Newton Method

Quasi-Newton Method

Conclusion

References

1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.

2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.

3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.

4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.