https://optimization.mccormick.northwestern.edu/api.php?action=feedcontributions&user=Epc257&feedformat=atomoptimization - User contributions [en]2022-08-13T09:51:52ZUser contributionsMediaWiki 1.21.3https://optimization.mccormick.northwestern.edu/index.php/File:Line_search_alogorithm_chart.pngFile:Line search alogorithm chart.png2015-06-07T17:30:36Z<p>Epc257: Epc257 uploaded a new version of &quot;File:Line search alogorithm chart.png&quot;</p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T17:28:36Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. Figure 1 gives a clear flow chart to indicate the iteration scheme. <br />
<br />
[[File:Line search alogorithm chart.png|frame|Figure 1: Algorithm flow chart of line search methods (Conger, adapted from Line Search wikipedia page)]]<br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 2.<br />
<br />
[[File:Step length image.png|frame|50px|Figure 2: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k) < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
<math>f(x_k) + (1-c)\alpha_k \nabla f^T_k p_k \le f(x_k + \alpha p_k) \le f(x_k) + c \alpha_k \nabla f^T_k p_k </math><br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 3. <br />
<br />
[[File:Goldstein.png|frame|Figure 3: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
<math>cos{\theta_k} = \frac{-\nabla f^T_k p_k}{||\nabla f_k|| ||p_k||}</math>,<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
<math>cos{\theta_k}^2 ||\nabla f_k||^2 < \infty</math>, which implies that <math>cos{\theta_k}^2 ||\nabla f_k||^2 \to 0</math><br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T17:27:27Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. Figure 1 gives a clear flow chart to indicate the iteration scheme. <br />
<br />
[[File:Line search alogorithm chart.png|frame|Figure 1: Algorithm flow chart of line search methods (Conger, adapted from Line Search wikipedia page)]]<br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 2.<br />
<br />
[[File:Step length image.png|frame|200px|Figure 2: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k) < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
<math>f(x_k) + (1-c)\alpha_k \nabla f^T_k p_k \le f(x_k + \alpha p_k) \le f(x_k) + c \alpha_k \nabla f^T_k p_k </math><br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 3. <br />
<br />
[[File:Goldstein.png|frame|Figure 3: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
<math>cos{\theta_k} = \frac{-\nabla f^T_k p_k}{||\nabla f_k|| ||p_k||}</math>,<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
<math>cos{\theta_k}^2 ||\nabla f_k||^2 < \infty</math>, which implies that <math>cos{\theta_k}^2 ||\nabla f_k||^2 \to 0</math><br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T17:23:19Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
[[File:Line search alogorithm chart.png|frame|Figure 1: Algorithm flow chart of line search methods (Conger, adapted from Line Search wikipedia page)]]<br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k) < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
<math>f(x_k) + (1-c)\alpha_k \nabla f^T_k p_k \le f(x_k + \alpha p_k) \le f(x_k) + c \alpha_k \nabla f^T_k p_k </math><br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
<math>cos{\theta_k} = \frac{-\nabla f^T_k p_k}{||\nabla f_k|| ||p_k||}</math>,<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
<math>cos{\theta_k}^2 ||\nabla f_k||^2 < \infty</math>, which implies that <math>cos{\theta_k}^2 ||\nabla f_k||^2 \to 0</math><br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:Line_search_alogorithm_chart.pngFile:Line search alogorithm chart.png2015-06-07T17:20:36Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:45:10Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k) < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
<math>f(x_k) + (1-c)\alpha_k \nabla f^T_k p_k \le f(x_k + \alpha p_k) \le f(x_k) + c \alpha_k \nabla f^T_k p_k </math><br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
<math>cos{\theta_k} = \frac{-\nabla f^T_k p_k}{||\nabla f_k|| ||p_k||}</math>,<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
<math>cos{\theta_k}^2 ||\nabla f_k||^2 < \infty</math>, which implies that <math>cos{\theta_k}^2 ||\nabla f_k||^2 \to 0</math><br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:40:39Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k) < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
<math>f(x_k) + (1-c)\alpha_k \nabla f^T_k p_k \le f(x_k + \alpha p_k) \le f(x_k) + c \alpha_k \nabla f^T_k p_k </math><br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
<math>cos{\theta_k} = \frac{-\nabla f^T_k p_k}{||\nabla f_k|| ||p_k||}</math>,<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:29:53Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
<math> \nabla f(x_k + \alpha p_k)^T p_k \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math> |\nabla f(x_k + \alpha p_k)^T p_k | \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:27:49Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
<math>\left |\nabla f(x_k + \alpha p_k)^T p_k \right \ge f(x_k) + c_2 \nabla f^T_k p_k</math><br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:22:42Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
<math>f(x_k+\alpha p_k < f(x_k)</math><br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
<math>f(x_k+\alpha p_k < f(x_k) + c_1 \alpha \nabla f^T_k p_k</math><br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:10:04Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k + \alpha p_k), \alpha >0</math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-06-07T16:01:30Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<math>x_{k+1} = x_k + \alpha_k p_k</math>,<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T04:48:42Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies: <br />
<br />
[[File:CodeCogsEqn (9).gif]], which implies that [[File:CodeCogsEqn (10).gif]]<br />
<br />
There are various algorithms to use this angle property to converge on the function's minimum, and they each have their benefits and disadvantages depending on the application and complexity of the target function. The major algorithms available are the ''steepest descent method'', the ''Newton method'', and the ''quasi-Newton methods''. These algorithms are explained in more depth elsewhere within this Wiki. <br />
<br />
When using these algorithms for line searching, it is important to know their weaknessess. The steepest descent method is the "quintessential globally convergent algorithm", but because it is so robust, it has a large computation time. The Newton methods rely on choosing an initial input value that is sufficiently near to the minimum. This is because the Hessian matrix of the function may not be positive definite, and therefore using the Newton method may not converge in a descent direction. The Newton method can be modified to atone for this. <br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T04:35:25Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies, which implies that <br />
<br />
[[File:CodeCogsEqn (9).gif]] or [[File:CodeCogsEqn (10).gif]]<br />
<br />
<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(10).gifFile:CodeCogsEqn (10).gif2015-05-25T04:35:11Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(9).gifFile:CodeCogsEqn (9).gif2015-05-25T04:33:56Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T04:33:33Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. The amount that <math>p_k</math> can deviate from the steepest slope and still produce reasonable results depends on the step length conditions that are adhered to in the method. For example, if <math>\alpha_k</math> satisfies the Wolfe conditions, the ''Zoutendijk condition'' applies, which implies that <br />
<br />
<math>\cos ^{2}\Theta_k \left \| \nabla f_k \right \|^{2} < \infty </math><br />
<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T04:26:01Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
When using line search methods, it is important to select a search or step direction with the steepest decrease in the function. This will increase the efficiency of line search methods. To identify this steepest descent at varying points along the function, the angle <math>\theta_k</math> between the chosen step direction and the negative gradient of the function <math> -\nabla f_k</math>, which is the steepest slope at point k. The angle is defined by <br />
<br />
[[File:CodeCogsEqn (8).gif]],<br />
<br />
and, as with the step length, it is not efficient to completely minimize <math> \theta_k </math>. <br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(8).gifFile:CodeCogsEqn (8).gif2015-05-25T04:23:21Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T04:15:38Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in Newton methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
In comparison to the Wolfe conditions, the Goldstein conditions are better suited for [https://optimization.mccormick.northwestern.edu/index.php/Quasi-Newton_methods quasi-Newton methods] than for Newton methods. <br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T03:51:20Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^{-4}</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in quasi-Newton line search methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe Conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T03:50:39Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^-4</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
These conditions are valuable for use in quasi-Newton line search methods.<br />
<br />
==Goldstein Conditions==<br />
Another approach to finding an appropriate step length is to use the following inequalities known as the Goldstein conditions. This condition, instead of having two constants, only employs one: <br />
<br />
[[File:CodeCogsEqn (7).gif]]<br />
<br />
The second equality is very similar to the Wolfe Conditions in that it is simply the sufficient decrease condition. The first inequality is another way to control the step length from below. This is best seen in the Figure 2. <br />
<br />
[[File:Goldstein.png|frame|Figure 2: Application of the Goldstein Conditions (Nocedal & Wright)]]<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:Goldstein.pngFile:Goldstein.png2015-05-25T03:48:06Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(7).gifFile:CodeCogsEqn (7).gif2015-05-25T03:44:13Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T03:29:53Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^-4</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi ({\alpha_k})</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T03:29:04Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^-4</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1. These two conditions together are the Wolfe Conditions. <br />
<br />
Another, more stringent form of these conditions is known as the ''strong Wolfe conditions''. The ''Armijo condition'' remains the same, but the curvature condition is restrained by taking the absolute value of the left side of the inequality. <br />
<br />
[[File:CodeCogsEqn (6).gif]]<br />
<br />
This left hand side of the curvature condition is simply the derivative of the <math>\phi (\alpha_k)</math> function, and so this constraint prevents this derivative from becoming too positive, removing points that are too far from stationary points of <math>\phi</math> from consideration as viable <math>\alpha_k</math> values. <br />
<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(6).gifFile:CodeCogsEqn (6).gif2015-05-25T03:25:49Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T03:10:45Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
<br />
These conditions, developed in 1969 by Philip Wolfe, are an inexact line search stipulation that requires <math>\alpha_k</math> to decreased the objective function by significant amount. This amount is defined by <br />
<br />
[[File:CodeCogsEqn (4).gif]]<br />
<br />
where <math>c_1</math> is between 0 and 1. Another way of describing this condition is to say that the decrease in the objective function should be proportional to both the step length and the directional derivative of the function and step direction. This inequality is also known as the ''Armijo condition''. In general, <math>c_1</math> is a very small value, ~<math>10^-4</math>.<br />
<br />
The ''Armijo condition'' must be paired with the ''curvature condition'' <br />
<br />
[[File:CodeCogsEqn (5).gif]]<br />
<br />
to keep the <math>\alpha_k</math> value from being too short. In this condition, <math>c_2</math> is greater than <math>c_1</math> but less than 1<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
=Conclusion=<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.<br />
<br />
4. Wolfe P (1969) Convergence Conditions for Ascent Methods. SIAM Review 11(2):226-235.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(5).gifFile:CodeCogsEqn (5).gif2015-05-25T02:59:14Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(4).gifFile:CodeCogsEqn (4).gif2015-05-25T02:52:17Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-25T02:28:39Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
A common and practical method for finding a suitable step length that is not too near to the global minimum of the <math>\phi</math> function is to require that the step length of <math>\alpha_k</math> reduces the value of the target function, or that<br />
<br />
[[File:CodeCogsEqn (3).gif]]<br />
<br />
This method does not ensure a convergence to the function's minimum, and so two conditions are employed to require a ''significant decrease condition'' during every iteration. <br />
<br />
==Wolfe Conditions==<br />
<br />
<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(3).gifFile:CodeCogsEqn (3).gif2015-05-25T02:21:53Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T16:37:30Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Figure 1: Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:Step_length_image.pngFile:Step length image.png2015-05-24T16:17:15Z<p>Epc257: Epc257 uploaded a new version of &quot;File:Step length image.png&quot;</p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T16:16:35Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
[[File:CodeCogsEqn (2).gif]],<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T16:15:58Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. To select the ideal step length, the following function could be minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k+\alpha p_k), \alpha >0 </math>,<br />
<br />
but this is not used in practical settings generally. This may give the most accurate minimum, but it would be very computationally expensive if the function <math>\phi</math> has multiple local minima or stationary points, as shown in Figure 1.<br />
<br />
[[File:Step length image.png|frame|Complexity of finding ideal step length (Nocedal & Wright)]]<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:Step_length_image.pngFile:Step length image.png2015-05-24T16:14:18Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn_(2).gifFile:CodeCogsEqn (2).gif2015-05-24T16:04:15Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:Conger_eqn_2.gifFile:Conger eqn 2.gif2015-05-24T16:02:31Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T16:00:23Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
Choosing an appropriate step length has a large impact on the robustness of a line search method. If the step length is too large, the line search will not produce an accurate or substantial minimization of the target function. If the step length is too small, the method would be too computationally expensive to complete in a reasonable about of time. To select the correct step length, the following function is minimized:<br />
<br />
<math>\phi(\alpha) = f(x_k+\alpha p_k), \alpha >0 </math><br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T15:54:49Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]],<br />
<br />
in which <math>\alpha_k</math> is a positive scalar known as the step length and <math>p_k</math> defines the step direction. <br />
<br />
=Step Length=<br />
<br />
=Step Direction =<br />
<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T15:38:07Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased by the following iteration scheme<br />
<br />
[[File:CodeCogsEqn.gif]]<br />
<br />
=Step Length=<br />
<br />
=Step Direction =<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn.gifFile:CodeCogsEqn.gif2015-05-24T15:36:53Z<p>Epc257: Epc257 uploaded a new version of &quot;File:CodeCogsEqn.gif&quot;</p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/File:CodeCogsEqn.gifFile:CodeCogsEqn.gif2015-05-24T15:35:53Z<p>Epc257: </p>
<hr />
<div></div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T14:48:10Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen, and the value of <math>f(x_k)</math> is calculated. To find a lower value of <math>f(x)</math>, the value of <math>x_{k+1}</math> is increased <br />
<br />
<br />
=Step Length=<br />
<br />
=Step Direction =<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T14:45:38Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math> is chosen, and the value of <math>f( x_k )</math> is calculated. To find a lower value of <math>f(x)</math>, the value of <math>x_k+1</math> is increased <br />
<br />
<br />
=Step Length=<br />
<br />
=Step Direction =<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T14:31:48Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x_k</math><br />
<br />
<br />
=Step Length=<br />
<br />
=Step Direction =<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
==Steepest Descent Method==<br />
==Newton Method==<br />
==Quasi-Newton Method==<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.<br />
<br />
3. Nocedal, J. & Wright, S. (2006) Numerical Optimization (Springer-Verlag New York, New York) 2 Ed p 664.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T13:48:31Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
An algorithm is a line search method if it seeks the minimum of a defined nonlinear function by selecting a reasonable direction vector that, when computed iteratively with a reasonable step size, will provide a function value closer to the absolute minimum of the function. Varying these will change the "tightness" of the optimization. For example, given the function <math>f(x)</math>, an initial <math>x</math><br />
<br />
<br />
==Section 1.1==<br />
==Section 1.2==<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
=Section 2=<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Section 3=<br />
<math>E=mc^2</math><br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
[25 20 15]<br />
<br />
<br />
=References=<br />
1. Sun, W. & Yuan, Y-X. (2006) Optimization Theory and Methods: Nonlinear Programming (Springer US) p 688.<br />
<br />
2. Anonymous (2014) Line Search. (Wikipedia). http://en.wikipedia.org/wiki/Line_search.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Computational_complexityComputational complexity2015-05-24T05:26:18Z<p>Epc257: </p>
<hr />
<div>Authors: David O'Brien, David Chen, Mark Caswell.<br />
<br />
=Introduction=<br />
[[File:Sorting_bubblesort_anim.gif|thumb|right|350px|Bubblesort animation. One of many sorting algorithms]]<br />
Computational complexity refers to the amount of resources required to solve a type of problem by systematic application of an algorithm. Resources that can be considered include the amount of communications, gates in a circuit, or the number of processors. Because the size of the particular input to a problem will affect the amount of resources necessary, measures of complexity will have to take into account this difference. As such, they can be reported as a function of the input size. <br />
As a basic example, consider the problem of sorting an array of random numbers. Take two arrays of different sizes. Intuitively, the larger one will require more steps when using the same method.<br />
Additionally, comparisons of complexity across a range of input sizes may not be consistent – that is, an algorithm may be considered to be less complex than others for small inputs, but the same comparison may not hold for larger input sizes. Depending on starting points, iterative application of algorithms may not cover the same trajectory to the same solution even if the input to the problem is identical. One way of addressing this variability is to compare the upper and lower bounds of complexity (worst and best case scenarios, respectively) and average values. <br />
Returning to our example, consider the case when the arrays are randomly populated. There is a probability, however small, that either or both of the arrays will be populated already sorted. In this case, it is possible that the larger array needs less sorting than the smaller one.<br />
<br />
<br />
Sorting algorithms are a good introduction to the idea of computational complexity. The problem is intuitive, and there are many different algorithms of varying complexity that can elucidate the comparisons being made, and there are many useful illustrations of the different mechanisms of sorting on the world wide web.<br />
<br />
=Machine Models=<br />
The simplest model is a deterministic Turing machine, which consists of two units: control and memory. The control unit has a finite number of states whereas the memory units has an infinite number of states in both directions of infinity. Such Turing machine ''M'' can be defined by following characteristics<sup>4</sup>:<br />
<br />
# finite set ''Q'' <math>\forall</math>states<br />
# initial state ''q''<sub>''0''</sub><math>\in</math>''Q''<br />
# subset ''F''<math>\subseteq</math>''Q'' of accepting states<br />
# finite set <math>\sum</math>''inputs''<br />
# finite set ''A''<math>\in\sum</math> tape units<br />
# Partial partition function<br />
<br />
In our example of sorting number arrays, we did not consider the resources needed to completely traverse the arrays. Again, consider our example, using the following sorting mechanism: At each array location, compare the current value with the value in the next location. If they are out of order, swap them. Do this at every location until the penultimate location, and start again from the beginning. Repeat until a pass through the array is completed in which no swaps are made, meaning everything is finally sorted. There are more efficient algorithms such as mergesort, heapsort, and quicksort. The comparison between our simple algorithm and the improved ones is analogous to comparing a Turing machine to other computational models. The standard Turing machine has to take time to access all the locations on its way between two locations. This is in contrast to a random access Turing machine, in which the intermediate elements can be skipped.<br />
<br />
=Quantifying Complexity=<br />
When discussing complexity, “time” is often used as the basis for comparison. However, the “time” used is not absolute time, but rather an indication of the number of steps required to solve a problem. Because hardware is constantly improving, the absolute time required to solve a given problem will decrease, even if no change is made to the algorithm. The complexity of a problem will be identical, as it is the number of steps required, but the processing is faster on modern machines because of their processing power. Complexity should thus be quantified in a manner that is independent of hardware. It will, however, necessarily depend on the methods used, namely algorithm and machine model. A meaningful way of quantifying complexity is to state how the time scales with the size of the input.<br />
=P=<br />
The set of problems P contains problems with whose solution-time upper bound scales as a polynomial function of the input size. Other time classes include linear time, quadratic time, or exponential time. <br />
<br />
Cobham's thesis states that problems in P are "efficiently solvable" or "tractable." There are exceptions, but in general this is usually true.<br />
<br />
Examples of polynomial time algorithms: the "quicksort" algorithm and basic arithmetic operations such as addition, subtraction, multiplication, division, and comparison.<br />
<br />
=NP=<br />
[[File:800px-P_np_np-complete_np-hard.svg.png|thumb|right|350px|Classification of P and NP sets]]<br />
The set of problems NP contains problems whose solutions can be verified within a polynomially scaled upper bound. That does not mean they can be absolutely solved in polynomial time, but given a potential solution, its verity can be confirmed or denied in polynomial time. <br />
==NP-Hard==<br />
A problem is NP-hard (nondeterministic polynomial time-hard) if it it can be obtained from a NP-complete problem that is polynomial time Turing-reducible. It can be said to be "at least as hard as the hardest problems in NP."<br />
<br />
Examples of NP-hard problems: Subset sum problem, traveling salesman problem, halting problem (this last one is NP-hard but not NP-complete because all NP problems are decidable in a finite number of operations, whereas this one is not). <br />
==NP-Complete==<br />
A problem is NP-complete (nondeterministic polynomial time-complete) if it belongs to both NP as well as NP-hard. NP-complete problems can be obtained by transforming every other problem in NP in polynomial time. NP-complete problems are of note because there is an apparent correlation between the quick verifications of solutions and quick solving of the problems<br />
<br />
Common approaches to solving NP-complete problems are heuristic algorithms and approximation algorithms.<br />
<br />
Examples of NP-Complete problems: graph isomorphism problem, Boolean satisfiability problem, knapsack problem, Hamiltonian path problem, travelling salesman problem, subgraph isomorphism problem, and more.<br />
<br />
=P vs NP=<br />
A notable comparison is the one between the P and NP classes of problems. P and NP stand for Polynomial Time and Nondeterministic Polynomial Time, respectively. P contains decision problems that can be solved in polynomial time on a deterministic Turing machine. NP, on the other hands, contains all decision problems that, given a potential solution, can be verified in polynomial time on a deterministic Turing machine. Solution of NP problems frequently occurs in exponential time. Polynomial time means that the steps required to solve the problem has an upper bound given by a polynomial function of the input size. <br />
The P vs. NP problem asks whether every problem with solutions that can be verified in polynomial time can also be solved in polynomial time. It is one of the Millennium Prize Problems. The significance of this is that if P and NP are indeed equal, then a polynomial time solution exists for problems whose solutions can be verified in polynomial time rather than the frequently encountered exponential time solutions.<br />
<br />
=Example=<br />
As a concrete example, let us take the subset sum problem. The subset sum problem asks whether a subset of a given set can sum to zero. Take the subset [-3,-1,2,4,8]. Can a subset sum to zero? The answer is yes: [-3,-1,4]. The problem is NP-complete. <br />
<br />
The complexity of the subset problem can be viewed as depending on two parameters, N, the number of decision variables, and P, the precision of the problem - the number of binary place values that it takes to state the problem.<br />
<br />
If N (the number of variables) is small, then an exhaustive search for the solution is practical. If P (the number of place values) is a small fixed number, then there are dynamic programming algorithms that can solve it exactly.<br />
<br />
=Conclusions=<br />
Computational complexity is very important in analysis of algorithms. As problems become more complex and increase in size, it is important to be able to select algorithms for efficiency and solvability. The ability to classify algorithms based on their complexity is very useful. Finally, if you can solve the P vs NP problem, you will win a lot of money.<br />
<br />
<br />
<br />
<br />
=Sources=<br />
1. van Leewuen, Jan. ''Handbook of Theoretical Computer Science: Algorithms and complexity, Volume 1.'' Elsevier, 1990.<br />
<br />
2. Cormen, Thomas H. ''Introduction to Algorithms.'' MIT Press, 2001.<br />
<br />
3. "P vs NP Problem." Clay Mathematics Institute. Accessed May 5, 2014. http://www.claymath.org/millenium-problems/p-vs-np-problem. Last updated May 23, 2014.<br />
<br />
4. Ding-Zhu Du. "Theory of Computational Complexity." Wiley-Interscience, 2010.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Computational_complexityComputational complexity2015-05-24T05:26:01Z<p>Epc257: </p>
<hr />
<div>Authors: David O'Brien, David Chen, Mark Caswell.<br />
<br />
=Introduction=<br />
[[File:Sorting_bubblesort_anim.gif|thumb|right|350px|Bubblesort animation. One of many sorting algorithms]]<br />
Computational complexity refers to the amount of resources required to solve a type of problem by systematic application of an algorithm. Resources that can be considered include the amount of communications, gates in a circuit, or the number of processors. Because the size of the particular input to a problem will affect the amount of resources necessary, measures of complexity will have to take into account this difference. As such, they can be reported as a function of the input size. <br />
As a basic example, consider the problem of sorting an array of random numbers. Take two arrays of different sizes. Intuitively, the larger one will require more steps when using the same method.<br />
Additionally, comparisons of complexity across a range of input sizes may not be consistent – that is, an algorithm may be considered to be less complex than others for small inputs, but the same comparison may not hold for larger input sizes. Depending on starting points, iterative application of algorithms may not cover the same trajectory to the same solution even if the input to the problem is identical. One way of addressing this variability is to compare the upper and lower bounds of complexity (worst and best case scenarios, respectively) and average values. <br />
Returning to our example, consider the case when the arrays are randomly populated. There is a probability, however small, that either or both of the arrays will be populated already sorted. In this case, it is possible that the larger array needs less sorting than the smaller one.<br />
<br />
<br />
Sorting algorithms are a good introduction to the idea of computational complexity. The problem is intuitive, and there are many different algorithms of varying complexity that can elucidate the comparisons being made, and there are many useful illustrations of the different mechanisms of sorting on the world wide web.<br />
<br />
=Machine Models=<br />
The simplest model is a deterministic Turing machine, which consists of two units: control and memory. The control unit has a finite number of states whereas the memory units has an infinite number of states in both directions of infinity. Such Turing machine ''M'' can be defined by following characteristics<sup>4</sup>:<br />
<br />
# finite set ''Q'' <math>\forall</math>states<br />
# initial state ''q''<sub>''0''</sub><math>\in</math>''Q''<br />
# subset ''F''<math>\subseteq</math>''Q'' of accepting states<br />
# finite set <math>\sum</math>''inputs''<br />
# finite set ''A''<math>\in\sum</math> tape units<br />
# Partial partition function<br />
<br />
In our example of sorting number arrays, we did not consider the resources needed to completely traverse the arrays. Again, consider our example, using the following sorting mechanism: At each array location, compare the current value with the value in the next location. If they are out of order, swap them. Do this at every location until the penultimate location, and start again from the beginning. Repeat until a pass through the array is completed in which no swaps are made, meaning everything is finally sorted. There are more efficient algorithms such as mergesort, heapsort, and quicksort. The comparison between our simple algorithm and the improved ones is analogous to comparing a Turing machine to other computational models. The standard Turing machine has to take time to access all the locations on its way between two locations. This is in contrast to a random access Turing machine, in which the intermediate elements can be skipped.<br />
<br />
=Quantifying Complexity=<br />
When discussing complexity, “time” is often used as the basis for comparison. However, the “time” used is not absolute time, but rather an indication of the number of steps required to solve a problem. Because hardware is constantly improving, the absolute time required to solve a given problem will decrease, even if no change is made to the algorithm. The complexity of a problem will be identical, as it is the number of steps required, but the processing is faster on modern machines because of their processing power. Complexity should thus be quantified in a manner that is independent of hardware. It will, however, necessarily depend on the methods used, namely algorithm and machine model. A meaningful way of quantifying complexity is to state how the time scales with the size of the input.<br />
=P=<br />
The set of problems P contains problems with whose solution-time upper bound scales as a polynomial function of the input size. Other time classes include linear time, quadratic time, or exponential time. <br />
<br />
Cobham's thesis states that problems in P are "efficiently solvable" or "tractable." There are exceptions, but in general this is usually true.<br />
<br />
Examples of polynomial time algorithms: the "quicksort" algorithm and basic arithmetic operations such as addition, subtraction, multiplication, division, and comparison.<br />
<br />
=NP=<br />
[[File:800px-P_np_np-complete_np-hard.svg.png|thumb|right|350px|Classification of P and NP sets]]<br />
The set of problems NP contains problems whose solutions can be verified within a polynomially scaled upper bound. That does not mean they can be absolutely solved in polynomial time, but given a potential solution, its verity can be confirmed or denied in polynomial time. <br />
==NP-Hard==<br />
A problem is NP-hard (nondeterministic polynomial time-hard) if it it can be obtained from a NP-complete problem that is polynomial time Turing-reducible. It can be said to be "at least as hard as the hardest problems in NP."<br />
<br />
Examples of NP-hard problems: Subset sum problem, traveling salesman problem, halting problem (this last one is NP-hard but not NP-complete because all NP problems are decidable in a finite number of operations, whereas this one is not). <br />
==NP-Complete==<br />
A problem is NP-complete (nondeterministic polynomial time-complete) if it belongs to both NP as well as NP-hard. NP-complete problems can be obtained by transforming every other problem in NP in polynomial time. NP-complete problems are of note because there is an apparent correlation between the quick verifications of solutions and quick solving of the problems<br />
<br />
Common approaches to solving NP-complete problems are heuristic algorithms and approximation algorithms.<br />
<br />
Examples of NP-Complete problems: graph isomorphism problem, Boolean satisfiability problem, knapsack problem, Hamiltonian path problem, travelling salesman problem, subgraph isomorphism problem, and more.<br />
<br />
=P vs NP=<br />
A notable comparison is the one between the P and NP classes of problems. P and NP stand for Polynomial Time and Nondeterministic Polynomial Time, respectively. P contains decision problems that can be solved in polynomial time on a deterministic Turing machine. NP, on the other hands, contains all decision problems that, given a potential solution, can be verified in polynomial time on a deterministic Turing machine. Solution of NP problems frequently occurs in exponential time. Polynomial time means that the steps required to solve the problem has an upper bound given by a polynomial function of the input size. <br />
The P vs. NP problem asks whether every problem with solutions that can be verified in polynomial time can also be solved in polynomial time. It is one of the Millennium Prize Problems. The significance of this is that if P and NP are indeed equal, then a polynomial time solution exists for problems whose solutions can be verified in polynomial time rather than the frequently encountered exponential time solutions.<br />
<br />
=Example=<br />
As a concrete example, let us take the subset sum problem. The subset sum problem asks whether a subset of a given set can sum to zero. Take the subset [-3,-1,2,4,8]. Can a subset sum to zero? The answer is yes: [-3,-1,4]. The problem is NP-complete. <br />
<br />
The complexity of the subset problem can be viewed as depending on two paramaters, N, the number of decision variables, and P, the precision of the problem - the number of binary place values that it takes to state the problem.<br />
<br />
If N (the number of variables) is small, then an exhaustive search for the solution is practical. If P (the number of place values) is a small fixed number, then there are dynamic programming algorithms that can solve it exactly.<br />
<br />
=Conclusions=<br />
Computational complexity is very important in analysis of algorithms. As problems become more complex and increase in size, it is important to be able to select algorithms for efficiency and solvability. The ability to classify algorithms based on their complexity is very useful. Finally, if you can solve the P vs NP problem, you will win a lot of money.<br />
<br />
<br />
<br />
<br />
=Sources=<br />
1. van Leewuen, Jan. ''Handbook of Theoretical Computer Science: Algorithms and complexity, Volume 1.'' Elsevier, 1990.<br />
<br />
2. Cormen, Thomas H. ''Introduction to Algorithms.'' MIT Press, 2001.<br />
<br />
3. "P vs NP Problem." Clay Mathematics Institute. Accessed May 5, 2014. http://www.claymath.org/millenium-problems/p-vs-np-problem. Last updated May 23, 2014.<br />
<br />
4. Ding-Zhu Du. "Theory of Computational Complexity." Wiley-Interscience, 2010.</div>Epc257https://optimization.mccormick.northwestern.edu/index.php/Line_search_methodsLine search methods2015-05-24T05:24:58Z<p>Epc257: </p>
<hr />
<div>Author names: Elizabeth Conger <br/><br />
Steward: Dajun Yue and Fengqi You<br />
<br />
=Introduction=<br />
Line search methods are a group of algorithms that determine the minimum of a defined multivariable function by selecting a reasonable direction relative to the function that will provide a value closer to the absolute minimum of the function. This process is iterated using defined direction parameters and step sizes of travel. Varying these will change the "tightness" of the optimization. <br />
==Section 1.1==<br />
==Section 1.2==<br />
jadklfjlasjfkladsl'''kfdsklf'''<br />
dfadjfkhdakjfhadskj<br />
fahdfkjadshf''kahdfjsdk'' [1]<br />
[https://www.youtube.com/ Youtube Site]<br />
<br />
=Section 2=<br />
[[File:Chemicals.jpg]]<br />
[[File:48StatesTSP.png|frame|Solution to 48 States Traveling Salesman Problem]]<br />
<br />
=Section 3=<br />
<math>E=mc^2</math><br />
<br />
=Conclusion=<br />
<math>\begin{bmatrix} G(x,y) & 0 & -A(x)^T \\ 0 & Y & W \\ A(x) & -I & 0 \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta s \\ \Delta y \end{bmatrix} = \begin{bmatrix} -\nabla f(x) + A(x)^T y \\ \mu e - W Y e \\ -g(x) + s \end{bmatrix}</math><br />
<br />
[25 20 15]<br />
<br />
<br />
=References=</div>Epc257