# Difference between revisions of "Interior-point method for NLP"

Author names: Cindy Chen
Steward: Dajun Yue and Fengqi You

# Introduction

The interior point (IP) method for nonlinear programming was pioneered by Anthony V. Fiacco and Garth P. McCormick in the early 1960s. The basis of IP method restricts the constraints into the objective function (duality) by creating a barrier function. This limits potential solutions to iterate in only the feasible region, resulting in a much more efficient algorithm with regards to time complexity.

# Algorithm

To ensure the program remains within the feasible region, a perturbation factor, $\mu$, is added to "penalize" close approaches to the boundaries. This approach is analogous to the use of an invisible fence to keep dogs in an unfenced yard. As the dog moves closer to the boundaries, the more shock he will feel. In the case of the IP method, the amount of shock is determined by $\mu$. A large value of $\mu$ gives the analytic center of the feasible region. As $\mu$ decreases and approaches 0, the optimal value is calculated by tracing out a central path. With small incremental decreases in $\mu$ during each iteration, a smooth curve is generated for the central path. This method is accurate, but time consuming and computationally intense. Instead, Newton's method is often used to approximate the central path for non-linear programming. Using one Newton step to estimate each decrease in $\mu$ for each iteration, a polynomial ordered time complexity is achieved, resulting in a small zig-zag central path and convergence to the optimal solution.

The logarithmic barrier function is based on the logarithmic interior function: $B(x, \mu) = f(x) - \mu\log(x) = f(x) - \mu\sum_{i=1}^m ln(x_i)$

# Application

The IP method for NLP have been commonly used to solve Optimal Power Flow (OPF) problems, where a set of nonlinear equations are used to find the optimal solution of a power network in terms of speed and reliability. To solve these problems, the perturbation factor is used in addition to the typical Karush-Kuhn-Tucker (KKT) methods.

Starting with a general optimization problem: \begin{align} \text{min} & ~~ f(x)\\ \text{s.t.} & ~~ h(x) = 0\\ & ~~ g(x) \le 0 \\ \end{align}

Modify the KKT conditions by adding convergence properties with slack variables and the perturbation factor: $\nabla_x L (x, \lambda_h, \lambda_g)=0$ $h(x) = 0$ $g(s) + s =0$ $[\lambda_g] s - \mu e=0$ $(s, \lambda_g, \mu) \ge = 0$

Solve the nonlinear equations iteratively by Newton's methods. First determine $\Delta x$ and $\Delta \lambda_h$ with reduced linear equations.
Next, calculate slack variables and corresponding multipliers with: $\Delta s = -g(x) - s - \nabla g(x) \Delta x$ $\Delta \lambda_g = -\lambda_g + [s^{-1}] * {\mu e - [\lambda_g] \Delta s}$

To calculate the perturbation factor, $\mu$, use primal-dual distances: $\mu = \sigma * pdad = \sigma * \dfrac{\lambda_g^t s}{niq}$
where $\sigma$ defines the trajectory of the optimal solution, pdad is the primal-dual average distance, and niq accounts for the inequality constraints. $\sigma$ ranges between 0 and 1. For the extreme conditions: $\sigma = 0$ corresponds to affine-scaling direction where the optimal point is obtained through non-perturbed solution of KKT $\sigma = 1$ corresponds to centralization direction where the non-optimal solution is found with a primal-dual distance equal to the initial value of $\mu$

In a conventional primal-dual IP method, a constant value is assigned to $\sigma$ (usually close to 0.1) for the iterations. This results in a search direction where 90% is defined towards the optimal point and 10% is allocated to trajectory of centralization.

# Illustrative Example

Perform 1 iteration of IP method to solve the following NLP: \begin{align} \text{min} & ~~ f = 0.25x_1^2 + x_2^2\\ \text{s.t.} & ~~ 1 \le x1 - x2 \le 7\\ \end{align}

To solve, first form the Lagrange function: $L(x,\lambda)=f(x) + \lambda F(x) = 0.25x_1^2 + x_2^2 + \lambda(x_1 - x_2)$ $U(x,z) = f_x^T (x) + zF_x^T (x) = 0$ $f_x = x_1 - x_2$ $F_x = \begin{bmatrix} 0.5x_1 & 2x_2 \\ \end{bmatrix}$ $S = U_x (x,z) = f_{xx} + \sum_{i=1}^3 z_i f_{xx} (x) = \begin{bmatrix} \lambda^T (1-x_2)+1 & \lambda^T (x_1-1)+1 \\ \lambda^T (1-x_2)-1 & \lambda^T (x_1-1)-1 \\ \end{bmatrix}$

Using the initial solution $x=[1,0], \lambda = -2, k = 3$: $S = \begin{bmatrix} 2x_2-1 & 3-2x_1 \\ 2x_2-3 & 1-2x_1 \\ \end{bmatrix} = \begin{bmatrix} -1 & 1 \\ -3 & -1 \\ \end{bmatrix}$

Solving with Newton's Method: $\begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}^{new} = \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}^{old} + \begin{bmatrix} \delta x_1 \\ \delta x_2 \\ \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} + \begin{bmatrix} 5.25 \\ -0.25 \\ \end{bmatrix} = \begin{bmatrix} 6.25 \\ -0.25 \\ \end{bmatrix}$

So after 1 iteration: $f(x_1, x_2) = f(6.25,-0.25) = 0.25(6.25)^2 + (-0.25)^2 = 9.828$

After running through multiple iterations, the perturbation factor can be minimized to reach a solution close to the true value.

# Conclusion

The IP method was later adapted for linear programming by Karmarkar in 1984. As a polynomial-time linear programming method, it solved complex linear problems 50 times faster than the simplex method. Multiple solvers utilize the IP method for non-linear programming, such as IPOPT and KNITRO, both of which were developed by IEMS professors at Northwestern University. Although successful, the IP method is no longer as popular since the creation of more competitive methods, such as sequential quadratic programming.