# Nonconvex Generalized disjunctive programming (GDP)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Author: Kaiwen Li (ChE345 Spring 2015)
Steward: Prof.Fengqi You, Dajun Yue

# Introduction

General disjunctive programming, GDP, is an alternative approach to represent the formulation of traditional Mixed-Integer Nonlinear Programming, solving discrete/continuous optimization problems. By using algebraic constraints, disjunctions and logic propositions, Boolean and continuous variables are involved in the GDP formulation. The formulation process of GDP problem are more intuitive, and the underlying logic structure of the problem can be kept, so that solution can be found more efficiently. GDP has been successfully applied to Process Design and Planning and Scheduling areas.

However, funtions in GDP problem sometimes could be nonconvex. Due to nonconvexites, conventional MINLP algorithms are often trapped in suboptimal solutions. Thus, solutions to nonconvex GDP has been receiving increasing attention.

# Algorithm

In the GDP formulation, some functions might be nonconvex, which will lead to a nonconvex GDP problem. Traditional algorithms for MINLP such as Generalized Benders Decomposition (GBD), or Outer Approximation (OA) will fail to provide a global optimum because the solution of the NLP subproblem can be only a local optimum, while the cuts in master problem may not be valid either. Therefore, in order to get a global optimum, we need to introduce special algorithm for nonconvex GBD problems.
Most of the methods rely on spatial branch and bound method.

## Motivation

In order to find the best design and operation of a system in many problems in engineering, a set of algebraic expressions are used to reduce the problem with continuous and discrete variables. Normally theses problems lead to Mixed-Integer Nonlinear Programming (MINLP). However, in order to represent the behavior of physical, chemical, biological, financial or social systems accurately, nonlinear expressions are often used and leads to a MINLP with a nonconvex solution space. Instead of reaching a global optimal, this may give rise to local solutions that are suboptimal. Meanwhile, most algorithms for nonconvex problems are just particular implementation of the spatial branch and bound framework. Therefore, in order to find the global optimum of a large-scaled nonconvex MINLP models in reasonable computational time, Grossmann and others proposed nonconvex generalized disjunctive programming method.

## General Formulation for GDP

Consider the following Generalized Disjunctive Programming problem, which includes Boolean variables, disjunctions and logic propositions:
$min Z$ $=$ $\sum_{k\in K} c_k$ $+ f(x)$

s.t. $r(x)\le 0$

$\bigvee_{j\in J_k}$ $\begin{bmatrix} Y_{jk} \\ g_{jk}(x) \le 0 \\ c_k = \gamma_{jk} \end{bmatrix}$, $k\in K$

$\Omega(Y) = True$

$0 \le x \le U$

$x \in R^n$, $c_k \in R^1$, $Y_{jk} \in {true, false}$

where, $f:R^n \rightarrow R^1$ is a function of the continuous variables x in the objective function, $g:R^n \rightarrow R^1$ bolongs to the set of global constraints, the disjuctions $k\in K$ are composed of a number of terms $j\in J_k$, which is connected by the OR operator. $Y_{jk}$ is a Boolean varibale, $g_{jk}(x) \le 0$ is a set of inequalities, and $c_k$ is a cost variable. When $Y_{jk}$ is true, $g_{jk}(x) \le 0$ and $c_k$ are enforced.Also, $x$represents continuous variables, with lower and upper bounds.Each term in the disjunctions gives rise to a nonempty feasible region which is generally nonconvex. Also, $\Omega(Y) = True$
are logic propositions for the Boolean variables.

## Overall Procedure

The following flowchart(Fig.1) shows the overall procedure of the proposed two-level branch and bound algorithm.
First, introduce convex underestimators in the non-convex GDP problem (P), and construct the underestimating problem (R). This convex GDP problem is then reformulated as the convex NLP problem (CRP) by using the convex hull relaxation of each disjunction, which generates valid lower bound. Initial upper bound is obtained in step 0, by sloving P-MIP problem. It is a MINLP reformulation of the nonconvex GDP by a standard MINLP method. Then, use upper bound for bound contraction to reduce the feasible region, which is solved as a Bound Contraction Problem (BCP) in step 1. Next in step 2, discrete branch and bound method is applied at the first level to slove problem (CRP). After fixing all Boolean variables, solve the corresponding nonconvex NLP problems for a upper bound by using spatial branch and bound at the second level. Then, problem (CRP) is solved with fixed discrete choice in step 3.

## Detailed Formulation and Models

### Convex Relaxation of GDP

The following reformulation shows the introduction of valid convex underestimating functions to change Problem (P) into a convex GDP problem.

$min Z^R$ $=$ $\sum_{k\in K} c_k$ $+ \bar{f}(x)$

s.t. $\bar{r}(x)\le 0$

$\bigvee_{j\in J_k}$ $\begin{bmatrix} Y_{jk} \\ \bar{g}_{jk}(x) \le 0 \\ c_k = \gamma_{jk} \end{bmatrix}$, $k\in K$

$\Omega(Y) = True$

$0 \le x \le U$
,$c_k \ge 0$
$x \in R^n$, $c_k \in R^1$, $Y_{jk} \in \{ true, false\}$

where $\bar{f}$,$\bar{r}$,$\bar{g}$ are valid convex underestimators so that $\bar{f}(x) \le f(x)$, $\bar{r}(x) \le 0$,$\bar{g}(x) \le 0$ are satisfied if $r(x) \le 0$,$g(x) \le 0$ (see fig.2)

Considering problem (R) is a convex GDP, by replacing each disjunction by its convex hull we can relax problem (R), which generates the folloing convex NLP model:

$min Z^L$ $=$ $\sum_{k\in K}\sum_{j\in J_k} \gamma_{jk}\lambda_{jk} + \bar{f}(x)$

s.t. $\bar{r}(x)\le 0$

$x = \sum_{j \in J_k} \nu_{jk}, \sum_{j \in J_k} \lambda_{jk} = 1, k\in K$

$0 \le \nu_{jk} \le \lambda_{jk}U_{jk}, j \in J_k, k\in K$

$\lambda_{jk}\bar{g}_{jk}(\nu_{jk}/\lambda_{jk}) \le 0, j \in J_k, k\in K$

$A\lambda \le a$
$0 \le x, \nu_{jk} \le U, 0 \le \lambda_{jk} \le 1, j \in J_k, k\in K, (CRP)$

where $\nu_{jk}$ is the disaggregated continuous variable for the $j$th term in the $k$th disjunction and $\lambda_{jk}$ is the corresponding multiplier for each term $j \in J_k$ in a given disjunction $k\in K$
Given the problem (R) yields a lower bound, the problem (CRP) is a relaxation of problem (R).

### Global Upper Bound Subproblem

This is step is to obtain a valid upper bound for problem (P) based on MINLP reformulation of (P) using big-M formulation.
$min Z = \sum_{k\in K}\sum_{j\in J_k} \gamma_{jk}y_{jk} + f(x)$

s.t. $r(x)\le 0$

$g_{jk}(x) \le M_jk(1-y_jk), j \in J_k, k\in K$

$x = \sum_{j \in J_k} y_{jk}= 1, k\in K$

$Ay \le a$
$0 \le x \le U, y_{jk} \in \{0,1\}, j \in J_k, k\in K, (P-MIP)$

where $M_jk$ is the big-M parameter, which provides a valid upper bound to the violation of $g_jk \le 0$ and $U$ is an upper bound to $x$.

### Bound Contraction Procedure

The bound contraction scheme is designed to tighten the lower and upper bound of a given continuous variable $x_i$, which will eliminate the nonoptimal subregions and accelerate the search. This will greatly reduce the difference between lower and upper bounds of the objective function and save computational work.
The contraction scheme is shown by the following NLP problem:

$min/max$ $x_i$

s.t. $\sum_{k\in K}\sum_{j\in J_k} \gamma_{jk}\lambda_{jk} + \bar{f}(x)r(x)\le GUB$

$\bar{r}(x) \le 0$

$x = \sum_{j \in J_k}\nu_{jk}, \sum_{j \in J_k}\lambda_{jk}= 1, k\in K$

$0 \le \nu_{jk} \le \lambda_{jk}U_{jk},j \in J_K, k \in K$

$\lambda_{jk}\bar{g}_{jk}(\nu_{jk}/\lambda_{jk}) \le 0, j \in J_k, k\in K$

$Ay \le a$
$0 \le \nu^{jk} \le U, 0 \le \lambda_{jk} \le 1, j \in J_k, k\in K, (BCP)$

As is shown in the following Fig.3, if the solution point $x_i$ of problem (CRP) does not lie at its bound, then we solve the NLP problem (CRP) and update the upper and lower bounds based on the relative distance from the solution to each bound of to decide the direction of bound contraction. Bound contraction is applied to only the continuous variables. Not much reduction will be made if the contraction is not successful, and then move to next variable.

### Branch and Bound on Boolean Variables

This step is where branch and bound method is applied in the space of the terms of the disjunctions by solving the relaxed convex NLP problem (CRP) at each node. The rule is to select the variable $\lambda_{jk}$ with the largest fractional value in the solution. By creating two child nodes with $\lambda_{jk}=1$ and $\lambda_{jk}=0$, $Y_{jk}$ is fixed in problem (R) respectively. The nodes number should be finite due to finite Boolean variables and the global lower bound increases monotonically.
Upper bound needs to be updated when there is a gap between the solution of this problem and the original nonconvex GDP problem (P). When the feasible solution to problem (R) is obtained and the gap between nonconvex term and convex underestimator in problem (P) and problem (R) is nonzero, all Boolean variables should be fixed and Spatial Branch and Bound Method should be introduced.

### Spatial Branch and Bound Method

Since there is a specific wiki page introducing this method, there is no need to discuss the detailed algorithm here. In this case, by solving problem (CRP), the nonconvex term with the max gap is chosen. Then, choose the middle point of the variable bound as the branching point. Lastly, select the node with lowest objective value. The current node will be fathomed if the lowest objective value of the node is greater than GUB.

Many application shows that tighter lower bounds can be obtained in bilinear and concave problems by applying basic Nonconvex GDP steps, which often leads to a significant reduction of the numbers of nodes in Spatial Branch and Bound method. This is a direct indication of achieving tightening bounds.
Meanwhile, GDPs are often reformulated as an MINLP, which enables us to take advantages of the existing MINLP solvers.

The main problem about this method is that the question how to efficiently implement the strong relaxations within a spatial branch and bound framework in a large scale system still cannot be answered.

# Applications and Examples

## Application: Solution Algorithm for nonconvex GDP problems

### Step 0 Heuristic Search (Nonconvex MINLP)

Use MINLP solvers (such as DICOPT++) to solve the nonconvex problem (P-MIP), and set GUB as the best upper bound and let $(Y^*,x*)$ be the solution.

### Step 1 Bound Contraction (Convex NLP)

1. Initialize the Relative Distance from $x^*_i$ to each bound

$RDL_i = \frac{x^*_i - x^L_i}{x^U_i - x^L_i}, RDU_i = \frac{x^U_i - x^*_i}{x^U_i - x^L_i}$

1. Increase iteration, solve problem (BCP)

Update bound and continue with $x^*_i$ when contraction is greater than $SP_M$. Otherwise select next x.

### Step 2 Branch and Bound on Discrete Variables (Convex NLP)

1. set tolerance $\alpha$ for difference in $Z^L$ and GUB. Set $\varepsilon$ for max gap.
2. Solve problem (CRP) to obtain lower bound $Z^L$. Update lowest lower bound as GLB.
3. Fathom, go to step 3 or Branch on the node.

### Step 3 Spatial Branch and Bound (Convex NLP)

1. Fix all $\lambda_{jk}$ according to solution from step 2
2. Solve problem (CRP) until no open node with $Z^L \le GUB - \alpha$
3. Go to step 2

## Example

### Optimal structure for process system

The system net work is shown in the following Fig4.

Fig4. Superstructure of the process

The formulation for this problem is shown as follows:
$min Z =-35P1A - 30P2B +10F1 + 8F2 + F4A + F4B + 4F5A + 4F5B + 2CF + 50CD$

$s.t. F3A = 0.55F1 + 0.5F2$

$F3B = 0.45F1 + 0.5F2$

$P1A = F8A + F10A + F6A$

$P1B = F8B + F10B + F6B$

$P2A = F9A + F11A + F7A$

$P2B = F9B +F11B+F7B$

$F6A=E6F3A,F6B=F6F3B$

$F7A=E7F3A,F7B=E7F3B$

$E4+D5+D6+D7 = 1$

$P1A \ge 4.0P1B, P2B \ge 3.0P2A$

$P1A+P1B \le 15, P2A +P2B \le 18$

$\begin{bmatrix} YF\\ F4A=E4F3A,F4B=E4F3B\\ 2.5 \le F4A+F4B \le 25\\ F8A=0.85F4A, F8B=0.20F4B\\ F9A=0.15F4A,F9B=0.8F4B\\ CF=2 \end{bmatrix}$ $\or$ $\begin{bmatrix} \lnot YF\\ F4A=F4B=0\\ F8A=F8B=0\\ F9A=F9B=0\\ E4=0\\ CF=0 \end{bmatrix}$

$\begin{bmatrix} YD\\ F5A=E5F3A,F5B=E5F3B\\ 2.5 \le F5A+F5B \le 25\\ F10A=0.975F5A, F10B=0.050F5B\\ F11A=0.025F5A,F11B=0.950F5B\\ CD=25 \end{bmatrix}$ $\or$ $\begin{bmatrix} \lnot YD\\ F5A=F5B=0\\ F10A=F10B=0\\ F11A=F11B=0\\ E5=0\\ CD=0 \end{bmatrix}$
$0 \le CF,CD; F1,F2 \le 25; 0 \le E4,E5,E6,E7 \le 1;$

$Y\in \{true,false\}$

The optimal solution is -510.08, and $F1^* = 8, F2^* = 25, P1^* = 15, P2^* = 18, E^* = (0.108,0.758,0,0.134)$, and $Y^*=(true,true)$, shown as the following Fig5.

Fig5. Optimal Solution

Here is the detailed explanation of the method applied in this example. Bilinear terms are replaced by continuous variables and linear underestimators and overestimators are introduced to construct the relaxed convex GDP problem (R). In step 0, which is mentioned above, DICOPT++ finds the trivial solution of $0.0$ ath first NLP. By solving 14 LP subproblems in step 1, the bound is reduced $54.17%$ for continuous variables. In step 2, discrete branch and bound on problem (CRP) reaches a feasible solution to the relaxed GDP at the third node, where $Y^L=(1,1)$ and objective function of $-661.56$. When switches to Spatial Branch and Bound in step 3, $Y^F=(1,1)$ is fixed. Then after 27 nodes and gap being reduced, upper bound is found to be $-510.08$. Using the upper bound to update GUB value in step 2, a cut to exclude $Y = (true,true)$ is added to the previous node and the problem (CRP) is resolved, which leads to a infeasible solution. Then by closing current node and backtrack, two more nodes are searched and fathomed by the GUB.

# Conclusion

For nonconvex generalized disjunctive programming, specified algorithm can provide a global optimum more efficiently by tightening the lower bound and reduce nodes in Spatial Branch and Bound step, compared with conventional MINLP and GBD algorithms. This method address the increasing need in applications in engineering and other area, where nonlinearity and nonconvexity are essential due to the problem nature.

# References

1. Lee, Sangbum, and Ignacio E. Grossmann. "New algorithms for nonlinear generalized disjunctive programming." Computers & Chemical Engineering 24.9 (2000): 2125-2141.
2. Lee, Sangbum, and Ignacio E. Grossmann. "A global optimization algorithm for nonconvex generalized disjunctive programming and applications to process systems." Computers & Chemical Engineering 25.11 (2001): 1675-1697.
3. Lee, Sangbum, and Ignacio E. Grossmann. "Global optimization of nonlinear generalized disjunctive programming with bilinear equality constraints: applications to process networks." Computers & chemical engineering 27.11 (2003): 1557-1575.
4. Grossmann, Ignacio E., and Juan P. Ruiz. "Generalized Disjunctive Programming: A framework for formulation and alternative algorithms for MINLP optimization." Mixed Integer Nonlinear Programming. Springer New York, 2012. 93-115.