Adaptive robust optimization

From optimization
Revision as of 20:29, 7 June 2015 by Wchoe418 (Talk | contribs)

Jump to: navigation, search

Author: Woo Soo Choe (ChE 345 Spring 2015)
Steward: Dajun Yue, Fengqi You



In order to illustrate how Adaptive Robust Optimization works, a numerical example is given in this section. This example involves 3 factories and 5 customers and a detailed information if provided through the table below.


Before solving the problem, the basic set up is as follows.

 & v_j = \min\limits_{i \in O(y)} \big\{c_{ij}\big\} \\
 & w_{ij} 
0 &i \in O(y) \\
max_{i \in C(y)} \big\{(v_j - c_{ij}),0 \big\} &i \in C(y) \\
In this case, v_{ij} are the dual variables associated with the demand constraints and w_{ij} represent the dual variables associated with the setup constraints. Furthermore the dual variable can be represented as u and it means the combination of (v,w). From the proposition, the following Benders cut is derived.

For this specific problem, the Benders cut can be rewritten as follows.
 \beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i

Returning to the problem, we denote factory 1, factory 2, and factory 3 as  y_1, y_2, y_3 and to start the problem, we only assume factory 1 is open and in this case,  v_j = min_{i \in O(y)} \big\{ c_{ij} \big\} , j=1,...,m would become  v_j=(2,3,4,5,7). Based on the proposition, w_{ij} may be found as follows. 
w_{1j}=0 \\
w_{2j}=(0,0,3,3,1) \\

Then, solving Benders Cut, we get the following result.

\beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i \\
\beta_y(y)=2+3+4+5+7+(2-0)y_1+(3-(3+3+1))y_2+(3-(2+4+4+4))y_3 \\


In order to investigate how Adaptive Robust Optimization problem, numerous techniques may be used. However, given the scope of this page, only three of the techniques will be introduced. The three algorithms are Bender's Decomposition, Trilevel Optimization, and column-and-Constraint Generation Algorithm and for the Benders Decomposition and Trilevel . When using Benders Decomposition approach, the algorithm essentially breaks down the original problem into the outer and inner problems. Once the problem is divided into two parts, the outer problem is solved using the Benders Decomposition and the inner problem is solved using the Outer Approximation. The detailed steps are as follows.

Benders Decomposition

The Outer Problem: Benders Decomposition
Step 1: Initialize, by denoting the lower bound as LB = - \infty and the upper bound as UB=\infty and set the iteration count as C=0. Then choose the termination tolerance \epsilon.

Step 2: Solve the master problem

\max_{x,\zeta} &c^T x + \zeta &\\
\text{s.t.} &Fx \le f &\\
&{\zeta} \ge -h^T \alpha_l + (Ax-g)^T \beta_l + d_l^T \lambda_l , \forall \le C
In this case,  (x_c, \zeta_c)  denote the optimum solution.

Step 3: Update the lower bound LB=c^T x_c + \zeta_c
Step 4: Increase  C , the iteration count by 1
Step 5: Solve  I(x_c), the inner problem and denote the optimal solution as (d_c, \alpha_c, \beta_c, \lambda_c) . Update UB=min(UB, c^T x_c + I(x_c)), where UB stands for the upper bound.
Detailed procedure of Step 5 is as follows.

if  UB-LB \ge \epsilon then
Go to step 2
Calculate  y_c, the dispatch variable given x_c and  d_c

The Inner Problem : Outer Approximation

Step 1: Initialize by using the commitment decision from the outer problem x_c from the outer problem. Then, find an initial uncertainty realization  d_1 \in \mathbb{D}, set the lower bound LOA = - \infty and the upper bound  UOA = \infty , set iteration count j=1 and then termination tolerance which is denoted as  \theta
Step 2: Solve the sub-problem below.

\ S(x_c,d_j) = max -h^T \alpha + (Ax_c - g ) ^T \Beta + d_j^T \lambda &\\
\text{s.t.}& -H^T \alpha - B^T \beta + J^T \lambda = b &\\
&alpha  \ge 0, \beta \ge 0 &\\
In the inner problem, the optimal solution is denoted as  (\alpha_j, \beta_j, \lambda_j) . Furthermore, define  d_j^T \lambda_j as L_j(d_j,\lambda_j) + (d-d_j)^T \lambda_j + (\lambda - \lambda_j)^T d_j . Then, update  LOA = max (S(x_c,d_j)), LOA)
Step 3: Solve the master problem

\ U(d_j, \lambda_j) = max -h^T \alpha + (Ax_c-g)^T \beta + \zeta &\\
\text{s.t} &\zeta \le L_i (d_i, \lambda_i), \forall \le j &\\
&-H^T \alpha -B^T \beta +J^T \lambda =b &\\
&Dd \le k &\\
&\alpha \ge 0 , \beta \ge 0 &\\
Increase the iteration of j by 1. While the optimal solution is denoted as d_j, \alpha_j, \beta_j, \lambda_j, \zeta)</maht>, update the upper bound as <math>UOA = min(UOA, U(d_j, \lambda_j))
if  UB - LB \ge 0 then <br>
Go to Step 2 <br>
Return optimal solution as the output<br>
As seen from the algorithms, Benders Decomposition divides an Adaptive Robust Optimization problem into outer and inner problems and solves them using two algorithms. While this may cause some confusion for ones who have no previous exposures to Benders Decomposition, the approach to solving the outer problem is also called the Benders Decomposition and the approach to solving the inner problem is called the Outer Approximation method. Fundamentally, this algorithm works by first solving the outer problem until <math> UB -LB \ge \epsilon condition is met and then use the x. \zeta from the outer problem to plug into the inner problem and solve for the optimum solution until  UB -LB \ge \ 0 condition is met. This method has an advantage over traditional Robust Optimization in a sense that it does not sacrifice as much optimality in the solution at the cost of obtaining a conservative answer. Unfortunately, Benders Decomposition method has three problems. First problem is the fact that the master problem relies on the dual variables of the inner and outer problems, which means that the sub-problems cannot have integer variables. Second problem is that the solution does not guarantee a global optimal solution, and it means that the algorithm may not return the absolute worst case scenario before returning the solution. Third problem is that it takes a long time to compute the answer and this might pose a problem when solving a large scale problem.
In order to resolve this issue, another algorithm called Trilevel Optimization was proposed by Bokan Chen of Iowa University. Before iterative Trilevel Optimization algorithm applied, the problem needs to be reformulated in an appropriate form as shown below.

Failed to parse(PNG conversion failed; check for correct installation of latex and dvipng (or dvips + gs + convert)): \begin{array}{llr} \min\limits_x c^T x + b^T y &\\ \text{s.t} &Fx \ le f &\\ &max_d & b^T y &\\ &s.t &Dd \le k &\\ &min\limits_y b^T y &\\ &s.t &Ax + By \le g &\\ &Hy \le h &\\ &Jy = d &\\ \end{array}

Equation  Fx \ le f represents the first constraints on the first stage commitment variables and equation  Dd \le k represents the uncertainty set of demand. Equation  Ax + By \le g represents the constraints that couples the commitment and dispatch variables. Hy \le h constrains the dispatch variable and Jy =d is the constraint that couples the demand variables and the dispatch variables. Then, the reformulated model can be further refined into the following model.

\min_{x,\phi} c^T x + \phi &\\
\text{s.t.} Fx \le f &\\
& \phi \ge b^T y, \forall y \in \mathbb{Y_D}

When \mathbb{Y_D} = \big\{ y| Ax + By \le g Hy \le h \big\} \cap \big\{y | Jy = d, d \in \mathbb{D} \big\}. Assuming \Omega \subset \mathbb{D}, we have \mathbb{Y_\Omega} = \big\{ y| Ax + By \le g Hy \le h \big\} \cap \big\{y | Jy = d, d \in \Omega \big\}. This implies \mathbb{Y}_\Omega \subset \mathbb{Y_D}. This relaxes the problem into the following form.

\min_{x,\phi} c^T x + \phi &\\
\text{s.t.} &Fx \le f &\\
&\phi \ge b^T y, \forall y \in \mathbb{Y_\Omega}

This allows the trilevel problem to be split into the master problem and a sub-problem. Following is the relaxation of the master problem M of the trilevel problem as follows.

\min_{x,\phi} c^T x + \phi &\\
\text{s.t.}&Fx \le f &\\
&\phi \ge b^T y, \forall y \in \mathbb{Y_D}

The master problem M is a relaxation of the trilevel problem as follows: 
\min_{x,\phi} c^T x + \phi &\\
\text{s.t.}&Fx \le f &\\
&\phi \ge b^T y^i, \forall i = 1,...,| \Omega | &\\
&H y^i \le h, \forall i = 1, ..., \ \Omega | &\\
&Ax + By^i \le g, \forall i = 1,..., | \Omega | &\\
&J y^i = d^i, \forall i = 1,..., | \Omega | 

Following is the bilevel sub-problem which yields the dispatch cost under the worst-case scenario 
\max\limits_d b^T y &\\
\text{s.t.} &Dd \le k &\\
&min_y b^T y &\\
&s.t. Hy \le h &\\
&By \le g - Ax &\\
&Jy = d &\\

An Iterative Algorithm For The Trilevel Optimization Problem Optimization Problem Step 1: Initialize by denoting lower bound as  LB= -\infty and the upper bound as UB= \infty. Then create and empty set  \Omega .
Step 2: Solve the master problem M. Where the solution of the problem is (x^M, \zeta^M, y^M). Then update the lower bound of the algorithm LB=max(LB, c^Tx^M+\zeta^M). Step 3: Solve the sub-problem S(x^M). The solution to the problem is (y^S, d^S). Update the upper bound which is UB=min(UB, c^Tx^M+b^Ty^S) and the set \Omega = d^8 \cap \Omega.
if Failed to parse(unknown function '\g'): UB-LB \g 0


Go to Step 2
else Find argmax_i b^T y^i. Calculate the total cost \zeta = c^T x^M + b^T y^i, return the optimal solution as x^M, d^i, y^i, \zeta

As seen from the iterative procedure, Trilevel Optimization also breaks an optimization problem into smaller parts and use iterative algorithm to close in the difference between the upper and the lower bound. However, the Trilevel Optimization addresses the issues with the Benders Decomposition approach.

Model Formulation

Adaptive Robust Optimization implements different techniques to improve on the original static robust optimization by incorporating multiple stages of decision into the algorithm. Currently, in order to minimize the complexity of algorithm, most of the studies on adaptive robust optimization have focused on two-stage problems. Generally, Adaptive Robust Optimization may be formulated in various different forms but for simplicity, Adaptive Robust Optimization in convex case was provided.

\max\limits_{x\in \mathit{S}}   &f(x) + \max\limits_{b\in \mathit{B}} Q(x,b) &\\

In the equation x is the first stage variable and y is the second stage variable, where S and Y are all the possible decisions, respectively.b represents a vector of data and when \mathit{B} represents uncertainty set.

In order for the provided convex case formulation to work, the case must satisfy five conditions:
1.  \mathit{S} is a nonempty convex set
2. f(x) is convex in x
3. \mathit{Y} is a nonempty convex set
4. h(y) is convex in y
5. For all i=1,...,n, H_i (x,y,b) is convex in (x,y), \forall b \in \mathit{B}

Clearly, not every Adaptive Robust Optimization problem may be solved using exactly one model. However, key features that need to be present in a model of Adaptive Robust Optimization are the variables which respectively represent the multiple stages, uncertainty sets whether in ellipsoidal form, polyhedral form, or other novel way, and general layout of the problem which solves for the minimum loss at the worst case scenario. Furthermore, another key feature is that second stage variables are not known. Another form of Adaptive Robust Optimization formulation is provided below.

\ min_x   c^T x + \max\limits_{d\in \mathbb{D}} \min\limits_{y\in {\Omega}}  b^T y &\\
\text{s.t.} Fx \le f &\\
&{\Omega} (x,d)= \big\{y: Hy \le h, Ax+By \le g, Jy=d \big\} &\\
&\mathbb{D} = \big\{ d: Dd \le k \big\}

Similarly as in the first formulation provided, x and y represent the first stage variable and the second stage variable respectively. In this case the,  \mathbb{D} is the polyhedron uncertainty set of demand  d and  \Omega represents the uncertainty set for the second stage variable y. In this case, H, A, B, g, J, D, and k are numerical parameters which could represent different parameters under different circumstances.


Traditionally, robust optimization has solved problems based on static decisions which are predetermined by the decision makers. Once the decisions were made, the problem was solved and whenever a new uncertainty was realized, the uncertainty was incorporated to the original problem and the entire problem was solved again to account for the uncertainty.[1] Generally, robust optimization problem is formulated as follows.


In the equation x\epsilon\mathbb{R}^n is a vector of decision variables and f_o,f_iare functions and are the uncertainty parameters which take random value in the uncertainty sets Failed to parse(unknown function '\subseteqmathbb'): \mathcal{U}_i\subseteqmathbb{R}^k . When robust optimization is utilized to solve a problem, three implicit assumptions are made.
1. All entries need in the decision vectorx get specific numerical values prior to the realization of the actual data.
2. When the real data is within the range of the uncertainty set \mathcal{U}, the decision maker is responsible for the result obtained through the robust optimization algorithm
3. The constraints are hard and the violation of the constraints may not be tolerated when the real data is within the uncertainty set mathcal{U}
The three assumptions grant robust optimization technique immunity from uncertainties. There are other types of optimization techniques such as Stochastic Optimization which may be used to handle problems with uncertainties. However, because Stochastic Optimization has its own drawback because it requires the probability distribution of the events. By having the decision makers make guesses about the probability distribution, Stochastic Optimization method often yield results that are less conservative than the ones by Robust Optimization method.
Robust Optimization certainly may have advantages over other optimization methods, but unfortunately, most robust optimization problems for real life applications require multiple stages to account for uncertainties and traditional static robust has shown limitations. In order to improve the pre-existing technique, Adaptive Robust Optimization was studied and advances in the field was made to address the problems which could not be easily handled with previous methods.[2]