# Difference between revisions of "Adaptive robust optimization"

Author: Woo Soo Choe (ChE 345 Spring 2015)
Steward: Professor You, Dajun Yue

## Introduction

Traditionally, robust optimization has solved problems based on static decisions which are predetermined by the decision makers. Once the decisions were made, the problem was solved and whenever a new uncertainty was realized, the uncertainty was incorporated to the original problem and the entire problem was solved again to account for the uncertainty.[1] In order to address deal with the issues of uncertainties, several attempts were made in the field of optimization. While numerous technique exists, one of the widely studied way of dealing with uncertainty has been utilizing Stochastic Optimization method. In Stochastic Approach, the uncertainty is handled by assigning probability distribution to the uncertainty. Stochastic Optimization has proven its usefulness in certain areas, but this approach has couple drawbacks. First, while one can randomly assign probability distribution to make the model work, in a real life application, it is difficult to come up with an accurate probability distribution. Second, the Stochastic approach does not emphasize heavily on minimizing the cost of the worst case scenario, for the people who are making investment or company decisions need an Optimization technique that will yield conservative result and account for the uncertainties.
Adaptive Robust Optimization, currently led by Aharon Ben-Tal and dimitris Bertsimas, is an improved version of the traditional static robust optimization. Instead of assigning probability distribution to handle uncertainty, Adaptive Robust Optimization handles uncertainty by treating it as a function of ellipsoid, polyhedron, or any other ways that might best serve a specific case of interest. Furthermore, it utilizes the decisions made in the first stage to come up with a solution, which is used to arrive at the final answer even under uncertainties. Even though Adaptive Robust Optimization is a relatively new field, its capability as a way of solving a frequently asked questions in business and other real life application has proven the method useful. This wiki-page was created to introduce the topic of Adaptive Robust Optimization to fellow students with the hope of enriching ChemE 345 experience beyond the scope of what was covered in class.

## Model Formulation

Adaptive Robust Optimization implements different techniques to improve on the original static robust optimization by incorporating multiple stages of decision into the algorithm. Currently, in order to minimize the complexity of algorithm, most of the studies on adaptive robust optimization have focused on two-stage problems. Generally, Adaptive Robust Optimization may be formulated in various different forms but for simplicity, Adaptive Robust Optimization in convex case was provided.
$\begin{array}{llr} \max\limits_{x\in \mathit{S}} &f(x) + \max\limits_{b\in \mathit{B}} Q(x,b) &\\ \end{array}$

In the equation $x$ is the first stage variable and $y$ is the second stage variable, where S and Y are all the possible decisions, respectively.$b$ represents a vector of data and when $\mathit{B}$ represents uncertainty set.

In order for the provided convex case formulation to work, the case must satisfy five conditions:
1. $\mathit{S}$ is a nonempty convex set
2. $f(x)$ is convex in $x$
3. $\mathit{Y}$ is a nonempty convex set
4. $h(y)$ is convex in $y$
5. For all i=1,...,n, $H_i (x,y,b)$ is convex in $(x,y), \forall b \in \mathit{B}$

Clearly, not every Adaptive Robust Optimization problem may be solved using exactly one model. However, key features that need to be present in a model of Adaptive Robust Optimization are the variables which respectively represent the multiple stages, uncertainty sets whether in ellipsoidal form, polyhedral form, or other novel way, and general layout of the problem which solves for the minimum loss at the worst case scenario. Furthermore, another key feature is that second stage variables are not known. Another form of Adaptive Robust Optimization formulation is provided below.

$\begin{array}{llr} \ min_x &c^T x + \max\limits_{d\in \mathbb{D}} \min\limits_{y\in {\Omega}} b^T y &\\ \text{s.t.} &Fx \le f &\\ &{\Omega} (x,d)= \big\{y: Hy \le h, Ax+By \le g, Jy=d \big\} &\\ &\mathbb{D} = \big\{ d: Dd \le k \big\} \end{array}$

Similarly as in the first formulation provided, $x$ and $y$ represent the first stage variable and the second stage variable respectively. In this case the, $\mathbb{D}$ is the polyhedron uncertainty set of demand $d$and $\Omega$ represents the uncertainty set for the second stage variable $y$. In this case, H, A, B, g, J, D, and k are numerical parameters which could represent different parameters under different circumstances.

## Methodology

In order to investigate how Adaptive Robust Optimization problem, numerous techniques may be used. However, given the scope of this page, only three of the techniques will be introduced. The three algorithms are Bender's Decomposition, Trilevel Optimization, and column-and-Constraint Generation Algorithm and for the Benders Decomposition and Trilevel . When using Benders Decomposition approach, the algorithm essentially breaks down the original problem into the outer and inner problems. Once the problem is divided into two parts, the outer problem is solved using the Benders Decomposition and the inner problem is solved using the Outer Approximation. The detailed steps are as follows.

Benders Decomposition

The Outer Problem: Benders Decomposition
Step 1: Initialize, by denoting the lower bound as $LB = - \infty$ and the upper bound as $UB=\infty$ and set the iteration count as $C=0$. Then choose the termination tolerance $\epsilon$.

Step 2: Solve the master problem
$\begin{array}{llr} max_{x,\zeta} &c^T x + \zeta \\ s.t. &Fx \le f &\\ &{\zeta} \ge -h^T \alpha_l + (Ax-g)^T \beta_l + d_l^T \lambda_l , \forall \le C \end{array}$
In this case, $(x_c, \zeta_c)$ denote the optimum solution.

Step 3: Update the lower bound $LB=c^T x_c + \zeta_c$
Step 4: Increase $C$, the iteration count by 1
Step 5: Solve $I(x_c)$, the inner problem and denote the optimal solution as $(d_c, \alpha_c, \beta_c, \lambda_c)$. Update $UB=min(UB, c^T x_c + I(x_c))$, where $UB$ stands for the upper bound.
Detailed procedure of Step 5 is as follows.

if  $UB-LB \ge \epsilon$ then
Go to step 2
else
Calculate $y_c$, the dispatch variable given $x_c$ and $d_c$
end


The Inner Problem : Outer Approximation

Step 1: Initialize by using the commitment decision from the outer problem $x_c$ from the outer problem. Then, find an initial uncertainty realization $d_1 \in \mathbb{D}$, set the lower bound $LOA = - \infty$ and the upper bound $UOA = \infty$, set iteration count j=1 and then termination tolerance which is denoted as $\theta$
Step 2: Solve the sub-problem below.
$\begin{array}{llr} S(x_c,d_j) &= max -h^T \alpha + (Ax_c - g ) ^T \Beta + d_j^T \lambda \\ s.t. & -H^T \alpha - B^T \beta + J^T \lambda = b \\ &alpha \ge 0, \beta \ge 0 \\ \end{array}$
In the inner problem, the optimal solution is denoted as $(\alpha_j, \beta_j, \lambda_j)$. Furthermore, define $d_j^T \lambda_j$ as $L_j(d_j,\lambda_j) + (d-d_j)^T \lambda_j + (\lambda - \lambda_j)^T d_j$. Then, update $LOA = max (S(x_c,d_j)), LOA)$
Step 3: Solve the master problem
$\begin{array}{llr} U(d_j, \lambda_j) &= max -h^T \alpha + (Ax_c-g)^T \beta + \zeta \\ s.t &\zeta \le L_i (d_i, \lambda_i), \forall \le j \\ &-H^T \alpha -B^T \beta +J^T \lambda =b \\ &Dd \le k &\\ &\alpha \ge 0 , \beta \ge 0 \\ \end{array}$
Increase the iteration of j by 1. While the optimal solution is denoted as $(d_j, \alpha_j, \beta_j, \lambda_j, \zeta)$, update the upper bound as $UOA = min(UOA, U(d_j, \lambda_j))$

if $UB - LB \ge 0$ then
Go to Step 2
else
Return optimal solution as the output
end


As seen from the algorithms, Benders Decomposition divides an Adaptive Robust Optimization problem into outer and inner problems and solves them using two algorithms. While this may cause some confusion for ones who have no previous exposures to Benders Decomposition, the approach to solving the outer problem is also called the Benders Decomposition and the approach to solving the inner problem is called the Outer Approximation method. Fundamentally, this algorithm works by first solving the outer problem until $UB -LB \ge \epsilon$ condition is met and then use the $x. \zeta$ from the outer problem to plug into the inner problem and solve for the optimum solution until $UB -LB \ge \ 0$ condition is met. This method has an advantage over traditional Robust Optimization in a sense that it does not sacrifice as much optimality in the solution at the cost of obtaining a conservative answer. Unfortunately, Benders Decomposition method has three problems. First problem is the fact that the master problem relies on the dual variables of the inner and outer problems, which means that the sub-problems cannot have integer variables. Second problem is that the solution does not guarantee a global optimal solution, and it means that the algorithm may not return the absolute worst case scenario before returning the solution. Third problem is that it takes a long time to compute the answer and this might pose a problem when solving a large scale problem. In order to resolve this issue, another algorithm called Trilevel Optimization was proposed by Bokan Chen of Iowa University. Before iterative Trilevel Optimization algorithm applied, the problem needs to be reformulated in an appropriate form as shown below.

$\begin{array}{llr} \min\limits_x &c^T x + b^T y \\ s.t &Fx \ le f \\ &max_d & b^T y \\ &s.t. &Dd \le k \\ \min\limits_y &b^T y \\ s.t. &Ax + By \le g \\ &Hy \le h \\ &Jy = d \\ \end{array}$

Equation $Fx \ le f$ represents the first constraints on the first stage commitment variables and equation $Dd \le k$ represents the uncertainty set of demand. Equation $Ax + By \le g$ represents the constraints that couples the commitment and dispatch variables. $Hy \le h$ constrains the dispatch variable and $Jy =d$ is the constraint that couples the demand variables and the dispatch variables. Then, the reformulated model can be further refined into the following model.

$\begin{array}{llr} \min_{x,\phi} &c^T x + \phi \\ s.t. &Fx \le f \\ & \phi \ge b^T y, \forall y \in \mathbb{Y_D} \end{array}$

When $\mathbb{Y_D} = \big\{ y| Ax + By \le g Hy \le h \big\} \cap \big\{y | Jy = d, d \in \mathbb{D} \big\}$. Assuming $\Omega \subset \mathbb{D}$, we have $\mathbb{Y_\Omega} = \big\{ y| Ax + By \le g Hy \le h \big\} \cap \big\{y | Jy = d, d \in \Omega \big\}$. This implies $\mathbb{Y}_\Omega \subset \mathbb{Y_D}$. This relaxes the problem into the following form.

$\begin{array}{llr} \min_{x,\phi} &c^T x + \phi \\ s.t. &Fx \le f \\ &\phi \ge b^T y, \forall y \in \mathbb{Y_\Omega} \end{array}$

This allows the trilevel problem to be split into the master problem and a sub-problem. Following is the relaxation of the master problem M of the trilevel problem as follows.

$\begin{array}{llr} \min_{x,\phi} &c^T x + \phi &\\ s.t. &Fx \le f &\\ &\phi \ge b^T y, \forall y \in \mathbb{Y_D} \end{array}$

The master problem M is a relaxation of the trilevel problem as follows:
$\begin{array}{llr} \min_{x,\phi} &c^T x + \phi \\ s.t. &Fx \le f \\ &\phi \ge b^T y^i, \forall i = 1,...,| \Omega | \\ &H y^i \le h, \forall i = 1, ..., \ \Omega | \\ &Ax + By^i \le g, \forall i = 1,..., | \Omega | \\ &J y^i = d^i, \forall i = 1,..., | \Omega | \end{array}$

Following is the bilevel sub-problem which yields the dispatch cost under the worst-case scenario
$\begin{array}{llr} \max\limits_d &b^T y \\ \text{s.t.} &Dd \le k \\ &min_y b^T y \\ &s.t. Hy \le h \\ &By \le g - Ax \\ &Jy = d &\\ \end{array}$

An Iterative Algorithm For The Trilevel Optimization Problem Optimization Problem
Step 1: Initialize by denoting lower bound as $LB= -\infty$ and the upper bound as $UB= \infty$. Then create and empty set $\Omega$.
Step 2: Solve the master problem M. Where the solution of the problem is $(x^M, \zeta^M, y^M)$. Then update the lower bound of the algorithm $LB=max(LB, c^Tx^M+\zeta^M)$.
Step 3: Solve the sub-problem $S(x^M)$. The solution to the problem is $(y^S, d^S)$. Update the upper bound which is $UB=min(UB, c^Tx^M+b^Ty^S)$ and the set $\Omega = d^S \cap \Omega$.

if $UB-LB \gg 0$ then
Go to Step 2
else
Find $text{argmax}_i b^T y^i$. Calculate the total cost $\zeta = c^T x^M + b^T y^i$, return the optimal solution as $x^M, d^i, y^i, \zeta$
end


## Example

In order to illustrate how Adaptive Robust Optimization works, a numerical example is given in this section. This example involves 3 factories and 5 customers and a detailed information if provided through the table below.

Before solving the problem, the basic set up is as follows.
$\begin{array}{llr} & v_j = \min\limits_{i \in O(y)} \big\{c_{ij}\big\} \\ & w_{ij} \begin{cases} 0 &i \in O(y) \\ max_{i \in C(y)} \big\{(v_j - c_{ij}),0 \big\} &i \in C(y) \\ \end{cases} \end{array}$
In this case, $v_{ij}$ are the dual variables associated with the demand constraints and $w_{ij}$ represent the dual variables associated with the setup constraints. Furthermore the dual variable can be represented as $u$ and it means the combination of $(v,w)$. From the proposition, the following Benders cut is derived.

$\beta_y(y)=u(b-By)+f^ty$
For this specific problem, the Benders cut can be rewritten as follows.
$\beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i$

Returning to the problem, we denote factory 1, factory 2, and factory 3 as $y_1, y_2, y_3$ and to start the problem, we only assume factory 1 is open and in this case, $v_j = min_{i \in O(y)} \big\{ c_{ij} \big\} , j=1,...,m$ would become $v_j=(2,3,4,5,7)$. Based on the proposition, $w_{ij}$ may be found as follows. $\begin{array}{llr} w_{1j}=0 \\ w_{2j}=(0,0,3,3,1) \\ w_{3j}=(0,0,2,4,4) \end{array}$

Then, solving Benders Cut, we get the following result.
$\begin{array}{llr} \beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i \\ \beta_y(y)=2+3+4+5+7+(2-0)y_1+(3-(3+3+1))y_2+(3-(2+4+4+4))y_3 \\ \beta_y(y)=21+2y_1-4y_2-7y_3 \end{array}$
From this, the upper bound on the solution, 23, obtained. Then, the Benders cut is used to solve the master problem and by inserting the Benders cut into the master problem, we get the problem in the following form.
$\begin{array}{llr} min z \\ s.t. &z \ge 21+2y_1-4y_2-7y_3 \\ &y \in \mathbb{B}^3 \end{array}$
In the above problem, the optimal solution is y=(0,1,1), meaning it is best to keep factory 1 closed and open factories 2 and 3. This yield a solution of 10, which becomes new y and one more iteration of the algorithm may be done with this. When the $v_j$ and $w_{ij}$ are found again with the solution, following values are obtained
$\begin{array}{llr} v_j = (4,3,1,1,3) \\ w_{1j} = (2,0,0,0,0)\\ w_{2j}=w{3j}=0 \end{array}$
Then the Benders cut was calculated again as follows.
$\begin{array}{llr} \beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i \\ \beta_y(y)= (4+3+1+1+3)+(2-2)y_1+(3-0)y_2+(3-0)y_3 \\ \beta_y(y)=12+3y_2+3y_3 \end{array}$
From this, we get a new upper bound which is 18 and the master problem looks like:
$\begin{array}{llr} min z \\ s.t. &z \ge 21+2y_1-4y_2-7y_3 \\ &z \ge 12+ 3y_2+3y_3 \\ &y \in \mathbb{B}^3 \end{array}$

As the solution to the problem, we get $y=(0,0,1)$. Then, we repeat the iteration process, which yields:
$\begin{array}{llr} v_j = (5,4,2,1,3) \\ w_{1j} = (3,1,0,0,0) \\ w_{2j} = (1,1,1,0,0) \\ w_{3j} = 0 \end{array}$
Then the Benders cut becomes:
$\begin{array}{llr} \beta_y(y)=\sum_{i=1}^m v_j + \sum_{i=1}^n (f_i - \sum_{j=1}^m w_{ij}) y_i \\ \beta_y(y)=(5+4+2+1+3)+(2-4)y_1+(3-3)y_2+(3-0)y_3 \\ \beta_y(y)=15-2y_1+3y_3 \end{array}$
Now, there is no better upper bound and the master problem becomes:
$\begin{array}{llr} min z \\ s.t. &z \ge 21+2y_1-4y_2-7y_3 \\ &z \ge 12+ 3y_2+3y_3 \\ &z \ge 15-2y_1+3y_3 &y \in \mathbb{B}^3 \end{array}$
This yields the optimal solution of $y=(1,0,1)$ and the new lower bound is 16. when the iteration process is repeated until the upper and lower bound are the same, we obtain the optimal solution value of 16 and come to the decision of opening factories 1 and 3 only.

As seen from the iterative procedure, Trilevel Optimization also breaks an optimization problem into smaller parts and use iterative algorithm to close in the difference between the upper and the lower bound. However, the Trilevel Optimization addresses the issues with the Benders Decomposition approach.

## Application

Adaptive Robust Optimization has applications in different fields because it can deal with circumstances where uncertainties are present. In real life, since uncertainties exists everywhere regardless of the field, Adaptive Robust Optimization may be used under any circumstances as long as the models may be accurately formulated. In the field of business, problems such as ones presented in the example section may be solved using Adaptive Robust Optimization, where decision makers need to know the worst possible outcome in order to make the best decision for the company.
Separately, even in the field of electrical engineering Adaptive Robust Optimization may be used. As discussed in Alvaro Lorca's paper, Adaptive Robust Optimization may be used to handle daily operational problem of power systems, in which the nodal net electricity loads are uncertain.