# Stochastic programming

Author: Jake Heggestad (ChE 345 Spring 2015)

Stewards: Dajun Yue, Fengqi You

# Introduction

Stochastic programming is an optimization model that deals with optimizing with uncertainty. For example, imagine a company that provides energy to households. This company is responsible for delivering energy to households based on how much they demand. Typically, this problem could be solved as a simpler Linear Program (LP) with constraints based on demand from households. Though this is convenient, future demand of households is not always known and is likely dependent on factors such as the weather and time of year. Therefore, there is uncertainty and our basic LP model will not suffice.

Stochastic programming offers a solution to this issue by eliminating uncertainty and characterizing it using probability distributions. Many different types of stochastic problems exist. The most famous type of stochastic programming model is for recourse problems. This type of problem will be described in detail in the following sections below. However, other forms types of stochastic problems exist, such as the chance-constraint method. In this type of stochastic programming, the constraints to be optimized depend on probabilities. More directly, this means that certain constrains need not be satisfied all the time, but instead only must be true a certain percentage of the time (i.e. 95 percent of the time).

# Mathematical Formulation

Two-Staged Recourse Programs

In recourse problems, you are required to make a decision now, as well as minimize the expected costs of your decision. In this second step, we are able to avoid making the constraints of the problem infeasible. We will examine the two-staged problem below, however it is important to note that these problems can become multidimensional with lots of stages.

The general formulation for two-staged problems is seen below. In this model, as described above, we first make a decision (knowing only the probability distribution of the random element) and then follow up that decision with a correction that will be dependent on the stochastic element of the problem. The objective is then to minimize the 1st stage decision costs, plus the expected cost from the second stage.

$\begin{array}{llr} \min\limits_{x\in \mathbb{R}^n} &g(x)= c^T x + E[Q(x,\xi)] & \\ \text{subject to} & Ax = b &\\ & x \geq 0 & \end{array}$

where $Q(x,\xi)$ is the optimal value of the second-stage problem

$\begin{array}{llr} \min\limits_{y\in \mathbb{R}^m} & q(\xi)^T y & \\ \text{subject to} & T(\xi)x+W(\xi)y = h(\xi) &\\ & y \geq 0 & \end{array}$

In the equations above the $Wy$ term ensures that $Tx \le h$ remains feasible (seen by the fact that it depends on y, the decision variable of the second stage).

Once these expected values have been calculated, the two stage problem can be re-written as one linear program with the form shown below. This is the deterministic equivalent and involves solving for all of the possible scenarios. Ultimately, only one scenario will be chosen and it is based entirely on the costs from stage 1 and the expected value in stage 2.

$\begin{array}{lccccccccccccc} \text{Minimize} & f^T x & + & g^T y & + & p_1h_1^Tz_1 & + & p_2h_2^Tz_2 & + & \cdots & + & p_Kh_K^Tz_K & & \\ \text{subject to} & Tx & + & Uy & & & & & & & & & = & r \\ & & & V_1 y & + & W_1z_1 & & & & & & & = & s_1 \\ & & & V_2 y & & & + & W_2z_2 & & & & & = & s_2 \\ & & & \vdots & & & & & & \ddots & & & & \vdots \\ & & & V_Ky & & & & & & & + & W_Kz_K & = & s_K \\ & x & , & y & , & z_1 & , & z_2 & , & \ldots & , & z_K & \geq & 0 \\ \end{array}$

The deterministic equivalent problem can be solved using solvers such as CPLEX or GLPK, however it is important to note that if the number of scenarios is large, it may take a long time.

# Features of the Problem

## Statistical Methods

In order to deal with the uncertainty aspect of stochastic programming, the future expectations term $E[Q(x,\xi)]$ must be modeled using statistics. One such formulation is shown below were there are K scenarios, each with a specific probability assigned to them that is known

$E[Q(x,\xi)]=\sum\limits_{k=1}^{K} p_kQ(x,\xi_k)$

To make this formulation more concrete, lets consider a simple example. Say there is a newspaper delivery boy who must decide each day how many newspaper he should purchase from the newspaper company so that he can sell them to other consumers. From his past experiences, he has determined that there are 3 scenarios for the demand of newspapers. It will either be, 100 with a probability of 0.5, 150 with a probability of 0.2, or 200 with a probability of 0.3. From this, he must make a decision of how many newspapers to purchase in stage 1. This example is displayed graphically below.

Stochastic Decision Tree. In stage 1, a decision is made based on the probability functions present in stage 2. These trees can have many branches depending on the possible outcomes.

When the number of scenarios for a problem is very large, or even infinite, it becomes convenient to use a technique known is Monte Carlo simulation to calculate the expected value of the second stage. This technique is known as the sample average approximation (SAA). Its formulation can be seen below.

$\hat{q}_N(x) = \frac{1}{N} \sum_{j=1}^N Q(x,\xi^j)$

This technique assumes that each scenario has an equivalent probability of $\frac{1}{N}$. This method cuts down on the number of scenarios because only a sample of the scenarios are taken and used to approximate the entire set. Therefore, this provides an approximate expected value.

# Applications

This type of problem has many meaningful applications. For example, consider the logistics of transporting goods from manufactures to consumers. It is often the case that demand is not fixed and thus the transportation of goods contains uncertainty. The problem can be formulated using probabilistic constraints to account for this uncertainty. Another, more widely used application is portfolio optimization while minimizing risk. Additionally, these concepts can be applied to a wide variety of ecological problems where weather conditions are uncertain.

## Numerical Example

Suppose we have the following optimization problem:

$\begin{array}{rcll} \max &~& 5x_1 + 10x_2 - 4y_1 - 6y_2 & \\ \mathrm{subject~to} &~& y_1 + y_2 = 10 \\ &~& x_1 + 3x_2 \le 14 \\ &~& x_1, x_2 \ge 0 \end{array}$

This is a simple linear optimization problem with optimal solution set $x_1 = 14, y_1 = 10, z = 30$.

Now assume that variables $x_1$ and $x_2$ are uncertain and that there are three different scenarios, $(1.5X, X, 0.7X)$ for the values of $x_1$ and $x_2$, each occurring with a probability of 1/3. This new problem involves uncertainty and is thus considered a stochastic problem. We must now partition $x_1$ and $x_2$ into $x_{1,1}, x_{1,2}, x_{1,3}$ and $x_{2,1}, x_{2,2}, x_{2,3}$ respectively. Once turned into the discrete version, the problem is reformulated as shown below and can be solved once again using linear programming.

$\begin{array}{rcll} \max &~& \frac{1}{3}(7.5x_{1,1} + 15x_{2,1}) + \frac{1}{3}(5x_{1,2} + 10x_{2,2}) + \frac{1}{3}(3.5x_{1,3} + 7x_{2,3}) - 4y_1 - 6y_2 & \\ \mathrm{subject~to} &~& y_1 + y_2 = 10 \\ &~& 1.5x_{1,1} + 4.5x_{2,1} \le 14 \\ &~& x_{1,2} + 3x_{2,2} \le 14 \\ &~& 0.7x_{1,3} + 2.1x_{2,3} \le 14 \\ &~& x_{1,1} x_{1,2}, x_{1,3}, x_{2,1}, x_{2,2}, x_{2,3} \ge 0 \end{array}$

The new optimal solution is $30.005$.

# Conclusion

Stochastic programming has a wide range of topics. Many complexities exist in optimizing with uncertainty (a large amount of which were not discussed here). For more in depth information, see the References section. Overall, probabilistic constraints and recourse problems provide a framework for solving more real world issues that involve uncertainty. Though it has been said before, it is important to reiterate that stochastic programming only works if a probability distribution is known for the given problem (i.e. probability distribution for the demand of newspapers). Many issues, such as: optimizing financial portfolios, capacity planning, distribution of energy, scheduling, and many more can be solved using stochastic programming.