A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software

Torres, Juan J.; Li, Can; Apap, Robert M.; Grossmann, Ignacio E.

doi:10.3390/a15040103

Open AccessEditor’s ChoiceReview

A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software

¹

Laboratoire d’Informatique de Paris Nord (LIPN), CNRS (UMR 7030), Université Sorbonne Paris Nord, Sorbonne Paris Cité, 93430 Villetaneuse, France

²

Department of Chemical Engineering, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA

^*

Author to whom correspondence should be addressed.

Algorithms 2022, 15(4), 103; https://doi.org/10.3390/a15040103

Submission received: 21 February 2022 / Revised: 16 March 2022 / Accepted: 17 March 2022 / Published: 22 March 2022

(This article belongs to the Special Issue Stochastic Algorithms and Their Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This paper presents a tutorial on the state-of-the-art software for the solution of two-stage (mixed-integer) linear stochastic programs and provides a list of software designed for this purpose. The methodologies are classified according to the decomposition alternatives and the types of the variables in the problem. We review the fundamentals of Benders decomposition, dual decomposition and progressive hedging, as well as possible improvements and variants. We also present extensive numerical results to underline the properties and performance of each algorithm using software implementations, including DECIS, FORTSP, PySP, and DSP. Finally, we discuss the strengths and weaknesses of each methodology and propose future research directions.

Keywords:

stochastic programming; L-shaped method; scenario decomposition; software benchmark

1. Introduction

In the modeling and optimization of real-world problems, there is usually a level of uncertainty associated with the input parameters and their future outcomes. Stochastic programming (SP) models have been widely studied to solve optimization problems under uncertainty over the past decades [1,2]. SP is acknowledged for providing superior results compared to the corresponding deterministic model with nominal values for the uncertain parameters, which can lead to suboptimal or infeasible solutions. SP applications in process systems engineering include manufacturing networks and supply chain optimization [3,4], production scheduling [5], synthesis of process networks [6].

Two-stage stochastic programming is a commonly applied framework for cases where parameter uncertainties are decision independent (exogenous uncertainties). In stochastic programming, uncertain parameters are explicitly represented by a set of scenarios. Each scenario corresponds to one possible realization of the uncertain parameters, according to a discretized probability distribution. The goal is to optimize the expected value of the objective function over the full set of scenarios, subject to the implementation of common decisions at the beginning of the planning horizon.

Stochastic programs are often difficult to solve due to their large size and complexity that grows with the number of scenarios. To overcome these difficulties, decomposition algorithms, such as Benders decomposition [7], Lagrangian decomposition [8], and progressive hedging [9], have been developed to solve linear programming (LP) and mixed-integer linear programming (MILP) stochastic problems. For a comprehensive review of recent algorithmic advances in two-stage stochastic MIP, we refer the readers to the tutorial [10].

On the software development front, several modeling systems and optimization platforms have included extensions for a convenient algebraic representation of stochastic problems, as offered by major software vendors, such as GAMS, LINDO, XpressMP, AIMMS, and Maximal.

In recent years, various commercial and open-source applications have been developed specifically to represent and solve two-stage and multistage SP problems. Some of them include capabilities to read and build stochastic MPS (SMPS) files, the standard exchange format for SP applications. However, despite advances in the field and proven benefits, SP has not been widely used in industrial applications. Existing applications include power systems planning and scheduling [11,12,13], and process systems engineering [14,15].

The scope of this paper is to provide a tutorial for beginners on linear and mixed-integer linear stochastic programming, introducing the fundamentals on solution methodologies with a focus on available software. To accomplish this objective, we present a general description of techniques based on Benders decomposition and dual decomposition, which are the fundamentals of most known solvers designed specifically to tackle special problem structures. We then review the currently available state-of-the-art software for the solution of two-stage stochastic programming problems and evaluate their performance, using large-scale test libraries in SMPS format.

The remainder of this paper is organized as follows. In Section 2, we explain the mathematical formulation of (mixed-integer) linear stochastic problems. Section 3 describes the classical L-shaped algorithm. Section 4 summarizes the enhancement strategies to improve the performance of Benders decomposition. Section 5 describes scenario decomposition methods and algorithmic enhancements. In Section 6, we describe algorithmic innovations in software packages for dual decomposition and show some computational results. Finally, in Section 7, we summarize the conclusions.

2. Problem Statement

We consider a two-stage stochastic mixed-integer linear problem (P) in the following form:

(P) min_{x, y} T C = c^{T} x + \sum_{ω \in Ω} τ_{ω} d_{ω}^{T} y_{ω}

(1a)

s . t . A x \leq b

(1b)

x \in X, X = \{x : x_{i} \in {0, 1} \forall i \in I_{1}, x_{i} \geq 0 \forall i \in I \ I_{1}\}

(1c)

W_{ω} y_{ω} \leq h_{ω} - T_{ω} x \forall ω \in Ω

(1d)

y_{ω} \in Y \forall ω \in Ω

(1e)

where x denotes the ‘here and now’ decisions, taken at the beginning of the planning horizon before the uncertainties unfold, and

Ω

is the set of scenarios. Vector

y_{ω}

represents the recourse or corrective actions (wait and see), applied after the realization of the uncertainty. Matrix

A \in R^{m_{1} \times n_{1}}

and vector

b \in R^{m_{1}}

represent the first-stage constraints. Matrices

T_{ω}

and

W_{ω}

, and vector

h_{ω} \in R^{m_{2}}

represent the second-stage problem. Matrices

T_{ω} \in R^{m_{2} \times n_{1}}

and

W_{ω} \in R^{m_{2} \times n_{2}}

are called technological and recourse matrices, respectively. Let

I = {1, 2, \dots, n_{1}}

be the index set of all first-stage variables. Set

I_{1} \subseteq I

as the subset of indices for binary first-stage variables. Let

J = {1, 2, \dots, n_{2}}

be the index set of all second-stage variables. If the second-stage variables are mixed integer,

Y = \{y : y_{j} \in {0, 1}, \forall j \in J_{1}, y_{j} \geq 0 \forall j \in J \ J_{1}\}

, where set

J_{1} \subseteq J

is the subset of indices for binary second-stage variables. If all the second-stage variables are continuous, set

J_{1} = \emptyset

and

Y = \{y : y_{j} \geq 0 \forall j \in J\}

. The objective function (

T C

) minimizes the total expected cost with the scenario probability

τ_{ω}

, and the cost vectors c and

d_{ω}

. It should be noted that the approaches discussed in this paper can be applied to mixed-integer variables. Here, we restrict the variables to be mixed binary for simplicity. Equation (1) is often referred to as the deterministic equivalent, or extensive form of the SP.

Formulation (P) can be rewritten in an equivalent form (PNAC) with nonanticipativity constraints (NACs), where the first-stage variables are no longer shared, and each scenario represents an instance of a deterministic problem with a specific realization outcome [16,17].

(P N A C) min_{x_{ω}, y_{ω}} T C = \sum_{ω \in Ω} τ_{ω} (c^{T} x_{ω} + d_{ω}^{T} y_{ω})

(2a)

s . t . \sum_{ω \in Ω} H_{ω} x_{ω} = 0

(2b)

(x_{ω}, y_{ω}) \in G_{ω} \forall ω \in Ω

(2c)

In Equation (2c),

G_{ω}

represents the feasible region for scenario

ω

, which is defined by constraints (1b)–(1e). Nonanticipativity constraints (2b) are added to ensure that the first-stage decisions are the same across all scenarios. Nonanticipativity constraints are represented by a suitable sequence of matrices

H_{ω} \in R^{n_{1} \cdot (| Ω | - 1) \times n_{1}}

. One example of such constraints is the following:

\begin{matrix} x_{ω} = x_{ω - 1} \forall ω = 2, 3, \dots, | Ω | \end{matrix}

(3)

Given the mathematical structure of the deterministic equivalent formulations (P) and (PNAC), (mixed-integer) linear stochastic problems can be solved using decomposition methods derived from duality theory [18]. Such methods split the deterministic equivalent into a master problem and a series of smaller subproblems to decentralize the overall computational burden. Decomposition methodologies are classified in two groups: (i) node-based or vertical decomposition, which includes Benders decomposition and variants where the problem is decomposed by the nodes in the scenario tree, and (ii) scenario-based or horizontal decomposition, where the problem is decomposed by scenarios. In the following Section 3, we provide a tutorial overview of Benders decomposition. In Section 4, we also provide a tutorial review of scenario decomposition methods, including the dual decomposition algorithm and the progressive hedging algorithm.

3. L-Shaped Algorithm/Benders Decomposition

If the second-stage variables are all continuous (i.e.,

Y = \{y : y_{j} \geq 0 \forall j \in J\}

), problem (P) can be solved with Benders decomposition. Benders decomposition (BD) was originally developed in 1962 by Benders [19] to solve large-scale mixed-integer linear problems (MILP) with complicating variables. This concept has been extended to solve a broader range of optimization problems [20], including multistage, bilevel, and nonlinear programming. When applied to stochastic problems, it is commonly referred to as the L-shaped algorithm [7].

The L-shaped algorithm partitions the deterministic formulation (P) into multiple problems: (i) a master problem (MP) that contains all the first-stage constraints and variables, which can be mixed integer; and (ii) a collection of subproblems that include corrective future actions for the given first-stage solution. The master problem (MP) is derived from the projection of (P) on the variables x:

(M P) min_{x} T C = c^{T} x + Q (x)

(4a)

s . t . A x \leq b

(4b)

x \in X

(4c)

where

Q (x) = \sum_{ω \in Ω} τ_{ω} θ_{ω} (x)

is defined as the recourse function (or expected second-stage value function); and

θ_{ω} (x)

is defined by the primal second-stage program for scenario

ω

,

(B S P p_{ω})

,

(B S P p_{ω}) θ_{ω} (x) = min_{y_{ω}} d_{ω}^{T} y_{ω}

(5a)

s . t W_{ω} y_{ω} \leq h_{ω} - T_{ω} x

(5b)

y_{ω} \geq 0

(5c)

The recourse functions

θ_{ω} (x)

and

Q (x)

are convex, differentiable, and piece-wise linear, characteristics that are exploited in the BD method [1]. These conditions do not hold when integer variables are included in the second-stage program. For the case of integer recourse, a logic-based Benders framework [21], second-stage convexification techniques [22,23,24], specialized branch-and-bound schemes [25,26] or dual decomposition methods [16] may be applied to solve large stochastic problems. In this section, we only focus on Benders decomposition for SP with continuous second-stage variables.

Formulation (

B S P p_{ω}

) is a linear program for any given feasible value of x. By the strong duality theorem, the second-stage program is equivalent to its dual (

B S P d_{ω}

) if (

B S P p_{ω}

) is feasible and bounded. Vector

π_{ω}

represents the Lagrangian multipliers associated with the second-stage constraints given by Equation (5b):

(B S P d_{ω}) θ_{ω} (x) = max_{π_{ω}} {(h_{ω} - T_{ω} x)}^{T} π_{ω}

(6a)

s . t . W_{ω}^{T} π_{ω} \leq d_{ω}

(6b)

π_{ω} \geq 0

(6c)

BD introduces a set of piece-wise linear approximations of the recourse function in the problem MP, known as optimality cuts, which are built from dual solutions of the second-stage program. It is important to highlight that the dual feasible region does not depend on the value of x. Thus, the exact representation of the expected cost consists of the computation of all the extreme points of problems (

B S P d_{ω}

).

However, the second-stage program may not be feasible for some values of x. BD enforces second-stage constraints (5b) by adding feasibility cuts, which are valid inequalities that exclude infeasible first-stage solutions from the MP. Subproblem feasibility is evaluated by solving the following recourse reformulation for scenario

ω

:

V_{ω} (x) = min_{y_{ω}, v^{-}} e^{T} v^{-}

(7a)

s . t . W_{ω} y_{ω} - v^{-} \leq h_{ω} - T_{ω} x

(7b)

v^{-} \geq 0, y_{ω} \geq 0

(7c)

where

e \in R^{m_{2}}

is a vector with all-1 entries, and

v^{-} \in R^{m_{2}}

is the slack of constraints (5b). The objective function

V_{ω} (x)

measures the amount by which these constraints are violated; thus, if

V_{ω} (x)

equals zero, it implies that the original subproblem (5) is feasible. To derive feasibility cuts in terms of x, BD considers the dual of problem (7) to generate an expression equivalent to Equation (7a). The optimal solution

μ \in R^{m_{2}}

of the dual feasibility problem (8) corresponds to one of the extreme rays (or directions) of the recourse subproblem (6):

V_{ω} (x) = max_{μ} {(h_{ω} - T_{ω} x)}^{T} μ

(8a)

s . t . W_{ω}^{T} μ \leq 0

(8b)

0 \leq μ \leq e

(8c)

The master problem (4) is linearized by (i) substituting function

Q (x)

with the weighted sum of the future cost estimation (6a), and (ii) applying feasibility cuts as needed. This reformulation is referred to as the multi-cut Benders master problem (BMP),

(B M P) T C_{d} = min_{x, θ_{ω}} c^{T} x + \sum_{ω \in Ω} τ_{ω} θ_{ω}

(9a)

s . t . A x \leq b, x \in X

(9b)

{(h_{j} - T_{j} x)}^{T} {\bar{μ}}_{j} \leq 0 \forall j \in E

(9c)

{(h_{ω} - T_{ω} x)}^{T} {\bar{π}}_{ω}^{k} \leq θ_{ω} \forall ω \in Ω, k \in K

(9d)

where variables

θ_{ω} \in R^{| Ω |}

represent the outer linearization of the second-stage cost

θ_{ω} (x)

. Parameters

{\bar{π}}_{ω}^{k}

and

{\bar{μ}}_{j}

represent the extreme points and rays from the dual form of the recourse program (

B S P d_{ω}

), which are stored in sets E and K, respectively. Constraints (9c) and (9d) denote the feasibility and optimality cuts,

j \in E

and

k \in K

respectively. Matrices

h_{j}

and

T_{j}

correspond to the matrices

h_{ω}

and

T_{ω}

for the scenario where a feasibility cut can be found.

The complete enumeration of the extreme points and rays of the dual second-stage program is impractical, if not impossible. Instead, the L-shaped algorithm relaxes the MP by initially considering a subset of the optimality and feasibility cuts. BD iteratively solves the relaxed problem to generate a candidate solution for the first-stage variables (

\bar{x}

), and then solves the collection of scenarios subproblems at fixed

\bar{x}

to generate a new group of optimality or feasibility cuts. This process is repeated until the optimal solution is found [26].

The optimal solution of the relaxed Benders master problem provides a valid lower estimation (

T C_{d}

) of the optimal total cost, TC. On the other hand, the solution of the second-stage programs (

B S P d_{ω}

) at feasible

\bar{x}

yields an upper bound of the original objective function (

T C_{p}

), given by Equation (10). The solution procedure terminates when the difference between the bounds is closed, as implied by Equation (11). Algorithm 1 summarizes the procedure.

T C_{p} (\bar{x}) = c^{T} \bar{x} + \sum_{ω \in Ω} τ_{ω} θ_{ω} (\bar{x})

(10)

T C_{d} \leq T C \leq T C_{p}

(11)

Algorithm 1: Multi-cut Benders decomposition

The L-shaped method is summarized in Algorithm 1. It is initialized with a guess of the first-stage solution

x_{o}

and considers two stopping criteria: (i) the optimality tolerance

ϵ

that limits the relative gap between the dual

(z_{L B})

and primal

(z_{U B})

bounds of the objective function

(T C)

, and (ii) the maximum number of allowed iterations

(k_{m a x})

.

4. Benders Decomposition Enhancement Strategies

The application of BD often leads to slow convergence, long computational times, and excessive use of memory resources, particularly for the case when the MILP master problem has poor LP relaxation [27,28,29]. Major BD disadvantages include time-consuming iterations, poor feasibility and optimality cuts, ineffective initial iterations, primal solutions that behave erratically, slow convergence at the end of the algorithm (tailing-off effect), and upper bounds that remain stuck in successive iterations due to second-stage degeneracy [26,30].

Various strategies have been proposed to accelerate the convergence of the standard BD method. Enhancement strategies are mainly split into two categories: reducing the cost of each iteration, or reducing the number of iterations [26,31].

4.1. Reducing the Cost of Each Iteration

Cheaper iterations are achieved by reducing the time spent solving the MP and subproblems. The MP is often the most time-consuming part of the BD algorithm (more than 90% of the execution time) in the case of mixed-integer problems [27]. The overall process can be accelerated by relaxing the integrality of the MP in most of the iterations to rapidly compute a large number of cuts [32]. A variation of this method was proposed by Geoffrion and Graves [28], in which the MP is solved to a non-zero optimality gap. The integrality gap is continuously reduced to ensure global convergence.

Alternatively, the MP might be solved via (meta) heuristics [33,34], which provide good approximate solutions in short time; however, it is still required to solve the MP to optimality to guarantee convergence. The application of heuristics or the LP relaxation of the MP often yields worse bounds and lack of controllability, reducing the ability of BD to generate the necessary cuts [35].

Similarly, suboptimal solutions of the dual subproblems yield valid cuts, known as inexact cuts. Algorithm convergence can still be guaranteed under the conditions described by Zakeri et al. [36]. Additional subproblem acceleration schemes comprise synchronous parallelization and re-optimization. The latter exploits structural similarities between scenarios to solve the subproblems in fewer solver iterations.

4.2. Reducing the Number of Iterations

The number of iterations of the L-shaped algorithm is closely related to the tightness of the LP relaxation of the first-stage problem, as well as the strength of the optimality and feasibility cuts [27]. Better candidates are computed from improvements in the MP problem, especially in the strengthening of the representation of the recourse functions. Tighter formulations can be obtained by adding multiple cuts per iteration (multi-cut reformulation [37]); as well as through the use of heuristics to eliminate inactive cuts and to select the fittest dual variables to be inserted in the MP (size management techniques).

Complementary strategies have been developed to generate cuts that are more efficient. One alternative is the reformulation of the subproblems to select non-dominant dual solutions from the set of optimal multipliers, known as Pareto-optimal cuts [27]. Recently, Ref. [38] proposed a methodology to compute bundles of covering cuts, designed to involve most of the first-stage variables and to carry as much information as possible.

Alternative methods tighten the MP to alleviate some of the drawbacks of BD: cross decomposition, for instance, avoids the generation of low-quality solutions, while quadratic stabilization methods provide a solution for the tailing-off effect. Cross decomposition [39] combines and alternates between iterations of BD and Lagrangian decomposition to provide an additional valid lower bound of the objective function and a set of feasible deterministic solutions

(x_{ω}, y_{ω}) \in G_{ω}

, which are used to compute Lagrangian-based cuts to strengthen the MP.

Quadratic methods have been proposed to stabilize BD, aiming to improve the quality of the initial iterations and reduce the oscillation that occurs when the algorithm is close to the optimal solution [31]. These methods penalize the distance of the first-stage candidates to the current best solution. Popular variants include regularized decomposition [16,40], the trust-region method [41] and level decomposition [42], which are summarized below. It should be noted that Equations (12)–(14) are all variants of the large family of bundle method, which can be seen from chapters XIV and XV of [43].

Regularized decomposition (also known as proximal bundle method)

$x_{k + 1} = \underset{x, θ_{ω}}{arg min} {c^{T} x + \sum_{ω \in Ω} τ_{ω} θ_{ω} + \frac{1}{2 t_{k}} {∥ x - {\hat{x}}_{k} ∥}_{2}^{2} s . t . (x, θ_{ω}) \in V_{k}}$

(12)
Trust-region method

$x_{k + 1} = \underset{x, θ_{ω}}{arg min} {c^{T} x + \sum_{ω \in Ω} τ_{ω} θ_{ω} s . t . {∥ x - {\hat{x}}_{k} ∥}_{2}^{2} \leq R_{k}, (x, θ_{ω}) \in V_{k}}$

(13)
Level decomposition method

$x_{k + 1} = \underset{x, θ_{ω}}{arg min} {{∥ x - {\hat{x}}_{k} ∥}_{2}^{2} s . t . c^{T} x + \sum_{ω \in Ω} τ_{ω} θ_{ω} \leq L_{k}, (x, θ_{ω}) \in V_{k}}$

(14)

where $t_{k}$ , $R_{k}$ and $L_{k}$ are real-valued, iteration-dependent parameters that balance the minimization of the relaxed MP and the distance to the stability center ${\hat{x}}_{k}$ . $V_{k}$ represents the feasible region of the Benders master problem at each iteration, which is defined by the optimality (9d) and feasibility cuts (9c), as well by the first-stage constraints (9b). Stabilization methods were initially introduced for BD with no integer variables; nonetheless, recent improvements have adapted the method to mixed-integer problems [31].

Two software packages using the Benders decomposition algorithm, GAMS– DECIS [44,45] and FORTSP [46] are described in the benchmark in Appendix C.

5. Scenario Decomposition Methods

Scenario decomposition is a popular approach to solve two-stage SP formulations with mixed-integer recourse, i.e.,

Y = \{y : y_{j} \in {0, 1}, \forall j \in J_{1}, y_{j} \geq 0 \forall j \in J \ J_{1}\}

in (PNAC). In contrast to the BD algorithm, scenario decomposition methods dualize the non-anticipativity constraints (NACs) to obtain lower bounds of the original formulation. Scenario-based decomposition addresses the computational difficulties associated with the solution of large stochastic problems by considering each scenario independently and solving the set of subproblems in parallel. Moreover, feasible solutions to the original problem (P) can be obtained by heuristics based on the optimal solutions of the subproblems. In this section, we describe the dual decomposition (DD) algorithm and the progressive hedging (PH) algorithm.

5.1. Dual Decomposition (DD) Method

The dual decomposition algorithm proposed by Carøe and Schultz [17] applies the Lagrangian relaxation to problem (2) and uses a

b r a n c h

-

a n d

-

b o u n d

procedure to restore the non-anticipativity conditions. The Lagrangian relaxation of the NACs results in the following dual function:

D (λ) = min_{x, y} \sum_{ω \in Ω} L_{ω} (x_{ω}, y_{ω}, λ_{ω})

(15a)

s . t . (x_{ω}, y_{ω}) \in G_{ω} \forall ω \in Ω

(15b)

where

L_{ω} (x_{ω}, y_{ω}, λ_{ω}) = τ_{ω} (c^{T} x_{ω} + d_{ω}^{T} y_{ω}) + λ_{ω}^{T} H_{ω} x_{ω}

(16)

In Equation (15a), vector

λ \in R^{n_{1} \times (| Ω | - 1)}

represents the dual multipliers associated with the NACs (2b).

λ_{ω} \in R^{n_{1}}

represents the Lagrangian multipliers for the NACs associated with scenario

ω

, as given by Equation (3). Given the independence of the variables and constraints in each scenario, function D can be split into separate subproblems

D_{ω} (λ_{ω})

:

D (λ) = \sum_{ω \in Ω} D_{ω} (λ_{ω})

(17a)

D_{ω} (λ_{ω}) = {min_{x_{ω}, y_{ω}} L_{ω} (x_{ω}, y_{ω}, λ_{ω}) s . t . (x_{ω}, y_{ω}) \in G_{ω}}

(17b)

According to the weak duality theorem, the relaxation (17) is always less than or equal to the optimal objective value of problem (2). The best lower bound of (PNAC) is computed by solving the following maximization problem, referred to as the Lagrangian dual problem:

Z_{L D} = max_{λ} D (λ)

(18)

The Lagrangian dual is a concave non-smooth program and can be solved by subgradient methods, cutting-plane methods, or column-generation methods. The details of these methods can be found in Guignard [47]. We illustrate the dual search approaches by describing the standard cutting-plane algorithm.

5.1.1. Cutting-Plane Method

The cutting-plane algorithm solves the Lagrangian problem iteratively by implementing the outer approximation on (18) and solving the Lagrangian subproblems (17b) to improve the formulation of the relaxed dual function (

R Z_{L D}

) in Equation (19a). The outer approximation is given by the Lagrangian master problem (LMP):

(L M P) R Z_{L D} = max_{λ_{ω}, ϕ_{ω}} \sum_{ω \in Ω} ϕ_{ω}

(19a)

s . t . ϕ_{ω} \leq \bar{D_{ω}^{k}} (λ_{ω}^{k}) + {(H_{ω} x_{ω}^{k})}^{T} (λ_{ω} - λ_{ω}^{k}) \forall k \in K, ω \in Ω

(19b)

where parameters for iteration k and scenario

ω

,

x_{ω}^{k}

and

\bar{D_{ω}^{k}} (λ_{ω}^{k})

, represent the previous solution of subproblem (17b), and parameter

λ_{ω}^{k}

represents the vector of the previously considered dual multipliers. The dual search is outlined in Algorithm 2. It should be noted that (19) can be unbounded, especially in the first few iterations. An approach to avoid the unboundedness is to add bounds for the

λ_{ω}

variables. An alternative way is to use the bundle method [48] instead of Algorithm 2.

Algorithm 2: Cutting-plane dual search

Cutting-plane methods present similar drawbacks as the BD algorithm, such as slow convergence and strong oscillations of the dual variables. This can be explained by the fact that BD is a special case of the cutting-plane method from the view of nonsmooth optimization. Various alternatives have been proposed to accelerate this technique, including the bundle method and the volume algorithm [47]. Additional strategies consider the suboptimal solution of the master problem, using an interior-point method (IPM) in combination with Benders-like cuts to tighten the Lagrangian subproblems (17b) and exclude infeasible first-stage solutions (see [29,49]). Other methodologies such as cross decomposition, exchange information with BD to compute additional cuts derived from feasible first-stage candidates [50].

5.1.2. Branch-and-Bound Method

The DD algorithm proposed by Carøe and Schultz [17] uses the bound

Z_{L D}

to discard nodes from the first-stage search domain. Algorithm 3 summarizes the branch-and-bound procedure. The set

P

denotes the group of active problems and

T C^{i}

, the lower bound associated with program

P_{i} \in P

. Commonly, the Lagrangian dual problem yields first-stage solutions that differ in value from one scenario to another. For those instances, a candidate

\hat{x}

is estimated by applying a rounding heuristic on the average solution

\sum_{ω \in Ω} τ_{ω} x_{ω}^{i}

. Note that Algorithm 3 can be applied not only to problems with mixed-binary variables, but to problems with general mixed-integer variables as well. The branching steps assume that the integer variables can be nonbinary. More, recently Kim and Dandurand [51] proposed a new branching strategy based on Dantzig–Wolfe decomposition that allows reduction in the number of nodes and decreases the overall solution time.

Algorithm 3: DD Branch and Bound method

5.2. Progressive Hedging (PH) Algorithm

The progressive hedging (PH) algorithm [9] is a popular approximation for solving multi-stage stochastic programs. Although it was initially proposed for convex stochastic problems, it has been successfully applied as a heuristic to solve mixed-integer stochastic programs [52,53].

To find a solution of problem (2), PH aggregates a new set of variables

\hat{x}

(also known as a first-stage policy) that replaces the NACs (2b). Then, it solves the reformulated program (20) using a specialized variant of the alternating direction method of multipliers (ADMM) [54,55]:

min_{x_{ω}, y_{ω}, \hat{x}} T C = \sum_{ω \in Ω} (c^{T} x_{ω} + d_{ω}^{T} y_{ω})

(20a)

s . t . (x_{ω}, y_{ω}) \in G_{ω}, x_{ω} = \hat{x}, \forall ω \in Ω, \hat{x} \in X

(20b)

Related to dual decomposition, PH relaxes the non-anticipativity restrictions on the first stage. The augmented Lagrangian relaxation

L^{ρ}

of constraints

x_{ω} = \hat{x}, \forall ω \in Ω

yields a lower bound

D (λ)

of the original deterministic formulation (20). The best lower bound is estimated by solving the following problem:

T C \geq max_{λ} {D (λ) s . t . \sum_{ω \in Ω} τ_{ω} λ_{ω} = 0}

(21a)

where

D (λ) = min_{x, \hat{x}, y} L^{ρ} (x, \hat{x}, y, λ) s . t . (x_{ω}, y_{ω}) \in G_{ω} \forall ω \in Ω, \hat{x} \in X

(21b)

L^{ρ} (x, y, \hat{x}, λ) = \sum_{ω \in Ω} τ_{ω} L_{ω} (x_{ω}, y_{ω}, \hat{x}, λ_{ω})

(21c)

L_{ω} (x_{ω}, y_{ω}, \hat{x}, λ_{ω}) = c^{T} x_{ω} + d_{ω}^{T} y_{ω} + λ^{T} (x_{ω} - \hat{x}) + ρ / 2 {∥ x - \hat{x} ∥}_{2}^{2}

(21d)

and

ρ > 0

is a penalty parameter. Constraint

\sum_{ω \in Ω} τ_{ω} λ_{ω} = 0

is required to make

L^{ρ}

bounded from below. To mitigate the computational difficulties of minimizing the augmented Lagrangian dual function (21b), PH decomposes the problem by scenarios. To achieve the complete separability of subproblems, Rockafellar and Wets [9] proposed to fix the first-stage policy temporarily, and repeatedly solve the program (22) to update the multipliers and the value of

\hat{x}

:

min_{x_{ω}, y_{ω}} {c^{T} x_{ω} + d_{ω}^{T} y_{ω} + λ^{T} x_{ω} + ρ / 2 {∥ x_{ω} - \hat{x} ∥}_{2}^{2}}

(22)

Algorithm 4 summarizes the procedure to solve the dual problem (21).

Algorithm 4: Two-stage progressive hedging algorithm

The termination of the algorithm is achieved when the first-stage policy is non-anticipative. In the case of convex instances,

{\hat{x}}^{k ⟶ \infty}

is equivalent to the optimal solution of the deterministic formulation (2), and the convergence is guaranteed. These conditions do not hold for mixed-integer programs; however, a high-quality solution and upper bound can be computed from a non-convergent value of

{{\hat{x}}^{k} >}_{k = k_{m a x}}

and

T C_{p} {>({\hat{x}}^{k} >)}_{k = k_{m a x}}

, respectively [52].

Recent investigations have focused on the improvement and acceleration of PH. Various studies identify the penalty term as a critical factor in the quality of the solution and the convergence rate: larger values of

ρ

can accelerate the convergence but can lead to suboptimal solutions. On the other hand, lower values can improve the quality of the solutions and lower bounds, although with a very slow convergence rate [56]. Numerous alternatives have been developed to circumvent those problems, from

p e r

-

c o m p o n e n t

and

c o s t

-

p r o p o r t i o n a l

heuristics [52], to the dynamic update of the penalty parameter [42,57].

A limitation in applying PH to stochastic mixed-integer programs is the lack of a lower bound to assess the quality of the computed solution. This disadvantage can be alleviated by estimating a valid lower bound from the non-convergent set of Lagrangian weights

λ_{k}

[56], or by combining the Frank–Wolfe and PH methods [58]. These methodologies establish a relationship between the dual decomposition and progressive hedging, which has motivated the development of hybrid solution strategies (see [59]).

6. Software Packages for Scenario Decomposition

In this section, we review two software packages, PySP [52,60] and DSP [49], for scenario decomposition. The two software packages are benchmarked based on the problems in SIPLIB [61], including the SSLP [62], SSLPR [63], and DCAP [64] test problems.

The SSLP test set consists of 12 two-stage stochastic mixed-integer programs arising in stochastic server location problems (SSLPs). The base deterministic server location problem considers building servers in some potential locations to serve clients in given locations. The stochastic version of the server location problem considers different realizations of client locations. Each scenario represents a set of potential clients that do materialize. The decisions in SSLP are all binary variables. In the first stage, we decide whether a server is located at each given location. The second-stage (recourse) actions decide whether any given client is served by any given server. SSLPR (stochastic server location problem with random recourse) is an extension of SSLP. While SSLP assumes fixed demands for the clients, SSLPR considers the demands of the clients as uncertain parameters.

DCAP consists of 12 two-stage stochastic integer programs arising in dynamic capacity acquisition and allocation applications. The deterministic model considers a capacity expansion schedule over T time periods. In each time period, the amount of capacity expansion for each resource needs to be decided. There is a fixed and a variable cost for each capacity expansion. In each time period, each task must be assigned to one of the existing resources, which is represented by binary variables that decide whether a given task is assigned to a given resource. Since there are multiple periods, the stochastic version of this problem should, in principle, be formulated as a multi-stage stochastic program, which is difficult to solve. Ahmed and Garcia [64] proposed to approximate the multi-stage problem with a two-stage stochastic program in which the first-stage decisions are the capacity expansions. The second-stage decisions are the assignment decisions. The uncertainties include the processing requirement for each task and the cost of processing each task.

The sizes of all the test problems are shown in Table 1. The names of the SSLP and SSLPR instances are expressed in the form sslp(rf)_m_n, where m is the number of potential server locations, and n is the number of potential clients. Each instance is tested with a different number of scenarios. The size is expressed as the number of constraints (Rows), variables (Cols), and integer variables (Ints) in the first stage and the second stage per scenario. Note that the SSLP problems have pure binary first-stage variables, and the DCAP problems have mixed-binary first-stage variables. This difference affects the PH algorithm, which is discussed in detail later.

All of the test problems are available in the SMPS format; however, we implement an interpreter to make the format compatible with PySP. All of the tests are run on a server with an Intel Xeon CPU (24 cores) at 2.67 GHz and 128 GB of RAM. The whole set of instances is solved in a synchronous parallel manner to reduce the time of each iteration.

6.1. PySP: Pyomo Stochastic Programming

PySP is a software package implemented in the Python programming language using Pyomo [65] as the optimization modeling framework. PySP enables the user to solve stochastic programs with a specialized progressive hedging algorithm for stochastic mixed-integer programs. In order to use PySP, the user only needs to write a deterministic base model and define the scenario tree structure in Pyomo. With these inputs, PySP is able to apply the progressive hedging algorithm as an effective heuristic for obtaining feasible solutions to multi-stage stochastic programs. We highlight that starting from distribution 6.0, PySP will not be part of Pyomo’s source code; however, it will still be available and maintained. Furthermore, the PySP project will be continued as mpi-sppy, available in https://github.com/Pyomo/mpi-sppy, accessed on 20 January 2022.

6.1.1. Algorithmic Innovations in PySP

The innovations in PySP for multi-stage mixed-integer stochastic programs were described by Watson and Woodruff [52]. Here, we briefly paraphrase those innovations. First, instead of keeping a fixed

ρ

value for all first-stage decisions in Algorithm 4, the authors proposed several variable-dependent

ρ

strategies. The cost-proportional (CP) strategy sets

ρ (i)

to be proportional to the cost parameter

c (i)

, i.e.,

ρ (i) = α c (i)

, where

α

is a constant multiplier for all first-stage variables i. The other variable-dependent

ρ

strategy was denoted by SEP in [52], where the

ρ (i)

for integer variables is calculated by

\begin{matrix} ρ (i) : = \frac{c (i)}{(x^{max} - x^{min} + 1)} \end{matrix}

After PH iteration 0, for each variable x,

x^{max} = {max}_{ω \in Ω} x_{ω}^{0}

and

x^{min} = {min}_{ω \in Ω} x_{ω}^{0}

. For continuous variables, the

ρ (i)

is calculated with

\begin{matrix} ρ (i) : = \frac{c (i)}{max ((\sum_{ω \in Ω} τ_{ω} | x_{ω}^{0} - {\hat{x}}^{0} |), 1)} \end{matrix}

where

{\hat{x}}^{0}

is the weighted average of

x_{ω}^{0}

, i.e.,

{\hat{x}}^{0} = \sum_{ω \in Ω} τ_{ω} x_{ω}^{0}

.

The authors also proposed some heuristics for accelerating convergence. One heuristic is called “variable fixing”. The values of some of the first-stage decisions

x_{ω, i}

are fixed after they stay at a given value

z_{i}

for a few iterations for all scenarios. In order to apply this heuristic, the authors introduced a lag parameter

μ

. At a given PH iteration k, the value of

x_{ω, i}^{k}

is fixed to

z_{i}

for all subsequent iterations

l > k

, if

x_{ω, i}^{(k)} = z_{i}

for all

ω \in Ω

and

m \in {k - μ | Ω |, \dots, k}

, such that

m \geq μ | Ω |

. Additionally, the authors proposed another more aggressive variable fixing heuristic called “variable slamming”, where the

x_{ω}^{k}

is fixed if they are “sufficiently converged”, i.e., there can be some discrepancies for

x_{ω}^{k}

across all scenarios. In order to decide when variable slamming should be applied, the authors proposed several termination criteria based on the deviations of the first-stage variables.

In solving stochastic mixed-integer programs with PH, the cyclic behavior can be found in some instances. In order to detect the cyclic behavior, the authors proposed a strategy based on the values of the

u_{ω}

vectors, i.e., the weights associated with the first-stage decision variable

x_{ω}

. The authors proposed a simple hashing scheme. Let hash value

h (i) = \sum_{ω \in Ω} z_{ω} u_{ω, i}

, where

z_{ω}

is an integer hash weight for each scenario

ω \in Ω

when the PH is initialized. If equal hash weights are detected, they are interpreted as evidence for a potential cycle. Variable

x_{i}

can be fixed if cyclic behaviors are found.

The heuristics, including variable fixing and slamming and cyclic behavior detection, are denoted as WW (Watson–Woodruff) heuristics in the software distribution of PySP.

6.1.2. Computational Results for PySP

We use PySP (Pyomo 5.0.0) to solve the SSLP, SSLPR, and DCAP problems. Each subproblem is solved with the CPLEX (12.7.0) quadratic solver. We use the cost-proportional (CP) heuristic to set the values of

ρ (i)

. The multipliers

α

in the CP heuristic are set to 0.1, 0.3, 0.5, and 1.0, respectively. Note that the main results shown in this section are not using WW heuristics, i.e., we do not use the variable fixing and slamming, or cycle-detection heuristics. We will make a comparison of PySP with WW heuristics and PySP without WW heuristics at the end of this section.

The number of iterations and the walltime for different multipliers are shown in Figure 1 and Figure 2, respectively. If the PH algorithm reaches iteration limit, there is an “(i)” label at the top of the column. If the PH algorithm reaches the time limit, there is a “(t)” label on top of the column. From Figure 1 and Figure 2, one can observe that setting the

α

value to 0.1 makes PH take the largest number of iterations and largest amount of walltime to converge in most of the instances, which may be due to the small step size. On the other hand, setting

α

to the largest value, i.e., 1.0, takes fewer iterations and less walltime than using other

α

values in most instances. However, it runs out of the iteration limit in two of the instances. Overall, setting

α

to 0.3 seems to be a robust choice because cp-0.3 always converges within a reasonable walltime and number of iterations. The details of the SSLP and SSLPR results are shown in Table A1 and Table A2 in Appendix A.

We also apply PySP to solve DCAP instances. We observe that for all the DCAP instances, PySP is unable to converge within 300 iterations. The details of the results are shown in Table A3 in Appendix A where the walltime of the upper bound for those instances is reported.We compare the upper bound obtained by PySP with those obtained by DSP in the next subsection. From this experiment, we can see that it is more difficult for PySP to solve problems with mixed-binary first-stage variables than problems with pure binary first-stage variables because it is more difficult for the continuous variables to satisfy the NACs.

Scenario bundling [66,67,68] is a technique that has been used in dual decomposition algorithms. The main idea is to dualize only “some” of the non-anticipativity constraints, rather than dualizing all of them. In other words, the individual scenario subproblems are bundled into larger subproblems in which the NACs are preserved. Ryan et al. [67] used PH with scenario bundling to solve stochastic unit commitment problems. The authors showed that with the use of scenario bundling, PH can obtain solutions with a smaller optimality gap. In order to test the effectiveness of the scenario bundling, we test several instances from the SSLP and DCAP libraries. The computational results are shown in Table 2. For each instance, we try a different number of bundles. For the SSLP instances, PH with a different number of bundles can obtain the same upper bound. However, the number of bundles has a significant impact on the computational time. For example, for SSLP_10_50 with 1000 scenarios, PH with 50 bundles can reduce the walltime of the original PH with 1000 bundles to 3%. Additionally, it only takes PH with 50 bundles one iteration to converge. For DCAP problems, PH does not converge within 300 iterations for most cases, even with scenario bundling. However, PH is able to obtain better feasible solutions with scenario bundling (see UB in Table 2).

Finally, we evaluate how the use of WW heuristics can affect the performance of PySP on the SSLP and SSLPR libraries. The results on DCAP library are omitted here since PySP does not converge for DCAP instances. The solution time improvements by using WW heuristics for each SSLP and SSLPR instance are shown in Figure 3. Note that there are three cases where the improvements are omitted in the figure: case (1)—neither PH nor PH with WW heuristics can converge in 300 iterations; case (2)—only PH-WW fails to converge in 300 iterations; and case (3)—both PH and PH-WW exceed the time limit of 3 h (10,800 CPU seconds). Using WW heuristics gives significant improvements for small cost-proportional multipliers, i.e., 0.1 and 0.3. As we have described in Table 1, PH with small multipliers usually takes more iterations to converge. Therefore, the WW heuristics can accelerate convergence for those instances, more effectively. However, there are also a few instances where PH can converge, but PH with WW heuristics cannot converge, which are denoted by case (2) in Figure 3.

6.2. DSP: Decompositions for Structured Programming

DSP [49] is an open-source software package implemented in C++ that provides different types of dual decomposition algorithms to solve stochastic mixed-integer programs (SMIPs). DSP can take SMPS files, and JuMP models as input via a Julia package Dsp.jl.

6.2.1. Algorithmic Innovations in DSP

From the description of the dual decomposition algorithm in Section 5, one can observe that the lower bounds of the dual decomposition algorithm are affected by the way the Lagrangian multipliers are updated. One advantage of DSP is that the authors have implemented different dual-search methods including the subgradient method, the cutting plane method, and a novel interior-point cutting-plane method for the Lagrangian master problem. The authors observed that if the simplex algorithm is used, the solutions to the Lagrangian master problem can oscillate significantly, especially when the Lagrangian dual function is not well approximated. Therefore, the authors proposed to solve the Lagrangian master problem suboptimally using an interior point method, which follows from the work of Mitchell [69].

The authors also proposed some tightening inequalities that are valid for the Lagrangian subproblems. These valid inequalities, including feasibility and optimality cuts, are obtained from Benders subproblems, where the integrality constraints are relaxed. Computational results show that the Benders-like cuts can be effective in practice. The latest DSP distribution is able to complete Algorithm 3 using the Dantzig–Wolfe decomposition-based branching strategy described in [51].

6.2.2. Computational Results for DSP in Comparison with PySP

We test the dual decomposition algorithm on the SSLP, SSLPR, and DCAP libraries. Each subproblem is solved with CPLEX (12.7.0) mixed-integer linear solver. The interior point method proposed by the authors [49] is used to solve the Lagrangian master problem, which is solved with CPLEX as well. Benders-like cuts are not used because the implementation of Benders cuts in DSP only works with SCIP.

In Figure 4 and Figure 5, we evaluate the best feasible solution (the upper bound) obtained by PySP, and the upper and lower bounds obtained by DSP. For each instance, we include three different gaps. The upper and lower bounds from Ref. [61] are used to evaluate the bounds from PySP and DSP. Note that the bounds from literature are close to the global optimality of each instance. The first column for each instance in both Figure 4 and Figure 5 is the gap between the upper bound from PySP (

P y S P_{u b}

) and the lower bound from the literature (

L i t e r a t u r e_{l b}

). The second column represents the gap between the upper bound from DSP (

D S P_{u b}

) and the lower bound from the literature (

L i t e r a t u r e_{l b}

). The third column represents the gap between the upper bound from literature (

L i t e r a t u r e_{u b}

) and the lower bound from DSP (

D S P_{l b}

).

For the SSLP and SSLPR instances shown in Figure 4, although PySP can converge within the time and iteration limit, the best feasible solution obtained from PySP (

P y S P_{u b}

) may not be optimal. There are about 1% gaps for some of the SSLP instances (see the first column of each instance in Figure 4). DSP can solve more instances to optimality than PySP (see the second column of each instance in Figure 4). The lower bounds obtained by DSP are also quite tight, usually less than 0.01% (see the third column of each instance in Figure 4). Note that the literature values for SSLPRF5_50_100(1), SSLPRF5_50_100(2), and SSLPRF5_50_100(3) do not match the values from our experiment. Therefore, we try to solve the deterministic equivalent of these instances to obtain bounds. The literature bounds of SSLPRF5_50_100(3) come from solving the deterministic equivalent. The gaps of SSLPRF5_50_100(1) and SSLPRF5_50_100(2) are omitted since the corresponding deterministic equivalent cannot be solved within 3 h.

For the DCAP instances where we have mixed-integer first-stage variables, the best feasible solutions from PySP (

P y S P_{u b}

) and DSP (

D S P_{u b}

) are quite far from optimality. The gaps of the first two columns are around 10%. On the other hand, the lower bounds obtained from DSP (

D S P_{l b}

) are tight. The gaps between (

L i t e r a t u r e_{u b}

) and (

D S P_{l b}

) are around 0.1%. Therefore, in order to improve the relative optimality gap of DSP, the focus should be on designing more advanced heuristics to obtain better feasible solutions.

6.3. Results from Branch-and-Bound Algorithm in DSP

As we were completing this paper, a new version of DSP [51] was developed that uses the branch-and-bound algorithm similar to Algorithm 3. We test the branch-and-bound decomposition on the SLSP, SSLPR, and DCAP test instances. The computational setting is the same as the one described in the previous section. Table 3, Table 4 and Table 5 show the results obtained by DSP. We highlight that almost all the instances were solved to optimality within the walltime limit.

In all the SSLP instances (except SSLP15_45), DSP using the DW-based branch-and-bound (DSP-DW) is able to solve the instances to optimality in times lower than the ones from the DSP implementing only the first iteration of DD. The same trend is replicated on the SSLPRF instances.

On the other hand DSP-DW is remarkably slower in attaining algorithmic convergence in the DCAP test set. This behavior can be explained by looking at the

D S P_{l b}

vs.

L i t e r a t u r e_{u b}

gap in Figure 4 and Figure 5. The standard implementation of DSP is able to close the optimality gap in the SSLP and SSLPR instances by solving the Lagrangian relaxation problem (15a) and using heuristics to compute a primal bound. Nonetheless, this strategy does not work in the DCAP set for which additional computation is devoted by DSP-DW to close the gap via branch-and-bound.

7. Conclusions

We presented a summary of the state-of-the-art methodologies for the solution of two-stage linear stochastic problems. First, we introduced the mathematical formulation of such programs and highlighted features in their structure, which enable the development of decomposition algorithms. These methodologies are classified in two groups: node-based decomposition and scenario-based decomposition.

For two-stage stochastic programs with continuous recourse, we summarized Benders decomposition, which partitions the problem according to its time structure. BD may present computational problems, which can be alleviated by reducing the cost of each iteration, and/or decreasing the number of iterations.

Scenario decomposition methodologies are popular alternatives in the case of (mixed) integer recourse. The progressive hedging algorithm and dual decomposition relax the nonanticipatity restrictions and provide the user with valid bounds. Our numerical results show that the performance of PH is strongly affected by the constant penalty multiplier. Furthermore, its performance and the quality of the approximate solution may be enhanced by grouping the scenarios in large bundles (or scenario sets). We also tested the dual decomposition algorithm with the DSP package. The computational results show that DSP is able to provide a tight lower bound on the instances that we tested. However, the optimality gaps can be as large as 10%, relative to the upper bound from literature. Therefore, for those tested instances, future effort should be focused on developing more advanced heuristics to improve the best feasible solution. Moreover, we tested the newest DSP implementation, which incorporates a Dantzig–Wolfe decomposition-based branch-and-bound algorithm to close the optimality gap. In most of the instances, it is able to retrieve a global solution in the given walltime limit. We highlight that its computational performance depends on the strength of the dual decomposition.

At the time we completed this paper, several new developments were made in stochastic programming software. These software packages have the capability of solving multi-stage stochastic programs, e.g., SDDP.jl [70], StochasticPrograms.jl [71], and MSPPy [72]. In the terms of algorithmic development, recent works have extended the capability to solve nonlinear problems [4,73,74,75,76,77,78]. Furthermore, it should be noted that the paper focuses on the risk-neutral setting of two-stage stochastic programs. However, the studied approaches can be easily extended to the risk-averse setting [79], e.g., by including some risk measures, such as the conditional value at risk (CVar).

Author Contributions

J.J.T. conceptualization, computational study and paper writing; C.L. conceptualization and paper writing; R.M.A. conceptualization and review; I.E.G. conceptualization, supervision and review. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded from the Center of Advanced Process Decision-making at Carnegie Mellon University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Computational Results for PySP

Table A1. Computational results for PySP on SSLP.

Instances	Scenarios	cp Multiplier	Iterations	Time	UB
SSLP_5_25	50	cp 0.1	24	17.59	−121.60
		cp 0.3	11	8.16	−121.60
		cp 0.5	13	8.53	−121.60
		cp 1.0	12	7.80	−121.60
	100	cp 0.1	19	27.38	−127.37
		cp 0.3	11	15.27	−127.37
		cp 0.5	18	21.26	−127.37
		cp 1.0	>300	303.53	−125.59
SSLP_10_50	50	cp 0.1	68	254.15	−364.64
		cp 0.3	22	71.57	−364.64
		cp 0.5	15	49.19	−364.64
		cp 1.0	13	41.91	−364.64
	100	cp 0.1	95	540.21	−354.19
		cp 0.3	31	149.33	−354.19
		cp 0.5	19	88.54	−354.19
		cp 1.0	12	57.98	−354.19
	500	cp 0.1	174	4322.45	−349.14
		cp 0.3	61	1265.18	−349.14
		cp 0.5	35	688.90	−349.14
		cp 1.0	19	379.64	−349.14
	1000	cp 0.1	180	9984.56	−351.71
		cp 0.3	59	2849.90	−357.71
		cp 0.5	35	1604.27	−357.71
		cp 1.0	18	845.16	−351.71
SSLP_15_45	5	cp 0.1	9	12.48	−262.40
		cp 0.3	4	5.85	−261.20
		cp 0.5	>300	197.35	−261.20
		cp 1.0	10	9.65	−261.20
	10	cp 0.1	155	485.38	−260.50
		cp 0.3	55	108.15	−260.50
		cp 0.5	33	58.23	−259.30
		cp 1.0	>300	463.63	−259.30
	15	cp 0.1	186	1416.21	−253.60
		cp 0.3	74	286.78	−253.60
		cp 0.5	38	126.79	−253.60
		cp 1.0	22	65.08	−253.20

Table A2. Computational results for PySP on SSLPR.

Instances	cp Multiplier	Iterations	Time	UB
SSLPRF_5_25_100_1	cp 0.1	33	722.61	−74,005.84
	cp 0.3	20	461.1	−74,005.84
	cp 0.5	19	427.04	−74,005.84
	cp 1.0	16	315.67	−74,005.84
SSLPRF_5_25_100_2	cp 0.1	28	852.75	−72,671.95
	cp 0.3	18	517.33	−72,671.95
	cp 0.5	15	436.16	−72,671.95
	cp 1.0	21	539.61	−72,671.95
SSLPRF_5_25_100_3	cp 0.1	34	735.9	−75,664.19
	cp 0.3	21	412.91	−75,664.19
	cp 0.5	20	397.01	−75,664.19
	cp 1.0	18	348.75	−75,664.19
SSLPRF_5_50_100_1	cp 0.1	59	2457.66	138,900.12
	cp 0.3	20	814.04	138,900.12
	cp 0.5	12	429.35	138,900.12
	cp 1.0	6	200.07	138,900.12
SSLPRF_5_50_100_2	cp 0.1	94	>10,800	−
	cp 0.3	95	7382.14	245,424.96
	cp 0.5	114	4315.02	245,424.96
	cp 1.0	75	3008.51	500,144.07
SSLPRF_5_50_100_3	cp 0.1	88	>10,800	−
	cp 0.3	83	6984.09	258,578.79
	cp 0.5	50	3887.54	258,578.79
	cp 1.0	24	1727.73	258,578.79

Table A3. Computational results for PySP on DCAP.

Instances	Scenarios	cp Multiplier	Iterations	Time	UB
DCAP 233	200	cp 0.1	>300	456.18	2206.7
		cp 0.3	>300	457.5	2431.8
		cp 0.5	>300	458.23	1966.1
		cp 1.0	>300	455.88	1952.6
	300	cp 0.1	>300	734.23	1862.0
		cp 0.3	>300	726.59	1943.3
		cp 0.5	>300	726.77	1831.6
		cp 1.0	>300	741.17	1815.4
	500	cp 0.1	>300	1515.27	2498.1
		cp 0.3	>300	1492.59	2000.0
		cp 0.5	>300	1467.64	1939.2
		cp 1.0	>300	1494.13	1893.8
DCAP 243	200	cp 0.1	>300	481.96	2465.9
		cp 0.3	>300	478.34	2454.2
		cp 0.5	>300	481.51	2369.3
		cp 1.0	>300	466.95	2383.2
	300	cp 0.1	>300	792.01	2825.8
		cp 0.3	>300	756.17	2802.8
		cp 0.5	>300	710.44	2755.0
		cp 1.0	>300	776.97	2743.1
	500	cp 0.1	>300	1690.74	2196.0
		cp 0.3	>300	1622.8	2235.2
		cp 0.5	>300	1674.76	2216.7
		cp 1.0	>300	1536.42	2323.3
DCAP 332	200	cp 0.1	>300	456.58	1362.7
		cp 0.3	>300	427.65	1529.1
		cp 0.5	>300	450.88	1278.0
		cp 1.0	>300	460.35	1171.0
	300	cp 0.1	>300	714.83	1332.1
		cp 0.3	>300	698.52	1948.9
		cp 0.5	>300	709.21	1904.4
		cp 1.0	>300	706.75	1766.3
	500	cp 0.1	>300	1464.11	1768.6
		cp 0.3	>300	1451.73	1822.8
		cp 0.5	>300	1473.66	1846.6
		cp 1.0	>300	1452.64	1861.6
DCAP 342	200	cp 0.1	>300	449.9	1993.4
		cp 0.3	>300	476.77	1990.4
		cp 0.5	>300	445.11	1870.6
		cp 1.0	>300	470.82	1830.8
	300	cp 0.1	>300	722.6	2260.3
		cp 0.3	>300	739.94	2371.1
		cp 0.5	>300	690.76	2497.7
		cp 1.0	>300	702.23	2255.9
	500	cp 0.1	>300	1582.9	2198.1
		cp 0.3	>300	1604.98	2317.5
		cp 0.5	>300	1555.9	2290.6
		cp 1.0	>300	1593.41	2097.5

Appendix B. Computational Results for DSP

Table A4. Computational results for DSP on SSLP.

Instances	Scenarios	Iterations	Time	LB	UB	Gap [%]
SSLP_5_25	50	16	3.86	−121.60	−121.60	0.00
SSLP_5_25	100	17	6.00	−127.37	−125.59	1.42
SSLP_10_50	50	57	204.63	−364.64	−357.98	1.86
	100	44	213.95	−354.19	−341.33	3.77
	500	69	2439.58	−349.14	−344.02	1.49
	1000	60	2960.55	−351.71	−336.23	4.60
SSLP_15_45	5	15	14.14	−262.40	−261.20	0.46
	10	41	152.25	−260.50	−260.50	0.00
	15	44	207.39	−253.60	−253.60	0.00

Table A5. Computational results for DSP on SSLPR.

Instances	Iterations	Time	LB	UB	Gap [%]
SSLPRF_5_25_100_1	36	1239.55	−74,005.84	−74,005.84	0.00
SSLPRF_5_25_100_2	38	1783.89	−72,671.95	−72,671.95	0.00
SSLPRF_5_25_100_3	40	1541.41	−75,664.19	−75,664.19	0.00
SSLPRF_5_50_100_1	88	6776.87	138,900.12	138,900.12	0.00
SSLPRF_5_50_100_2	57	9357.49	163,943.96	245,427.14	33.20
SSLPRF_5_50_100_3	85	>10,800	189,569.71	254,469.62	25.50

Table A6. Computational results for DSP on DCAP.

Instances	Scenarios	Iterations	Time	LB	UB	Gap [%]
DCAP 233	200	59	17.65	1833.40	2053.77	10.73
	300	69	35.45	1642.73	1812.89	9.39
	500	60	29.57	1735.09	2257.81	23.15
DCAP 243	200	54	17.81	2321.17	2447.75	5.17
	300	50	23.62	2556.68	2600.56	1.69
	500	62	58.24	2165.48	2481.84	12.75
DCAP 332	200	59	16.25	1059.09	1337.71	20.83
	300	79	39.58	1250.91	1431.11	12.59
	500	66	55.94	1587.07	1802.24	11.94
DCAP 342	200	52	14.32	1618.07	1804.57	10.34
	300	46	21.19	2065.42	2252.33	8.30
	500	56	51.59	1902.98	2059.87	7.62

Appendix C. Software Packages for Benders Decomposition

In this section, we review two software packages, GAMS–DECIS [44,45] and FORTSP [46] for Benders decomposition. Both packages are benchmarked with 20 instances from the random [80] and SAPHIR [81] test collections. All of the test problems are available in the SMPS format; however, specific modifications need to be done in order to make the format compatible with DECIS. The computational experiments are performed on a Linux machine with a 2.67 GHz Intel Xeon CPU, 128 GB of RAM, and a limit of 3 h of walltime.

The random collection consists of 15 instances artificially generated with the test problem generator GENSLP [82]. The instances are grouped into three sets of problems (rand0, rand1, rand2), each one of them with five instances with 2000, 4000, 6000, 8000 and 10,000 scenarios. None of the instances represent a real-world problem; nonetheless, they are successfully used to assess the performance of stochastic solvers (see [83]). All problems in this collection present uncertainty only in the right-hand side (RHS) coefficients

h_{ω}

.

The SAPHIR collection consists of five instances of the optimization of a gas-purchase portfolio, considering the cost of purchase, as well as underground storage capacities and transportation, under uncertain demand conditions [84]. In this family of problems, the random elements are located in both the RHS and constraint matrices

W_{ω}

and

T_{ω}

.

The sizes of all of the test problems are shown in Table A7. The size is expressed as the number of constraints (Rows) and variables (Cols) in the first stage and the second stage per scenario. None of the test instances consider integer variables in the first-stage.

Table A7. Sizes of SLP instances tested.

Name	Scenarios	First Stage		Second Stage
Name	Scenarios	Rows	Cols	Rows	Cols
rand0	2000, 4000, 6000, 8000, 10,000	50	100	25	50
rand1	2000, 4000, 6000, 8000, 10,000	100	200	50	100
rand2	2000, 4000, 6000, 8000, 10,000	150	300	75	150
saphir	50, 100, 200, 500, 1000	32	53	8678	3924

Appendix C.1. FortSP: A Stochastic Programming Solver

FortSP is a solver for the solution of linear and mixed-integer linear stochastic programs. It accepts input in the SMPS format, or through a separate SAMPL translator (an AMPL extension for stochastic programming). In addition, FortSP can be used as a library with an application programming interface (API) in C. FortSP enables the user to solve stochastic two-stage linear programs with four variants of Benders decomposition, and provides three different solution approximations for mixed-integer instances.

Appendix C.1.1. Algorithmic Innovations in FortSP

The innovations in FortSP for two-stage linear and mixed-integer linear stochastic programs are described by Ellison et al. [46]. FortSP incorporates five methods to solve two-stage stochastic linear programs: (i) solving the deterministic equivalent via the interior-point method (IMP), (ii) Benders decomposition with aggregated cuts (see problem (A1)), (iii) regularized decomposition [85] (see problem (12)), (iv) Benders decomposition with regularization of the expected recourse by the level method [86] (see problem (14)), and (v) the trust region (regularization) method [41] (see problem (13)).

To solve mixed-integer instances, FortSP uses the deterministic equivalent with both implicit and explicit representations for the NACs. In addition, it incorporates a specialized L-shaped algorithm based on branch-and-cut for instances with mixed-integer variables in the first-stage and continuous and complete recourse. This method might be accelerated with the variable neighborhood decomposition search heuristic (VNDS) [87].

All of the Benders variants in FortSP are formulated in the

a g g r e g a t e d

form shown in Equation (A1).

D i s a g r e g a t e d

formulations (i.e., problem (9)) store larger information in the master problem, which yields a reduction in the number of iterations. However, this is done at the expense of larger master problems. As a rule of thumb, the

d i s a g g r e g a t e d

approach is expected to be more effective when the number of scenarios

| Ω |

is not significantly larger than the number of constraints

m_{1}

of the first-stage program [1].

(B M P) T C_{d} = min_{x, v} c^{T} x + v

(A1a)

s . t . A x \leq b, x \in X, v \in R

(A1b)

{(h_{j} - T_{j} x)}^{T} {\bar{μ}}_{j} \leq 0 \forall j \in E

(A1c)

\sum_{ω \in Ω} τ_{ω} {(h_{ω} - T_{ω} x)}^{T} π_{ω}^{k} \leq v \forall k \in K

(A1d)

Appendix C.1.2. Computational Results for FortSP

We use FortSP to solve the random [80] and SAPHIR [81] test instances. The number of iterations and walltime for different solution methodologies are shown in Table A8, where IPM stands for interior-point method, RD for regularized decomposition, and TR for trust region. The CPLEX (12.5.1) linear and quadratic solver is used to solve the set of master problem and subproblems. For decomposition methodologies, a stopping optimality gap of

1 \times 10^{- 5}

is used. FortSP automatically selects the methodology used to solve the set of master problem and recourse instances, from primal and dual simplex, as well as an interior-point method. In addition, FortSP considers the warm-start of linear programs.

From Table A8, one can observe that solving the deterministic equivalent via IPM is an effective alternative, outperforming BD in most of the instances considered; nonetheless, it fails to solve the larger instances in the SAPHIR set. Regularized decomposition and the trust region method perform better than BD in the SAPHIR set, effectively decreasing the number of iterations and the solution time. However, RD fails on the whole set of RAND test problems. Decomposition with the level method presents the best performance on both of the benchmark sets, yielding computational times close to the interior-point method and effectively reducing the number iterations of the standard BD method.

Table A8. Computational results for FortSP.

Instances	Scenarios	IPM		Benders		Level		RD		TR
Instances	Scenarios	Iter	Time	Iter	Time	Iter	Time	Iter	Time	Iter	Time
rand0	2000	128	38.21	80	10.57	44	7.53	-	-	103	13.56
	4000	46	26.18	69	20.02	32	11.50	-	-	84	24.60
	6000	57	46.30	108	41.10	51	21.53	-	-	136	51.36
	8000	64	66.28	127	65.34	50	34.00	-	-	159	81.33
	10,000	80	95.32	230	153.99	71	53.39	-	-	311	207.46
rand1	2000	37	34.74	391	237.40	74	52.86	-	-	502	307.87
	4000	46	79.92	502	528.99	59	69.90	-	-	624	655.29
	6000	47	116.40	385	576.33	58	94.25	-	-	484	728.86
	8000	50	160.58	453	818.78	65	126.08	-	-	611	1126.22
	10,000	51	414.21	430	1064.25	52	526.53	-	-	558	1388.47
rand2	2000	36	63.78	886	1643.40	65	133.59	-	-	1239	2415.88
	4000	40	140.56	414	1355.37	42	152.27	-	-	573	1936.61
	6000	48	245.89	514	3067.92	52	318.58	-	-	675	4172.58
	8000	51	329.10	454	3036.40	44	310.44	-	-	681	4638.54
	10,000	51	418.11	686	6774.75	52	528.81	-	-	988	9733.37
Saphir	50	-	-	127	527.06	39	215.72	22	82.30	33	77.18
	100	-	-	122	768.42	44	503.87	29	216.37	34	97.01
	200	-	-	-	-	-	-	30	163.66	19	84.15
	500	326	555.35	122	847.17	42	426.28	29	231.10	25	85.62
	1000	-	-	138	1153.40	51	655.66	29	259.29	86	289.53

Appendix C.2. DECIS: A System for Solving Large-Scale Stochastic Programs

DECIS is a software platform for the solution of large-scale two-stage stochastic programs. It accepts problems in SMPS format. To use DECIS in GAMS, the user needs to formulate the deterministic problem and time distribution of the constraints and variables in the GAMS interface, which automatically constructs the

c o r e

and

t i m

files. The uncertain components and realization probabilities are set from an external

s t o c h a s t i c

file (.sto extension in SMPS format), which is written by the user. Recent developments in GAMS allow to use the extended mathematical programming (EMP) framework to define a stochastic program for DECIS, as well as setting the optimization of two additional risk measures: value at risk (VaR) and conditional value at risk (CVaR).

Appendix C.2.1. Algorithmic Innovations in DECIS

DECIS incorporates multiple alternatives to solve linear two-stage stochastic programs, including (i) Benders decomposition with aggregated cuts, and, (ii) a regularized decomposition variant. The latter uses MINOS to solve the quadratic master problem (12), and requires the user to select a proper constant penalty parameter (

t_{k} > 0

). The overall algorithm performance and convergence are strongly affected by the value of

t_{k}

.

When the number of realizations is large, DECIS can employ advanced Monte Carlo sampling techniques to compute good approximate solutions. Instead of considering the whole set of possible outcomes to estimate the expected cost, DECIS uses an independent sample drawn from the distribution of random parameters. In addition to crude Monte Carlo sampling, DECIS incorporates importance sampling and control variates, variance reduction techniques which enhance the estimation of the expected cost. In addition, DECIS computes a confidence interval in which the optimal objective function value lies.

Appendix C.2.2. Computational Results for DECIS

We use DECIS to solve the random and SAPHIR test instances. The number of iterations and walltime for different solution methodologies are shown in Table A9. Two initialization strategies are tested on Benders decomposition: (U) where the initial first-stage candidate solution is 0, and (EV+U) where BD is employed to solve the EV (expected value) problem. The EV optimal solution is then used as a starting point for the stochastic instance. Iter-EV and Iter-U stand for the number of iterations required to solve the EV and stochastic problem, respectively. A stopping optimality gap of

1 \times 10^{- 5}

is considered. DECIS-CPLEX (12.7.0) uses primal simplex in both the MP and subproblems in Benders decomposition. DECIS-MINOS (5.6) is used in the quadratic MP and linear subproblems in regularized decomposition.

Table A9. Computational results for DECIS.

Instances	Scenarios	Benders (U)		Benders (EV+U)			RD − 1 (U)		RD − 10 (U)
Instances	Scenarios	Iter	Time	Iter-EV	Iter-U	Time	Iter	Time	Iter	Time
rand0	2000	82	29.72	31	77	27.98	50	13.17	72	18.30
	4000	71	53.77	35	58	48.49	42	22.11	58	30.16
	6000	105	112.36	47	106	120.96	58	40.6	85	58.83
	8000	121	170.25	38	111	155.61	59	54.23	102	91.64
	10,000	229	410.76	40	213	389.04	110	133.2	135	163.31
rand1	2000	391	459.29	91	384	448.74	120	264.43	255	551.64
	4000	488	1051.82	87	487	1031.35	117	448.65	296	1175.35
	6000	396	1269.56	118	363	1158.85	100	533.02	146	781.95
	8000	443	1763.46	100	436	1688.43	106	679.39	153	1004.85
	10,000	449	2356.12	115	437	2353.02	113	983.68	193	1736.57
rand2	2000	885	3213.08	125	870	3225.03	142	1147.62	265	2620.33
	4000	411	2784.49	136	405	2786.91	93	1696.08	212	3879.52
	6000	496	5470.71	165	520	5764.87	132	4196.52	223	6981.10
	8000	457	6151.33	173	459	6277.49	97	3631.94	140	5224.19
	10,000	-	-	-	-	-	-	-	-	-
Saphir	50	167	362.21	163	80	317.21	-	-	-	-
	100	151	568.44	151	83	539.73	-	-	-	-
	200	-	-	-	-	-	-	-	-	-
	500	138	1357.83	109	73	917.47	-	-	-	-
	1000	-	-	-	-	-	-	-	-	-

To exemplify the effects of the constant penalty term on the performance of regularized decomposition, two

ρ

values, 1 and 10, are tested. From Table A9, it can be observed that regularized decomposition may significantly reduce the number of iterations, and thus the solution time of the overall decomposition algorithm. In addition, stronger penalization might increase the number of iterations, as it restricts the movement of the first-stage candidate to be close to the best incumbent solution. Furthermore, this methodology might present numerical issues, such as bad scaling in the master problem, which makes the algorithm stop without closing the optimality gap. For instance, regularized decomposition fails to solve the whole set of SAPHIR problems.

Using the (EV+U) initialization can accelerate the convergence of Benders decomposition. In 14 of 17 instances where BD converged, (EV+U) had fewer iterations than the (U) strategy, as well as less solution time. The reduction in the iteration number alleviates the time spent computing an appropriate starting point.

Appendix C.2.3. Computational Results for FortSP in Comparison with DECIS

From the results in the previous subsections, it can be observed that the algorithms implemented in FortSP (Table A8) outperforms the decomposition implementations in GAMS–DECIS (Table A9) in terms of solution time. The strength of FortSP resides in the use of multiple strategies that can accelerate the convergence of the standard BD algorithm and regularization solved with MINOS. In fact, we observed that the best FortSP methodology is at least

37.3 %

faster than the best algorithmic implementation evaluated with DECIS for each test problem (see Figure A1). In the instances in which none of the DECIS solvers converge, the solution time is noted as 3 h of walltime.

As expected, the performance of the BD algorithm in both FortSP and DECIS behaves similarly, having a difference of fewer than 10 iterations in each test instance. Both implementations use BD with aggregated cuts but differ in the initialization procedure. However, the BD algorithm is on average two times faster in the FortSP’s implementation than DECIS’s implementation.

Figure A1. Maximum relative improvement of the solution time by using FortSP’s solvers over DECIS’s solvers.

In this particular set of instances, the most time-consuming part of the algorithm is the cumulative solution of scenario subproblems, as can be observed in Figure A2 and Figure A3, which is explained by the large number of scenario subproblems. This difference is especially pronounced in the SAPHIR group, where the recourse problem is larger than the first-stage program, in terms of constraints and variables. In most of the test instances, DECIS with initialization in the EV solution is the methodology that spends more time solving the master problem, as it uses BD to obtain a proper starting point. Following the general trend, FortSP is faster in the solution of both the master problem and the subproblems separately, indicating that differences in the implementation play an important role in the performance of the decomposition strategies. Warm-starting and automatic selection of the linear solver might contribute to the acceleration of the convergence of BD in FortSP.

Figure A2. Cumulative solution time of masters problem in BD, where **, *** means the algorithm fails to solve the instance in 10,800 CPU seconds.

Figure A3. Cumulative solution time of scenario instances in BD, where **, *** means the algorithm fails to solve the instance in 10,800 CPU seconds.

References

Birge, J.R.; Louveaux, F. Introduction to Stochastic Programming; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Kall, P.; Wallace, S.W. Stochastic Programming; Wiley: Chichester, UK, 1996; p. xii. [Google Scholar]
Garcia-Herreros, P.; Wassick, J.M.; Grossmann, I.E. Design of Resilient Supply Chains with Risk of Facility Disruptions. Ind. Eng. Chem. Res. 2014, 53, 17240–17251. [Google Scholar] [CrossRef]
Li, C.; Grossmann, I.E. An improved L-shaped method for two-stage convex 0–1 mixed integer nonlinear stochastic programs. Comput. Chem. Eng. 2018, 112, 165–179. [Google Scholar] [CrossRef]
Ye, Y.; Li, J.; Li, Z.; Tang, Q.; Xiao, X.; Floudas, C.A. Robust optimization and stochastic programming approaches for medium-term production scheduling of a large-scale steelmaking continuous casting process under demand uncertainty. Comput. Chem. Eng. 2014, 66, 165–185. [Google Scholar] [CrossRef]
Tarhan, B.; Grossmann, I.E. A multistage stochastic programming approach with strategies for uncertainty reduction in the synthesis of process networks with uncertain yields. Comput. Chem. Eng. 2008, 32, 766–788. [Google Scholar] [CrossRef]
Van Slyke, R.; Wets, R. L-Shaped Linear Programs with Applications to Optimal Control and Stochastic Programming. SIAM J. Appl. Math. 1969, 17, 638–663. [Google Scholar] [CrossRef]
Geoffrion, A.M. Lagrangean relaxation for integer programming. In Approaches to Integer Programming; Springer: Berlin/Heidelberg, Germany, 1974; pp. 82–114. [Google Scholar]
Rockafellar, R.T.; Wets, R.J.B. Scenarios and Policy Aggregation in Optimization under Uncertainty. Math. Oper. Res. 1991, 16, 119–147. [Google Scholar] [CrossRef] [Green Version]
Küçükyavuz, S.; Sen, S. An introduction to two-stage stochastic mixed-integer programming. In Leading Developments from INFORMS Communities; INFORMS: Catonsville, MD, USA, 2017; pp. 1–27. [Google Scholar]
Lara, C.L.; Siirola, J.D.; Grossmann, I.E. Electric power infrastructure planning under uncertainty: Stochastic dual dynamic integer programming (SDDiP) and parallelization scheme. Optim. Eng. 2020, 21, 1243–1281. [Google Scholar] [CrossRef]
Beltrán, F.; Finardi, E.C.; de Oliveira, W. Two-stage and multi-stage decompositions for the medium-term hydrothermal scheduling problem: A computational comparison of solution techniques. Int. J. Electr. Power Energy Syst. 2021, 127, 106659. [Google Scholar] [CrossRef]
Colonetti, B.; Finardi, E.C.; de Oliveira, W. A Mixed-Integer and Asynchronous Level Decomposition with Application to the Stochastic Hydrothermal Unit-Commitment Problem. Algorithms 2020, 13, 235. [Google Scholar] [CrossRef]
Li, C.; Grossmann, I.E. A review of stochastic programming methods for optimization of process systems under uncertainty. Front. Chem. Eng. 2021, 34. [Google Scholar] [CrossRef]
Li, C.; Eason, J.P.; Drouven, M.G.; Grossmann, I.E. Shale gas pad development planning under price uncertainty. AIChE J. 2020, 66, e16933. [Google Scholar] [CrossRef]
Ruszczyński, A. Decomposition methods in stochastic programming. Math. Program. 1997, 79, 333–353. [Google Scholar] [CrossRef]
Carøe, C.C.; Schultz, R. Dual decomposition in stochastic integer programming. Oper. Res. Lett. 1999, 24, 37–45. [Google Scholar] [CrossRef]
Lim, C.; Cochran, J.J.; Cox, L.A.; Keskinocak, P.; Kharoufeh, J.P.; Smith, J.C. Relationship among Benders, Dantzig–Wolfe, and Lagrangian Optimization. In Wiley Encyclopedia of Operations Research and Management Science; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2010. [Google Scholar] [CrossRef]
Benders, J.F. Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 1962, 4, 238–252. [Google Scholar] [CrossRef]
Geoffrion, A.M. Generalized Benders decomposition. J. Optim. Theory Appl. 1972, 10, 237–260. [Google Scholar] [CrossRef]
Hooker, J.N.; Ottosson, G. Logic-based Benders decomposition. Math. Program. 2003, 96, 33–60. [Google Scholar] [CrossRef]
Sherali, H.D.; Fraticelli, B.M.P. A modification of Benders’ decomposition algorithm for discrete subproblems: An approach for stochastic programs with integer recourse. J. Glob. Optim. 2002, 22, 319–342. [Google Scholar] [CrossRef]
Gade, D.; Küçükyavuz, S.; Sen, S. Decomposition algorithms with parametric Gomory cuts for two-stage stochastic integer programs. Math. Program. 2014, 144, 39–64. [Google Scholar] [CrossRef]
Zhang, M.; Kucukyavuz, S. Finitely convergent decomposition algorithms for two-stage stochastic pure integer programs. SIAM J. Optim. 2014, 24, 1933–1951. [Google Scholar] [CrossRef] [Green Version]
Ahmed, S.; Tawarmalani, M.; Sahinidis, N.V. A finite branch-and-bound algorithm for two-stage stochastic integer programs. Math. Program. 2004, 100, 355–377. [Google Scholar] [CrossRef]
Rahmaniani, R.; Crainic, T.G.; Gendreau, M.; Rei, W. The Benders decomposition algorithm: A literature review. Eur. J. Oper. Res. 2016, 259, 801–817. [Google Scholar] [CrossRef]
Magnanti, T.L.; Wong, R.T. Accelerating Benders decomposition: Algorithmic enhancement and model selection criteria. Oper. Res. 1981, 29, 464–484. [Google Scholar] [CrossRef]
Geoffrion, A.M.; Graves, G.W. Multicommodity distribution system design by Benders decomposition. Manag. Sci. 1974, 20, 822–844. [Google Scholar] [CrossRef]
Naoum-Sawaya, J.; Elhedhli, S. An interior-point Benders based branch-and-cut algorithm for mixed integer programs. Ann. Oper. Res. 2013, 210, 33–55. [Google Scholar] [CrossRef]
Vanderbeck, F.; Wolsey, L.A. Reformulation and Decomposition of Integer Programs; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Zaourar, S.; Malick, J. Quadratic Stabilization of Benders Decomposition. Available online: https://hal.archives-ouvertes.fr/hal-01181273/document (accessed on 15 January 2022).
McDaniel, D.; Devine, M. A modified Benders’ partitioning algorithm for mixed integer programming. Manag. Sci. 1977, 24, 312–319. [Google Scholar] [CrossRef]
Costa, A.M.; Cordeau, J.F.; Gendron, B.; Laporte, G. Accelerating Benders decomposition with heuristic master problem solutions. Pesqui. Oper. 2012, 32, 3–20. [Google Scholar] [CrossRef] [Green Version]
Raidl, G.R.; Baumhauer, T.; Hu, B. Speeding up logic-based Benders’ decomposition by a metaheuristic for a bi-level capacitated vehicle routing problem. In International Workshop on Hybrid Metaheuristics; Springer: Berlin/Heidelberg, Germany, 2014; pp. 183–197. [Google Scholar]
Holmberg, K. On using approximations of the Benders master problem. Eur. J. Oper. Res. 1994, 77, 111–125. [Google Scholar] [CrossRef]
Zakeri, G.; Philpott, A.B.; Ryan, D.M. Inexact cuts in Benders decomposition. SIAM J. Optim. 2000, 10, 643–657. [Google Scholar] [CrossRef]
Birge, J.R.; Louveaux, F.V. A multicut algorithm for two-stage stochastic linear programs. Eur. J. Oper. Res. 1988, 34, 384–392. [Google Scholar] [CrossRef] [Green Version]
Saharidis, G.K.; Minoux, M.; Ierapetritou, M.G. Accelerating Benders method using covering cut bundle generation. Int. Trans. Oper. Res. 2010, 17, 221–237. [Google Scholar] [CrossRef]
Van Roy, T.J. Cross decomposition for mixed integer programming. Math. Program. 1983, 25, 46–63. [Google Scholar] [CrossRef]
Ruszczyński, A.; Świȩtanowski, A. Accelerating the regularized decomposition method for two stage stochastic linear problems. Eur. J. Oper. Res. 1997, 101, 328–342. [Google Scholar] [CrossRef]
Linderoth, J.; Wright, S. Decomposition Algorithms for Stochastic Programming on a Computational Grid. Comput. Optim. Appl. 2003, 24, 207–250. [Google Scholar] [CrossRef]
Zehtabian, S.; Bastin, F. Penalty Parameter Update Strategies in Progressive Hedging Algorithm. Available online: https://www.cirrelt.ca/documentstravail/cirrelt-2016-12.pdf (accessed on 15 January 2022).
Hiriart-Urruty, J.B.; Lemaréchal, C. Convex Analysis and Minimization Algorithms II; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 305. [Google Scholar]
Bussieck, M.R.; Meeraus, A. General Algebraic Modeling System (GAMS). In Modeling Languages in Mathematical Optimization; Volume 88, Applied Optimization; Kallrath, J., Ed.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 137–157. [Google Scholar]
Infanger, G. GAMS/DECIS User’S Guide. Available online: https://www.gams.com/docs/DECIS-Users_Guide.pdf (accessed on 15 January 2022).
Ellison, F.; Mitra, G.; Zverovich, V. FortSP: A Stochastic Programming Solver. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.178.5858&rep=rep1&type=pdf (accessed on 15 January 2022).
Guignard, M. Lagrangean relaxation. Top 2003, 11, 151–200. [Google Scholar] [CrossRef]
Kim, K.; Petra, C.G.; Zavala, V.M. An asynchronous bundle-trust-region method for dual decomposition of stochastic mixed-integer programming. SIAM J. Optim. 2019, 29, 318–342. [Google Scholar] [CrossRef]
Kim, K.; Zavala, V.M. Algorithmic innovations and software for the dual decomposition method applied to stochastic mixed-integer programs. Math. Program. Comput. 2017, 10, 225–226. [Google Scholar] [CrossRef]
Mitra, S.; Garcia-Herreros, P.; Grossmann, I.E. A cross-decomposition scheme with integrated primal–dual multi-cuts for two-stage stochastic programming investment planning problems. Math. Program. 2016, 157, 95–119. [Google Scholar] [CrossRef]
Kim, K.; Dandurand, B. Scalable branching on dual decomposition of stochastic mixed-integer programming problems. Math. Program. Comput. 2022, 14, 1–41. [Google Scholar] [CrossRef]
Watson, J.P.; Woodruff, D.L. Progressive hedging innovations for a class of stochastic mixed-integer resource allocation problems. Comput. Manag. Sci. 2011, 8, 355–370. [Google Scholar] [CrossRef]
Lubin, M.; Martin, K.; Petra, C.G.; Sandıkçı, B. On parallelizing dual decomposition in stochastic integer programming. Oper. Res. Lett. 2013, 41, 252–258. [Google Scholar] [CrossRef]
Gabay, D.; Mercier, B. A Dual Algorithm for the Solution of Non Linear Variational Problems via Finite Element Approximation; Institut de Recherche d’Informatique et d’Automatique: Le Chesnay-Rocquencourt, France, 1975.
Eckstein, J.; Bertsekas, D.P. On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 1992, 55, 293–318. [Google Scholar] [CrossRef] [Green Version]
Gade, D.; Hackebeil, G.; Ryan, S.M.; Watson, J.P.; Wets, R.J.B.; Woodruff, D.L. Obtaining lower bounds from the progressive hedging algorithm for stochastic mixed-integer programs. Math. Program. 2016, 157, 47–67. [Google Scholar] [CrossRef] [Green Version]
Gonçalves, R.E.C.; Finardi, E.C.; da Silva, E.L. Applying different decomposition schemes using the progressive hedging algorithm to the operation planning problem of a hydrothermal system. Electr. Power Syst. Res. 2012, 83, 19–27. [Google Scholar] [CrossRef]
Boland, N.; Christiansen, J.; Dandurand, B.; Eberhard, A.; Linderoth, J.; Luedtke, J.; Oliveira, F. Combining Progressive Hedging with a Frank–Wolfe Method to Compute Lagrangian Dual Bounds in Stochastic Mixed-Integer Programming. SIAM J. Optim. 2018, 28, 1312–1336. [Google Scholar] [CrossRef] [Green Version]
Guo, G.; Hackebeil, G.; Ryan, S.M.; Watson, J.P.; Woodruff, D.L. Integration of progressive hedging and dual decomposition in stochastic integer programs. Oper. Res. Lett. 2015, 43, 311–316. [Google Scholar] [CrossRef] [Green Version]
Watson, J.P.; Woodruff, D.L.; Hart, W.E. PySP: Modeling and solving stochastic programs in Python. Math. Program. Comput. 2012, 4, 109–149. [Google Scholar] [CrossRef]
Ahmed, S.; Garcia, R.; Kong, N.; Ntaimo, L.; Parija, G.; Qiu, F.; Sen, S. SIPLIB: A Stochastic Integer Programming Test Problem Library. 2004. Available online: https://www2.isye.gatech.edu/~sahmed/siplib/ (accessed on 10 March 2022).
Ntaimo, L.; Sen, S. The million-variable “march” for stochastic combinatorial optimization. J. Glob. Optim. 2005, 32, 385–400. [Google Scholar] [CrossRef] [Green Version]
Ntaimo, L. Disjunctive decomposition for two-stage stochastic mixed-binary programs with random recourse. Oper. Res. 2010, 58, 229–243. [Google Scholar] [CrossRef] [Green Version]
Ahmed, S.; Garcia, R. Dynamic capacity acquisition and assignment under uncertainty. Ann. Oper. Res. 2003, 124, 267–283. [Google Scholar] [CrossRef] [Green Version]
Hart, W.E.; Laird, C.D.; Watson, J.P.; Woodruff, D.L.; Hackebeil, G.A.; Nicholson, B.L.; Siirola, J.D. Pyomo-Optimization Modeling in Python; Springer: Berlin/Heidelberg, Germany, 2012; Volume 67. [Google Scholar]
Gupta, V.; Grossmann, I.E. A new decomposition algorithm for multistage stochastic programs with endogenous uncertainties. Comput. Chem. Eng. 2014, 62, 62–79. [Google Scholar] [CrossRef] [Green Version]
Ryan, S.M.; Wets, R.J.B.; Woodruff, D.L.; Silva-Monroy, C.; Watson, J.P. Toward scalable, parallel progressive hedging for stochastic unit commitment. In Proceedings of the 2013 IEEE Power & Energy Society General Meeting, Vancouver, BC, Canada, 21–25 July 2013; pp. 1–5. [Google Scholar]
Escudero, L.F.; Garín, M.A.; Unzueta, A. Scenario cluster Lagrangean decomposition for risk averse in multistage stochastic optimization. Comput. Oper. Res. 2017, 85, 154–171. [Google Scholar] [CrossRef]
Mitchell, J.E. Computational experience with an interior point cutting plane algorithm. SIAM J. Optim. 2000, 10, 1212–1227. [Google Scholar] [CrossRef] [Green Version]
Dowson, O.; Kapelevich, L. SDDP. jl: A Julia package for stochastic dual dynamic programming. INFORMS J. Comput. 2021, 33, 27–33. [Google Scholar] [CrossRef]
Biel, M.; Johansson, M. Efficient stochastic programming in Julia. INFORMS J. Comput. 2022. [Google Scholar] [CrossRef]
Ding, L.; Ahmed, S.; Shapiro, A. A python Package for Multi-Stage Stochastic Programming. Available online: http://www.optimization-online.org/DB_FILE/2019/05/7199.pdf (accessed on 15 March 2022).
Li, C.; Grossmann, I.E. A finite ϵ-convergence algorithm for two-stage stochastic convex nonlinear programs with mixed-binary first and second-stage variables. J. Glob. Optim. 2019, 75, 921–947. [Google Scholar] [CrossRef]
Li, C.; Grossmann, I.E. A generalized Benders decomposition-based branch and cut algorithm for two-stage stochastic programs with nonconvex constraints and mixed-binary first and second stage variables. J. Glob. Optim. 2019, 75, 247–272. [Google Scholar] [CrossRef]
Li, C.; Bernal, D.E.; Furman, K.C.; Duran, M.A.; Grossmann, I.E. Sample average approximation for stochastic nonconvex mixed integer nonlinear programming via outer-approximation. Optim. Eng. 2021, 22, 1245–1273. [Google Scholar] [CrossRef]
Cao, Y.; Zavala, V.M. A scalable global optimization algorithm for stochastic nonlinear programs. J. Glob. Optim. 2019, 75, 393–416. [Google Scholar] [CrossRef]
Füllner, C.; Rebennack, S. Non-convex nested Benders decomposition. Math. Program. 2022, 1–38. [Google Scholar] [CrossRef]
de Oliveira, W. Regularized optimization methods for convex MINLP problems. Top 2016, 24, 665–692. [Google Scholar] [CrossRef] [Green Version]
Rockafellar, R.T. Solving stochastic programming problems with risk measures by progressive hedging. Set-Valued Var. Anal. 2018, 26, 759–768. [Google Scholar] [CrossRef]
Kall, P.; Mayer, J. On testing SLP codes with SLP-IOR. In New Trends in Mathematical Programming; Springer: Berlin/Heidelberg, Germany, 1998; pp. 115–135. [Google Scholar]
Konig, D.; Suhl, L.; Koberstein, A. Optimierung des Gasbezugs im liberalisierten Gasmarkt unter Berucksichtigung von Rohren-und Untertagespeichern. VDI BERICHTE 2007, 2018, 83. [Google Scholar]
Keller, E. GENSLP: A Program for Generating Input for Stochastic Linear Programs with Complete Fixed Recourse; Manuscript; IOR, University of Zurich: Zurich, Switzerland, 1984. [Google Scholar]
Zverovich, V.; Fábián, C.I.; Ellison, E.F.; Mitra, G. A computational study of a solver system for processing two-stage stochastic LPs with enhanced Benders decomposition. Math. Program. Comput. 2012, 4, 211–238. [Google Scholar] [CrossRef]
Koberstein, A.; Lucas, C.; Wolf, C.; König, D. Modeling and optimizing risk in the strategic gas-purchase planning problem of local distribution companies. J. Energy Mark. 2011, 4, 47. [Google Scholar] [CrossRef]
Ruszczyński, A. A regularized decomposition method for minimizing a sum of polyhedral functions. Math. Program. 1986, 35, 309–333. [Google Scholar] [CrossRef]
Lemaréchal, C.; Nemirovskii, A.; Nesterov, Y. New variants of bundle methods. Math. Program. 1995, 69, 111–147. [Google Scholar] [CrossRef] [Green Version]
Hansen, P.; Mladenović, N.; Perez-Britos, D. Variable Neighborhood Decomposition Search. J. Heuristics 2001, 7, 335–350. [Google Scholar] [CrossRef]

Figure 1. Number of iterations for PH to solve SSLP instances using different cost proportional multipliers.

Figure 2. Walltime for PH to solve SSLP instances using different cost proportional multipliers.

Figure 3. Solution time improvement by using WW heuristics for SSLP and SSLPR instances.

Figure 4. Comparison of optimality gaps from PySP, DSP, and literature for SSLP and SSLPR library—instances with only binary in the first stage.

Figure 5. Comparison of the optimality gaps from PySP, DSP, and literature for DCAP library—instances with mixed-integer variables in the first stage.

Table 1. The sizes of the problems tested.

Name	Scenarios	First Stage			Second Stage
Name	Scenarios	Rows	Cols	Ints	Rows	Cols	Ints
sslp_5_25	50, 100	1	5	5	30	130	125
sslp_10_50	50, 100, 500, 1000	1	10	10	60	510	500
sslp_15_45	5, 10, 15	1	15	15	60	690	675
sslprf_5_25	100	1	5	5	30	130	125
sslprf_5_50	100	1	10	10	60	510	500
dcap 233	200, 300, 500	6	12	6	15	27	27
dcap 243	200, 300, 500	6	12	6	18	36	36
dcap 332	200, 300, 500	6	12	6	12	24	24
dcap 342	200, 300, 500	6	12	6	14	32	32

Table 2. Computational results for PySP with scenario bundling.

Instances	Scenarios	Bundles	Iterations	Time	UB
SSLP_5_25	50	10	4	3.73	−121.60
	50	50	24	17.59	−121.60
	100	10	2	3.99	−127.37
	100	50	7	10.03	−127.37
SSLP_10_50	50	10	4	30.55	−364.64
	50	50	68	254.15	−364.64
	100	10	2	83.67	−354.19
		50	49	263.59	−354.19
		100	95	540.21	−354.19
	500	10	1	476.13	−349.13
		50	2	162.54	−349.14
		500	174	4322.45	−349.14
	1000	10	1	7137.61	−351.71
		50	1	313.07	−351.71
		1000	180	9984.56	−351.71
DCAP233	200	10	>300	342.97	1854.36
		50	>300	232.21	1861.63
		200	>300	456.18	2206.68
	300	10	147	>10,800	−
		50	>300	317.28	1679.80
		300	>300	1515.27	2498.12
	500	10	>300	634.60	1749.87
		50	>300	400.59	1858.98
		500	>300	1494.13	1893.83

Table 3. Results of DSP with DW-based branch-and-bound on SSLP instances.

Instances	Scenarios	Time	Solution
SSLP5_25	50	1.70	−121.6
SSLP5_25	100	3.16	−127.37
SSLP10_50	50	103.9	−364.64
	100	101.62	−364.191
	500	1110.64	−349.136
	1000	3021.83	−351.711
SSLP15_45	5	11.22	−262.4
	10	151.79	−260.5
	15	377.65	−253.601

Table 4. Results of DSP with DW Branch-and- Bound on SSLPRF instances.

Instances	Time	LB	UB	GAP [%]
SSLPRF_5_25_100_1	697.74	−74,005.8	−74,005.8	0
SSLPRF_5_25_100_2	1021.76	−72,675	−72,675	0
SSLPRF_5_25_100_3	673.04	−75,666.9	−75,666.9	0
SSLPRF_5_50_100_1	3071.82	138,900	138,900	0
SSLPRF_5_50_100_2	9083.39	163,931	163,931	0
SSLPRF_5_50_100_3	-	-	-	-

Table 5. Results of DSP with DW Branch-and- Bound on DCAP instances.

Instances	Scenarios	Time	LB	UB	GAP [%]
DCAP233	200	417.08	1834.57	1834.57	0
	300	512.97	1644.25	1644.25	0
	500	898.05	1834.57	1834.57	0
DCAP243	200	370.68	2322.5	2322.5	0
	300	525.91	2559.19	2559.19	0
	500	1523.61	2167.4	2167.4	0
DCAP332	200	949.64	1060.75	1060.75	0
	300	1379.98	1252.77	1252.77	0
	500	-	1588.660	1589.203	0.03
DCAP342	200	2732.97	1619.61	1619.61	0
	300	1970.75	2067.5	2067.5	0
	500	3958.25	1904.74	1904.74	0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Torres, J.J.; Li, C.; Apap, R.M.; Grossmann, I.E. A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software. Algorithms 2022, 15, 103. https://doi.org/10.3390/a15040103

AMA Style

Torres JJ, Li C, Apap RM, Grossmann IE. A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software. Algorithms. 2022; 15(4):103. https://doi.org/10.3390/a15040103

Chicago/Turabian Style

Torres, Juan J., Can Li, Robert M. Apap, and Ignacio E. Grossmann. 2022. "A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software" Algorithms 15, no. 4: 103. https://doi.org/10.3390/a15040103

APA Style

Torres, J. J., Li, C., Apap, R. M., & Grossmann, I. E. (2022). A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software. Algorithms, 15(4), 103. https://doi.org/10.3390/a15040103

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review on the Performance of Linear and Mixed Integer Two-Stage Stochastic Programming Software

Abstract

1. Introduction

2. Problem Statement

3. L-Shaped Algorithm/Benders Decomposition

4. Benders Decomposition Enhancement Strategies

4.1. Reducing the Cost of Each Iteration

4.2. Reducing the Number of Iterations

5. Scenario Decomposition Methods

5.1. Dual Decomposition (DD) Method

5.1.1. Cutting-Plane Method

5.1.2. Branch-and-Bound Method

5.2. Progressive Hedging (PH) Algorithm

6. Software Packages for Scenario Decomposition

6.1. PySP: Pyomo Stochastic Programming

6.1.1. Algorithmic Innovations in PySP

6.1.2. Computational Results for PySP

6.2. DSP: Decompositions for Structured Programming

6.2.1. Algorithmic Innovations in DSP

6.2.2. Computational Results for DSP in Comparison with PySP

6.3. Results from Branch-and-Bound Algorithm in DSP

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Computational Results for PySP

Appendix B. Computational Results for DSP

Appendix C. Software Packages for Benders Decomposition

Appendix C.1. FortSP: A Stochastic Programming Solver

Appendix C.1.1. Algorithmic Innovations in FortSP

Appendix C.1.2. Computational Results for FortSP

Appendix C.2. DECIS: A System for Solving Large-Scale Stochastic Programs

Appendix C.2.1. Algorithmic Innovations in DECIS

Appendix C.2.2. Computational Results for DECIS

Appendix C.2.3. Computational Results for FortSP in Comparison with DECIS

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI