1. Introduction
The process of making decisions based on a solution to a problem, within the context of mathematical modeling, can take several stages. In each of these stages, events are defined in a such way that may be dependent on each other and on the variables that they define.
The use of random variables in the above-mentioned form of decision making may be described within a stochastic programming context. Then, the model can be represented as a sequential decision-making process, where the variables of one stage that are unknown at first are part of the input parameters of the subsequent stage.
A stochastic programming problem is a mathematical formulation with constraints where some of its components can be represented by random variables. If the problem has decisions that have to be made after the uncertainty is resolved (realization), it takes the name of the resource program [
1]. The values of the random variables are known after carrying out an experiment, and decisions can be made at a first stage and at a second stage, before and after the experiment, respectively. In addition, each stage can contain a succession of decisions and they can be made over different periods of time. In this case, a multi-state stochastic programming problem should be formulated. Thus, we can say that a stochastic programming problem is multi-state if the decisions depend on realizations that occur over time. Since its introduction in the seminal papers by Dantzig [
2] and Beale [
3], the stochastic framework has been widely studied and applied in diverse areas, such as transportation [
4], inventory management [
5], drug supply [
6], and industrial processes [
7], among others.
Another example of the above framework is the so-called simple plant location problem, SPLP in short [
8]. In the SPLP, given a set of possible locations, we must decide where to open facilities and then distribute resources demanded by certain customers/patients, minimizing the costs involved. Note that the SPLP is a combinatorial approach, where the mathematical model can be viewed as one of two stages that tries to answer two questions: (i) what facilities should be opened, and (ii) how should these facilities be located in such a way that the customer demand is met? The second question must be answered after the facilities (unknown at the beginning) are stated, that is, after the variables defined at this stage for such a decision are revealed by conducting an experiment that considers a possible future state of a second stage. Therefore, it is logical to think that it is necessary to define different forms of this state of nature that represent possible future scenarios such that the effect of uncertainty is minimized. Described in this way, the SPLP is known as a two-stage stochastic simple plan location problem, 2S-SPLP in short.
The 2S-SPLP has been widely studied, and its solution methods have been shown to be efficient. As mentioned in [
1], it is important to consider that, taking advantage of the structure of the problem, the 2S-SPLP is especially beneficial in stochastic mathematical programming, and it is the main focus of the algorithmic work in this area. Here, the concept of structure refers to any characteristic in the mathematical formulation that can be used to provide a solution. One of the most-used procedures for stochastic mathematical programming that involves integer variables is the so-called integer L-shape method proposed in [
9]. This method is a branch-cut algorithm that uses the L-shape that the coefficient matrix has in the stochastic formulation of the 2S-SPLP and solves subproblems obtained by the Benders decomposition. For details of this and other similar procedures, see [
1,
10]. Lagrangian relaxation (LR) is a less-used approach, but a full description of different applications can be found in the references previously cited. Some gradient (subgradient)-based methods, such as those used in LR, were applied over other algorithms to speed up the search for the optimum in discrete problems; see [
11] as an example.
A less-known extension of the SPLP is when an order of preference, that customers have over a set of facilities to meet their demand, is considered. The SPLP with order is denoted by SPLPO in short [
12,
13], with the size of the instances in the SPLPO that can be solved being limited. Such an order incorporates a new set of constraints in its formulation that increases the difficulty to obtain an optimal solution. To the best of our knowledge, the use of order with two stages in the SPLP based on stochastic programming has not been considered until now. We use an algorithm that incorporates a heuristic component to handle the complexity that is added when orders of preference are included.
The motivation for our study comes from a problem related to healthcare services, where service centers must be sited in strategic locations to meet the immediate needs of patients. The current situation due to the COVID-19 pandemic makes this problem particularly relevant [
14]. Let us assume that each center corresponds to an assigned place for vaccination and that each center uses one or more vaccine brands from different laboratories. Then, each patient could choose one center instead of another, because she/he may prefer the vaccine from a laboratory to be considered more reliable. This defines a ranking (order of preference) which will not necessarily be given for all of them, since a patient may not want to be vaccinated in a center where there are only non-preferred vaccine brands. In countries where the vaccination process is considered successful, the order of preference assigned by each patient to the vaccination centers is defined by incentives that local governments give to their population. These same incentives for foreign citizens are seen as a strategic decision to generate income from tourism.
The main objective of this work is to solve the problem of large instances based on a more general version of the 2S-SPLP, which we call the two-stage stochastic simple plant location problem with order (2S-SPLPO). Our secondary objective is to show a potential use of the 2S-SPLPO in COVID-19 vaccination based on sensor-related data. Indeed, we provide an algorithm that stores patient’s service records in a central data warehouse using sensors. We conduct computational experiments on simulated instances to assess the performance of the 2S-SPLPO.
In the 2S-SPLPO, the customers have preferences over the set of facilities that can serve them. We consider the possibility that the order of preference given by patients is partial, that is, a patient may want to be served by only a subset of all possible facilities that may be open in some locations. In the stochastic formulation that we propose for the 2S-SPLPO, together with the variables defined to decide in which locations to open facilities, the preferences (partial or complete) are considered random variables. To solve the 2S-SPLPO, we use the accelerated dual ascent (ADA) algorithm proposed in [
12,
13], which solves large instances of the non-stochastic version of the SPLPO. Our algorithm is based on that given in [
15] for the SPLP. The ADA algorithm uses LR and semi-Lagrangian relaxation (SLR) in an iterative procedure to approximate the solution. The SLR is a relatively new idea that exploits the structure of problems by relaxing a direction of a split equality constraint. The ADA algorithm tries to solve a Lagrangian dual (LD) problem with a subgradient method (SGM) as a first step. Then, the results of the first step are used in a second step to solve a semi-Lagrangian dual (SLD) problem with a dual ascent method (DAM). At last, a heuristic procedure is applied to accelerate the search of the optimum in a third step. For a complete description of the SGM, the reader is referred to [
16,
17,
18,
19,
20,
21], whereas for a description of the SLR, the reader is referred to [
15,
22] Applying heuristic procedures is a common practice in optimization, but without guarantee of reaching the optimum. Some creative algorithms help to finding sufficiently good solutions [
23]. In the next sections, we shows how the ADA algorithm produces good solutions, comparing them to the exact solutions given by the XPRESS software.
Other two potential applications of the our approach can be seen in [
24,
25], where one of them is found in business-to-consumer (B2C) e-commerce. As pointed out in [
25], e-commerce has logistical advantages over traditional B2C. However, although costs can be reduced considerably, due to the use of technology, the problem of distributing the products demanded to patients from centers exists still. Sometimes, patients prefer to pick up the products from these centers and, therefore, may have preferences about where to do it, as the distance to travel or travel times can influence the problem’s variables. Thus, the aim of the B2C e-commerce organizations lies in finding the best way to locate the distribution centers, reducing the logistics costs related to this process and also considering the order of preference of the centers. These applications proposed in [
24,
25] are complemented with the potential application in COVID-19 vaccination based on the sensor-related data proposed in the present investigation.
The rest of the article is distributed as follows.
Section 2 provides background on stochastic programming.
Section 3 and
Section 4 define the SPLPO and 2S-SPLPO, respectively. In
Section 5, a description of the three steps that form the ADA algorithm is provided. In
Section 6, we introduce two sensor-based algorithms that store patients’ service records in a central data warehouse and that generate 2S-SPLPO instances, respectively. In
Section 7, we report the results of computational experiments performed on large instances. Finally, in
Section 8, some conclusions and recommendations for future studies on this topic are outlined.
2. Two-Stage Stochastic Linear Programming
In this section, we provide the background of stochastic programming, with MP standing for mathematical program. In MP 1, denotes the mathematical expectation on the random vector values , whereas , , , and are known vectors and matrices.
Let
be a finite sample space of some experiment. Let
be a random variable over the elements of
with values
. For simplicity, we call any
the scenario
. In a stochastic program, the scenarios can be seen as states of the nature since all the unknown information could depend on them. In addition, we assume a discrete probability function expressed as
The vectors
and
are the decision variables of the first and second stages, respectively. Therefore,
can be considered the decisions to be made under uncertainty, and
the corrective decisions made when uncertainty is revealed. Furthermore,
,
and
are vectors with known components after the realization of
, with
being usually called the technological matrix and
the fixed resource matrix. Let
be the
i-th row of
. A random vector
is implicitly defined in MP 1, whose formulation was originally stated in [
2,
3].
MP 1 Two-stage stochastic mathematical programming formulation with fixed resources. |
|
3. Simple Plant Location Problem with Order
In this section, we define the SPLPO. Let be a set of customers, and let be a set of possible sites where facilities can be opened. Let be a subset of J defined for each customer i, with and . Unit costs are considered to supply the demand of customer i from facility j, and fixed costs to open a facility at location j. We say that k is i-worse than j if the customer i prefers facility j rather than k, which is denoted by . We define as the set of facilities k strictly i-worse than j, with its complement being denoted as and as . Let be a decision variable that represents the fraction of the demand required by the customer i and covered by facility j. Let be a binary variable such that if a facility is open at the location j, and otherwise. We assume that each customer i classifies her/his favorite facility with a number , where 1 and are the most and least preferred, respectively. Additionally, for all , we have that . Under these conditions, the linear program for the SPLPO is given by MP 2.
MP 2 SPLPO mathematical programming formulation. |
|
The statement expressed in (
2) says that the program minimizes a cost function related to opening a facility covering the demand. Equalities presented in (
3) ensure that customer
i is supplied by exactly facility
j, called assignment constraints. Constraints given in (
4) ensure that if customer
i is supplied by facility
j, then
j must be opened, which are often called varying upper bounds. Inequalities stated in (
5) model the customers’ orders of preference [
26]. Since no capacities are considered and the model minimizes the number of open facilities
y, the demand of customer
i can always be covered completely by one single facility. Therefore, we can guarantee that there is an optimal solution with the values of the variables
belonging to
, even though this is not specified in the constraints; see inequalities given in (
6). The family of inequalities presented in (
7) makes the binary nature of the variable
explicit. Note that if it is more favorable to open an installation that does not belong to a particular subset
, customer
i can be served by any open installation, since
is the worst classification given for any
j.
4. Stochastic Formulation for the SPLPO
In this section, we define the 2S-SPLPO. In the context of the SPLPO, consider the experiment of asking all customers
to rate their preferred facilities
. Such as that defined in MP 1, we call each result of this experiment
and the set of all of them the sample space
. Recall the random variable
with values
over the events
called the scenario
. Once again, we assume that this random variable has the discrete probability function stated in (
1). Under this setting and the mentioned probability function, a mathematical program for the 2S-SPLPO (complete or partial) can be written as in MP 3.
MP 3 2S-SPLPO mathematical programming formulation. |
|
Note that the mathematical expressions defined in (
8)–(
13) of MP 3 have a similar interpretation as those stated in MP 2 for each scenario
. The second component established in (
8) represents the expected value of the distribution cost over the scenarios. Consider that the first decision stage to open a facility is given by binary variable
y and the distribution of the demand is given by
x as the second stage. Regarding the family of constraints formulated as in (
11) of MP 3, each scenario
is represented by a
-matrix
(
) of dimension
, where each row
has a one for each element of
(
) and a zero for elements of its complement; see
Figure 1. Furthermore,
is a finite set but it can have a big cardinality. For example, if we consider no partial preferences and since
, each row of
(
) can be given in
different ways. Thus, there are
possible different matrices
(
).
Note that the stochastic program formulated in MP 3 has no fixed resources since the coefficient matrix given by the sets with the second-stage variables is random, that is, it depends on .
5. A Lagrangian Algorithm for the 2S-SPLPO
In this section, we state a Lagrangian algorithm for the 2S-SPLPO. As noted in MP 3, since the mathematical formulation of the 2S-SPLPO has similar features to its non-stochastic version (SPLPO), with obvious differences due to the presence of multiple random scenarios, we suggest applying the ADA algorithm.
Figure 2 shows the scheme of the ADA algorithm, which includes three modules that we have broken down into internal steps in a very general way. The SLR theory and some important results, which are exposed in the description of the ADA algorithm, can be found in [
15,
22], where this algorithm was introduced. The main idea of the SLR approach is to relax a direction of an equality constraint in the Lagrangian sense, keeping the other one without relaxing, and then to take advantage of the mathematical properties of the resulting programming. Next, we detail each of the steps of the ADA algorithm summarized in
Figure 2.
- Step 1:
Lagrangian relaxation
In MP 3, we relax two families of constraints: those of assignment stated in (
9) and those of customer preference defined in (
11), obtaining the Lagrangian program given by
In the case where a partial order is considered, for all
, the corresponding terms
and
vanish in the objective function defined in (
14).
As before, we are assuming that customer
i ranks for each scenario her/his favorite facility
with a number
, where 1 and
are the most and least preferred, respectively. Since in
each
is multiplied by a sum of
variables with
terms corresponding to
, then each
is multiplied by a sum of
with
terms corresponding to
. Thus, we can check the following:
where
By using the expression given in (
15), the objective function defined in (
14) can be rewritten as
where
with
Similar to its non-stochastic version, with and without order, SPLP and SPLPO, respectively, the problem stated in (
16), with multipliers
and
fixed, can be easily solved by considering that
and
The problem stated in (
16) has the integrality property, that is, its optimal value is equal to the standard linear relaxation LP (2S-SPLPO), which corresponds to the original problem without the integrality constraints. The ADA algorithm tries to solve the LD problem related to (
16) established as
with a SGM, due to the fact that its objective function is a piecewise linear concave function of vector multipliers
and
over its domain. Hence, after some iterations, the multipliers obtained are used as a starting point for the second step. The SGM requires multiple times to solve the problem given in (
16), but as mentioned, it is simple to solve; see Algorithm 1, and for its step in line see (
12) and also [
19,
20,
21].
Algorithm 1: Subgradient method. |
|
It is possible to prove that the vector computed in step 11 of Algorithm 1 is a subgradient of the Lagrangian problem. In our experiments, we obtain better results when
, for all
and
instead
. In step 4 of Algorithm 1, we set
by applying an upper bound heuristic for the 2S-SPLP (H2S); see Algorithm 2. First, it opens a facility with the lowest operating cost for all customers. Then, for those facilities that are not open yet, the procedure compares and chooses for each customer the most preferred facility between it and its previously assigned supplier. Consequently, the new cost is saved and the new open facility has the lowest operating cost. This is repeated until no more facilities are available.
Algorithm 2: Upper bound heuristic for the 2S-SPLP. |
|
- Step 2:
Semi-Lagrangian relaxation
We start with a brief description of the SLR. Let the problem (P) of mathematical program be stated as
with (i) the set
containing integrality constraints; and (ii)
and rational components
being non-negative.
After splitting
into
and
, inequality
is relaxed with a vector of multipliers
. Then, the SLR problem related to (P) is expressed as
with its SLD being given by
.
The following theorems summarizes the basic properties derived from the definition of the SLR.
Theorem 1 ([
15]).
Let be a problem as defined in (
17)
under the same conditions. Then, we have the following:- (i)
is concave, non-decreasing on its domain, andis a subgradient at the point .
- (ii)
There is an interval where for each multiplier, we obtain the optimal solution of .
- (iii)
, that is, closes the duality gap.
In MP 3, we semi-relax the assignment constraints defined in (
9). Thus, we obtain the program formulated as
, and , .
The corresponding dual problem is stated as
We follow the same idea given in [
15], in the context of the SLR applied to the uncapacitated facility location problem, to restrict our search for the vector of multipliers
in
.
Theorem 2. Let and , be the maximal cost for costumer i associated with facility j and the vector of these costs, respectively. Then, we obtain if .
Proof. If for all i and , the SLR closes the duality gap. By hypothesis, . If we choose such that , then . This inequality is true for any j since gives the maximum among all . Therefore, the event cannot happen at an optimal solution since it is always possible to set and for all i and j, meeting all the constraints. □
Theorem 3. Let . For each and , let be the sorted cost . Then, if , .
Proof. By hypothesis, . Then, we have that for all j since we are minimizing at the optimal solution for all j. Therefore, the event cannot happen at an optimal solution and . □
Theorems 2 and 3 limit the search of to . We use the multipliers obtained in the previous step as starting point in this step by setting . Then, based on Theorems 2 and 3, we solve the LD problem by increasing the components of in each iteration of a DAM that requires solving many times; see Algorithm 3. However, this problem is harder than . The DAM algorithm employs the following:
By using the sorted costs , each component of can be either in an interval of the form or out of it. For the first case, there are infinite values of that can belong to a single interval . Each one of them has the same effect in the solution of since only a change from to and to modifies the solution.
We just need a single representative of the intervals. As we get closer to solving the SLR, the values of the components of increase. Hence, solving the SLR becomes more and more difficult. Then, it is always convenient to choose a being as small as possible, that is, at an distance from the lower bound of an interval.
Algorithm 3: Dual ascent method. |
|
- Step 3:
Variable fixing heuristic
A heuristic method is used to speed up the search process to the optimum. In a run of the DAM, the procedure chooses by a cost criteria, from the solution given, a subset corresponding to a percentage of the best variables , such that . These y are fixed to solve the 2S-SPLPO in the next iteration by an exact method. The procedure is described in Algorithm 4. In addition, the procedures described above are assembled in the ADA method stated in Algorithm 5.
Algorithm 4: Variable fixing heuristic. |
|
Algorithm 5: Accelerated dual ascent algorithm. |
|
7. Computational Experiments
In this section, we show the results of computational experiments obtained after applying the ADA algorithm to a set of 2S-SPLPO (complete preferences) with two scenarios. Problems with one scenario were taken from [
27], which are based on the Beasley OR-Library [
28]. The second scenario was randomly generated. In order to simulate data and solve instances of the 2S-SPLPO, we use Algorithm 7 which sensorizes the database.
The experiments were carried out on a PC with Intel® Xeon® 3.40 GHz processor and 16 GB of RAM under a Windows® 10 operating system. All problems were solved with FICO XPRESS® version 8.0.
Table 1 shows the parameter settings for all steps of the ADA algorithm, which were set after an extensive experimental process. We observe that there is a positive relation between the number of iterations of the SGM and the size of the problem, with
sgm_iter being set at a value greater than or equal to
n. Due to the difficult to solve
) in the DAM,
sgm_iter was set with a value less than a
of
n.
The names of the problems in
Table 2 and
Table 3 use the nomenclature as given in the following example:
where
[10a]: The second scenario is
different from the first scenario
;
[7550]:
and
; and
[2]: Problem
2 with first scenario
. In the literature on the topic (for example, in p. 145,
Section 4 of [
27]), the instances shown in
Table 3 are considered large. Note that, although the times in many cases increase as the size of the problem increases, in some cases, this behavior is not detected. For example, in
Table 3, the ADA algorithm solves the problem
100b10075_1 in 3531 seconds. However, in a larger instance, such as
100a150100_1, the time is less (1253 seconds). Since the ADA algorithm is a heuristic, the running times can vary, even on the same problem, due to the random component that the VFH has. Therefore, we cannot affirm the exponential growth of the running times.
The results show that the ADA algorithm performs very well in all cases. Indeed, the optimum was found in most of them. Note that when the percentage of changed preferences was
(first six instances in
Table 2), the ADA algorithm was not able to improve the XPRESS times. Nevertheless, in the remaining cases, the times were improved considerably with a couple of exceptions. From
Table 3, observe that the ADA algorithm attains the optimal solution in most of the cases in much less time, with the exception of the problem
100c150100_1, where the time was greater than that obtained by the XPRESS software. The last of the four groups corresponds to the same problems in the third group of
Table 3 (
,
). Nonetheless, they were tested with a different parameter
. For all these six cases, the ADA algorithm improves the running times and bounds. The column headers in
Table 2 and
Table 3 mean the following:
Prob: Name of the problem.
Opt: Optimal value of the problem.
LP(P): Linear relaxation value for a problem (P).
GAP: Relative gap between Opt and LP of a problem by using the XPRESS software.
t: Time in seconds
H2Sub: Best upper bound with the H2S.
y: Number of opened facilities.
SGMlb: Lower bound with the SGM.
ADAub: Best upper bound with the ADA algorithm.
DAMlb: Lower bound with the DAM (without the VFH).
GAP: Relative gap between the best upper bound and lower bound of a problem by using the DAM with VHF.
Tt: Total time in seconds.
imp t:
GAP: .