1. Introduction
The analysis and optimization of systems of coupled Markov Chains (MC) appear in various applied areas such as environmental management [
1,
2], control of dams [
3,
4,
5], control of data transmission systems for the avoidance of congestion [
6,
7,
8], and many others. In general, the optimal control of MC with constraints and various criteria leads to the solution of a system of ordinary differential equations. However, the coupled MC produces serious difficulties due to a dramatic increase in the number of states, which renders impossible the numerical solution of the optimization problem with the aid of standard computers in a reasonable time [
9]. Moreover, the presence of constraints very often leads to rather complicated and cumbersome mathematical problems.
The evolution of single controlled MC may be effectively represented via a vector-valued function satisfying stochastic differential equations with a control dependent generator [
10]. If a set of MCs shares common control parameters, they no longer can be analyzed separately, since the system becomes coupled. Approaching the coupled system as a whole leads to the necessity of the consideration of a global system state, which incorporates the states of all individual MCs involved. Joining together the states of all MCs as a set of vectors (perhaps of different dimensions) leads to the description of their evolution in tensor form. The next step of the control synthesis for this coupled system is the derivation of the optimality conditions in the form of a special dynamic programming equation (DPE). The demand for accuracy leads, however, to the extension of MCs’ state space, and by that, to the increase of the dimensionality of the DPE. At the same time, the numerical solution of DPE allows the parallelization of the most time-consuming operations, and this is one of the advantages of the tensor representation. Here we give an algorithm for DPE derivation, which is the principal step for its numerical solution. It should be noted that this paper follows the previously published works [
6,
9,
11] and provides a proper mathematical justification for the results presented there.
As an example, we consider the management of coupled dams under non-stationary seasonally changing random inflows/outflows. The presentation of a dam’s state as a continuous-time MC permits one to take into account the random character of the incoming and outgoing water flow due to rain, evaporation, and, of course, customer demands. However, sometimes it is necessary to organize the interflow between the different parts of a system, such as controlled flow from one dam to another or controlled release, to avoid overflow. In all these cases, it is necessary to consider the system as a whole, that is, the global state of the dam system. The current water level of each dam is described by the state of a continuous-time MC, hence the state of the whole system of the coupled dams is represented in tensor form. The connection of MCs is a result of the controlled flow between the dams. The control aims to maintain the required water levels in given weather conditions, and at the same time, to satisfy the customer demands. The general approach is based on the solution of DPE in tensor form. This equation may be reduced to a system of ordinary differential equations. We suggest here the procedure for the generation of this system and also the approach to the minimization in its right-hand side (RHS), which may be realized for each state of the coupled MC independently.
The structure of the paper is as follows: in
Section 2 we describe the model of the controlled MC system. The martingale representation and stochastic evolution of the global state of coupled MC is given in
Section 3. The optimal control and Kolmogorov equation for the distribution of states of the controlled coupled system of MC is in
Section 4. A numerical example motivated by the Goulburn River water cascade is presented in
Section 5.
2. Model of Controlled Markov Chains System
In this section we give the definition of the model of the controlled coupled MCs. Let us consider a set of M MCs. The m-th MC has possible states and a time- and control-dependent generator , :
, for ;
.
The controlparameter , where U is a compact set in some complete metric space, and the matrix valued functions , are assumed to be continuous in both parameters , where . The control parameter u is shared by all the MCs, and hence defines their interconnection.
2.1. Global State
Let the local state space of the
m-th MC be given by a set of standard basis vectors
, then its dynamics can be described by the following stochastic differential equation [
10]:
where
is the initial state of the
m-th MC and the process
is a square integrable martingale with bounded quadratic variation. We assume that the martingales
and
are independent for
, and hence the MCs do not experience simultaneous transitions.
The first approach to the controlled coupled MCs was presented in References [
6,
9,
11], where the global system state was described as a tensor product of the individual MCs’ states:
,
, where × denotes the Cartesian product (the definition of tensor product can be found in
Appendix A). For the combination of local states
, the global state
can be represented as a multidimensional array of shape
with single non-zero element
. This representation allows us to significantly simplify the expressions involved in the DPE and makes the implementation of the control optimization algorithms rather straightforward, especially with programming tools that allow multidimensional array objects (e.g., Matlab, Python NumPy or Julia).
Let the set of admissible controls
be defined as a set of
-predictable controls taking values in
U (here
stands for a natural filtration associated with the stochastic process
). This assumption ensures the measurability of the admissible controls. In other words, if the number of global state transitions of the coupled MC system up to the current time
is
and
is the time of the
k-th jump and
is the history of the
m-th MC on the time interval
(set of state and jump time pairs), then for
the controls
are measurable with respect to
t and the global history
[
9].
2.2. Performance Criterion
Let
be a running cost function defined for the global state
and time
, and
be a terminal condition. Then a general performance criterion to be minimized is given by
Here
is, in fact, a multidimensional array-valued function defined for any possible global state of the coupled MC system
. For example, the running cost function can be represented as
where
denotes the inner product and
is the indicator function. Thus, for given
t and
, the state
selects a single value
from the multidimensional array
. The same is valid for the terminal condition
, which is defined by a multidimensional array
.
Assumption A1. The elements of are bounded from below and are continuous functions on .
5. Numerical Study: Goulburn River Cascade
To illustrate our approach to the modeling of real-world systems as coupled MCs and optimal control synthesis, in this section we present a numerical study based on real-world data. The usual area of applications for MCs and numerous generalizations is service systems, which can be described in terms of Poisson arrival flows, queues, and servers. For example, in References [
6,
17], one can find an application of coupled MCs for congestion control in networks. Specifically, an optimal control strategy was proposed for a multi-homing network connection, where the packets (portions of incoming traffic) could be sent through one of two lines of different speed and price or discarded. However, in the present paper, we stick to another application area, namely, the control of water reservoirs. The reason is that the proposed approach allows successful modeling of these systems, accounting for the stochastic nature of the incoming and outgoing flows, being dependent on the weather condition and users’ behavior. Moreover, the proposed form of the performance criterion (
3) permits a variety of optimization goals, for example, minimization of the difference between the actual and desired demand intensities, maintenance of the balance between the incoming and outgoing flows, minimization of the probability that either dam falls below some critical level on average or at the terminal time [
9]. Besides, this criterion allows penalizing control in certain states, so that natural restrictions (such as the impossibility to transfer water from an empty or to an overflowing reservoir) can also be taken into account. The terminal condition can reflect the desire to bring the system to a certain state by the end of the control interval.
The object of our study is a system of interconnected dams of the Goulburn River cascade. The model is deliberately simplified to avoid cumbersome details, which are unnecessary for a journal on mathematical subjects, nevertheless, it demonstrates all the necessary techniques, such as MCs states’ choice, relations between the discharges and state transition intensities, accounting for the natural inflows and outflows, consumer demands, and environmental constraints.
The Goulburn River Basin is located in the south-western part of State Victoria, Australia. The description of the region and a review of the literature devoted to water resource modeling and planning can be found, for instance, in References [
18,
19,
20]. The review of papers on allocation modeling in this region and water trading can be found in References [
21,
22,
23]. The Goulburn Basin, together with the Upper Murray, Ovens, Kiewa, Broken, Loddon, and Campaspe Basins constitute an area called the Goulburn-Murray Irrigation District (common abbreviation is GMW from Goulburn-Murray Water—an organization governing water resources in this area). Since the Goulburn River is a major waterway and a major contributor of water supply in the district, it is an object of very intensive research and modeling. The GMW uses the Goulburn simulation model (GSM) for the optimization and planning of water supply in the region. The GSM represents the Goulburn system as a set of nodes of different types: storages, demands, and streamflow inputs. These nodes are connected by a network of carriers characterized by their capacities and delivery penalties (penalty functions). The GSM was calibrated using a trial-and-error procedure, which optimizes the model parameters related to the water supply infrastructure [
24]. The model presented here reflects a part of the GSM and involves Lake Eildon—a major storage of water, which can be required by farmers during the irrigation season, and Lake Nagambie with the Goulbourn Weir, which is connected by three major channels to the end-users. The Goulburn River connects the two water storages and continues to run after Nagambie, serving as its fourth outflow. Based on the request of irrigators, the water is released from Lake Eildon to Lake Nagambie, and then by one of three channels or the river delivered to the farmers (water right holders). The important point is that apart from irrigation requests, the system must satisfy environmental demand, which means the levels in rivers and channels should exceed some minimal threshold that will allow the support of healthy ecosystems and provide some required minimal inflows to satisfy downstream demands.
The scheme of the Goulburn River cascade is presented in
Figure 1. We assume that the water level of Lake Eildon is affected by two major flows. One is the incoming flow, which reflects the sum of all the natural water arrivals and losses, including precipitation, upstream tributaries, evaporation, and other causes. The single outcoming flow is the controlled discharge of the Goulburn River, which also serves as the single incoming flow of Lake Nagambie (other natural inflows and outflows into this lake are ignored here due to a significant difference in the size of the two lakes). From Lake Nagambie, there are four controlled discharges, namely the Stuart Murray Canal, Cattanach Canal, East Goulburn Main Channel, and the Goulburn River.
5.1. Mcs and Generators
The states of the water reservoirs indicate their water levels, expressed in equal portions of their volume. So for Lake Eildon with a total volume equal to = 3,390,000 ML, the bounds of the states are , where and is the desired number of states, which is chosen according to accuracy demands. The same state division is considered for Lake Nagambie with volume = 25,500 ML: the bounds are , , — desired number of states for this reservoir. Association of the states with water volumes instead of the water surface levels leads to the same dependency of the transitions on the incoming and outcoming flows and hence simplifies the resulting model.
We model the reservoirs as MCs, so the evolution of the reservoirs’ states in time is described by the stochastic differential Equation (
1), where
and
and
stand for Lakes Eildon and Nagambie respectively. Due to natural reasons, the transitions are only possible between adjacent states, so the generators
reflect a birth-and-death processes:
where
is the intensity of the incoming flow of natural water arrivals and losses, and
is the controlled intensity, which reflects the flow from Lake Eildon to Lake Nagambie. The variables
and
have a purely probabilistic sense, which has to be translated into the common units of water discharge measurement, say
. Here the basic assumption is the equality of the average transition times, which for the constant rate
and constant discharge
u are equal to
from the one side, and to
from the other. Here
v is the volume of water, which corresponds to the difference between the levels, for example, for Lake Eildon,
. Thus, in terms of the discharges, the generator
can be represented as follows:
where
is the discharge from Lake Eildon, and
is the sum of all the natural inflows expressed in common units
.
Denote the controlled discharges from Lake Nagambie by
,
,
,
, which stand respectively for the discharges of the Stuart Murray, Cattanach, East Goulburn Main Channels, and the Goulburn River, and the sum of these discharges as
. Since the outflow of Lake Eildon is at the same time the incoming flow of Lake Nagambie, we have the following representation of the generator
:
It should be noted that the capacity of Lake Eildon is more than 100 times larger than the capacity of Lake Nagambie. This causes certain problems for the numerical simulation since the same discharge values affect the transition rates in different scales. To overcome this issue, one can normalize the transition rates by adjusting the number of states, providing . Another way is to scale down the time discretization step, ensuring adequate probabilities of state transitions for the MC, which corresponds to Lake Nagambie.
5.2. Performance Criterion and Constraints
The performance criterion should take into account the customer demands and ensure the balance of the water reservoirs inflows and outflows. We assume that there are some reference values for the demands in the channels and the Goulburn River:
,
,
,
, which an optimal control aims to satisfy. Let
where
i and
j stand for the state number of Lakes Eildon and Nagambie respectively,
C is some large number, and
is an indicator function.
The running cost function (
4) with the state-related cost given by (
18) reflects the mentioned aim along with the intention to balance the arriving and outgoing flows of Lake Nagambie. Plus, this running cost penalizes the nonzero discharges from both lakes, when they are in the states
,
, which corresponds to the lowest possible level, or drought.
The terminal condition
in (
3) is defined by matrix
with elements
, which correspond to the combination of states
and
. Let
be the desired terminal state of the system, then define the terminal condition as follows:
where
is the
norm (sum of absolute values). This condition penalizes any deviation of the system’s state at the end of the control interval
T from the desired and, moreover, it assigns a larger penalty for greater deviations.
For all the control variables we also define phase constraints in the form of upper and lower bounds:
These constraints bound the maximal possible discharges for the channels and the Goulburn River and guarantee the environmental demand, mentioned earlier.
5.3. Parameters Definition
In previous subsections, the Goulburn River cascade model was defined in general. Nevertheless, there still are some undefined parameters, which will be specified using real-world data. To define the reference demand values
,
,
,
, we use the data on the discharges of the corresponding channels and river available from the Water Measurement Information System of the Department of Environment, Land, Water & Planning of Victoria State Government [
25]. The data on the daily discharges was grouped by the day of year, and for each group, the mean value was calculated. The resulting time series were approximated by functions
where
days and coefficients
,
are chosen using the Least-Squares method. The resulting functions are smooth and have a period equal to one year, which makes them suitable for the reference values of the irrigation demands in the proposed model. The time series of the average daily discharges and the approximating functions are presented in
Figure 2.
The data for the Lake Eildon incoming flow is obtained from the Climate Data Online service of the Bureau of Meteorology of the Australian Government [
26]. The rainfall data is collected at the weather station situated there. This data is averaged on a daily basis and approximated by a smooth periodic function in the same way as the daily discharges. The resulting function serves as an incoming flow pattern
. For this model, we assume that the irrigation demands agree with the total available natural resources, so the incoming flow pattern is normalized to make the incoming flow
satisfy the following condition:
The resulting flow approximation along with normalized rainfall averages is presented in
Figure 3.
Specifying of the upper and lower bounds for the control parameters
and
by maximum and minimum registered values of the corresponding discharges time series obtained from Reference [
25] finalizes the definition of our Goulburn River cascade model.
5.4. Simulation Results
The complete definition of the control optimization problem is given by
state Equation (
1) with generators (
16) and (
17);
optimization criterion (
3) with running cost (
18) and terminal condition (
19);
incoming flow
, reference values of the irrigation demands
,
,
,
and upper and lower bounds for the control parameters defined in
Section 5.3.
Now the optimal Markov control (
13) can be calculated as a solution to the system of ODEs (
12). Remember that the minimization in the RHS of the ODEs can be done independently and hence can be effectively parallelized. The system (
12) was solved numerically using the Euler method with the discretization step
, while the time scale and all the necessary functions were normalized so that
. The number of MCs’ states for both lakes was chosen as
, and the desired terminal state
.
In
Figure 4 and
Figure 5, the optimal control values are presented for the states
and
. One can see the nontrivial behavior of the optimal discharges, which differs from the reference values even for the desired terminal state. The evolution of the value function components
exhibits the influence both from the integral part of the criterion and the terminal condition. Indeed, for
it monotonically decreases since the demands here are mostly satisfied and the system is already in the desired state. For the state
, on the contrary, there are intervals of monotonic growth: the first is because of insufficient supplies in late summer and autumn months, and the second starts in spring and shows the necessity to transit to the desired terminal state
.
The Kolmogorov Equation (
15) allows us to calculate the probabilities of the global system states given the initial state or distribution. The probabilities for states
and
are presented in
Figure 4 and
Figure 5 on the upper subplots. The initial state here was chosen to be equal to the terminal
. The solid red lines present the solution of (
15), and the dotted lines are the result of Monte-Carlo sampling: the state rate was estimated using a set of 100 sample paths governed by the optimal control strategy. The state probabilities allow us to also calculate the average states of MCs, which can be converted into the average levels of Lakes Eildon and Nagambie. In
Figure 6, the average levels calculated by means of the Kolmogorov Equation (
15) are presented by solid lines. The corresponding averages calculated using the Monte-Carlo sampling are depicted with dotted lines.
As an alternative to the optimal control
we present here a program control
which aims to fully satisfy the customer demands:
The indicator functions make sure that the natural constraints are satisfied, that is, the water cannot be taken from an empty dam. In
Figure 7, the average levels calculated by means of the Kolmogorov equation and Monte-Carlo sampling are presented. One can observe that this program control leads to overuse of the resources in contrast with the optimal one, which is able to refill the dams after the winter demand growth and rainfall shortage.