1. Introduction
The theoretical solution to the optimal filtering problem of stochastic differential system states with the development of its effective numerical solution is a traditional field of extensive research [
1,
2,
3,
4]. The online calculation of the high precise estimates of the system state given the noisy indirect observations is in demand for navigation [
5,
6,
7], telecommunications [
8,
9,
10], finance [
11,
12,
13,
14,
15], processing in micro- [
16] and nano-structures [
17], and many other purposes [
18,
19,
20,
21,
22,
23,
24]. However, advanced filtering algorithms play a crucial role in the synthesis of the control in the case of incomplete information [
25,
26,
27,
28,
29]. The spectrum of such applications is rather broad [
30,
31,
32,
33].
When the optimal control depends on the available observation through the optimal estimate, fulfillment of
the separation theorem is presumed [
34,
35]. The corresponding “good lucks” are not numerous and primarily include the case of
the Linear Quadratic Gaussian (LQG) systems [
36]. When the separation theorem fails for some reason, one can use the so-called “separation principle”, i.e., find the conditional optimum within the class of strategies, which are the functions of the state estimate, not of the original observations. So, the availability of a high precise filtering estimate is a significant factor for the successful solution to numerous control problems under incomplete information.
As is known, the optimal in the mean-square sense state estimate coincides with a conditional mathematical expectation of the state given the available observations. The estimates, which are optimal in the sense of criteria other than mean-square one, are the functionals of the corresponding conditional distribution. This, in turn, is a solution to the Kushner–Stratonovich or Zakai partial
stochastic differential equations (SDEs) [
29,
37,
38,
39,
40]. There are only a few cases when the conditional distribution is a solution to some finite-dimensional closed system of SDSs. The Kalman–Bucy filter [
41,
42], the Wonham one [
2,
25], and some particular cases [
43] belong to this class. Development of the stable numerical realizations of the optimal filters is nontrivial even for the linear Gaussian observation systems [
44].
The aim of this paper is an investigation of how the filtering performance impacts the quality of the subsequent optimal control with incomplete information. As a testbed, we choose the stabilization of a linear stochastic differential system affected by some statistically uncertain external disturbance. This represents a
continuous-time Markov chain (CTMC). The theoretical solution to the stabilization problem is presented in [
33], and the separation theorem is proved. The optimal control strategy needs the optimal estimate, and the last, in turn, is determined by the Wonham filter. The numerical realization of the filter is nontrivial: the classical numerical schemes such as the Euler–Maruyama one [
45,
46] are unstable in this case. The point is that they do not provide non-negativity and normalization properties for the estimate approximations. The authors of [
47] suggest a new concept of the stable numerical approximations of the filtering estimates. It is left to solve the Wonham filter SDE numerically for direct development of the optimal filtering algorithm by the observations discretized by time. The corresponding estimate is calculated recursively by a variant of the Bayes formula. The formula contains the integrals, which are the shift-scale mixtures of Gaussians, and one cannot calculate them analytically. The author of [
48] suggests various numerical schemes of their calculation and the accuracy characteristics. This paper presents a comparison of the numerical filtering schemas in light of their influence on the performance of the subsequent stabilization.
The paper is organized as follows.
Section 2 presents the investigated control system affected by the external statistically uncertain piece-wise constant Markov disturbances. We state the problem of optimal stabilization and present the equations describing the optimal stabilization strategy. The separation theorem is valid for the system, so the optimal strategy depends on the observation via the optimal filtering state estimate.
Section 3 introduces the mechanical system, an overhead crane, which serves as a prototype of the controlled stochastic system used for the comparative numerical analysis. We present the general properties of the system, its evolution with and without external disturbances, and the influence of the criterion coefficients on the control character.
Section 4 is devoted to the numerical realizations of the Wonham filter. Based on the optimal filtering estimates of the CTMC given the continuous-time observations, we introduce the numerical schemas of its approximation of the order
, 1, and 2 given the time-discretized observations.
Section 5 plays a vital role in the presentation. It contains the results of the numerical experiments, illustrating the influence of the estimation stability on the final performance of the proposed control. We compare the suggested schemas with the classical Euler–Maruyama one.
Section 5.1,
Section 5.2 and
Section 5.3 present the results of the stabilization of the state in the case of the stable, semistable, and unstable control system. All the examples confirm the superiority of the proposed schemas of numerical filtering for subsequent stabilization purposes.
Section 6 contains concluding remarks.
2. Problem Formulation and Optimal Control Equations
Below, we briefly present the control problem and its theoretical solution (see [
33] for details).
On the canonical space with filtration
,
let us consider the linear SDE
Here
is a controllable output stochastic process of the second order;
is an uncertain stochastic input which represents a CTMC with the finite-state space of the unit vectors in the Euclidean space , with the transition intensity matrix and initial distribution ;
is a standard Wiener process: are mutually independent;
is a control, which is a process with a finite second moment;
are known nonrandom matrix-valued functions.
The admissible control
represents any function of the output
. The optimal control
minimizes the objective function
where
are known bounded matrix-valued functions,
is the
x transpose,
is the Euclidean norm.
Under the standard conditions of (
1) and (
2) (piecewise continuity and uniform degeneracy of
and
), the solution to the optimization problem
takes the form
where
where
denotes the optimal trajectory of the state, i.e., the solution to (
1) calculated for the control
.
Above, the functions on the right hand side (RHS) of (
4) and (
5) have the form
where
is an identity matrix of suitable dimensionality. So, the optimal control is a linear function of the system state
and the optimal filtering estimate
of the external input
. Note that
is defined by the Wonham filter Equation (
6) [
2,
25]. Moreover, Equation (
6) determines the conditional mathematical expectation (CME)
for any control
other than the optimal one
. Here, we denote the natural filtration generated by the state process
, governed by the control
u as
,
.
The above relations allow construction of the following stabilization algorithm
to solve numerically ordinary differential Equations (
4) and (
5) (any stable numerical scheme is valid for this);
to solve numerically SDE (
6), defining the Wonham filter.
The separation theorem is valid in control of a Linear Quadratic Gaussian (LQG) system under incomplete information. The optimal control coincides with its analog under complete information, when the unobservable system state is replaced by its Kalman–Bucy filtering estimate [
49]. The realization of the optimal control in the LQG case represents the numerical solution to a couple of the Riccati ODEs and a system of linear SDEs. The numerical solution of the Riccati equation can be effectively performed, for example, by the Runge–Cutta scheme, and the realization of the Kalman–Bucy filter is also well developed.
In contrast with the LQG case, the considered optimal stochastic control problem is challenging exactly in the numerical realization stage. The separation theorem is valid for the class of the investigated systems, and the optimal estimate represents the solution to the system of
nonlinear SDEs. It is well known that the CME of the MJP is its conditional distribution, so the trajectories satisfy the non-negativity and normalization conditions. Visually, the instability of a numerical scheme of the Wonham filter realization looks as follows: once an estimate approximation violates the non-negativity, the subsequent trajectory of the filter “blows up” in a few steps, which makes it impossible to synthesize the control strategy. We knew this feature, so in the numerical example of [
33] we chose one of the stable numerical schemes [
43,
47] of the Wonham filter. The point was that the example played an illustrative role only. The article can be treated as the second part of [
33]: it presents the way for the practical realization of the stated optimal stochastic control problem.
3. Performance Analysis of Mechanical Actuator
Initially, for the comparative numerical study, we choose a controlled overhead crane model. The crane trolley with a hoist travels along the H-beam. The loaded trolley has significant inertia. The goal is to place it above one of the locations (for example, the railway track). The set of possible locations is known and fixed. The trolley state includes the current coordinate
and velocity
on the beam. The stabilization system includes a feedback velocity control block, actuated by the couple “coordinate–velocity”. In addition, the system also includes some external unobservable input
(the signal, specifying the number of the requested railway track) and the control
itself. So, the overhead crane model has the form
The external governing signal represents a homogeneous CTMC with the rransition intensity matrix (TIM) and initial condition . The constant scalars and row vector are known, and is a standard Wiener process. The term simulates both an inner inaccuracy of the actuator and the random disturbances contaminating the governing signal. The initial condition is a two-dimensional standard Gaussian vector.
One can see that the system (
7) is stable, when
and
, because
b and
are eigenvalues of the system matrix
.
Initially, we choose the following “basic” values for the system and control problem:
The vector forms the coordinates of the railway tracks, served by the crane, while the process represents the desired “ideal” piece-wise continuous movement of the trolley between the tracks.
We illustrate the crane’s functioning in three stages. To do this, we have to vary the parameters above to highlight various features of the estimation and/or control algorithms.
First, we consider the system without both the external governing signal and control . Second, we model the system with CTMC input without the control . Third, we consider the system under the action of both the input and control .
We solve (
7) numerically via the Euler–Maruyama scheme with the time increments
; meanwhile, we model the CTMC
with the smaller time step, namely
.
At the first stage,
Figure 1 contains typical paths of the coordinate
and velocity
of the crane trolley in the case where the external input
is absent.
The figure confirms the stable character of the system.
Second, we consider the system with the parameters (
8), keeping
. The simulation is performed for the CTMCs with the two variants of the TIM:
and
The first TIM provides a long staying in the initial state for the CTMC, while the second TIM generates the rapid transitions , and the long staying in the last state.
The input CTMC influences the output via the process , where represent possible drift accelerations.
Figure 2 contains typical paths of the coordinate and velocity of the crane trolley affected by the CTMC with the corresponding TIMs compared with the desired switching
.
One can see some stabilization of the real trolley coordinate near the “ideal” position without additional control , but the stabilization performance looks poor.
Third, the stabilization could be made more effective by the control
, which minimizes the functional
where
characterizes the control unit cost.
Figure 3 demonstrates typical paths of the coordinate and velocity of the crane trolley with and without control
. The calculations are performed for the parameter set (
8): left subplots correspond to the the controlled case with
, and right subplots stand for the uncontrolled case
.
Comparing the results demonstrated in
Figure 1,
Figure 2 and
Figure 3, it can be seen that the control remarkably improves the stabilization performance. Without
the system only “declares an intension” to pursue the current stabilization position
. By contrast, the assisting control
admits a more sharp reaction to the external drift change.
Fourth, we consider the system with the parameter set (
8), reducing the value
R, namely
. The example demonstrates a potential stabilization performance in the case of low control cost.
Figure 4 illustrates the high performance of the coordinate stabilization and high amplitude oscillating character of the corresponding velocity under the optimal control
.
Note that the value
R of the unit control cost does not affect the filtering performance of the external input
y and is valuable only in the control procedure. Let us explain this fact from the physical point of view. The trolley has a positive mass. According to Newton’s second law, the faster the actuator must react to the external governing signal, the higher the force applied. However, each resource has its own cost. The coefficient
R sets a regulated trade-off between the inaccuracy of the actuator governance to the external input
and the total expenses for the control assistance
. We can track variation of
R even visually. When
R is small enough, the actuator follows the external signal relatively accurately, whereas the applied assisting control demonstrates high values and oscillating nature (see, e.g.,
Figure 3 (left)). The high unit cost
R leads to small assisting control
values and low quality concerning how the trolley coordinate follows the input
.
Figure 3 (right) can serve as an illustration of this fact.
The provided calculations disclose the main issue of the control realization. The point is that the Euler–Maruyama scheme for the numerical solution to the SDE demonstrates an instability, which leads to divergence of some estimate trajectories. Regardless of the step , almost each trajectory contains the points with violated conditions concerning the non-negativity or normalization . The number of such points varies from single to several thousands, which leads to the filtering “blowing up” and inability of the control synthesis. The more such defective estimates there are in the trajectory, the more likely the trajectory diverges rapidly.
If the defects appear rarely, they can be neutralized by a heuristic modification of the Euler–Maruyama scheme. Once the filtering estimate violates either non-negativity or normalization conditions
, it is replaced by the CTMC stationary distribution
, or by the estimation at the previous time instant
. These modifications help in some cases but are not a panacea, and this fact is shown in
Section 5. Further in the presentation we refer to the introduced scheme modifications as
and
, respectively.
The next section presents a stable realization of the Wonham filter [
43,
47] required for the synthesis of the optimal stabilization strategy
.
4. Stable Filtering Algorithms by Discretized Observations
We realize both the filtering and stabilization algorithms with the same constant time increment , such that , and is an integer.
Furthermore, without loss of generality, we suppose for model (
1)
and
over the discretization intervals
; otherwise, the functions
and
should be approximated by suitable piece-wise constant functions.
Let us consider
, which represents a non-anticipated transformation of the observable input
:
The process
does not depend on the control
, and due to the identity
the equality
holds.
Since the numerical realization of the control strategy
presumes its action at the time instants
, we have to estimate the Markovial drift
at the same moments. We use the transformed observations
discretized by time with the step
:
The discretized observations generate the new family of
algebras
If
is a random vector composed of the occupation times of
in each state during the interval
, and
is a Gaussian probability density function with the mean
m and variance
, then the estimate
can be calculated by the following recursive procedure [
43,
47]
where
is a vector formed by units, and the matrix
consists of the random values
The CMEs
represent the integrals—the shift/scale mixtures of some Gaussians with the distribution of
as the mixing one. The issue is that this distribution is not continuous with respect to the Lebesgue measure, so the integral (
11) cannot be calculated analytically: we have to do it numerically. In this paper, we compare the following numerical schemes for the calculation of
:
which is a scheme of the “left” rectangles with the accuracy order
,
which is a scheme of the “midpoint” rectangles with the accuracy order 1,
which is a Gaussian quadrature scheme with the accuracy order
2 (here and below
stands for the Kronecker symbol).
In the system under investigation since , and this fact simplifies calculations significantly.
For the comparative study we choose the approximations
,
, and
, calculated by (
10) and (
11) and the schemes
,
, and
, respectively.
5. Comparative Numerical Study
Through a series of numerical experiments, we try to answer the question of how the accuracy of the filtering approximation affects the total performance of the control of a linear system (
1) or, more precisely (
7). The first set of tests is devoted to the control of the physical system introduced in
Section 3. The investigation object of the second set represents a semi-stable linear system. The third set returns to a consideration of the trolley functioning under various complications. Finally, the fourth set is directed to the unstable linear system control.
We perform all experiments with the time increments . On the one hand, the step is inversely proportional to the computational costs of the input filtering. On the other hand, the decrease of improves the filtering accuracy and raises the control performance. Hence, the value of is a trade-off between the computational costs and reasonable losses of the control quality. However, in some applied problems, the step is constrained from below by the frequency of the actual measuring sensors.
We simulate 1000 random trajectories of the Wiener process and CTMC , which are the same for all experiments. Then we synthesize the optimal stabilization strategy by use of various approximations of the Wonham filter estimates and various time steps . Finally, the control performance index is a result of the Monte-Carlo sampling over the simulated trajectories.
We present the obtained results both in the tables and figures. For convenient analysis, the headers of all tables contain the system parameters, and we mark the important results in bold font. To characterize the quality of the estimates
, , , , we calculate the values
where
denotes averaging over the simulated trajectories.
In the case of the Euler–Maruyama scheme divergence for
, we use its stable versions
and
, suggested in
Section 3.
To analyse the stabilization quality, we calculate the performance functionals
where
, , , are optimal control strategies calculated with the use of the estimates
, , , respectively, and
5.1. Stable System
We perform the first numerical experiment with the trolley model, which has the parameters (
8), to compare the modified Euler–Maruyama scheme and the stable ones (
10) and (
11). The calculation results are accumulated in
Table 1.
We do not supply the example with an extra illustration, because
Figure 3 can serve this purpose. The results of the first experiment can lead to the following conclusions.
The utilization of the suggested numerical filtering schemes, based on the time discretization of the observations, provides stability to the filtering procedure for all experimental conditions. The superiority is obvious for large steps . The offered stable modifications of the Euler–Maruyama scheme demonstrate the expectable behavior. There are many “divergent” filtering trajectories; they need correction for large , so both modifications provide a low filtering quality. When decreases, the number of the “divergent” trajectories also decreases, and both modifications provide the acceptable filtering quality. They lose to the suggested stable schemes. When decreases, the probability that the “divergent” filtering trajectory vanishes too and the quality of all considered numerical filtering schemes are identical.
In general, one “divergent” trajectory is enough to make the value arbitrarily huge. The absence of such trajectories in the sample of 1000 trajectories for does not guarantee their absence in the large samples. We detect the “divergent” trajectories in the samples of the size . Note that the modified Euler–Maruyama schemes provide the acceptable filtering quality in the case of the small values.
On the set of 1000 simulated trajectories, all suggested stable numerical schemes demonstrate very close performance.
The absolute values of all and are small enough. The reason might be the small noise intensity . The calculation error and moderate sample size seem to make the main contribution to the performance index comparing with the theoretical properties of all considered estimates.
The second numerical experiment differs from the first one only in that the parameter
g value is increased by 10 times:
. The calculation results are collected in
Table 2.
The results of the second experiment can lead to the following conclusions.
There are no divergent filtering trajectories among the simulated sample for any considered step . This does not imply their absence in the samples of the greater size; however, the modifications and give an opportunity to neutralize divergence without significant loss of the estimate accuracy.
All estimates have moderate quality because of relatively high noise intensity g. We can conclude that under a low signal–noise ratio the performance of all considered estimates becomes similar for any considered step .
The numerical experiments above indicate that the reasonability of the stable numerical filtering scheme usage is questionable. To highlight their utility, we have to involve the models with more delicate examples, which help to expose the value of the stable numerical schemes.
5.2. Semi-Stable System
We impair the model quality and consider a semi-stable dynamic system. Let us remind the reader that trolley model (
7) keeps its stability for any
. The equalities
determine the stability bound of the system, and we investigate this case. Obviously, the formula
for the stable states is not valid in this situation. We also make the following changes to the parameters:
,
,
. The calculation results are collected in
Table 3. The results of the third experiment can lead to the following conclusions.
The proposed stabilization strategy demonstrates good performance even in the semi-stable case.
The proposed stable numerical schemes of the state filtering demonstrate remarkable superiority compared with the Euler–Maruyama scheme and its modifications. This situation can be seen for all considered time steps .
There were five divergent filtering trajectories in the simulated sample, which did not affect the filtering performance dramatically in the case . Nevertheless, even the fact of their existence justifies the usage of the stable filtering scheme.
The experiment demonstrates no significant difference in the performance of the stable numerical schemes , and .
Figure 5 contains typical state trajectories for the semi-stable case, governed by the same control law
, but calculated by the various drift estimates
,
and control law
.
Figure 6 contains the uncontrolled variant of the same trajectory.
The numerical experiments with the chosen parameters make it possibile to expose the behaviour of the offered heuristic modernizations of the Euler–Maruyama scheme
and
in comparison with the stable scheme
, when the original Euler–Maruyama scheme diverges. For the case
,
Figure 7 makes it possibile to compare the estimates
,
,
with the true values
visually.
Figure 8 contains corresponding state trajectories governed by the controls
,
and
.
The numerical experiments with the semi-stable system allow us to conclude that the proposed stable filtering estimates are very effective for stabilization control synthesis. The differences between the proposed schemes are still minor in the semi-stable case. The subsection below discusses the situation when the choice of the stable numerical scheme makes some sense.
5.3. Stable System with High-Frequency Changing Drift
The numerical experiments of the last subsection demonstrate that the modified estimates
and
do not “blow up” at any trajectory. Nevertheless, in some cases, they “lose” the CTMC
, as we can see in
Figure 7. Hence, we hold for the comparative study the original Euler–Maruyama scheme, keeping in mind the possibility of its “blowing up” divergence.
The experiments above expose several realistic cases, for which the proposed stable numerical estimates demonstrate remarkable superiority for the stabilization strategy synthesis compared with the Euler–Maruyama scheme and its modifications. Nevertheless, we still do not present a situation when the suggested stable schemes deliver different estimation quality. To do this, we complicate the conditions of the trolley functioning. Namely, we increase the intensity of the CTMC transitions. We do this to confirm the hypothesis that the proposed schemes show different precision when the probability that the CTMC
has more than one transition during
is significant. To meet this condition, we have to increase the elements of TIM
. The frequent input signal transitions lead to more “active” control. If the control cost
R is rather high, then the optimal control is minor and does not affect the system state. Hence, in this experiment, we set
, as in
Section 3 (see
Figure 4).
Table 4 and
Table 5 contain the results of the numerical experiments, performed with the transition intensities 10 and 50, respectively.
The results of
Table 4 correspond to expectations. For
and
, the quality of estimates of discretized filters is built in accordance with their theoretical convergence rate. It should be noted that the distinction is quite small and can be seen only in the third or fourth significant digit. The case with
can already be considered marginal. Assuming that the difference may show an upward trend with an increase in the intensity of the jumps, we will continue to complicate the operating conditions of the actuator.
Figure 9 illustrates the results corresponding to the transition intensities 10. It contains the typical state trajectory governed by the strategy
, indistinguishable from
and
. We compare it with the trajectory governed by the “ideal” control
synthesized with the complete information about the input
[
50]. The figure shows that the potential enhancement of the stabilization performance is possible only through the increasing of the filtering accuracy.
The plots corresponding to the system with the transition intensities 50 are given in
Figure 10.
The calculation presented in
Table 5 meets the expected results to the same extent as the previous one. For
, the distinction in the quality of the estimates of the discretized filters and the corresponding stabilization strategies can be seen in the second or third significant digit. For
, there is the same difference and the same hierarchy of algorithm quality, but already by the third or fourth significant digit. This means that the order of the transition intensities of the input
for such sampling steps provides quite a lot of implementations of more than one jump in the interval
, which gives an advantage to higher-order filters. Having achieved this result, it should be noted that according to
Figure 10, the effectiveness of the stabilization strategy itself, even in the case of complete information about the state of the input
, is extremely low. This is the result of the deterioration of the operating conditions of the actuator to completely unrealistic parameters. The slight superiority of the Euler–Maruyama scheme
in the case
is illusory: in fact, the sample of the trajectories does not contain the divergent ones.
5.4. Influence of MJP Dimensionality on Control
Performance
The dimensionality of the CTMC state space can also indirectly affect the filtering and stabilization performance. Let us repeat the experiments of the last subsection but increase the CTMC dimensionality:
. As in the subsection above, we consider values of the transition rate: 10 and 50. The numerical results of the experiments are given in
Table 6 and
Table 7, respectively.
The obtained results emphasize the necessity of using the stable numerical filtering schemes: even for the minimal time step , the Euler–Maruyama approximation of the Wonham filter “blows up”. The differences between the various stable numerical filtering schemes are still minor.
5.5. Unstable System
To analyze the bundle “stable numerical filtering scheme–optimal stabilization strategy” exhaustively, we finally omit the requirement that system (
7) be stable. This means that inequalities
and
are not valid. The numerical experiments in this subsection have no physical sense; they just illustrate the applicability of the stabilization strategy calculated with the aid of the stable numerical filtering scheme.
Note that the linearity of (
7) admits the control synthesis without requiring system stability. If
is an optimal strategy for the stable system with
and
, then
is optimal for the unstable system with “the mirror parameters”
and
, and the minimal value of the criterion remains the same as for the stable system (see
Table 1,
Table 2 and
Table 3). Hence, we have to consider the case when the parameters
a and
b have different signs. We perform a numerical experiment with the parameters
,
,
; the rest of parameters coincide with ones in (
8).
Table 8 contains the filtering and stabilization results for various time-discretization steps.
Figure 13 contains the typical system trajectories both in the controlled and uncontrolled cases.
The calculations confirm the effectiveness of the proposed stabilization strategy in the case of an unstable system. When the system is uncontrolled, its state diverges rapidly: the absolute values of the state attain – at the termination moment . The Euler–Maruyama scheme is also unstable in this system. All the proposed stable schemes of numerical filtering demonstrate the same performance.