The proposed multi-distribution filter, based on Gaussian and student-t distributions for a single local sensor operating without information interaction, is introduced, addressing inherent algorithm limitations. An approximate method is provided to mitigate these shortcomings. Subsequently, a CI-based consensus strategy is proposed for scenarios involving mixed Gaussian and student-t distributions. Finally, building upon the aforementioned consensus strategy and the single-sensor multi-distribution filter, an algorithm for distributed multi-distribution filtering is proposed.
3.1. The Multi-Distribution Filter Based on Gaussian Distribution and Student-t Distribution
In this subsection, we present a multi-distribution filter for a single sensor exposed to both a heavy-tailed process and measurement noises. The exclusive reliance on a student-t distribution filter often results in prolonged performance degradation or necessitates frequent parameter readjustments during normal system operations. Conversely, relying solely on Gaussian filters tends to cause divergence in scenarios where system outliers manifest. To harness the strengths of both filters, we present the following two hypotheses for a single sensor node (omitting the superscript representing the sensor node in this subsection):
: Suppose that the process and measurement noises obey the Gaussian distribution as follows:
where
denotes that
x obeys the Gaussian distribution with mean
m and covariance
P.
: Suppose that the process and measurement noises obey the student-t distribution as follows:
where the probability of
is
, the probability of
is
,
, and
.
Now, we need to assign a filter to each distribution. For the hypothesis of the Gaussian distribution, since the system model is linear, the standard Kalman filter can be used. The steps are as follows: given the initial values and , when time step , the following recursive process is performed:
For the hypothesis of the student-t distribution, we use the student-t filter described as follows: given the initial values , and , when time step , the following recursive process is performed:
The detailed derivation of the student-t filter can be seen in [
25].
In assuming that the state posterior obeys the mixed distribution of Gaussian and student-t distributions, their probabilities are
and
, respectively. According to the full probability theorem,
where
represents the measurement set up to time
k. In order to obtain the posterior probability density function (PDF), the required parameters are the probability
corresponding to the two distributions, the state estimation
and the matrix
. For the student-t distribution, the dof
is also required. In addition to the distribution probability
, the others are obtained by the two parallel filters. Given the distribution probability
of the previous time, the probability of assuming that
is correct can be given by
where
is the measurement likelihood of
at time
k, that is,
=
. For the Gaussian filter,
For the student-t distribution filter,
The mean of the mixed posterior distribution is
The covariance corresponding to the mixed posterior distribution is
For the Gaussian distribution, the variance is
and for the student-t distribution, the variance is
Therefore, the expression of the covariance of the mixed posterior distribution can be obtained by
Thus, we obtain a complete recursive step for the multi-distribution filter.
With each measurement update, the dof increase according to Equation (
24). In turn, this also requires an increase in the dof of noise, making the problem more and more Gaussian. In fact, the algorithm will converge to the Kalman filter after several time steps. Therefore, it is necessary to find some approximate methods.
One of the simplest ways is to enforce
in Equation (
24), where
is a constant, so that the dof will not increase all the time. However, the actual posterior density is
instead of
. Notice that we have omitted the condition here. At this point, we need to find a posterior density
to approximate
. The qualitative characteristics should be retained. Therefore, the adjusted matrix parameters should be the scaled version of the original matrix. The general expression of this problem is how to find the probability density
to approximate the probability density
so that the two are as close as possible. This density is controlled by the scaling factor c>0. Therefore, our problem is how to find c so that the two probability densities are close in a certain sense. A suggested method is to use the moment matching method, that is, to make the variances of the
and
equal. The advantages of this method are that it is simple to apply and does not need any parameters to be adjusted. We can obtain the following conditions:
where
and
. Then, the scale factor c can be obtained:
The PDF of process and measurement noises can be approximated in the same way.
3.2. Consensus on Mixed Density
The basic idea of consensus is to calculate the aggregation of the whole network by iteratively computing the same type of region on each node in the network that only contains the subset of adjacent nodes. Consensus is used to average the PDF of states in each node and the PDF of states received from neighbors. Given PDFs
and
and weight
, define the following information fusion and weighting operations:
The Kullback–Leibler average (KLA), which depends on relative entropy, is an average information definition of a PDF. The weighted KLA in PDF
is defined as
where
, and
represents the relative weight.
is the Kullback–Leibler divergence (KLD) between
and
. The problem of the consensus algorithm can be described as finding a way to make
where asymptotic PDF
represents the KLA with the same weight. The solution to this problem is the collective KLA: the weighted KLA is given by the normalized weighted geometric mean of PDF
It can be calculated by updating local data in a distributed manner using a convex combination with data from neighbors
where
ℓ is the number of consensus iteration steps,
is the consensus weight satisfying
, and
, and it is also the
component of the consensus matrix
(if
,
). In addition, the initial value of iteration is
. Let
be the
component of matrix
. Then,
When
is chosen so that matrix
is primitive and doubly stochastic,
Therefore, as the number of consensus steps increases, each local PDF tends to focus on the unweighted KLA.
For PDF
with the Gaussian distribution, it can be proved that the probability density consensus algorithm can be simplified to an algebraic expression involving only its information vector
and information matrix
:
This is the so-called CI consensus method.
In the above section, the multiple-distribution filter for a single sensor is given. To extend it to the distributed case, two important problems need to be solved: (1) for the mixed posterior distribution, which information should be transferred by the adjacent nodes and what strategy should be adopted to realize the collective KLA; and (2) since the mixed distribution is non-Gaussian, whether the CI consensus strategy for Gaussian distribution can be directly applied to the mixed distribution of Gaussian and student-t distributions.
For the first problem, the following consensus strategy is given: the distribution probability can be the first to obtain the consensus distribution probability, and then the fused PDF is given as the initial value, and then the consensus is run on the fused PDF. This strategy only needs to transfer the distribution probability and the fused PDF.
It should be noted that the above consensus method is for a continuous PDF. When the distribution probability is discrete, its distribution is a probability mass function (PMF). Given a PMF distribution
(
), the weighted KLA is defined as
For PMF
,
and weight
, the following information fusion and weighting operations are defined:
Then, the KLA probability mass function can be expressed as
The collective fusion of the PMF can be obtained in a distributed manner:
where
is the consensus weight, and
.
Before answering the second question, let us first look at how to use a Gaussian distribution to approximate a student-t distribution. That is, how to find a scalar c to minimize the difference between
and
under certain criteria. We know that when the dof of the student-t distribution tends to infinity, its distribution tends to the Gaussian distribution. Thus, we can obtain
. In this way, the problem becomes an approximation between two student-t distributions, so we can use the moment matching method to obtain the value of c, that is
Therefore, can be used to approximate . We can see that the fusion process can be seen as the fusion of two Gaussian distributions. Thus, we can directly use the CI consensus method based on the Gaussian distribution.
3.3. The Distributed Multi-Distribution Filter
We have previously obtained the algorithm for a single sensor and the consensus strategy for multi-sensor mixed density. Now, we need to extend the results to the distributed case.
For local node i, the initial values , , and are given for . For the student-t filter, the common dof is also given. When , start the following recursive process.
- Step 1
Parallel filtering
For the Gaussian filter, we use Equations (9)–(15). For the student-t distribution filter, we make an approximation so that
Then, Equations (17), (19) and (24) become
Therefore, we can use Equations (16), (54), (18), (55), (20)–(23) and (56) for the student-t distribution filter. After that, we can obtain , , , and for and .
- Step 2
Calculate distribution probability
Update probability
where the likelihoods of the Gaussian and student-t filter
and
can be obtained by Equations (27) and (28).
- Step 3
Consensus on distribution probability
For L-step consensus,
where
is the number of consensus steps,
is the consensus weight,
, and
.
- Step 4
Fuse the mixed PDF
- Step 5
Consensus on fused PDF
,
- Step 6
Reinitialization
The workflow of the proposed algorithm is shown in
Figure 4, and the pseudocode is summarized in Algorithm 1.
Algorithm 1: Distributed consensus multi-distribution filter (DCMDF) |
Given the initial values , and for and the common dof , for each time step k at every node i, start the following recursive process: |
Step1 Parallel filtering: |
Gaussian filtering: Calculate , , by Equations (9)–(15) |
Student-t filtering: Calculate , , by Equations (16)–(22) and (54), (55) |
|
Step2 Calculate distribution probability: |
Calculate according to (57) |
Step3 Consensus on distribution probability: |
for to L |
Calculate according to (58) |
end for |
|
Step4 Fuse the mixed PDF: |
Calculate and according to (60)–(62) |
Step5 Consensus on fused PDF: |
, |
for to L |
Calculate and according to (63) and (64) |
end for |
Step6 Reinitialization: |
|
|
Return:, , and |