1. Introduction
Multisensor data fusion is extensively applied across various robotic mobile applications including environment mapping and sensor networks [
1]. It involves the integration of data from various sensors and sources of information resulting in a more accurate description of the process of concern and to more accurate analysis than would be achieved by relying on a single sensor [
2]. Data fusion is a field that encompasses numerous disciplines such as information theory, artificial intelligence and signal processing [
3]. Its implementation can lead to increased data reliability, accuracy, and consistency.
Despite the clear advantages of data fusion, there exist several challenges and limitations that make their implementation difficult. One major challenge is the need for redundant sensors, which increases the risk of their failures. Additionally, there are issues associated to the used data, including the sensors’ imperfection, and the specific application requirements. These factors can make it difficult to achieve accurate and reliable results [
2,
4,
5].
The validation of data fusion techniques also presents two significant difficulties [
6]. These are the following:
- -
The absence of a definitive reference or ground truth data for measuring the data fusion results.
- -
The difficulty of isolating the effect of the data fusion algorithm from other factors that can impact performance, such as sensor errors and noise.
An alternate solution to the validation challenges is to introduce fault-tolerance techniques in data fusion, which work to reduce or remove the impact of defects on the process’s performance [
7]. Fault tolerance enables the system to continue operating accurately even in the presence of faults [
8,
9]. It is typically achieved through the implementation of fault detection and system recovery mechanisms. Fault detection identifies faults in the system, while system recovery works to correct or mitigate the effects of those faults. Together, these strategies allow the system to continue functioning correctly even when errors occur.
The use of fault-tolerant data fusion, specifically in aircraft navigation systems, has a history of over 50 years [
10]. The current literature mainly focuses on duplication–comparison techniques for fault tolerance, which involve evaluating outputs from a minimum of two independent and duplicate modules that provide the same service. The duplication approach encompasses two, or more, redundancy techniques:
- -
One variation of the duplication method is the use of analytical models, which act as an alternative to physical sensors. A common data fusion technique used in this approach is the Kalman Filter, which employs a system model for the estimation of an observation that is redundant to the one provided by the actual sensor. The difference between the measurement estimated by the model and the one provided by the sensor is then adopted as a sign of faults. Examples of this type of duplication method can be found in the literature, such as [
11,
12,
13].
- -
Another variation of the duplication method is the use of hardware redundancy, which involves combining multiple data sources. Unlike the analytical model-based approach, this technique is based on the evaluation of internal parameters. The work in [
14] uses the temporal analysis of conflicts arising from data source fusion usage to detect malfunction. Other works like [
15,
16] suggest the use of dynamic and static reliability analyses of data sources. Other works discussing these techniques can be found in [
17,
18].
As autonomous unmanned aerial vehicles (UAVs) become more prevalent, fault-tolerant data fusion is emerging as an increasingly important requisite for a secure and trustworthy operation. UAVs are able to complete a variety of tasks in unknown conditions where human intervention is either impossible or unsafe [
19]. However, these operations rely on sensors being susceptible to various faults [
20,
21]. As a result, it is crucial for UAVs to detect and diagnose sensor faults in order to ensure accurate state estimates. Research on multisensory fusion strategies for UAVs has been conducted for various applications such as position, velocity, and attitude estimation [
22]. The real-world evaluation of these strategies on actual UAV systems was also considered in [
23].
Various studies have investigated the implementation of fault tolerance in sensor fusion for UAV systems. In [
24], a navigation system of UAVs was designed using a combination of height sensor measurements and a main Kalman Filter with sub-filters. A Chi-Square test was employed for fault isolation. Another study [
25] developed a scheme for reliable UAV attitude estimation through the use of an Unscented Information Filter. Similar approaches were also explored in [
26], comparing various Kalman Filters implemented for sensor fusion. A data fusion technique to tolerate both software and sensor failures in a quadrotor UAV was proposed in [
27] using the duplication–comparison technique. A similar architecture was applied in [
28] on a framework that makes use of extended Informational Kalman Filters for performing the state estimation of a quadrotor UAV and Bhattacharyya Distance for residual evaluation.
In the majority of studies examining these methods, state estimation is achieved through traditional Kalman and complementary filters. Despite their effectiveness, these traditional techniques present certain limitations [
29]. Typically, attitude estimation for UAV systems is obtained through integrating the sensor measurements of an Inertial Measurement Unit (IMU). However, the correlation between attitude error and IMU error can be complex, making it challenging to establish a precise mathematical model. An alternative solution is to consider this relationship as a time series dataset, which can be effectively modeled using artificial neural networks [
30].
Traditional feedforward neural networks, while powerful for many applications, are inherently limited when it comes to handling time series data due to their lack of temporal memory. These networks process input data in isolation, treating each time point as independent and not leveraging any historical information from previous inputs. As a result, they are less effective in tasks that require understanding and predicting based on sequential data or patterns over time. To address this limitation, Recurrent Neural Networks (RNNs) were introduced, which are specifically designed to capture temporal dependencies through their recurrent structure. RNNs maintain a form of memory by looping connections that allow information to persist across multiple time steps [
31,
32]. This architecture enables RNNs to use information from previous time points to influence the prediction of future values, making them more suitable for sequential data tasks. However, despite their advantages, RNNs face challenges related to the gradient vanishing problem during training. This issue arises because gradients, which are used to update the network’s weights, can become exceedingly small, effectively halting the learning process and preventing the network from capturing long-term dependencies in the data. To overcome this challenge, Long Short-Term Memory (LSTM) networks were developed as a specialized type of RNN. LSTMs introduce a unique architecture designed to address the gradient vanishing problem through the incorporation of special gate units. These gate units—namely the input gate, forget gate, and output gate—regulate the flow of information through the network. The input gate controls the extent to which new information is added to the memory cell, the forget gate determines which information should be discarded, and the output gate manages how the information in the memory cell is used to influence the network’s predictions. The memory cell in an LSTM network functions similarly to a delay operator, providing a mechanism to retain information over extended periods. This allows the network to maintain and utilize relevant information from previous time steps effectively. By leveraging these gates, LSTMs can maintain a long-term context, which is particularly beneficial for tasks that involve complex time series data, such as predicting future values based on historical sensor readings.
This work developed a new strategy for fault-tolerant data fusion in UAV position and attitude estimation. The method is an adaptation of the duplication/comparison approach. A deep learning framework was designed using an LSTM NN for estimating the state using training data obtained from available sensor measurements. For the diagnostic layer, faults are identified and assessed through the generation and evaluation of fault indicators using the moving average (MA) metric. The moving average is a technique used to identify abnormal behavior in a system by analyzing the historical data of a certain signal or measurement. This method involves computing the mean value of a certain amount of data points over a pre-determined time frame and comparing it to the current value of the signal. Any significant deviation from the average value can signify the occurrence of a malfunction in the system. This approach is widely used in industrial settings, such as machinery or process control systems, to detect and diagnose faults in a timely manner [
33]. Compared to similar works in the literature, the proposed architecture does not rely on estimating the dynamic model of the UAV, thus eliminating the potential for errors introduced by imperfect model estimates, system uncertainties, or challenging environmental conditions.
The remaining sections of this paper are organized as follows:
Section 2 provides an overview of the UAV states estimation and fusion, while
Section 3 outlines the design of the fault-tolerant data fusion framework.
Section 4 showcases the results of an offline experiment using real data. Lastly, the paper ends with a summary in
Section 5.
2. Preliminaries
We start by identifying two frames: the body frame (
) and the North-East-Down (NED) navigation cartesian stationary frame (
) with the orientation between them being described by the roll (
), pitch (
), and yaw (
), the three Euler angles, and the rotating Direction Cosine Matrix is represented as:
2.1. UAV Attitude Equations
The UAV orientation in a three-dimensional space is predicted through the combination of gyroscope, accelerometer, and magnetometer sensors. It is represented by the three Euler angles.
The relation between the angular velocities
p,
q, and
r as measured by the gyroscope and the time derivatives of the Euler angles can be found using the rotation matrix derived previously, as shown below:
To achieve an accurate estimate of the attitude, it is important to fuse the measurements from another sensor with those of the gyroscope. The latter, when integrated, may accumulate errors and lead to a drift over time, hindering its accuracy as a single unit attitude estimator. Integrating the Euler rates allows for the derivation of attitude estimates from the gyroscope readings. However, the integration process, which involves cumulative addition, results in the buildup of undesirable components in the readings.
On the other hand, the linear accelerations measured along the three orthogonal axes, denoted as
,
, and
, can be employed to calculate the pitch and roll angles with the help of the accelerometer. This calculation can be performed using Equations (
3) and (
4), provided that external disturbances are ignored.
The magnetometer provides measurements of magnetic field strengths
,
, and
, which can be used to calculate the heading or yaw angle using the following equation:
The frequency characteristics of the accelerometer, gyroscope, and magnetometer are complementary, and relying solely on the gyroscope for orientation estimation is not effective. To achieve precise orientation estimates, it is necessary to combine these sensors.
2.2. UAV Altitude and Position Equations
The accelerometer measures the accelerations relative to the stationary gravity vector, and are given by
is the velocity vector’s rate of variation as seen from the stationary reference frame, and is defined as follows:
where
,
and
are the velocity components along the body axes.
Combining the two equations above results in the following velocity states dynamics:
The position of a UAV can be estimated by combining data from both a Global Positioning System (GPS) receiver and an accelerometer. The GPS offers global location information, while the accelerometer measures linear acceleration in the UAV’s body frame. Through combining these two measurements, the accuracy of position estimation is increased. The accelerometer provides additional velocity and orientation information, and this combined approach is also useful in situations where the GPS signal is weak or unavailable.
3. Fault-Tolerance Architecture
A fault-tolerant data fusion scheme that incorporates a duplication-comparison method for detection and recovery purposes is proposed in this section. The design includes two parallel and separate branches, each implementing a data fusion block (DF1 and DF2) utilizing redundant or diversified sensors blocks ((S1, S2) and (S3, S4)) for state estimation. S1 and S3 should perform similarly, whether they are redundant or diversified sensors. The same applies to S2 and S4. An implementation of this architecture is illustrated in
Figure 1. S1 (and S3) consists of an Inertial Measurement Unit (IMU) that combines a three-axis accelerometer and three-axis gyroscope. S2 (and S4) consists of a GPS module for determining the UAV’s absolute position and a barometer for measuring its altitude.
This architecture offers the capability to tolerate or detect hardware faults under the assumption that only one fault is present at a time in the system. To accommodate for multiple faults, the level of hardware redundancy should be increased. The principle of this architecture involves evaluating the outputs from two separate data fusion blocks and determining if there is any significant discrepancy between them. This difference serves as an indication of the presence of an error in the system. To diagnose the source of the error, the outputs of the sensor and the residual from the fusion are analyzed using the moving average technique. The detailed architecture applied on the UAV is shown in
Figure 2.
3.1. Data Fusion Component
An LSTM-based fusion technique is applied on data from magnetometer, accelerometer, and gyroscope sensors for the attitude prediction of a hexarotor UAV, and on accelerometer, barometer, and GPS sensors for its position and altitude prediction.
Starting with the attitude prediction, the output from three tri-axial inertial sensors are fed into a Long Short-Term Memory (LSTM) network. The nine outputs from these sensors are integrated into an input array, as depicted in
Figure 3. The input to the LSTM layer includes both current and previous step measurements, resulting in a time step of 2. The model comprises two LSTM hidden layers, and a linear activation function is used for the prediction. Using a supervised learning approach, the LSTM extracts features from the inputs, consisting of Euler angles, angular speeds, and accelerations, and generates the required output. The network outputs are the estimated angles based on the trained model and sensor measurements during the prediction phase.
Yet, the position network shown in
Figure 4 differs from the previous network only in the input and output layers, where seven measurements (tri-axial accelerometer, barometer altitude, GPS longitude, latitude, and and altitude) constitute the input, and the UAV estimated position (longitude, latitude, and altitude) constitutes the output (prediction) layer.
A linear activation function is employed since this is a regression problem. In regression, the goal is to obtain the real output of the network, whereas in classification, the output is categorized into a specified range using an activation function.
Between the two hidden layers of the LSTM architecture, a dropout layer is added. The purpose of dropout is to prevent overfitting by randomly disconnecting some of the inputs to the next layer during training, with a probability p. This makes the network architecture more diverse and eliminates the dependence of any single node on a given pattern, which leads to more robust and generalized models. The dropout technique aims to improve the testing accuracy while potentially compromising the training accuracy, and serves as a form of regularization.
3.2. Fault-Tolerant Component
In this section, we discuss the process of detecting faults in the sensors and how the system recovers from the faults.
3.2.1. Error Detection Module
As shown in
Figure 1, the fault-tolerance architecture component starts with the error detection module. A comparison between the states (attitude and position) estimated by the two data fusion blocks is performed in this module using residual generation. A residual
r is the difference between the estimated modeled output
and the actual sensor output
y:
Two sets of residuals are to be computed, and . When at least one set of the residuals exceeds its specified threshold, and , an error is detected in the system. Then, the error identification and recovery module is checked to diagnose the fault and the faulty sensor. Thresholds are fixed empirically using the trial-and-error technique.
3.2.2. Error Identification and Recovery Module
When a fault is detected through residual generation at time step
n, the moving average algorithm is checked on all fusion block outputs to identify the faulty block. The moving average algorithm has been widely used in the literature for fault detection. For example, in [
34], it was used for DC series arc detection in photovoltaic (PV) systems. It consists of the continuous calculation of the averages of a specified number
m of data samples:
where
represents the moving average to be calculated,
m represents the window size over which the average is to be computed, and
represents the data at the
i-th time step. Two sets of moving averages are to be computed,
and
, with
corresponding to the first and second fusion blocks. They correspond to the attitude and position outputs of the fusion blocks.
At time step
n, where an error is detected, the thresholds of the moving averages corresponding to the non-zero residual components are determined by the moving averages of the two blocks at that time step, i.e.,:
with
corresponding to the first and second fusion blocks. From this threshold, we can fix a region
where
is considered a tolerance on this threshold and is set empirically by trial and error. The same is applied for
. After a fault is detected, if the moving average of the output of a sensor block lies outside this region, we can isolate the fault in this block. However, if neither of the two blocks shows out-range values, we consider it a false alarm.
After identifying the faulty sensor block, residuals are generated between each of the two similar sensors. When a residual is non-zero, the sensor belonging to the faulty block is identified to be the faulty one. We first examine the measurements from block S1 with those of the block S3 by evaluating the distances
,
, and
with respect to thresholds
,
, and
. These distances are defined as
with
,
,
,
,
,
,
, and
being the outputs of the Inertial Measurement Unit
. Next, we evaluate the output from sensor block S2 against that from sensor block S4 by comparing the distances
and
with the thresholds
and
.
and
are as follows:
The
and
positions are obtained from the
, while the altitude
is measured by the
. The steps of the diagnosis and fault-tolerant algorithm are outlined in Algorithm 1.
Algorithm 1: Fault-Tolerant Algorithm |
Data: , , , , , ,
|
4. Experimental Results and Analysis
To validate the performance of the outlined approach, experiments with fault injection were conducted on a hexarotor UAV, and the results are analyzed in depth.
4.1. System Architecture and Setup
The effectiveness of the fault-tolerant architecture is demonstrated through offline experimental validation using real data and fault injection. An outdoor test environment was created for data acquisition (
Figure 5). The experimental UAV is a Tarot hexarotor equipped with a pixhawk flight controller and various sensors such as an InvenSense ICM-20689 tri-axial Inertial Measurement Unit (IMU1) (InvenSense TDK Group, Shanghai, China), a Bosch BMI055 Inertial Measurement Unit (IMU2) (Bosch, Shanghai, China), two Ublox NEO-M8N GPS (u-blox, Shanghai, China) modules, and two MS5611-01BA03 (TE Connectivity, Shanghai, China) barometers.
The hexarotor maintains communication with both the ground station and the RC (Remote Control) transmitter. This dual communication setup serves different purposes to enhance the control and monitoring of the hexarotor during its flight operations. The ground station allows for more extensive and sophisticated control, enabling the execution of complex flight paths, autonomous missions, and data transmission for analysis. On the other hand, the RC transmitter provides a direct and immediate control link, allowing for quick responses and manual intervention in case of emergencies or unforeseen situations. This redundant communication ensures a reliable and flexible connection with the hexarotor, enhancing its safety and versatility in various flight scenarios.
4.2. Dataset Description
To evaluate the practicality and effectiveness of the proposed framework, two distinct fault conditions were developed. As a first fault scenario, an additive fault was simulated on the output of the magnetometer of the first fusion block (Mag1). This type of fault occurs when the magnetometer is not calibrated for hard iron and soft iron biases. Hard iron biases are a permanent offset in the magnetic field at the current location, while soft iron biases are non-permanent biases caused by nearby electronic and magnetic fields. To minimize this issue, it is important to calibrate the magnetometer beforehand.
The second fault was a freeze fault simulated on the Gyroscope of the second fusion block. This fault refers to a malfunction or failure in the gyroscope sensor that causes it to become frozen or unresponsive.
When designing experimental data collection environments, faults representative of those that could be encountered in a real UAV setting were simulated.
Different datasets were collected using the experimental configurations described earlier. They were composed of accelerometer, gyroscope, magnetometer, GPS, and barometer sensor measurements. Data were split for training and testing by the following, known splitting rule:
−
× data for training and
−
× data for testing.
Table 1 below shows a description of the attitude and position real-data characteristics.
4.3. Data Fusion Performance
Before feeding the data into the LSTM model, several preprocessing steps were undertaken to ensure their suitability for training. The raw sensor data collected from the UAV were first cleaned to address any missing values and reduce noise through filtering techniques. The data were then normalized using Min-Max scaling to bring all features into a [0, 1] range, which is essential for the effective training of the LSTM model. Subsequently, the data were reshaped into sequences with a number of time steps of 2. The LSTM network used for attitude data fusion was configured with the following parameters:
- -
Input layer of shape (2,9) with 2 being the number of time steps and 9 the number of input features.
- -
LSTM layer of shape (2,128) with 128 being the number of units.
- -
Batch normalization layer of shape (2,128). Batch normalization is applied to each feature, which helps in stabilizing and accelerating training.
- -
Dropout Layer of shape (2,128). It applies dropout with a rate of 0.25 to the LSTM outputs. This helps in regularization by randomly dropping of the units during training.
- -
LSTM layer of 128 units. This layer processes the output from the previous layer and reduces the sequence dimension, outputting a single vector of 128 features for each sample.
- -
Dense layer that produces three outputs. The activation function here is linear, suitable for regression or multi-class classification depending on the loss function used.
- -
Batch normalization layer that normalizes the outputs of the dense layer. Batch normalization is applied to the final layer outputs.
The LSTM network used for position and altitude estimation exhibits the same characteristics but with seven input features.
The root mean squared error (RMSE) was considered a metric for evaluating the deep learning-based data fusion. For the attitude estimation, the following errors were computed:
For the position estimation, the following errors on latitude, longitude, and altitude were computed:
The results of attitude data fusion in the fault-free case are shown in
Figure 6.
To evaluate the performance of the Long Short-Term Memory (LSTM) network for data fusion in comparison to the Extended Kalman Filter (EKF), we analyzed two distinct attitude datasets characterized by low and high dynamic motion considered in [
35]. The results are illustrated in
Figure 7 and
Figure 8 and
Table 2. The analysis reveals that the LSTM outperforms the EKF, particularly in scenarios involving high dynamic motion. This is evident from the superior accuracy of the LSTM in tracking and predicting attitude changes under high-dynamic motion scenarios. During high dynamic motion, LSTMs can leverage their ability to adapt and learn from varying conditions to provide more robust performance. They can handle abrupt changes and complex patterns in the data more effectively than traditional models. However, in high dynamic scenarios, the EKF’s performance can degrade if the system dynamics are highly nonlinear or if the process and measurement noise assumptions are not accurate. The linearization steps in EKF can lead to significant errors under such conditions. In addition, LSTMs have the ability to remember long-term dependencies thanks to their gating mechanisms. This is crucial in data fusion where past states influence future states over long periods, allowing LSTMs to better capture and utilize historical information. The performance of the LSTM network was also compared to that of the Gated Recurrent Units (GRU) network when applied to data fusion. The results show that the LSTM outperforms also the GRU. In fact, the LSTMs’ architecture including separate memory cells and additional gates can make the LSTM network more effective for the data fusion task as it involves complex long-term dependencies.
4.4. Fault-Tolerance Performance
We then examined the effects of two fault injections: an additive fault on the first magnetometer sensor and a freeze fault on the second gyroscope.
At sample 5000, which corresponds to time
s, an additive fault of value
is added on the magnetometer’s three-axis measurements. The attitude and position residuals between the two fusion blocks are visualized on
Figure 9, noting that the fault injection time is represented by the red arrow and the thresholds are represented by the horizontal red lines. This figure shows that only the attitude residuals exceed the predefined thresholds, which indicates that a fault is detected in the attitude network.
Figure 10 shows the moving averages of the two data fusion blocks implemented with a window size
. The thresholds are calculated using Equation (
11) with
to check which of the two blocks shows out-region samples. It is clear that the first branch is the faulty one.
A similar reasoning for the residuals shown in
Figure 11 enabled us to determine that the magnetometer in the first branch is the faulty sensor.
A second fault was simulated on the gyroscope measurements of the S4 sensor block. The freeze starts at sample 5000, corresponding to time
s. The difference between the outputs of the data fusion blocks after the occurrence of this fault is shown in
Figure 12. A procedure similar to that above is then followed for the identification of the faulty sensor.
Table 3 shows the RMSEs of the different states obtained for the LSTM fusion.
Sensor faults can be categorized into various types, including offset, drift, and freeze. Each type represents a different way that a sensor’s performance can be impaired, leading to different effects on residuals, as shown in
Figure 13. Offset, drift, and freeze faults were simulated on the accelerometer measurement. Offset faults are easy to detect in a shorter detection time because they produce a consistent error pattern. However, drift and freeze faults require longer detection times due to their gradual nature. The LSTM network proves highly effective in detecting these various types of sensor faults due to its ability to model complex temporal dependencies and capture intricate patterns within time-series data. The LSTM’s architecture is particularly well suited for identifying offset faults, as it can learn to recognize consistent deviations from expected patterns and flag persistent shifts in sensor readings. For drift faults, the LSTM is well suited in detecting gradual, progressive changes over time, leveraging its memory capabilities to identify trends and subtle variations that deviate from historical norms. By maintaining long-term dependencies, the LSTM can discern slowly evolving patterns indicative of drift. In the case of freeze faults, the LSTM’s capacity to track temporal changes allows it to spot discrepancies when the sensor output fails to vary despite changes in the true values. This enables the network to identify periods where the residuals diverge significantly from expected behavior due to the sensor’s inability to adapt to new data.
Thresholds in this study were fixed empirically; they were determined based on observed data to balance the trade-offs between detecting faults and avoiding false alarms.
4.5. Discussion
This architecture can easily be modified to isolate software faults, as previously discussed in [
6]. These faults are typically caused by human error during the development stage. However, the detection of software faults is beyond the scope of this study.
The architecture proposed in this paper offers several benefits when compared to other similar approaches for unmanned aerial vehicles. One of the key advantages is that it does not rely on a dynamic model for sensor fault isolation like [
27], which requires tuning the model parameters following each failure to prevent drift in model accuracy. In addition, compared to the architecture in [
24], the proposed architecture requires fewer redundant sensors and data fusion blocks. Traditional fault-tolerant methods, such as weighted-mean or majority-voting, which use a minimum of three redundant or diversified sensors as in [
24], remain widely implemented in aviation systems. Yet, these methods are costly and could lead to a substantial weight increase in the system. Finally, unlike the other architectures based on Kalman Filters [
28], this architecture uses a deep learning framework for data fusion. It was proven in [
36] that the LSTM fusion method’s estimated states do not show cumulative divergence error, unlike those obtained from a standard Kalman Filter. This comparison is summarized in
Table 4.
This fault-tolerance technique for data fusion in UAVs not only contributes to enhancing the reliability and efficiency of UAV operations but also holds considerable promise for broader impacts in the industry. The reduced dependency on redundant hardware would make UAVs more cost-effective and efficient, which leads to substantial cost savings in both the manufacturing and maintenance processes.