1. Introduction
Challenges such as the aging population [
1] and the increasing prevalence of chronic diseases [
2] are putting pressure on healthcare systems worldwide. A strong correlation between an aging population and an increase in the incidence and mortality of chronic diseases like cardiovascular diseases, tumors, and diabetes [
3] exists. Thus, there is a need to increase efficiency in all areas of healthcare, including Operating Room (OR). Medical devices that control physical objects while constantly interacting with other connected devices are considered to facilitate effective and affordable healthcare [
4]. Consequently, connectivity and automation in the OR promise to contribute to solving these challenges, for instance, by the increased application of robotic medical devices. In addition, Service-oriented Device Connectivity (SDC) [
5], which is a family of medical device communication standards, intends to enable data exchange and manufacturer-independent interoperability of medical devices in the OR to improve patient care.
Nonetheless, this increased connectivity also increases the risk of potential cyberattacks on these devices [
6]. Historically, medical devices have operated in isolation without permanent connections to external networks, like the Internet [
7]. These devices have been typically self-contained, embedded systems. Therefore, security requirements were not in scope during development, and manufacturers focused on safety. By opening these devices up to connectivity with other devices, new safety risks to patient health arise from security threats.
Operating Room Tables (OR tables) are robotic medical devices that will leverage the potential of connectivity in the OR as they are central components. Nevertheless, positional plausibility for OR tables must be ensured regarding safety and security [
6], especially if other devices rely on the provided information. Examples are angiography systems, which enable image-guided surgery, or surgical robots. Both rely on the correctness of the determined OR table positions to avoid, e.g., collisions and thus severe patient harm. This, however, bears some unique challenges, as OR tables deform under a load of heavy patients, whereby the maximum patient load for a middle-class OR table is ∼250 kg (e.g., Getinge Maquet Meera [
8]). The plausibility check in the sense of detecting anomalies in the measurement of sensor data, e.g., for position or load, using mathematical models for non-linear deformation is a challenge. In addition, the body, especially of heavy patients, deforms due to gravitational forces during movements of the OR table. This makes it also challenging to determine the actual physical values based on sensor measurements and detect possible errors in the system or its sensors without a high false alarm rate.
Problem: Anomaly detection in security systems primarily emphasizes routing-based attacks, with only a limited number of systems tackling physical signals like sensor data [
9]. In addition, the mathematical model of a rigid body’s dynamic behavior is straightforward, but including the previously described deformation in such a model is not. One approach to overcome this obstacle is to use machine learning algorithms that can learn the nonlinear behavior of the deformation from training data. Let us assume that the system behavior and all occurring states can be written as a set
. The conventional states to be modeled via a physical system description can be written as
. The following, therefore, applies
. If we now transfer this relationship to a singular state
in the specific moment
t, we can rewrite the relation through an addition of a disturbance
of our modeled system:
However, machine learning-based approaches bear the challenge of data acquisition in OR and healthcare settings, since conditions for proper data collection are not given [
10]. Data-based models for the state estimation of sensor data may result in inadequate quality, particularly as sensor noise increases [
11]. Integrating these approaches with conventional model-based estimation is a more promising area for research.
Contribution: The manipulation of deformable objects is a growing research problem in robotics, especially in healthcare applications such as surgical procedures [
12]. Meanwhile, models representing the characteristics of the surgical robot and the human body can improve the safety of operations [
13], for example, when checking the plausibility of measurements such as joint positions. To improve the sensing of deformable objects, Zhu et al. propose a hierarchical sensing model. It combines a simple model with a more sophisticated nonlinear model, such as a linear model as a base layer combined with a deep neural network that learns the rest of the behavior [
12]. Such models can be built for OR tables to check the plausibility of position data. Based on intelligent sensor systems such as a load detection system [
14] and a corresponding fault detection system [
15], we suggest an anomaly detection approach for similar medical devices to contribute to a safer OR.
First, we propose a generic approach that uses and combines different anomaly detection techniques. Furthermore, we provide an example of a distributed software architecture to detect anomalies in an embedded environment. Our proposed system for anomaly detection involves a combined modeling approach. We use a Kalman Filter (KF) (
Figure 1) to model the deductive system and patient behavior, while a data-based algorithm learns any indeterminable or non-linear behavior.
The advantage of this approach is that the algorithm does not have to learn and know the behavior. Instead, it only learns the misbehavior of the manual model, which is then used for anomaly detection. This ensures that our system can detect anomalies that may not be accounted for by manual modeling.
Outline:
Section 2 provides an overview of anomaly detection approaches using KFs, classical statistical methods
Section 1, and machine learning methods.
Section 3 will review related work in anomaly detection. The concept of applying hybrid filters using KFs as dynamic models in combination with data-based models for anomaly detection is presented in
Section 4, followed by an evaluation of a prototypical implementation in
Section 5. A discussion of the results and the effectiveness of the approach is conducted in
Section 6. Finally,
Section 7 provides a summary and suggestions for future work in this area.
3. Related Work
Historically, most anomaly detection approaches have been model-based [
37]. An anomaly detection process decides whether a value is considered an anomaly based on the deviation between the estimated and the measured result. Dynamic Bayesian Network (DBNs) are well-established for real-time anomaly detection in streaming data. These networks are dynamically expanded by incrementally adding new variables at each time step. New measurements are added according to a predefined template that outlines the conditional dependencies between features and their relationships with the existing network. The KF is an exemplary realization of a DBN in this context (
Section 2.1). Hill et al. have effectively applied the KF to identify anomalies in low-resolution time series data [
38,
39]. However, such anomaly detection is highly dependent on the model’s accuracy. Model tuning is time-consuming and depends on prior knowledge of the system. Due to the limitations of model-based approaches, data-based approaches have received increasing interest.
A computationally inexpensive method is to check statistically whether a data point deviates from historical data. Statistical control charts [
40], principal component analysis [
41], and other statistical methods have been used.
With additional sensors and appropriate methods such as Fourier or Wavelet transforms, misbehavior can be detected, as shown in [
42]. Due to the nature of classical signal analysis, only one-dimensional signals can be considered. Besides this limitation, additional sensors are only an option for some scenarios where space and cost must not be considered.
Machine learning-based methods are gaining attention due to promising results. In anomaly detection, a distinction can be made between supervised and unsupervised methods. Supervised methods, as shown in [
43], are only suitable if a sufficient amount of labeled data is available.
In other domains, such as automotive in-vehicle communications, hybrid combinations of statistical, model-based, and machine learning-based anomaly detection have been explored to overcome the individual drawbacks of each method, as shown in [
44]. Due to the underlying model, time series data can be processed to consider the temporal dependencies of anomalies. Another anomaly detection in the automotive domain is presented by Weber with the automotive observer [
45]. This approach, which was initially developed for Controller Area Network (CAN) and later extended for Ethernet messages in automotive contexts in [
46], have been adapted for general data analysis based on [
44]. They use AEs and Lightweight On-line Detector of Anomalies (LODA) algorithms to detect signal anomalies, which have been modified to work with a sliding window for improved accuracy. The analysis is performed according to the ISO26262 standard, which ensures the reliability and consistency of the results. However, the authors do not incorporate dynamic models, they focus on single signals, and neglect the correlation of different signals. The authors use the
hybrid term for the combination of specification-based anomaly detection and machine learning algorithms.
The error determination system presented in the patent [
15] describes how a KF can determine sensor errors for a load recognition system for OR tables. Based on the prediction of the expected movement trajectory of the patient’s measured Center of Gravity (CoG) (
Figure 8), the measurements are checked for plausibility. Nevertheless, the inventors do not recommend improvements in combining the KF with data-based models.
Figure 8.
Overview of whole body movements Trendelenburg (
left) and Tilt
(right) used for the dynamic model (
Table 3).
Figure 8.
Overview of whole body movements Trendelenburg (
left) and Tilt
(right) used for the dynamic model (
Table 3).
Huang et al. present a novel anomaly detection approach [
47] specifically tailored for anomaly detection in time series data. The basic concept of their approach is to use the state estimation capabilities of the KF to improve the anomaly detection performance of an AE. This distinguishes their KalmanAE from traditional Kalman-based techniques, as it optimizes the KF using the embeddings produced by the AE. However, this method does not directly model the relationships between channels in multivariate time series data, an area that our approach aims to address.
In their study, Sanchez et al. [
48] examine the robotic handling and detection of flexible objects and propose a novel categorization of these objects that takes into account not only their geometric form, but also their physical properties. In their analysis, they note that there is a significant gap in the field due to the lack of methods that provide universal solutions for different types of objects and tasks. In particular, they point out that the dynamic handling of flexible objects is still at an early stage of development.
In their paper, Cheng et al. present an unsupervised fault detection system for industrial robots based on a Gaussian mixture model using current signals [
42]. They argue that the effectiveness of unsupervised machine learning algorithms in fault detection depends on the quality of the input fault features. They further propose that a better understanding of robotic systems through physics-based system analysis can help extract more effective fault features. The framework they propose involves pre-processing the signal and accurately distinguishing between different robot motion states in a measured current signal. Unique feature extraction is then used to derive robust fault features. These features are designed to be sensitive to faults but not to the robot’s movements. However, they do not address problems such as deformation.
The field of research into combining model-based and data-based methods, commonly called hybrid filters, has gained momentum in recent years. Jin et al. [
11] have noted that this approach is beneficial when a system becomes challenging to model accurately and a promising way to enhance a model’s accuracy. One study by Liu et al. [
49] demonstrated the effectiveness of this approach. They improved the predicted outcome of a model-based filter by training an LSTM model to estimate the discrepancy between the predicted and reference trajectories. By leveraging model-based and data-based approaches, hybrid filters can provide better results than using either method alone. However, they do not use their approaches to detect anomalies, a gap we target in this paper (
Section 1).
4. Concept
Applying statistical methods (
Section 2.2) that assume normal distribution to non-normal patient data will result in unwanted outliers since humans vary heavily in body proportions. The shoulder width, for example, varied from 1.5 to 3.25 head lengths for male subjects as examined by Kilgore [
50]. This means that the standard deviation for the Z-score must be set so high to avoid false positives that the detection of true positives is unlikely. Even non-normally based methods such as IQR assume a probability distribution that does not adapt to individual patients. In addition, the static interpretability and generalization ability of, e.g., multivariate Z-scores become challenging, especially for high-dimensional data. In addition, statistical and information-theoretical methods are not suitable for modeling sequential data [
51].
Based on the automotive observer by Weber et al., which uses different types of plausibility checks to monitor anomalies (
Section 3), we present a generic approach for anomaly detection systems using static, dynamic, and learning checks (
Figure 9). Dynamic checks use dynamic models (
Section 2.1) and learning checks use data-based models (
Section 2.3 and
Section 2.4). This paper focuses on combining the dynamic and learning checks since the dynamic checks are our proposed extensions, which are expected to reduce the number of features required (curse of dimensionality [
30]) and still reduce the False Positive Rate (FPR). By using expert knowledge for the dynamic tests, the explainability given for a KF is also given for the hybrid variant. This means that only the part of the data-based learning check cannot be easily explained.
Accordingly, we have chosen a hybrid filter approach: The first part of the equation is represented by a Kalman filter
to predict physically explainable system behavior. Here, the filter is parameterized through
, (see
Table 1) and receives an input tuple
(see
Table 1) to predict the state.
In the case of UKF/EKF the functions
h and
f replace
, and for UKF the sigma function
s is an additional parameter. The machine learning model
only has to model
from
in order to describe the system behavior that is physically difficult to model:
Static checks are derivable from the system specification, leading to rules, e.g., that a velocity signal needs to be in a specified value range. Dynamic Checks are manually created physical models that are mathematically described as proposed here for the partial body movements of a patient (
Section 4.1) or for whole body motions, as proposed in [
15] (
Section 3). Learning checks are data-based models such as AE, LSTM model (
Section 2.4) or IF (
Section 2.3) trained on data collected for the system. Each check can be combined and can provide input to the following checks at the same time. If one of the checks detects an anomaly, it is logged for further processing, such as user notification.
4.1. Dynamic Check for Partial Movements of the Patient’s Body
Given an OR table with a load determination system (
Section 1), the overall CoG of the load can be determined. Therefore, the idea is to check the trajectories of the CoG during a movement for plausibility as described in [
15] including movements of the patient’s body parts (
Figure 10). Nevertheless, as a simplification, the dynamic check assumes that the patient’s body mass distribution and flexible elements are neglected, so rigid body dynamics can be assumed (
Section 1).
Figure 10.
Patient position
in dependency to upper body distance
, lower body distance
, back joint distance
and leg joints distance
as well as the resulting distances
and
describing the rotational distances (
Table 4).
Figure 10.
Patient position
in dependency to upper body distance
, lower body distance
, back joint distance
and leg joints distance
as well as the resulting distances
and
describing the rotational distances (
Table 4).
In addition, for a proof of concept, it is sufficient to sum individual body parts into two big parts: the upper and lower body of the patient. This simplification allows a more straightforward calculation of the patient’s CoG. Specifically, for rigid body dynamics with homogeneous mass distribution, the patient’s upper body CoG position as well as the patient’s lower body CoG can be used to calculate the patient’s CoG .
To calculate the patient’s CoG, it is necessary to use a proportion for the moved upper body part with the back joint and a proportion for the impacted upper body part with the leg joint.
Neglecting the inhomogeneous mass distribution and deformation of the human body, the dynamics for rigid bodies can be applied to describe the position and velocity of the patient’s CoG:
To simplify matters, no tilting of the patient is considered, so only the rotation around the y-axis is considered for the leg and back joint, resulting in the y-axis value being irrelevant.
Other joints of the OR table are neglected in this scenario so that the positions of the joints
and
can be assumed as constants. Thus, they do not need to be measured and do not need to be contained in the state vector
. Another simplification to reduce the dimension of the state is achieved by using a constant velocity model rather than a constant acceleration model (
). The non-linear measurement function
can be calculated as follows:
4.2. Learning and Hybrid Filter Checks
Since the dynamic model is based on the assumptions of the standardized patient and a rigid OR table model with Gaussian process and measurement noise, the system will detect measurements with patients outside these norms as anomalies. Therefore, classical methods are expected to fail in these situations.
Thus, AE, LSTM model (
Section 2.4), and IF (
Section 2.3) are considered as standalone variants as well as hybrid variants in combination with the dynamic check for partial patient body movements (
Section 4.1).
It is common practice to use 1D Convolutional Neural Network (1DCNN) AEs (
Figure 11) for analyzing time series data [
52]. This design is also considered suitable for the scenario examined here. The encoder consists of two convolutional layers: a max-pooling layer and a dense layer. On the other hand, the decoder has a structure that is the mirror image of the encoder, which includes a dense layer followed by an unpooling layer and two deconvolutional layers. Finally, the decoder ends with another convolutional layer. The activation functions used are Rectified Linear Unit (ReLU), Leaky Rectified Linear Unit (LReLU), and Sigmoid.
The LSTM network model used in this study consists of three layers of LSTM units, with a dropout layer following each LSTM layer to prevent overfitting (
Figure 12). The activation functions can be adjusted accordingly: Each LSTM layer contains a ReLU activation function. In contrast, a linear activation function should be used in the dense layer at the output. Also, other activation functions are usable, such as Swish activation instead of ReLU in the LSTM layers. However, the first approach was selected to predict the absolute expected error in this case.
5. Prototypical Implementation and Evaluation
In the following, we apply the generic approach (
Section 4) and the safety system for detecting errors in load recognition systems of OR tables (
Section 3) to monitor the influence of the back movement of an OR table on the patient’s CoG.
5.1. Data synthesis Using Digital Twins
In the context of medical robotics, simulation models are becoming increasingly prevalent for developing and testing surgical procedures and devices. A digital twin of the Gazebo OR table (
Figure 13) is employed with Robot Operating System (ROS) 2 nodes for generating physical data, allowing the evaluation of algorithms under various scenarios. Simulation models offer a cost-effective and efficient approach to system development, minimizing the risk associated with physical prototypes and providing accurate representations of the system’s behavior and dynamics.
Flexible elements such as muscles and tissues need to be accurately represented to enhance the accuracy and reliability of patient models. Spring joints are applied here to model these elements, allowing a more realistic simulation of the patient’s body movement and deformation (
Figure 13). The use of spring joints also enables the representation of the flexibility of the OR table’s components, which is crucial for accurate simulations of the system’s dynamics. These deformations are not designed into the dynamic model (
Section 4.1) and, therefore, are expected to impact its estimation negatively.
The patient model is created in Unified Robot Description Format (URDF) based on relevant standards and regulations in medical technology. Here, the IEC 60601-1 [
53] is used as a foundation (
Figure 14). Flexible elements are introduced at the abdomen and upper legs for heavy patients to represent the patient’s movement and deformation.
5.2. Distributed Anomaly Detection
Our architecture for hybrid anomaly detection consists of the OR table and a backend system (
Figure 15). This server-client separation allows the computationally intensive data-based models to be processed in a cloud. In addition, flexibility is created as models can be exchanged or retrained without interfering with the embedded system of the OR table. The OR table uses a service-oriented architecture with Data Distribution Service (DDS) as middleware through the use of ROS2. The test setup consists of a controller service that represents the user interaction of the OR table. During simulation time, the OR table is moved to different target positions at random times.
Based on the Gazebo proxy data, a dynamic check of the patient CoG estimate is performed every 100 ṁs. By transferring the estimate and the measurement data to the backend system, the estimate is checked there for anomalies using data-based models. The results are logged in a Comma-Separated Values (CSV) file. The dynamic check of the CoG estimate is realized as an ROS2 node (see
Section 5.4). Together with the learning checks (see
Section 5.5–
Section 5.7), the system forms the hybrid filter. In the case of a connection failure, the system performs a simplified independent check based on the ROS2 node alone.
The connection between the backend system and the OR table is established over a REpresentational State Transfer (REST) interface. Several REST services provide the interaction needed for the dynamic checks and the simulation environment. At startup, the trained models are selected over a REST interface based on the currently determined patient weight and preferred data-based model (LSTM, IF, or AE) and loaded into the learning check. With another ROS2 node, random positions are commanded to the Gazebo OR table simulation that executes the desired joint movement. This is then sent to the Patient CoG Estimation dynamic check using the ROS1/ROS2 Bridge.
The dynamic check estimates each patient’s CoG, which is then sent to the backend system for evaluation. The inference is sent back to the node for the patient estimation. If the estimated value is an anomaly, the UI raises an alarm triggered by the dynamic check.
In a real-world OR scenario, we consider the backend system either as part of the hospital IT, as a dedicated cloud server provided by the device manufacturer, or as a supervision system directly in the OR, as proposed in [
54]. In the latter scenario, the OR table would send the estimation data to the OR monitoring system via a SDC network, which has more computational resources than the embedded system of the OR table. The dynamic check is considered to be executable on a performant embedded system and thus, for example, as part of a dedicated communication gateway of the OR table, comparable to the architecture proposed in [
7].
5.3. Classification of Deviations as Anomaly
As mentioned in
Section 2.4, a threshold must be found for the AE and the LSTM network used to ensure a clean categorization of the anomaly definition. These include:
The reconstruction error of the AE variants
The LSTM prediction of the difference value of the KF output to the measurement
The absolute difference of the pure LSTM position prediction to the measurements
When using the IF, the grouping is based on the path length and thus implicitly finds its threshold (
Section 2.3).
Similar to [
33], we have decided to use the 99th percentile. There are several reasons for this method over a manual threshold, such as the ability to adapt to new data and the creation of objective comparability between the approaches in the evaluation. The choice of percentile can also be changed. However, a low FPR is vital for our medical application as false alarms in hospitals [
55] lead to alarm fatigue, causing critical alarms to be missed [
56], and reducing the quality of care [
57]. Accordingly, we have selected a correspondingly high percentile to meet this target. In addition, each signal, e.g., CoG in x and z direction, is evaluated separately using a dedicated threshold. Therefore, an input tensor is classified as an anomaly when it exceeds the threshold value in one of these signals.
Four types of anomalies, each in the x and z position (
Figure 16), are used for the evaluation of the different algorithms based on the overview of anomalies by Weber [
45] (
Table 5).
5.4. Partial Body Movement UKF
As a dynamic check for partial body movements, the UKF and EKF algorithms have been implemented. These implementations are structured as ROS2 nodes, which subscribe to the positions and velocities of the CoG derived from the Gazebo simulation. The estimated CoG positions and OR table joint velocities are logged into a CSV file for comprehensive analysis and training of the data-based models. A dedicated internal class, Gazebo Proxy, facilitates communication between the Gazebo environment and the dynamic checks. The estimations are updated every 100 milliseconds to balance real-time responsiveness and computational efficiency. However, as the results are similar for the simulated scenario, only the UKF is discussed here for further examination because the EKF is expected to create similar results as its hybrid variant.
With the previously mentioned simplifications (
Section 4.1), the patient’s body is only divided in the lower and upper body. Thus, according to the standard patient of IEC60601-1 [
53], the upper body is considered to be
, and the lower body is considered to be
. When a patient is seated in a beach chair position (
Figure 10), which matches the scenario under examination, their upper body makes up around 63% of their overall mass.
5.5. Isolation Forest (IF)
The Python implementation of our IF utilizes the Scikit-learn [
58] library and its dependencies. Our implementation can take dynamic model estimations, sensor measurements, or their difference as inputs. The standalone variant is trained on the position measurements (
Figure 17). On the other hand, the hybrid variant of the implementation uses a two-dimensional input of the difference between the measurement and the UKF estimation in the x and z CoG direction (
Figure 18).
We do not select any time window to ensure the IF implementation is minimal, and evaluate the low-resource variant of the hybrid filter. As the IF does not require a manually predefined threshold for classification into anomalies (
Section 2.3), the training data primarily affect the evaluation metrics such as FPR and True Positive Rate (TPR), and there is limited manual influence for subsequent fine-tuning in inference time.
In our result, the pure IF (
Figure 17) is not capable of determining meaningful thresholds, leading to high FPR (∼40%) and False Negative Rate (FNR) (∼55.5%). At the same time, the hybrid variant centers the normal data in between the thresholds so that the FPR (∼13%) as well as the FNR (∼18.8%) can be improved significantly.
Since the trajectories of anomalies can be guessed from the hybrid plot, a disadvantage of the hybrid variant or at least of the classification of anomalies is revealed: The anomalies are classified based on the manipulation of the measurement data. It is not considered that the dynamic model, which is the UKF here, may need time to recover from a sudden change of the measurement after an anomaly back to normal data. Especially in jump scenarios, the difference between measurement and estimation will remain high for a short period until the dynamic model has recovered. These data points are not classified as anomalies here, which will also increase the FNR of all examined learning checks. However, it is noticeable in the time series of the pure IF (
Figure 19) that most of the detected anomalies do not correlate with the anomalies in contrast to the hybrid variant (
Figure 20).
5.6. LSTM
The LSTM network is realized in Python using the TensorFlow library [
59]. It is suitable for various feature combinations for predicting the output based on the position and joint velocity measurements and the estimated state of the KFs (
Section 4.1), which can be arbitrarily combined, including a variable time series.
As the LSTM predicts the next value (
Figure 21), we consider the pure variant as a deep learning variant to the dynamic model. Therefore, we chose all position and velocity measurements as input for a time window of 30 values, which corresponds to 3 s. However, the LSTM does have a disadvantage compared to the UKF, which is not iteratively updated based on all previous measurements. Instead, it uses an adequate time window, which needs to be exploratively determined.
The hybrid LSTM variant is designed to predict the estimated deviation of the following UKF iteration based on a time window of 20 values corresponding to 2 s. The error of the UKF estimation compared to the measurement (blue) (
Figure 22) is, therefore, directly predicted by an LSTM (orange). The absolute difference is then used as the basis for the anomaly determination (black).
The best results are achieved with the hybrid LSTM in terms of FPR as it decreases the FPR of the standalone LSTM by a factor of 30 (∼0.6% to ∼0.02%), while also decreasing the FNR by nearly 14%. In both cases, this leads to high precision. Furthermore, it is conspicuous that both LSTM variants have a high threshold based on the 99th percentile compared to the average non-anomaly data or even compared to peaks in the non-anomaly data. Other architectures of recurrent neural networks, such as GRU, will behave in a similar way to LSTM, which is why a closer look can be omitted at this point. They can also lead to serious problems such as vanishing gradients (vanilla LSTM) and are, therefore, considered obsolete.
5.7. Autoencoder (AE)
The PyTorch library in Python is used to implement the AE, which is used to reconstruct a desired time series of estimations and measurements [
60]. The reconstructed CoG x and z values in a time window of 30 values (3 s) are generated by the basic AE, (
Figure 23). The first hybrid variant, on the other hand, employs the difference between UKF estimation and measurement using the same window size (
Figure 24).
We calculate the reconstruction error for all variants as MAE, meaning that the average of all reconstructed values compared to the measurement is calculated as follows:
where
n: Number of data points in the window.
: Reconstructed value for the i-th data point.
: Measured value for the i-th data point.
While the AE has a higher FPR (2.3%) and FNR (65.5%) with smaller window sizes, another approach to the hybrid AE reveals improved performance compared to the difference between dynamic model estimation and measurement (FPR: 2.7%; FNR: 87.9%): When using the measurements and estimations as separate values, a time window of four results in a lower FPR and FNR (
Figure 25), while increasing the window size worsens the results. In addition, the latent space can be reduced from size 8 to 2 and improve the performance at the same time.
The hybrid AE shows that although the FPR and FNR cannot be lowered for the examined scenario, the second approach shows that training effort and window size can be decreased, which makes this approach more interesting for higher dimensional tasks. It slightly increases the FPR by 1%, but decreases the FNR by 16.3% compared to the UKF.
7. Conclusions and Future Work
Regarding feature reduction and possible data combination, whether the pure database models can keep their performance is questionable. More features may result in less distinguishable anomalies (distance in multidimensional space). Therefore, reducing necessary features and a smaller window size of the hybrid filters may outperform the entire setup for position monitoring of the patient. Furthermore, in all cases, the hybrid variant improves performance. In addition, for the AE and LSTM cases, the feature set, including the necessary window size, can be reduced compared to pure data-based models.
An application-specific advantage of the proposed check is that the position of the patient represented as CoG results from the entire kinematic chain and thus the entire OR table joint positions. Checking the patient’s CoG is sufficient to check all other joint positions for plausibility. If one of these positions is determined incorrectly, this leads to deviations in the movements. For example, suppose the patient is tilted around their longitudinal axis. In that case, there will also be a change in the y-axis during a movement of the upper body, which is not predicted by the dynamic model (
Section 4.1) and would, therefore, lead to deviations.
The scenario chosen for evaluation is only a small subset of the capabilities of the OR table and the possibilities of patient positioning and anthropometry. The data collected via a simulation of 5 h of driving the back joint in a static scenario is enough to yield results on the performance of several algorithms. It is questionable if all the scenarios not examined here can be covered with the data collected in the field. While in the automotive industry manufacturers can reach out to the data of millions of cars increasing year by year, in the medical device industry, such as for the OR table, manufacturers can only use the data of few hundred to a thousand each year. In addition, not all of these OR table are used for all patient types, positions, and surgical procedures [
7].
The primary focus to improve the presented hybrid anomaly detection lies in selecting an appropriate data-based model and design, as this choice substantially impacts the results. Following this, optimizing the dynamic model becomes the next priority. The lower priority is the enhancement or expansion of the training dataset, serving as the final adjustment to further refine overall performance.
As the hybrid AEs, in general, needs smaller windows and especially the IF could be improved significantly using the UKF estimation, we assume that most of the historical information is stored in the state of the KF. Therefore, also other internal states of the KF that result from sensor fusion and are not measured, e.g., or , can additionally improve the results. Thus, we expect the performance of the hybrid filters to be heavily dependent on the design of the dynamic model.
As this paper concentrates on a single device, the approach can be applied to interoperable medical devices in an SDC network (compare with the approach presented in [
54]). The different devices in the network may observe each other, creating a more robust system of systems. Each medical device in the OR is supervised by backend features, and compromising a single heavily connected device might be more challenging than a standalone system with minimal connectivity interfaces. However, this also requires analyses of response times, especially when hard real-time is needed.
To enhance the outcome, it is possible to assess other KFs that utilize Interacting Multiple Model (IMM) algorithms to adjust to varying patient positions and types. Moreover, since the second variation of the hybrid AE employs a small window, a DenseAutoencoder may be more appropriate than the currently selected 1DCNN AE.
The current estimation could be corrected to the likely value as another possible next step to recover from anomalies. This can be conducted with the values predicted from the LSTM or the AE’s reconstructed window. This could also increase the availability of the medical device in case of anomalies, vastly improving safety.
In the case of a safety or security incident, it is mostly not important which data point was an anomaly, but only the fact that an anomaly has occurred for several seconds or even minutes and with high confidence (small FPR). Therefore, the uncertainty should be calculated and logged, e.g., using Monte Carlo Sampling.
Finally, a comprehensive runtime and resource consumption evaluation must be carried out in the form of a benchmark for the various combinations of algorithms, also in comparison to their pure variants. This also includes the runtime evaluation for distributed approaches using a backend system. However, this requires a more extensive prototype implementation in a distributed architecture under hospital network conditions.