1. Introduction
Recent innovative works and research progress in the area of integrated circuits, wireless communications and sensors have allowed the creation of smart devices. Many fields now use a number of these devices, for example ubiquitous healthcare systems. The omnipresence of heterogeneous technologies has opened the way for better approaches to solving many current issues related to connected devices, and has highlighted the importance of the Internet of Things. Smart connected sensors are principal components of wireless body area networks (WBANs) and biosensors, and are fast becoming a key aspect of the Internet of Things. A WBAN is an emerging technology consisting of tiny sensors and medical devices placed on the bodies of humans or animals. It connects wirelessly to sense vital signs, and allows the remote, real-time diagnosis of health issues. Technologies used in association with wireless sensor networks and the Internet of Things are closely linked to many fields, such as e-health [
1], gaming, sports [
2], military [
3] and many applications are built in protected agriculture [
4]. A recent medical study reports that the average age of the populations of developing countries is increasing [
5]. Healthcare organizations are faced with the issue of a significantly larger elderly populace [
6,
7], which is becoming a major worldwide health problem. Ubiquitous systems offer the possibility of simplifying and easing access to health services [
8], especially at the end of intensive care. Through collaboration between these smart systems, WBANs offer a number of medical sensors that are capable of gathering vital physiological information such as blood pressure, heart rate, oxygen saturation, etc. and then transmitting these data wirelessly to be analyzed for a given process and task. In order to ensure the wellbeing of patients with chronic diseases, hospitals, smart homes and medical centers need to be equipped with a wide assortment of e-health solutions and connected medical devices. Hypertension as an example is a frequent health issue that has no evident symptoms; It is therefore important to monitor blood pressure regularly, as the only way to control its changes is to be on medications. The classification of medical abnormalities such as hypertension involves detecting and classifying data samples that present a problem or a danger to a patient’s health. The use of multi-agent systems is a research trend that can help to solve problems that are difficult for a monolithic system to deal with. The aim of our approach is to help medical staff to diagnose health disorders, and to detect or predict anomalies and seizures. The system proposed in this paper uses a specific classification model for each patient, meaning that decision making is specific, and on a case-by-case basis. The Gaussian distribution applied in this work takes into account various detected anomalies in order to analyze them. This facilitates learning of the behavior and habits of a patient while guiding the medical treatments that can alleviate any medical conditions. Statistics is used in many everyday applications, and many machine learning and data mining systems assume that data fed to these models follow a normal (Gaussian) distribution, allowing inferences to be made from a sample to a parent population. To support the results obtained, a comparative study between different classification techniques is illustrated in the results section.
Despite continuous research and improvement efforts from industry, wireless sensor networks face many issues associated with network topologies, energy usage and restricted resources. These sensors are placed in, on or around a patient’s body in order to gather and transmit medical information such as electrocardiograms (ECGs), blood pressure (BP), movement and so on [
9].
Figure 1 illustrates a WBAN architecture applied to healthcare detailing different communications types. The applications of WBANs are diverse, and range from e-health and ambient assisted living to mobile health and sports training.
Wireless sensor networks provide a communication framework that enables the sensing, gathering and forwarding of data to a central destination for further data analysis and decision making [
10]. One of the main concerns in these networks is to guarantee their energy efficiency for as long as possible. This collection of sensors can be linked with the Internet of Things to provide innovative applications for scientific research purposes. In order to develop our smart system, we combine many fields of computer science such as wireless sensor networks, classification techniques and probabilistic approaches. The proposed architecture consists of a smart connected system for ubiquitous healthcare applications. The aim is to use WBAN devices as a real-time medical data source and to provide an autonomous intelligent system that is capable of monitoring patients in a medical setting.
Table 1 summarizes the nomenclature and the technical specificities of the main used sensors in a single or multiple-WBAN environment. It is important to note that this research work focuses on the BP-related sensor. The other sensors such as ECG, EMG and EEG are shown for information purposes only. In this paper, we examine only on-body area networks; blood pressure sensors belong to this group, since these devices are attached to a patient.
Currently, one of the top challenges that every country is facing is healthcare [
11]. Numerous ubiquitous healthcare structures are outlined here and various different perspectives are taken into consideration, such as the reliability of the vital patient information to be transmitted, and the lifetime and monitoring of battery utilization based on optimized routing protocols [
12]. There is a rapidly growing body of literature on ubiquitous healthcare systems, which highlights its value and contains many examples of research in this field. The objective is to propose a healthcare solution that is capable of monitoring a patient’s vital signs and notifying physicians and/or technicians wirelessly when these metrics exceed certain limits. In this section, we give a brief overview of the state-of-the-art of ubiquitous systems and several examples that are related to our work. In [
13,
14,
15], a ubiquitous healthcare system consisting of physiological data-gathering devices using medical sensors offers monitoring and managing solutions for a patient’s health condition. Several works propose interesting solutions applied to e-health and ubiquitous systems; for example, Kang et al. [
16] used EEG sensors in order to classify stress status based on brain signals, while Kim et al. [
17] designed a mobile healthcare application based on an image representation of the tongue, a vital muscular organ. A summary of the energy requirements of a WBAN-based real-time healthcare monitoring architecture can be found in Kumar et al. [
18]. Hanen et al. [
19] implemented a multi-agent system based mobile medical service using the framework for modeling and simulation of cloud computing infrastructures and services (CloudSim). Hamdi et al. [
20] created a software system that improved the maintenance management of medical technology by sorting medical maintenance requests. O’Donoghue and Herbert [
21] presented a data management system (DMS) architecture, an agent-based middleware that utilizes both hardware/software resources within a pervasive environment and provides data quality management. Lee et al. [
22] proposed a management system for diabetes patients based on generated rules and a K-nearest neighbors (KNN) classification technique. Ketan et al. [
23] developed a healthcare system for diabetes patients that makes it easier to obtain a rapid diagnosis.
A ubiquitous healthcare system generally comprises of three components:
A portable bio-signal data-gathering device, represented by wired/wireless connected sensors;
A device for transmitting previously collected data by communicating with a remote server;
A server used to investigate the patient’s medical information.
Despite several contributions to significant advances in the world of ubiquitous services, these studies did not provide a data analysis support that can help medical staff to interpret correctly the various changes in a patient’s medical metrics. In addition to medical data classification, our approach fills this need by offering anomaly detection based on the learning model, as well as a probabilistic study that can illustrate in depth the physiological behavior of the patient by comparing the distributions at different critical points throughout the data-recording period.
Our approach therefore meets two key needs:
Provide a solution adapted to each patient on a case-by-case basis. This therefore implies a single-use learning model for any subject in this analysis and responds to the challenge of generalizing the interpretation of the various medical data in order to provide as accurate a diagnosis as possible (age, environment, habits, etc.)
Once an anomaly is detected, another classification method has been introduced to assess the severity of the situation. This perfectly meets an immediate need that can regularize medical intervention in the event of a patient’s critical situation.
The remainder of this paper is organized as follows: following this introduction,
Section 2 moves on to discuss the methodologies used to implement the proposed architecture.
Section 3 carries out an analysis in which the results are interpreted and explained.
Section 4 describes the significance of the findings based on the achieved results. Finally,
Section 5 presents the conclusion and open areas for research, and proposes potential solutions to issues that are currently faced in this study.
4. Discussion
The initial number of collected values was 500 for each dataset. During the first phase of classification using SVM, an anomaly detection ratio of 68.2%, 75.4% and 55.2% was observed correspondingly for Dts_1, Dts_2 and Dts_3. Comparing these results with the other two classifiers, namely linear discrimination (LD) and k-nearest neighbors (KNN), it could be seen that SVM achieved the best scores. We considered an anomaly, the Tuple {Sys,Dias}, one or both metrics presenting a value not belonging to the IBP standard mentioned in
Table 2. As a result, critical values were further penalized, providing a very realistic reading that did not neglect any data that could represent a danger.
Table 15 epitomizes the results obtained and the calculated ratios.
During the optimization phase of the SVM classifier parameters, the representative curves of the different data sets appeared to have the same behavior. The higher the C-axis value, the better the cross-validation score was, up to a certain constant value when it reached a steady-state score. It is worth noting that a large C resulted in a low bias and a high variance. This explained why the system significantly penalized the cost of misclassification by allowing the model to freely select more samples as support vectors. Otherwise, a small C led to a higher bias and a lower variance, this affected the decision surface to be smoother. For the three datasets, the C values ranged in [10.1, 40.4]. Initially, the gamma parameter specified the scale of influence of a single learning example. The higher the gamma value was, the more it tried to adjust accurately to the training data set. For the three datasets, γ = 0.01. Mainly, increasing the values of C and γ might lead to overfitting the training data. During this critical learning process, both parameters were used to evaluate the performance of our system by comparing the training results with those of the cross validation scores. As mentioned above, the RBF-kernel based support vector machine returned the best results. This was illustrated using the confusion matrix as well as some metrics such as precision, recall, f1-score and accuracy. The plots in
Figure 15 show that the training and validation scores increased to a certain point of stability with recorded slight differences. It was a sign of under-fitting. Then, the classifier operated properly for medium and high gamma levels.
Note that before the optimization phase, accuracies ranged in [0.95, 0.99] (for SVM). Thus, it was effective in preserving our analysis from overfitting or/and underfitting to divide our test learning data into k = 5 folds. Using this process, the three datasets responded positively by returning results that far exceeded the decision thresholds and that we believed were ideal. In this study, we set a minimum accuracy threshold of 80%. Therefore, metric scores above this value were retained. This collection was largely exceeded since it sometimes reached a perfect score of 1 (100%) essentially when the k-fold based cross-validation was used to validate our study.
Before being able to apply our normal distribution approach, it was essential to justify this step by testing whether our data allowed the use of this probabilistic solution. This was proven by using two different and complementary techniques, the Q–Q plot and the statistical test. In the Q–Q plot case, and by considering the sorted sample values on the y-axis as well as the expected quantiles on the x-axis, we could identify from the way in which the values in some sections of the graph did not approach the linear representation locally, whether they were more or less correlated than the theoretical distribution. The technique yielded quite similar representations since the distribution was the same (normal), but the observations took a symmetrical form so that no bias was observed (the mean was equal to or close to the median). Almost all the points fell into the straight line, with some observations that curved slightly in the extremities. This could be a sign of a light-tailed behavior since the sample grew slower than the normal distribution, approximately from (–3,–2) and reached its highest quantile before the standard normal distribution from (2,3). Thus, all the achieved results in this regard had proven our approach to be right. For the second test, it was required to use different statistical algorithms to support our assessment. This was done using four methods, namely: Anderson–Darling (AD), Shapiro–Wilk (SW), Kolmogorov–Smirnov (KS) and D’Agostino and Pearson (DP). The obtained results fit perfectly the standards depicted in
Table 11, as all the
p-values were higher than 0.05 for AD, KS and DP, and were lower than 0.05 for KS. By demonstrating in two different ways that our data provided an ideal basis for normal distribution enforcement, this process was successfully achieved.
In order to clearly define our approach, an explanatory graphic representation was developed. This concerns
Figure 13, where the most important data form was illustrated, namely: boxplot, histogram as well as the plotted normal distribution. In the latter we represented exactly the area likely to be dangerous to our patient, and whose metrics (highlighted in orange) exceeded the set threshold. During binary classification of the dangerousness of these anomalies using the normal distribution, the following ratios were noted: 43.7%, 31.8% and 81.1% respectively for Dts_1, Dts_2 and Dts_3.
Table 16 summarizes the obtained calculations and ratios.
As indicated in this process, two classes were taken into account. This number might vary according to the distance σ separating a value at the average μ from the sample of anomalies. This considerably reduced the number of anomalies to keep only those most likely to cause a more or less life-threatening factor for the patient. Looking closely at the results obtained in
Table 13, it could be seen that the number of Class 1 anomalies was greater than the one of class 2. This was due to the intrinsic nature of this distribution, since the more distant σ from the mean μ deviated, the more the number of values decreased. This justified the different results obtained. It should be mentioned that it was normal to produce results where no anomalies belonging to Class 1 were represented to the left of μ or/and to the right of μ. This was on the grounds that the values within this area did not represent a danger since they had previously been classified as normal and ranged in [90, 120] and [60, 80] for Sys and Dias measures respectively. With this in view, the number of data undergoing this classification was quite small compared to the original dataset since it was overlooked in this section, applying a classification of anomalies and then retaining only those that represent a real health risk for our patients. This considerably reduced the final number of readings we investigated. We also considered important and complementary to calculate the required time for this classification and for the previous one.
Table 17 shows the fit time for normalized parameters during the classification technique process. A time optimization was done for all the datasets by scaling data between 0 and 1.
In terms of time complexity, it was clear that the performance was rather favorable for the SVM, LDA and KNN algorithms since their processing times did not exceed 0.04 s, in contrast to XGBoost and random forest, which returned perfect results, but in a significant time, sometimes of around 1.97 and 0.01 s respectively. While using XGBoost, training data generally took longer because trees were built sequentially. We could therefore conclude that our SVM-based approach returned very satisfactory classification results, and this, in a very short processing time compared to the other algorithms. The accuracy was equal to or slightly less than 1 (100%) while the time did not exceed 0.003 s. This led us to conclude that these scores therefore validated the performance results of this approach. It was also important to introduce a comparison with a multi-class classifier (number of classes >2). This was done using SVM, but this time defining three classes instead of two, called safe (S), risky (R) and critical (C). The Class 0 data returned by the first classification layer using SVM were classified as S, while data that belonged to Class 1 and Class 2 when normal distribution was applied were classified as S and C respectively. The results obtained during this multi-classification process are illustrated in
Table 18 while
Table 19 shows its time duration.
Based on the results of
Table 18, we note that this method returned average training and testing accuracies that were less than 0.76 for the first two datasets. While it returned scores higher than 0.87 for Dts_3, which was acceptable based on our pre-set threshold of 80%. Regarding the processing time, it seemed clear that this approach did not exceed 0.003 s for each of the three datasets, which was still a very good time compared to those obtained using XGBoost and RF. These comparisons made between our 2-layer classification approach, using SVM as well as the normal distribution, and those using first, the powerful tree-based classifiers XGBoost and RF, and then the multi-class SVM classifier, have shown that our approach was getting near perfect scores while maintaining a very low processing time.
Table 20 illustrates a use example of this intelligent system. Several measurements of different values were taken as inputs in the tuple form {Sys, Dias}, then the treatments performed based on our approach were carried out.
As indicated in the description of our approach, if one of the measurements of the tuple {Sys, Dias} represented an anomaly, then the entire tuple was considered to be a danger to the patient. Mean and Std represent the mean and standard deviation of the sample from which the measurements were taken. The tuples {93.02, 67.65} and {110.73, 67.70} returned a negative risk alert since they both belonged to the IBP. So, there was no need to implement the process. The tuple {123.19, 78.63} returned a high risk alert because of its Sys value that belonged to Class 1. Finally, the tuple {75.53, 55.49} returned a critical risk alert given that both Sys and Dias values belonged to Class 2 and Class 1 respectively and therefore the tuple would be considered as a Class 2 alert according to the rules underlying our approach, which automatically classified the tuple in the highest class of the two child classes.
Table 21 depicts those rules based on our introduced logical operator Җ where S, R and C represent normal, Class 1 and Class 2 respectively.
Figure 16 summarizes our approach from data collection, through the discovery of the learning model based on the two classification processes, to the final model used to precisely classify any abnormality that might present a high or a critical risk for the patient.
So, this research work responded perfectly to our concern about proposing an intelligent system that was specific to each patient and about avoiding knowledge based on generalized thresholds. That being said, our study offered a detection learning model adapted to a single patient on a case-by-case basis. Future works will focus on modeling processes using techniques to optimize the active search for intervals and boundary zones related to the achieved results using the normal distribution. This would allow a deeper learning of the patient’s habits and thus adapt a given medical treatment.