1. Introduction
The disease characterized by making the patient constantly exhibit elevated levels of blood glucose is referred to as diabetes mellitus. There are three different diabetes mellitus types [
1]. Type I diabetes mellitus refers to the stage of juvenile deviance when the pancreas does not produce enough insulin, or even produces no insulin at all. Type II diabetes mellitus is associated with the inability of the patient’s body to use the secreted insulin. There is also gestational diabetes that indicates the condition in which a woman, without diabetes, has high blood glucose levels during pregnancy.
Millions of people worldwide are afflicted by Diabetes. It is considered an epidemic disease in the United States and a growing disease in the world. It is predicted to continue to be a major health crisis even with the recent advances in treatment [
2].
Diabetes treatment is done by controlling the values of blood glucose using artificial insulin and controlling quantity of carbohydrate intake. Controlling the values of glycated hemoglobin can lead to a healthy life for a diabetic [
3]. Not following a diet, limiting sugar and carbohydrates, and not taking the necessary dosage of insulin can lead to serious health problems such as retinopathy, kidney damage, heart diseases and stroke [
4].
Projections of the annual cost for diabetes treatment are updated every few years and each new prediction is higher than the previous one. A study from 2010 estimated that by 2030 the annual cost in the U.S. would be of
$490 billion dollars [
5], while a newer study from 2017 shows a prediction of
$622 billion dollars [
2]. In relation to the number of diabetics, it is estimated that there were 372 million diabetics in 2012 and this should increase to 552 million diabetics by 2030 [
6,
7]. New research already indicates the estimates up until 2040 [
8].
Glucose measurement and control is aided by glucose monitoring devices, a more in-depth study about the history and future of glucose measurement devices can be seen on [
9]. The most common type of glucose measurement device are the Blood Glucose Meters (BGMs) [
10] which requires the user to perform a fingerstick test to acquire blood for the test. There is also the use of non-invasive glucose monitoring (NGM) [
11], which have the benefit of making a glucose test more convenient by not needing a fingerstick test, but, on the other side, it doesn’t give a precision as good as a traditional BGM and are not recommended to be used as the only device to take medications and make treatment decisions [
12].
Continuous Glucose Monitoring Systems (CGMS) provides a large number of glucose measurements by measuring blood glucose at every few minutes. This makes them useful for detecting hypo-glycemia and hyper-glycemia and for choosing the insulin dosage [
13,
14]. This higher quantity of information available enables them to make a more complete glucose profile of the patient than blood glucose meters which only makes few measurements each day. However, their use in clinical practice is hampered by their accuracy, and this has led to research for better signal processing methods [
15]. In addition, there are some details that should be noted [
16,
17]:
Cost: this is the main limitation, as it is beyond the reach of most diabetics.
May be invasive: most of the commercially-available CGMS are based on invasive techniques.
Learning curve: It employs a complex procedure where the diabetic needs to be trained and educated to use it.
Calibration: In some cases, the application of some calibration may be necessary to adjust the performance according to the established standards of glucose monitoring. In these cases, then, continuous devices do not completely eliminate the need for blood tests by a BGM.
A significant amount of data must be collected, processed and stored over time, for proper monitoring. The process of discovering knowledge through these data may consider visualization mechanisms to support the recognition of relevant information and its presentation to patients and physicians. Therefore, the information visualization can be referenced as part of a understanding process that aims to achieve in-depth knowledge about a topic and it is critical to create a consistent mental model associated with a particular situation [
18], for example, the patient’s current condition and its evolution over time.
The composition of different points of view may allow the data analysis from different perspectives, complementing the information. Thus, a system that uses two or more distinct views to support the investigation of a subject is called Multiple Views. Such systems minimize the occurrence of misinterpretations when compared to analysis performed through a single view [
19].
In view of this, there is a need for Systems capable of providing health care professionals and patients with useful information about in glucose concentration and its fluctuations throughout the day [
20,
21,
22]. Moreover, tracking and maintaining traceability between glucose measurements, insulin doses and carbohydrate intake can provide useful information to physicians, health professionals, and patients. This may enable the prevention of hypoglycemia and allow the diabetic to maintain healthy levels of blood glucose concentration with a predicted glycemic variability.
Therefore, this paper presents an information system, called GLUMIS, aimed to support diabetes management activities. It encompasses a prediction rule-based method for glucose measurement, a reasoner and visualization elements. Through integration with BGM and NGM devices and CGMS, it is possible to collect historical treatment data and, with a REALI system, insulin doses and dietary habits can be processed. This paper is an extended version of a previously published paper on ITNG 2018 [
23], and the differences between the original version and this extended one are explained on
Section 2.
The data used in the experiment were obtained as a result of a partnership established with a medical clinic and has been protected for confidentiality reasons. Considering this real context, in order to obtain evidence on the system viability, an experimental study was carried out.
This article is structured by this introduction and
Section 2 shows the background in which the proposal is inserted and some related work.
Section 3 details the new classification algorithm proposed on this paper, which was named the Multidimensional Multiscale Forest (MMF).
Section 4 presents the components of the GLUMIS system. In
Section 5, the experiment was carried out and
Section 6 provides the final considerations.
3. Multidimensional Multiscale Forest
One important module of the GLUMIS system is the glucose prediction. It enables the user to make changes on his treatment based on his own past history of glucose measurements, which can help to avoid hypoglycemia and hyperglycemia. The data input for the prediction module is data received from a glucose measurement device. In addition, the output is a predicted glucose interval for a future time.
Section 3.1 explains the data used for the prediction module and why other techniques such as regression are ineffective while
Section 3.2 explains more in detail the proposed method.
4. GLUMIS System
The proposed GLUMIS system arose from the need to maintain traceability between diabetes management data to support physicians and health professionals on decision-making regarding patient care as well as characteristics that influence treatment such as eating habits, for example.
Figure 5 shows the GLUMIS system architecture with its main components and integration with the REALI system.
In addition to integrating with the REALI tool, the GLUMIS system is capable of integrating data from three different sources: CGMS systems, NGM devices or importing data from glycemic meters (BGM). There is also the possibility of manually entering the glucose level data, the date and time of the measurement. The data is uploaded regularly to the GLUMIS system forming a repository of historical data. A knowledge database is also maintained as decisions are taken, treatment is changed and this information is recorded to support future decisions. The REALI system, the integration feature and the other components of GLUMIS are detailed below.
4.3. Reasoner—Analysis of the Decision Tree
The tree is composed of nodes, each representing a rule. While nodes from the top of the tree are more generic rules, with a higher probability of being correct but lacking precision, nodes from the bottom of the tree are the opposite, having more precision but being less likely to be correct.
The search for a rule can be done either manually or automatically. The manual search for a usable rule should start from the top of the tree and descending until a good trade-off between precision and reliability is found. Such decision of what rule has a good trade-off should be made by an expert, either the patient or his physician. Due to the nature of the way the tree is built, every non-terminal node is connected to its two child nodes, the left one always having its center with a smaller value than the center of the right child, making the tree similar to a binary search tree. Thus, the search for a good rule can be done in an efficient way.
An automatic search for a rule can be done by the GLUMIS system. The user inputs his/her latest measured glucose, the time period of the measurement, the time period for the future value to be predicted and how reliable the rule should be. The reliability of a rule is given by the number of terminal nodes used to create it. Next, the GLUMIS system searches for nodes whose first range contains the inputted glycemia with reliability higher than, or equal, to the desired one. Finally, nodes meeting these criteria are displayed, sorted by reliability and by the distance from the center of its first interval to the indicated blood glucose. The Prediction Tree view, presented in the next section, supports this search process.
Advantages of Predicting on a Continuous Domain
The problem of predicting future glucose values differs from traditional classification problems because, while the latter has a discrete number of target classes, glucose values are a continuous domain. Clustering the glucose values into an arbitrary number of classes presents a serious problem in case the patient’s glucose values are clustered around the boundary between two classes. For example, if the class good is defined with values ranging from 80 to 120 and the class medium has values from 121 to 160, then if the measured values vary from 110 to 130, part of them will be classified as good and the other part as medium. This making the prediction difficult since part of his glucose is classified as medium, which would call for a raise on insulin dosage, while the other part part, classified as good, poses a risk of hypoglycemia in case the raise is applied. Because of this problem, most well known classifiers are unable to present rules with enough precision for a change on the treatment since they require the use of discrete target classes.
The main advantage of the proposed method over other classification algorithms lies in its ability to produce rules as intervals. Thus, there is no need for the data to be discretized. This allows for a more realistic overview of the behavior of glucose changes over time. In the example presented above, the algorithm would indicate a range of 110 to 130, which correctly models the patient, thus facilitating a correct treatment change. If a prediction turns out to be wrong, outside the range that it should be, the error is calculated as the distance between the measurement to the interval. If the distance is small, this error may not prove to be a problem for the patient’s health.
A secondary advantage of this method is its ability to only predict glucose from measurements that do not differ by more than the interval chosen during the training. This feature serves as a safety device, especially in applications such as this one which deals with a health application. For example, if a very different measure from those used in training appears, the algorithm will not give a prediction; instead, it will indicate the possible existence of a new glycemic profile of the patient that is safer than giving a prediction with high probability of error, which may result in hypoglycemia.
5. Evaluation
This section presents the experimental study conducted. According to the Goal/Question/Metric approach (GQM) [
29], the goal can be stated as: “
Analyze the GLUMIS system
in order to verify the feasibility of use
with respect to the task of interpreting the views and rules
from the point of view of patients and physicians
in the context of diabetes management through a decision support system”.
In this sense, two questions were defined to be answered by the participants, for each visualization of the system. In Question 1 (Q1), the participant should assess the degree of comprehension associated with the visualization and information presented. For this, they should use a scale where 1 indicates very difficult and 5 indicates very easy. In Question 2 (Q2), if the participant agreed that the view supported the understanding of the rules and the analysis made by Reasoner. Using for this a scale where 1 indicates totally disagree and 5 indicates totally agree.
The experiment was proposed based on a set of real data collected from two consecutive months taken from measurements done by BGM and CGMS. A database was considered with the glucose measurements of a patient that agreed to partake on this study. The time periods used were before breakfast, before lunch and before dinner and for each one was collected at least 25 measurements.
This experiment included 10 participants: nine patients and one specialist. After filling out a form characterization, it was observed that the patients were between 21 and 62 years old, most of them with diabetes mellitus type 1 and 2 with diabetes mellitus type 2. Among patients with type 2 diabetes, one use BGM and the other a CGMS. Among patients with type 1 diabetes, three stated that they use a CGMS and the other four make use of a BGM as a glucose meter. For characterization reasons, two questions were answered by the 10 participants. It was possible to identify that, with respect to the participant’s ease of dealing with new technologies, of interviewees said they had a very high level of ease, another said they had high ease, regular and indicate that they have low ease in dealing with new technologies. In addition, about the ease of interpreting charts, of respondents said that they had a very high level of ease, said they had high ease, say they have a regular facility, and say they have low ease in interpreting charts. This diversity shows the GLUMIS system’s ability to assist people with different skill levels in dealing with new technologies or in interpreting charts.
The participants were submitted to online training where the preliminary doubts were answered. Then, during the conduction of the experiment, the evaluation environment and participants were observed by the authors of this study. In addition, the participants were encouraged to comment and describe their impressions.
During the analysis and decision-making process, the Prediction Tree View was displayed with the rules identified. In addition, the Profile View and the Historical Traceability View were presented, showing the new glycemic profiles identified and the historical traceability data between diet information and carbohydrates ingested, the amount of insulin applied and glucose measurements through time. Through interaction elements, the visualizations were analyzed by a specialist to find relevant rules for a change on the patient’s treatment.
The GLUMIS system was able to provide many interesting rules which helped to provide valuable information about each patient. One patient’s data for the glucose prediction test was used, data was collected from 28 days with measurements from before breakfast, before lunch, afternoon break and before dinner. The MMF algorithm was used to process this data and create a forest, which showed the relations of different time periods; the outputted forest was analyzed by a specialist to find interesting rules capable of aiding the patient’s treatment. Some interesting rules found with the data of one patient are shown bellow:
Rule 1: if glucose before breakfast , then glucose at afternoon break in
Rule 2: if glucose at afternoon break , then glucose before dinner
Rule 3: if glucose before dinner , then glucose before breakfast of the next day
The values of were chosen by the specialist taking into account the variation that can happen by two consecutive measurements done by the same BGM, the maximum value of 15 was set by the specialist as an acceptable variation. By observing the results showed by the visualization module, the rules showed above were found by using the following values of : For the first rule, it was found was with set as 5 for the time period before breakfast, 5 for afternoon break. For the second rule, it was found with 12 for afternoon break and 9 for before dinner. For the third rule, it was found with 9 for dinner and 5 for before breakfast.
These three rules were able to aid the specialist on changing this specific patient’s treatment by providing a more precise view of his condition. Below is the reasoner’s interpretation of the rules, why they are relevant in this context, and the association with diet data and insulin dosages.
Rule 1 indicates that, if there is a hypoglycemia before breakfast, then the glucose will be high in the afternoon break. By identifying the patient’s glycemic profile and eating habits, it was possible to identify that, due to the low glucose in the morning, the patient increased carbohydrate intake during lunch and decreased the dose of insulin. This resulted in an increase of glucose at after lunch.
Rule 2 indicates that if glucose is low after lunch, then glucose may show a normal value before dinner. This fact shows signs that the current dose of insulin and food intake, in this case, are already adjusted.
Rule 3 indicates that, if glucose is high before dinner, then glucose may be low by the morning of the next day. This fact showed that the proportion of insulin ingested at night was high resulting in hypoglycemia in the next morning.
These rules were able to provide valuable information to improve patient glucose control. In face of visualizations and Rule 1 analysis, the physician (specialist) asked the patient to correct hypoglycemia during breakfast and to maintain the proportion of insulin per carbohydrate eaten at lunch. By Rule 2, insulin doses should be maintained. Rule 3 prompted the physician to ask the patient to correct hyperglycemia at dinner and reduced the proportion of insulin applied per gram of ingested carbohydrate.
Results and Lessons Learned
Responses and considerations were collected and organized.
Table 1 shows the averages received by each visualization in the Q1 and Q2 survey questions.
The results obtained by the questionnaire responses indicate that, in relation to Q1, the participants considered the visualizations to be easy or very easy to understand. Regarding Q2, the results indicate that the participants agreed or fully agreed that the “Profile View” and “Prediction Tree View” supported the understanding of the rules and the analysis made by the Reasoner. Considering that the average of the ratings received is higher than 3 (three), thus above the area of indifference proposed by the scale, then, according to the participants, the “Historical Traceability View” also supports the understanding of the rules and the analysis made by Reasoner.
Through a textual report that each participant could write at the end of the experiment, it was possible to extract some extra information and learn some lessons.
Some participants considered that “Prediction Tree View” was very simple, even though it received a high rating on Q1. According to the reports, it could be improved to show, for example, the path that originated the rule in automated research.
The textual report of one of the participants suggested that the system offer to the user the possibility of defining an ideal or desirable profile. Thus, if the current profile approaches that established as desirable, then the physician and the patient are reported and the treatment may be altered or the frequency of medical follow-up.
According to reports from four participants, there may have been redundancy between the information shown by the “Prediction Tree View” and the “Historical Traceability View”. For this reason, these participants classified the “Historical Traceability View” that would be indifferent in understanding the rules.
In fact, both views present insulin-related data and ingested carbohydrates. Thus, the participant did not realize the need to also use the contextual data, of the other days, shown by the “Historical Traceability View”. However, the “Prediction Tree View” and the “Historical Traceability View” have very different objectives: the first provides a punctual notion of the values and the second traces a historical parallel from three types of data: insulin, carbohydrate and value glycemic index. A possible explanation for this phenomenon is related to the reduced universe of data used that did not allow the participant to visually differentiate the purpose of each visualization. Familiarity with new technologies and visual metaphors may be another explanation.
As threats to validity, we can mention the reduced number of participants containing 1 specialist. To minimize the effect of this threat, it was possible to observe that the participants can represent different profiles.
Throughout the experiment, the generated database had balanced amounts of measurements of glucose, insulin and carbohydrates ingested for the periods of time considered. The application of this study on databases concentrated in specific periods of the day can reach different results.