5.1. Experiments Setup and Data
To ensure diverse and comprehensive datasets for our experiments, we employed two distinct methods for creating patient data. The first method involves utilizing a Python-based data generator service specifically designed for patient information [
57]. This service ensures the generation of realistic and representative observations, simulating various scenarios that might occur in a clinical setting. In the second method, we leveraged the Mockaroo tool as a supplementary approach. Mockaroo provides a versatile platform for generating synthetic data [
58], allowing for the customization of attributes and characteristics to mimic real-world scenarios. Simulated observations were collected from three virtual patients for a specific time of the day, with each patient assigned 60 observations per hour (i.e., one for every minute). These two methods enrich and diversify the dataset, enhancing the robustness of our experimental evaluations.
In
Table 2, a snapshot of the above-mentioned data recorded for a patient at a specific moment is presented, representing one of the 60 observations at a given time. The table presents various characteristics, each providing valuable insights into the patient’s condition. The attributes, their descriptions, and corresponding values have been randomly selected from a comprehensive record, offering a glimpse into the diverse aspects monitored during the observation period.
The data in
Table 2 represents a comprehensive set of features that have been meticulously extracted and compiled through a prior preprocessing phase [
6]. It is important to note that several of these properties, such as hasTremor, have been computed by the smart watch application at lower levels of abstraction. This means they are derived from more granular data or calculations, signifying a deeper level of analysis and processing involved in their determination. Additionally, it is important to emphasize that some of these properties are intricately linked to movement data, representing dynamic streams of information, while others are associated with PHR data, which typically encompasses static or longitudinal health-related information. This clear distinction underscores the multifaceted nature of the dataset, encompassing both dynamic and static facets of patient health, making it a valuable resource for a comprehensive understanding of the studied domain.
5.1.1. Wear4PDmove in Protégé OE Environment
To exploit and evaluate the engineered ontology, the competency questions were transformed to SPARQL queries and executed on the inferred knowledge represented using as input (a) a PHR database (patient information) and (b) a CSV file of sensor data gathered from a smartwatch during experimentation with PD patients. Using the Pellet reasoner [
59] and the Snap SPARQL plugin [
60], the inferred knowledge was queried to obtain observations related to the ‘missing dose’ high-level event. The source files of the ontology, the example SPARQL (numbered as Qx in
Appendix C) queries, and the SWRL rules can be accessed at
https://github.com/KotisK/Wear4PDmove/v1.0.0/ (accessed on 1 January 2024).
In the context of the Wear4PDmove ontology, the following SWRL rules are defined to detect missing dose high-level events based on the low-level events of tremor or bradykinesia identified in patients and generate notifications to doctors about possible missing doses of medication. These events are represented as Boolean data types, indicating their presence or absence. The values of these events are generated through the analysis of raw sensor data, which is performed within an API. It is important to note that the development of this API is part of another ongoing research project.
Below, we describe the rules for recognizing and notifying doctors of missing dose events in PD patients.
Rule 1: This rule states that if a tremor (low-level event) has been observed for a PD patient after a dosing event, then a missing dose event (high-level event) is recognized, and a notification is sent to the doctor (through the sendNotification function). The rule automatically classifies instances of PD patient observations as “PD patient Missing Dose Event Observations”.
SWRL Rule (Bradykinisia): Observation(?obs), ‘has bradykinisia of upper limp’(?obs, true), ‘observed property’(?obs, ‘Bradykinisia Upper Limp for PDpatient’), ‘observation after dosing’(?obs, true) -> ‘send notification’(‘Missing Dose Notification’, true), ‘PD patient Missing Dose Event Observation’(?obs)
Rule 2: Similar to Rule 1, this rule is used for the recognition of missing dose (high-level event) based on the identified low-level event of bradykinesia in the upper limb of PD patients. If an observation is made for bradykinesia of the upper limb after a dosing event, and bradykinesia is detected (hasBradykinisiaOfUpperLimp), then a missing dose event is recognized, and a notification is sent to the doctor through the sendNotification function.
SWRL Rule (Tremor): Observation(?obs), ‘observed property’(?obs, ‘Tremor for PDpatient’), ‘observation after dosing’(?obs, true), ‘has tremor’(?obs, true) -> ‘send notification’(‘Missing Dose Notification’, true), ‘PD patient Missing Dose Event Observation’(?obs)
These rules are defined for the purpose of identifying possible missed medication doses in PD patients and notifying their doctors in a timely manner. However, the effectiveness of these rules in a clinical setting would need to be evaluated through testing and validation. The accuracy of the observations, the sensitivity and specificity of the rules, and the usability of the notifications would need to be assessed. This evaluation process would require collaboration between domain experts and data scientists and may involve collecting and analyzing real-world patient data.
5.1.2. Wear4PDmove and RDFlib/Python Implementation
In the ontology construction phase, we meticulously detailed the connections with recommended classes and features that form the Wear4PDmove ontology. To achieve this, we utilized the Protégé 5.5 tool, a widely recognized ontology engineering tool, as already mentioned in the previous section. Protégé 5.5, with its intuitive and user-friendly interface, facilitated the modeling and editing of ontologies using the OWL. The entities, classes, and relationships within our ontology were meticulously defined and structured during this phase, ensuring a comprehensive representation of PD domain knowledge. It is noteworthy that our project can be seamlessly run in a Google Colab environment using Python. This provides an additional layer of accessibility and convenience for users, allowing them to engage with the ontology and its interconnected components.
RDFlib stands as a Python library designed for the manipulation of RDF data. Offering functionalities for parsing, querying, and manipulating RDF data, RDFlib plays a crucial role in loading and processing RDF data, including ontologies, within Python applications. The visualization component offers various KG versions tailored to specific needs. It includes representations of classes (concepts) as nodes and attributes (properties) as edges. These versions are:
Classes: Concentrates only on the categories in the ontology, giving a basic view.
Classes with Filters: Enhanced with filters to highlight specific aspects within classes.
Dark Background Classes (with connections): Emphasize classes with a dark background for clarity and visual appeal.
Classes and Data Properties: Combines classes and data properties for a holistic view of the ontology’s structure and properties.
This structured flow ensures a step-by-step approach, from importing the ontology to generating tailored visualizations catering to different requirements and preferences. The goal is to provide a flexible and insightful exploration of the Wear4PDmove ontology through diverse knowledge graph representations.
Owlready2 [
61], a Python library designed for ontology-oriented programming, empowers developers with a seamless integration of OWL ontologies into Python environments. This library provides a Pythonic approach to efficiently handle and manipulate OWL ontologies, bridging the gap between ontology development and Python programming.
In the context of ontology development, the specification of the “import” relationship holds significant importance. This relationship serves as a fundamental link connecting one ontology to another (
Figure 3). Through the “import” relationship, various ontologies can share and reuse classes, properties, and other essential elements. This practice is pivotal for achieving modular ontology design and facilitating the integration of ontologies. Establishing explicit import relationships enhances the modularity and interoperability of ontologies, fostering collaboration and knowledge sharing across diverse domains and applications.
The visualization of this interconnectedness, as depicted in the lower right corner in a star schema of
Figure 3, signifies the imported ontologies in our ontology. This portrayal reinforces the autonomy of the ontology by illustrating its independence from the broader elements of the ontological structure. Furthermore, the star schema serves as a representation of metadata, emphasizing the role of this ontology as a self-contained entity with clear import relationships, contributing to a more comprehensible and organized ontology landscape.
This phase involves comprehensively reading classes, properties, and individuals from the ontology. Subsequently, the focus shifts to extracting relational information related to instances, specifically targeting the values associated with properties such as hasBradykinisiaOfUpperLimp and hasTremor (
Table 3). This dual-step process ensures a thorough exploration of the KG, encompassing the identification and retrieval of essential entities and their interrelations. By systematically reading and analyzing classes, properties, and instances, the task aims to gain a holistic understanding of the KG’s structure, while the subsequent extraction of instance relations provides valuable insights into specific attributes crucial for further analysis and interpretation. The code snippet for extracting a list of values related to the studied observations, including hasBradykinisiaOfUpperLimp and hasTremor, can be found in
Appendix D.
A devised mechanism implemented through the RDFlib library orchestrates the extraction of the knowledge graph from the populated (with instances) ontology. This process involves tracing connections with all the imported ontologies, ultimately culminating in the extraction of classes and attributes that collectively form the ontology. Subsequently, two distinct types of notifications are generated: a medium alert and a high alert. The notification criteria are established through a rule extraction mechanism. In cases where both observations of hasBradykinisia and hasTremor are concurrently true, the high alert is triggered. Conversely, if either of the two observations is true, the medium alert notification is activated. When both observations are false, both notifications remain inactive. The output of this process yields a dataset comprising timestamped observations, the received observations values, and the corresponding notification outputs.
To achieve this, a SPARQL query is executed to construct a query structure (CONTSTRUCT). Notably, this approach replaces the SWRL (Semantic Web Rule Language) since the owlready2 package does not support this feature. Following the SPARQL query, triples associated with the identified observations, namely hasBradykinisia and hastTremor are constructed (
Figure 4).
In
Figure 4, triples serve as the foundation for assessing alerts using IF-THEN rules. It’s important to emphasize that these rules are formulated within the Python environment, leveraging its capabilities for dynamic and programmatically driven rule creation (
Figure 5). This approach ensures a flexible and adaptable system for assessing alerts and making informed decisions based on the identified observations.
The use of the Python programming language enhances the overall interaction and editing capabilities of the ontology, adapting to modern technological advancements and the evolving landscape of ontology development. Python complements, rather than replaces, the functionality provided by tools like Protégé. This complementary approach of using both Python and Protégé for ontology engineering not only empowers the derivation of meaningful insights and conclusions from the ontology but also encompasses visualization differences. Other frameworks, such as the JENA Apache framework, exist, which add further versatility to ontology management.
5.2. Neurosymbolic AI Approach
In this work, we aim to explore how a GAT network processes graph-structured data effectively. The GAT network is powerful because of its attention mechanisms, which enable it to capture intricate relationships hidden in the data, providing us with a nuanced perspective on the underlying patterns. GCN complements GAT’s capabilities by processing graph-structured data through its convolutional layers. Like GAT, GCN excels in handling nodes and edges representing various entities and relationships within the KG, such as symptoms, treatments, and PHR data. Together, GAT’s attention mechanisms and GCN’s convolutional approach enable a comprehensive analysis of the intricate relationships in the data.
To embed KG into GNNs, nodes and edges from the KG should be represented in the graph structure used by the PHGNN models. Nodes represent various entities (like symptoms, treatments, and PHR data), while edges represent relationships. To accomplish that task, we create a graph-based library called DGL (Deep Graph Library) to load our data and use them for training a GNN model to perform binary classification regarding two different alert labels, namely, ‘medium’ and ‘high’. These alerts are triggered by employing specific features (hasTremor and hasBradykinisia) related to tremor and bradykinesia, respectively. The “medium” alert is activated when one of these features indicates a positive result. The “high” alert is activated when both features signal a positive outcome. We also evaluate the model accuracy for each level. This algorithm represents the practical implementation of our approach, showcasing how GNN is embedded in our system to handle and classify data effectively. In summary, our approach involves feature extraction from the ontology, data organization, and the application of GNN to classify alert levels.
More specifically, the selection and implementation of these two algorithms was made considering the following:
In our GAT and GCN implementations, we utilize convolutional layers tailored for processing graph-structured data. These convolutional layers are typically associated with classification tasks. However, our objective is alert-level classification rather than regression.
For the GAT implementation, we employ two GAT convolutional layers, well-suited for classification tasks, and optimize the model using the Cross-Entropy Loss function, a standard choice for classification. This task involves determining ‘medium’ or ‘high’ alerts based on features related to tremor and bradykinesia. Similarly, in the GCN implementation, we incorporate two GCN convolutional layers chosen for their classification effectiveness. Although our data may seem continuous, our focus is on classification and determining alert levels. To facilitate this classification, we use the Mean Squared Error (MSE) loss, traditionally for regression but serving as a proxy for our classification needs.
Both GAT and GCN outputs are further processed to make the final classification decision. We acknowledge that GNNs, including GAT and GCN, excel in classification tasks, often employing methods like Cross-Entropy Loss. Our choice of loss functions may appear regression-like, but they effectively serve our classification objectives. Additionally, in our GCN implementation, we intentionally omit pooling layers. This decision is driven by the unique nature of our graph-structured data and the classification task at hand. Unlike traditional CNNs that rely on pooling to reduce spatial dimensions, our data’s relationships between nodes (patient attributes) are paramount. Pooling could potentially result in information loss. Our goal is to classify alert levels based on intricate relationships among attributes, and the graph convolutional layers aptly capture these relationships while preserving overall structure.
Ultimately, the presence or absence of pooling layers depends on the specific problem and dataset. In our context, omitting pooling layers aligns with our aim of accurately classifying alerts, considering the complex interplay of patient attributes within the graph structure.
The GAT performs node classification within a graph structure. This involves data preparation, constructing the graph, defining the GAT model, training, and evaluating the model’s performance in classifying alert levels.
The paper extends this approach with GCN for effective PD monitoring. Here, GCN processes graph-structured data, with nodes representing PD-specific elements like symptoms. The convolutional layers of GCN crucially analyze these nodes within their local networks, effectively capturing key spatial relationships.
In our GCN implementation, we initially utilize the MSE loss function, which is traditionally associated with regression tasks. However, it is important to note that we employ the MSE loss as a means to classify inputs into one of the two classes: ‘medium’ alert notification and ‘high’ alert notification. Specifically, the largest values correspond to ‘high’ notifications, while smaller values indicate a ‘medium’ notification classification. This classification process is an integral part of our approach and is designed to determine the appropriate notification level for each input, ensuring that our system effectively responds to varying PD symptoms.
The selection of GAT and GCN for our PD monitoring system is rooted in their demonstrated proficiency in handling complex, graph-structured data typical in patient-related contexts. GAT, with its attention mechanisms, excels in emphasizing critical nodes, while GCN stands out for its effectiveness in learning from graph data. Their combined capabilities enable our system to accurately capture and analyze the intricate relationships within PD data, significantly improving the accuracy of notification classification and contributing to better patient outcomes. This choice is supported by their established success in medical research, particularly in managing neurodegenerative diseases like PD.
Based on all of the above,
Figure 6 presents a general architectural function of PHGNNs. It consists of three main sections: Data Processing, Neural Network Layers, and Training Components. The Data Processing block includes a Data Loader, Ontology (Wear4PDmoveOnto) and a Graph Constructor. The Neural Network Layers block highlights an Attention Mechanism Layer, a GATConv Layer, and a GCNConv Layer, followed by an Activation Function. The Training Components section outlines the process with Training Loop, Loss Calculation, and Backpropagation. The flow of the diagram suggests a sequential process from data handling to model training.
5.2.1. GNNs (GAT Approach)
In the quest to derive meaningful metrics from GNN approaches tailored for patients with PD, we adopt a systematic and structured methodology. The initial phase entails the installation of the DGL package and the incorporation of the PyTorch library, establishing a robust foundation for subsequent tasks.
Moving forward, the data preparation stage involves the extraction of features from the ontology, systematically saving them in a CSV file. To enrich the dataset, two additional fields are introduced, representing distinct alert levels—‘medium’ and ‘high’. These alerts are generated through a decision support system embedded within the service, leveraging ontology-driven rules.
A pivotal aspect of the process revolves around a comprehensive exploration of the GAT algorithm. Specifically, we delve into the intricacies of the GAT mechanism, aiming to grasp its ability to process graph-structured data. The GAT algorithm, with its attention mechanisms, excels in capturing intricate relationships within the data, providing a nuanced perspective on the underlying patterns.
In this study, we conducted a series of experiments employing a GNN approach for monitoring PD patients. The primary objective was to evaluate the system’s performance in predicting medication adherence and recognizing potential lapses in pill dosing.
The training process was executed over multiple epochs, ranging from 100 to 500, each accompanied by distinct values for Medium Alert Loss, High Alert Loss, and Accuracy (
Figure 7). The Medium Alert Loss represents the loss incurred during the prediction of medium-level alerts, while the High Alert Loss pertains to the loss associated with high-level alerts. Accuracy, on the other hand, reflects the overall correctness of the model’s predictions. Over time, the system gradually improved its predictive accuracy, showing a notable decrease in medium and high alert loss, indicating better precision in alert predictions and an enhanced ability to identify different alert levels accurately.
The ascending trend observed in the Accuracy metric further reinforces the system’s competence. The consistent increase from 77.78% to 84.55% over the epochs implies an augmentation in the overall correctness of the predictions. This suggests that the GNN model effectively learned and adapted to the intricate relationships within the patient data, showcasing a promising capability to identify instances of missed medication doses.
The provided results in
Figure 8 detail the metric loss for Medium Alert based on exported outcomes from the GAT algorithm. The results vary across different configurations of hidden features (4, 8, and 16) and training epochs (100 to 500). The loss values, representing the dissimilarity between predicted and actual values during training, generally decrease as the number of epochs increases. Notably, the configuration with 8 hidden features and 500 epochs achieves the lowest loss, indicating optimal performance for the Medium Alert classification. This trend suggests that the model progressively improves its predictive capabilities over successive training iterations. Overall, these results reflect the effectiveness of the GAT algorithm in learning representations for Medium Alert classification, with specific configurations demonstrating superior performance.
In
Table 4, the GNN performance metrics showcase the accuracy, precision, recall, and F1 score across different epochs and hidden layers for the medium alert level. For the model with four hidden layers, performance steadily improved from an initial accuracy of 0.8223 to 0.8455 at 500 epochs. The 8 hidden layer model started at 0.8455 and reached 0.8711 at 500 epochs, demonstrating consistent enhancement. In the case of 16 hidden layers, the model began with an accuracy of 0.7778 and progressed to 0.8908 at 500 epochs. However, in relation to
Figure 8, it is observed that with the accuracy of 16 hidden layers (although greater than that of 8), it is a misinterpretation (therefore, it has been over-fitted), as the loss parameter shows that the optimization is achieved faster through 8 hidden layers. Trade-offs between precision and recall were observed, emphasizing the importance of tailoring models to specific application requirements. The findings suggest potential avenues for further optimization and underscore the need for continuous evaluation in healthcare contexts.
The presented results in
Figure 9 detail the metric loss for High Alert based on exported outcomes from the GAT algorithm. The configurations encompass varying numbers of hidden features (4, 8, and 16) and training epochs (100 to 500). Across these configurations, the loss consistently diminishes with increasing epochs, underscoring the model’s improving predictive prowess. Particularly noteworthy is the configuration employing 8 hidden features and 500 epochs, showcasing the most noteworthy performance with the lowest loss, affirming its efficacy in High Alert classification. This pattern underscores the GAT algorithm’s adeptness in refining its predictive capabilities over successive training iterations for High Alert classifications.
In
Table 5, examining the application of the GAT algorithm with hidden layers set at 4, 8, and 16 for predicting high-level alerts reveals notable trends in performance metrics. The outcomes illustrate a consistent improvement in accuracy, precision, recall, and F1-score as the number of hidden features increases. Particularly, employing eight hidden layers showcases the model’s efficacy in discerning intricate relationships within the data for robust high-alert classification. Across epochs, accuracy steadily ascends from 77.78% to 89.55%, underscoring the algorithm’s proficiency in high alert prediction. Precision, recall, and F1-score similarly exhibit positive trajectories, reaching their zenith at 16 hidden layers. This underscores the effectiveness of a more profound architecture in capturing and leveraging complex data patterns for accurate and reliable high-alert predictions.
In conclusion, while the results indicate promising performance with the GNN architecture, rigorous validation, exploration of model interpretability, and consideration of potential overfitting are imperative for ensuring the reliability and applicability of the model in diverse and real-world scenarios. Further investigations, including hyperparameter tuning and exploration of alternative architectures, are warranted to optimize the model’s performance. Understanding the interpretability of the model and uncovering learned representations within the hidden layers contribute to a more comprehensive understanding of its decision-making process.
5.2.2. GNNs (GCN Approach)
This section implements a GCN for node classification using PyTorch Geometric. The primary goal is to analyze a graph structure based on input data from a CSV file. The dataset is loaded into a Pandas DataFrame, where the ‘source’ and ‘target’ columns are used to create edges for the graph. Nodes are represented by random features, forming the initial node features tensor (denoted as x). The PyTorch Geometric library is leveraged to create a data object that encapsulates the graph structure.
The GCN model is defined as a PyTorch neural network, with two GCN layers (conv1 and conv2). These layers are applied successively to the input graph data. The activation function ReLU is used between the layers to introduce non-linearity. The overall goal of the GCN is to learn node representations that capture the graph structure and can be used for downstream tasks. The script includes a training loop where the GCN model is trained using a MSE loss function. The model parameters are updated through backpropagation using the Adam optimizer. The training process is repeated for a specified number of epochs, and the training loss is printed for each epoch.
The mechanism incorporates a list of hidden_channels_values, allowing the user to experiment with different numbers of hidden channels in the GCN layers. For each value in this list, a new GCN model is instantiated and trained independently. This provides flexibility in exploring how the number of hidden channels affects the model’s performance.
The presented data showcases the training progress of the GNN for medium alert notifications across different epochs and hidden channel configurations (16 and 32). For the GNN with 16 hidden channels (
Figure 10), the loss steadily decreases over epochs, indicating effective learning and convergence. The decreasing loss implies that the model is refining its predictions, aligning with the ground truth over training iterations. As the GNN progresses through epochs, the decreasing loss trend for the model with 32 hidden channels is also evident. However, it achieves a lower loss compared to the 16 hidden channel configurations, suggesting that a more complex model with additional hidden channels contributes to improved learning and convergence.
Interpreting the results, the decreasing loss signifies the model’s ability to capture underlying patterns in the data, enhancing its predictive accuracy for medium alert notifications. The choice of the number of hidden channels appears crucial, with 32 hidden channels outperforming the 16 hidden channel configuration.
The metrics in the mechanism quantify the GCN model’s performance in predicting node features based on graph topology. MSE measures the squared difference between predicted and actual features, emphasizing larger errors; lower MSE observations closer predictions. MAE gauges the average absolute difference, indicating error magnitude without direction; lower MAE suggests better accuracy. R2 quantifies the explainability of variance in node features; a higher R2 implies better capture of underlying patterns. Evaluating results involves considering these metrics: lower MSE and MAE, coupled with a higher R2, indicate effective predictive performance, while higher MSE and MAE, along with lower R2, suggest potential shortcomings in capturing the graph structure.
Collectively, these metrics offer a comprehensive evaluation of the GCN model’s accuracy and predictive power in the context of the provided graph-structured data. The interpretation of these metrics is task-dependent, and the specific context of the data being modeled should be considered for a nuanced understanding of model performance.
In
Table 6, the plot table presents performance metrics for the GCN algorithm with different hidden channel configurations (n = 16 and n = 32) specifically for medium alert notifications. Three key metrics—MSE, MAE, and R
2—are reported.
For Mean Squared Error, the model with 16 hidden channels achieves an MSE of 0.2044, while the model with 32 hidden channels has a slightly higher MSE of 0.2045. This indicates a relatively comparable level of error between the two configurations, with no significant advantage observed for either. Similarly, for Mean Absolute Error, the model with 16 hidden channels reports an MAE of 0.4139, and the model with 32 hidden channels has a slightly higher MAE of 0.4141. Again, the difference is minimal, suggesting similar predictive accuracy in terms of absolute error for both configurations. The R2 values provide insights into how well the model’s predictions align with the actual data. In this case, the R2 values for both configurations are low, indicating that the models explain only a small portion of the variance in the data. The model with 16 hidden channels has an R2 of 0.0514, while the model with 32 hidden channels has a slightly lower R2 of 0.0508.
Interpreting the results, the choice of hidden channels (16 or 32) does not significantly impact the Mean Squared Error and Mean Absolute Error, as both configurations exhibit similar levels of predictive accuracy. However, the low R2 values suggest that the models may not fully capture the variance in the medium alert notifications, and additional refinement or feature engineering may be necessary. To enhance the model’s performance, various options can be explored. Hyperparameter tuning, including adjusting the learning rate or exploring different activation functions, could be considered. Experimenting with alternative GNN architectures or increasing the complexity of the model may also contribute to improved performance. Additionally, incorporating additional relevant features or leveraging domain-specific knowledge could enhance the model’s ability to capture the intricacies of medium alert notifications.
In conclusion, the performance metrics indicate comparable predictive accuracy between the GCN models with 16 and 32 hidden channels for medium alert notifications. However, the low R2 values emphasize the need for further refinement and exploration to better capture the underlying patterns in the data.