Intelligent Data Analysis for Medical Diagnosis

A special issue of Diagnostics (ISSN 2075-4418). This special issue belongs to the section "Machine Learning and Artificial Intelligence in Diagnostics".

Deadline for manuscript submissions: closed (31 August 2022) | Viewed by 54137

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK
Interests: big data and analytics; brain–computer interface; deep learning; transfer learning; non-stationary learning and domain adaptation; artificial intelligence (AI) and eXplainable AI (XAI); EEG and MEG signal processing; AI in decision making for healthcare
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute of Health Informatics, University College London, London NW1 2DA, UK
Interests: using text technologies and knowledge graph techniques to analyse electronic health records
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

A medical diagnosis provides an explanation of a patient’s health problem based on symptoms and signs and informs subsequent healthcare decisions. The diagnostic process involves complex and collaborative activities relating to information gathering and clinical reasoning, and therefore, making an accurate diagnosis is a fundamental challenge for global healthcare systems. A study led by Singh revealed that at least 1 in 20 adults in the USA were misdiagnosed every year, equating to 12 million people per year, with half of these misdiagnoses potentially being harmful. On the other hand, advances in information technologies, particularly the ubiquitous nature of mobile technologies adopted in biomedical and health sciences, have generated mountains of data related to health and wellbeing from a wide range of sources, including electronic health records in primary care and secondary care, genome-wide studies, demographics, doctors’ notes, clinical images, laboratory results, genetic tests, wearable sensors, etc. One way to help healthcare professionals improve their diagnostic accuracy is by supporting them to analyse data more efficiently.

The purpose of this Special Issue is to investigate how intelligent data analysis techniques, such as machine learning/deep learning, data mining, big data analytics, etc., hold their promises of more efficiently analysing data to extract useful information and improve clinical decision making and medical diagnosis.

Prof. Dr. Shang-Ming Zhou
Dr. Haider Raza
Dr. Honghan Wu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Diagnostics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • medical diagnosis
  • healthcare informatics
  • electronic health records
  • machine learning
  • deep learning
  • big data analytics
  • predictive modelling in healthcare
  • omics data
  • imaging data
  • sensor data
  • clinical notes
  • natural language processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

16 pages, 3616 KiB  
Article
Blood Glucose Prediction Method Based on Particle Swarm Optimization and Model Fusion
by He Xu, Shanjun Bao, Xiaoyu Zhang, Shangdong Liu, Wei Jing and Yimu Ji
Diagnostics 2022, 12(12), 3062; https://doi.org/10.3390/diagnostics12123062 - 6 Dec 2022
Cited by 2 | Viewed by 1720
Abstract
Blood glucose stability in diabetic patients determines the degree of health, and changes in blood glucose levels are related to the outcome of diabetic patients. Therefore, accurate monitoring of blood glucose has a crucial role in controlling diabetes. Aiming at the problem of [...] Read more.
Blood glucose stability in diabetic patients determines the degree of health, and changes in blood glucose levels are related to the outcome of diabetic patients. Therefore, accurate monitoring of blood glucose has a crucial role in controlling diabetes. Aiming at the problem of high volatility of blood glucose concentration in diabetic patients and the limitations of a single regression prediction model, this paper proposes a method for predicting blood glucose values based on particle swarm optimization and model fusion. First, the Kalman filtering algorithm is used to smooth and reduce the noise of the sensor current signal to reduce the effect of noise on the data. Then, the hyperparameter optimization of Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) models is performed using particle swarm optimization algorithm. Finally, the XGBoost and LightGBM models are used as the base learner and the Bayesian regression model as the meta-learner, and the stacking model fusion method is used to achieve the prediction of blood glucose values. In order to prove the effectiveness and superiority of the method in this paper, we compared the prediction results of stacking fusion model with other 6 models. The experimental results show that the stacking fusion model proposed in this paper can accurately predict blood glucose values, and the average absolute percentage error of blood glucose prediction is 13.01%, and the prediction error of the stacking fusion model is much lower than that of the other six models. Therefore, the proposed diabetes blood glucose prediction method in this paper has superiority. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

25 pages, 1808 KiB  
Article
Pneumonia and Pulmonary Thromboembolism Classification Using Electronic Health Records
by Sinhue Siordia-Millán, Sulema Torres-Ramos, Ricardo A. Salido-Ruiz, Daniel Hernández-Gordillo, Tracy Pérez-Gutiérrez and Israel Román-Godínez
Diagnostics 2022, 12(10), 2536; https://doi.org/10.3390/diagnostics12102536 - 19 Oct 2022
Cited by 1 | Viewed by 2440
Abstract
Pneumonia and pulmonary thromboembolism (PTE) are both respiratory diseases; their diagnosis is difficult due to their similarity in symptoms, medical subjectivity, and the large amount of information from different sources necessary for a correct diagnosis. Analysis of such clinical data using computational tools [...] Read more.
Pneumonia and pulmonary thromboembolism (PTE) are both respiratory diseases; their diagnosis is difficult due to their similarity in symptoms, medical subjectivity, and the large amount of information from different sources necessary for a correct diagnosis. Analysis of such clinical data using computational tools could help medical staff reduce time, increase diagnostic certainty, and improve patient care during hospitalization. In addition, no studies have been found that analyze all clinical information on the Mexican population in the Spanish language. Therefore, this work performs automatic diagnosis of pneumonia and pulmonary thromboembolism using machine-learning tools along with clinical laboratory information (structured data) and clinical text (unstructured data) obtained from electronic health records. A cohort of 173 clinical records was obtained from the Mexican Social Security Institute. The data were preprocessed, transformed, and adjusted to be analyzed using several machine-learning algorithms. For structured data, naïve Bayes, support vector machine, decision trees, AdaBoost, random forest, and multilayer perceptron were used; for unstructured data, a BiLSTM was used. K-fold cross-validation and leave-one-out were used for evaluation of structured data, and hold-out was used for unstructured data; additionally, 1-vs.-1 and 1-vs.-rest approaches were used. Structured data results show that the highest AUC-ROC was achieved by the naïve Bayes algorithm classifying PTE vs. pneumonia (87.0%), PTE vs. control (75.1%), and pneumonia vs. control (85.2%) with the 1-vs.-1 approach; for the 1-vs.-rest approach, the best performance was reported in pneumonia vs. rest (86.3%) and PTE vs. rest (79.7%) using naïve Bayes, and control vs. diseases (79.8%) using decision trees. Regarding unstructured data, the results do not present a good AUC-ROC; however, the best F1-score were scored for control vs. disease (72.7%) in the 1-vs.-rest approach and control vs. pneumonia (63.6%) in the 1-to-1 approach. Additionally, several decision trees were obtained to identify important attributes for automatic diagnosis for structured data, particularly for PTE vs. pneumonia. Based on the experiments, the structured datasets present the highest values. Results suggest using naïve Bayes and structured data to automatically diagnose PTE vs. pneumonia. Moreover, using decision trees allows the observation of some decision criteria that the medical staff could consider for diagnosis. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

20 pages, 4456 KiB  
Article
Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol
by Pen-Chih Liao, Ming-Shu Chen, Mao-Jhen Jhou, Tsan-Chi Chen, Chih-Te Yang and Chi-Jie Lu
Diagnostics 2022, 12(8), 1965; https://doi.org/10.3390/diagnostics12081965 - 14 Aug 2022
Cited by 13 | Viewed by 2940
Abstract
Purpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven [...] Read more.
Purpose: Cardiovascular disease (CVD) is a major worldwide health burden. As the risk factors of CVD, hypertension, and hyperlipidemia are most mentioned. Early stage hypertension in the population with dyslipidemia is an important public health hazard. This study was the application of data-driven machine learning (ML), demonstrating complex relationships between risk factors and outcomes and promising predictive performance with vast amounts of medical data, aimed to investigate the association between dyslipidemia and the incidence of early stage hypertension in a large cohort with normal blood pressure at baseline. Methods: This study analyzed annual health screening data for 71,108 people from 2005 to 2017, including data for 27 risk-related indicators, sourced from the MJ Group, a major health screening center in Taiwan. We used five machine learning (ML) methods—stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), least absolute shrinkage and selection operator regression (Lasso), ridge regression (Ridge), and gradient boosting with categorical features support (CatBoost)—to develop a multi-stage ML algorithm-based prediction scheme and then evaluate important risk factors at the early stage of hypertension, especially for groups with high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) levels within or out of the reference range. Results: Age, body mass index, waist circumference, waist-to-hip ratio, fasting plasma glucose, and C-reactive protein (CRP) were associated with hypertension. The hemoglobin level was also a positive contributor to blood pressure elevation and it appeared among the top three important risk factors in all LDL-C/HDL-C groups; therefore, these variables may be important in affecting blood pressure in the early stage of hypertension. A residual contribution to blood pressure elevation was found in groups with increased LDL-C. This suggests that LDL-C levels are associated with CPR levels, and that the LDL-C level may be an important factor for predicting the development of hypertension. Conclusion: The five prediction models provided similar classifications of risk factors. The results of this study show that an increase in LDL-C is more important than the start of a drop in HDL-C in health screening of sub-healthy adults. The findings of this study should be of value to health awareness raising about hypertension and further discussion and follow-up research. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

16 pages, 1642 KiB  
Article
A Multi-Branch Convolutional Neural Network with Squeeze-and-Excitation Attention Blocks for EEG-Based Motor Imagery Signals Classification
by Ghadir Ali Altuwaijri, Ghulam Muhammad, Hamdi Altaheri and Mansour Alsulaiman
Diagnostics 2022, 12(4), 995; https://doi.org/10.3390/diagnostics12040995 - 15 Apr 2022
Cited by 58 | Viewed by 5076
Abstract
Electroencephalography-based motor imagery (EEG-MI) classification is a critical component of the brain-computer interface (BCI), which enables people with physical limitations to communicate with the outside world via assistive technology. Regrettably, EEG decoding is challenging because of the complexity, dynamic nature, and low signal-to-noise [...] Read more.
Electroencephalography-based motor imagery (EEG-MI) classification is a critical component of the brain-computer interface (BCI), which enables people with physical limitations to communicate with the outside world via assistive technology. Regrettably, EEG decoding is challenging because of the complexity, dynamic nature, and low signal-to-noise ratio of the EEG signal. Developing an end-to-end architecture capable of correctly extracting EEG data’s high-level features remains a difficulty. This study introduces a new model for decoding MI known as a Multi-Branch EEGNet with squeeze-and-excitation blocks (MBEEGSE). By clearly specifying channel interdependencies, a multi-branch CNN model with attention blocks is employed to adaptively change channel-wise feature responses. When compared to existing state-of-the-art EEG motor imagery classification models, the suggested model achieves good accuracy (82.87%) with reduced parameters in the BCI-IV2a motor imagery dataset and (96.15%) in the high gamma dataset. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

12 pages, 2220 KiB  
Article
Patient Perception When Transitioning from Classic to Remote Assisted Cardiac Rehabilitation
by Ștefan-Sebastian Busnatu, Maria-Alexandra Pană, Andreea Elena Lăcraru, Cosmina-Elena Jercălău, Nicolae Paun, Massimo Caprino, Kai Gand, Hannes Schlieter, Sofoklis Kyriazakos, Cătălina Liliana Andrei and Crina-Julieta Sinescu
Diagnostics 2022, 12(4), 926; https://doi.org/10.3390/diagnostics12040926 - 7 Apr 2022
Cited by 5 | Viewed by 2633
Abstract
Cardiac rehabilitation is an individualized outpatient program of physical exercises and medical education designed to accelerate recovery and improve health status in heart disease patients. In this study, we aimed for assessment of patients’ perception of the involvement of technology and remote monitoring [...] Read more.
Cardiac rehabilitation is an individualized outpatient program of physical exercises and medical education designed to accelerate recovery and improve health status in heart disease patients. In this study, we aimed for assessment of patients’ perception of the involvement of technology and remote monitoring devices in cardiac recovery. During the Living Lab Phase of the Virtual Coaching Activities for Rehabilitation in Elderly (vCare) project, we evaluated eleven patients (five heart failure patients and six ischemic heart disease patients). Patient admission in the UMFCD cardiology clinical department served as a shared inclusion criterion for both study groups. In addition, the presence of II or III heart failure NYHA stage status was considered an inclusion criterion for the heart failure study group and patients diagnosed with ischemic heart disease for the second one. We conducted a system usability survey to assess the patients’ perception of the system’s technical and medical functions. The survey had excellent preliminary results in the heart failure study group and good results in the ischemic heart disease group. The limited access of patients to cardiac rehabilitation in Romania has led to increased interest and motivation in this study. The final version of the product is designed to adapt to patient needs and necessities; therefore, patient perception is necessary. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

9 pages, 1937 KiB  
Article
Application of a Deep Learning System in Pterygium Grading and Further Prediction of Recurrence with Slit Lamp Photographs
by Kuo-Hsuan Hung, Chihung Lin, Jinsheng Roan, Chang-Fu Kuo, Ching-Hsi Hsiao, Hsin-Yuan Tan, Hung-Chi Chen, David Hui-Kang Ma, Lung-Kun Yeh and Oscar Kuang-Sheng Lee
Diagnostics 2022, 12(4), 888; https://doi.org/10.3390/diagnostics12040888 - 2 Apr 2022
Cited by 13 | Viewed by 4500
Abstract
Background: The aim of this study was to evaluate the efficacy of a deep learning system in pterygium grading and recurrence prediction. Methods: This was a single center, retrospective study. Slit-lamp photographs, from patients with or without pterygium, were collected to develop an [...] Read more.
Background: The aim of this study was to evaluate the efficacy of a deep learning system in pterygium grading and recurrence prediction. Methods: This was a single center, retrospective study. Slit-lamp photographs, from patients with or without pterygium, were collected to develop an algorithm. Demographic data, including age, gender, laterality, grading, and pterygium area, recurrence, and surgical methods were recorded. Complex ocular surface diseases and pseudopterygium were excluded. Performance of the algorithm was evaluated by sensitivity, specificity, F1 score, accuracy, and area under the receiver operating characteristic curve. Confusion matrices and heatmaps were created to help explain the results. Results: A total of 237 eyes were enrolled, of which 176 eyes had pterygium and 61 were non-pterygium eyes. The training set and testing set were comprised of 189 and 48 photographs, respectively. In pterygium grading, sensitivity, specificity, F1 score, and accuracy were 80% to 91.67%, 91.67% to 100%, 81.82% to 94.34%, and 86.67% to 91.67%, respectively. In the prediction model, our results showed sensitivity, specificity, positive predictive value, and negative predictive values were 66.67%, 81.82%, 33.33%, and 94.74%, respectively. Conclusions: Deep learning systems can be useful in pterygium grading based on slit lamp photographs. When clinical parameters involved in the prediction of pterygium recurrence were included, the algorithm showed higher specificity and negative predictive value in prediction. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

19 pages, 10478 KiB  
Article
An Embedded System Using Convolutional Neural Network Model for Online and Real-Time ECG Signal Classification and Prediction
by Wahyu Caesarendra, Taufiq Aiman Hishamuddin, Daphne Teck Ching Lai, Asmah Husaini, Lisa Nurhasanah, Adam Glowacz and Gusti Ahmad Fanshuri Alfarisy
Diagnostics 2022, 12(4), 795; https://doi.org/10.3390/diagnostics12040795 - 24 Mar 2022
Cited by 12 | Viewed by 4963
Abstract
This paper presents an automatic ECG signal classification system that applied the Deep Learning (DL) model to classify four types of ECG signals. In the first part of our work, we present the model development. Four different classes of ECG signals from the [...] Read more.
This paper presents an automatic ECG signal classification system that applied the Deep Learning (DL) model to classify four types of ECG signals. In the first part of our work, we present the model development. Four different classes of ECG signals from the PhysioNet open-source database were selected and used. This preliminary study used a Deep Learning (DL) technique namely Convolutional Neural Network (CNN) to classify and predict the ECG signals from four different classes: normal, sudden death, arrhythmia, and supraventricular arrhythmia. The classification and prediction process includes pulse extraction, image reshaping, training dataset, and testing process. In general, the training accuracy achieved up to 95% after 100 epochs. However, the prediction of each ECG single type shows a differentiation. Among the four classes, the results show that the predictions for sudden death ECG waveforms are the highest, i.e., 80 out of 80 samples are correct (100% accuracy). In contrast, the lowest is the prediction for normal sinus ECG waveforms, i.e., 74 out of 80 samples are correct (92.5% accuracy). This is due to the image features of normal sinus ECG waveforms being almost similar to the image features of supraventricular arrhythmia ECG waveforms. However, the model has been tuned to achieve an optimal prediction. In the second part, we presented the hardware implementation with the predictive model embedded in an NVIDIA Jetson Nanoprocessor for the online and real-time classification of ECG waveforms. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

17 pages, 10890 KiB  
Article
Machine Learning in Prediction of Bladder Cancer on Clinical Laboratory Data
by I-Jung Tsai, Wen-Chi Shen, Chia-Ling Lee, Horng-Dar Wang and Ching-Yu Lin
Diagnostics 2022, 12(1), 203; https://doi.org/10.3390/diagnostics12010203 - 14 Jan 2022
Cited by 22 | Viewed by 5328
Abstract
Bladder cancer has been increasing globally. Urinary cytology is considered a major screening method for bladder cancer, but it has poor sensitivity. This study aimed to utilize clinical laboratory data and machine learning methods to build predictive models of bladder cancer. A total [...] Read more.
Bladder cancer has been increasing globally. Urinary cytology is considered a major screening method for bladder cancer, but it has poor sensitivity. This study aimed to utilize clinical laboratory data and machine learning methods to build predictive models of bladder cancer. A total of 1336 patients with cystitis, bladder cancer, kidney cancer, uterus cancer, and prostate cancer were enrolled in this study. Two-step feature selection combined with WEKA and forward selection was performed. Furthermore, five machine learning models, including decision tree, random forest, support vector machine, extreme gradient boosting (XGBoost), and light gradient boosting machine (GBM) were applied. Features, including calcium, alkaline phosphatase (ALP), albumin, urine ketone, urine occult blood, creatinine, alanine aminotransferase (ALT), and diabetes were selected. The lightGBM model obtained an accuracy of 84.8% to 86.9%, a sensitivity 84% to 87.8%, a specificity of 82.9% to 86.7%, and an area under the curve (AUC) of 0.88 to 0.92 in discriminating bladder cancer from cystitis and other cancers. Our study provides a demonstration of utilizing clinical laboratory data to predict bladder cancer. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

15 pages, 6388 KiB  
Article
Polyp Detection from Colorectum Images by Using Attentive YOLOv5
by Jingjing Wan, Bolun Chen and Yongtao Yu
Diagnostics 2021, 11(12), 2264; https://doi.org/10.3390/diagnostics11122264 - 3 Dec 2021
Cited by 49 | Viewed by 9025
Abstract
Background: High-quality colonoscopy is essential to prevent the occurrence of colorectal cancers. The data of colonoscopy are mainly stored in the form of images. Therefore, artificial intelligence-assisted colonoscopy based on medical images is not only a research hotspot, but also one of the [...] Read more.
Background: High-quality colonoscopy is essential to prevent the occurrence of colorectal cancers. The data of colonoscopy are mainly stored in the form of images. Therefore, artificial intelligence-assisted colonoscopy based on medical images is not only a research hotspot, but also one of the effective auxiliary means to improve the detection rate of adenomas. This research has become the focus of medical institutions and scientific research departments and has important clinical and scientific research value. Methods: In this paper, we propose a YOLOv5 model based on a self-attention mechanism for polyp target detection. This method uses the idea of regression, using the entire image as the input of the network and directly returning the target frame of this position in multiple positions of the image. In the feature extraction process, an attention mechanism is added to enhance the contribution of information-rich feature channels and weaken the interference of useless channels; Results: The experimental results show that the method can accurately identify polyp images, especially for the small polyps and the polyps with inconspicuous contrasts, and the detection speed is greatly improved compared with the comparison algorithm. Conclusions: This study will be of great help in reducing the missed diagnosis of clinicians during endoscopy and treatment, and it is also of great significance to the development of clinicians’ clinical work. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

16 pages, 4040 KiB  
Article
Improving Skin Cancer Classification Using Heavy-Tailed Student T-Distribution in Generative Adversarial Networks (TED-GAN)
by Bilal Ahmad, Sun Jun, Vasile Palade, Qi You, Li Mao and Mao Zhongjie
Diagnostics 2021, 11(11), 2147; https://doi.org/10.3390/diagnostics11112147 - 19 Nov 2021
Cited by 29 | Viewed by 4313
Abstract
Deep learning has gained immense attention from researchers in medicine, especially in medical imaging. The main bottleneck is the unavailability of sufficiently large medical datasets required for the good performance of deep learning models. This paper proposes a new framework consisting of one [...] Read more.
Deep learning has gained immense attention from researchers in medicine, especially in medical imaging. The main bottleneck is the unavailability of sufficiently large medical datasets required for the good performance of deep learning models. This paper proposes a new framework consisting of one variational autoencoder (VAE), two generative adversarial networks, and one auxiliary classifier to artificially generate realistic-looking skin lesion images and improve classification performance. We first train the encoder-decoder network to obtain the latent noise vector with the image manifold’s information and let the generative adversarial network sample the input from this informative noise vector in order to generate the skin lesion images. The use of informative noise allows the GAN to avoid mode collapse and creates faster convergence. To improve the diversity in the generated images, we use another GAN with an auxiliary classifier, which samples the noise vector from a heavy-tailed student t-distribution instead of a random noise Gaussian distribution. The proposed framework was named TED-GAN, with T from the t-distribution and ED from the encoder-decoder network which is part of the solution. The proposed framework could be used in a broad range of areas in medical imaging. We used it here to generate skin lesion images and have obtained an improved classification performance on the skin lesion classification task, rising from 66% average accuracy to 92.5%. The results show that TED-GAN has a better impact on the classification task because of its diverse range of generated images due to the use of a heavy-tailed t-distribution. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

16 pages, 1757 KiB  
Article
Mining Primary Care Electronic Health Records for Automatic Disease Phenotyping: A Transparent Machine Learning Framework
by Fabiola Fernández-Gutiérrez, Jonathan I. Kennedy, Roxanne Cooksey, Mark Atkinson, Ernest Choy, Sinead Brophy, Lin Huo and Shang-Ming Zhou
Diagnostics 2021, 11(10), 1908; https://doi.org/10.3390/diagnostics11101908 - 15 Oct 2021
Cited by 6 | Viewed by 2985
Abstract
(1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and [...] Read more.
(1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and 40,656,805 secondary care records and 694,954 records from specialist surgeries between 2002 and 2012, to generate a unique dataset. Then, we treated patient identification as a problem of text classification and proposed a transparent disease-phenotyping framework. This framework comprises a generation of patient representation, feature selection, and optimal phenotyping algorithm development to tackle the imbalanced nature of the data. This framework was extensively evaluated by identifying rheumatoid arthritis (RA) and ankylosing spondylitis (AS). (3) Results: Being applied to the linked dataset of 9657 patients with 1484 cases of rheumatoid arthritis (RA) and 204 cases of ankylosing spondylitis (AS), this framework achieved accuracy and positive predictive values of 86.19% and 88.46%, respectively, for RA and 99.23% and 97.75% for AS, comparable with expert knowledge-driven methods. (4) Conclusions: This framework could potentially be used as an efficient tool for identifying patients with a condition of interest from EHRs, helping clinicians in clinical decision-support process. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

13 pages, 3110 KiB  
Article
Using Transfer Learning Method to Develop an Artificial Intelligence Assisted Triaging for Endotracheal Tube Position on Chest X-ray
by Kuo-Ching Yuan, Lung-Wen Tsai, Kevin S. Lai, Sing-Teck Teng, Yu-Sheng Lo and Syu-Jyun Peng
Diagnostics 2021, 11(10), 1844; https://doi.org/10.3390/diagnostics11101844 - 6 Oct 2021
Cited by 1 | Viewed by 3123
Abstract
Endotracheal tubes (ETTs) provide a vital connection between the ventilator and patient; however, improper placement can hinder ventilation efficiency or injure the patient. Chest X-ray (CXR) is the most common approach to confirming ETT placement; however, technicians require considerable expertise in the interpretation [...] Read more.
Endotracheal tubes (ETTs) provide a vital connection between the ventilator and patient; however, improper placement can hinder ventilation efficiency or injure the patient. Chest X-ray (CXR) is the most common approach to confirming ETT placement; however, technicians require considerable expertise in the interpretation of CXRs, and formal reports are often delayed. In this study, we developed an artificial intelligence-based triage system to enable the automated assessment of ETT placement in CXRs. Three intensivists performed a review of 4293 CXRs obtained from 2568 ICU patients. The CXRs were labeled “CORRECT” or “INCORRECT” in accordance with ETT placement. A region of interest (ROI) was also cropped out, including the bilateral head of the clavicle, the carina, and the tip of the ETT. Transfer learning was used to train four pre-trained models (VGG16, INCEPTION_V3, RESNET, and DENSENET169) and two models developed in the current study (VGG16_Tensor Projection Layer and CNN_Tensor Projection Layer) with the aim of differentiating the placement of ETTs. Only VGG16 based on ROI images presented acceptable performance (AUROC = 92%, F1 score = 0.87). The results obtained in this study demonstrate the feasibility of using the transfer learning method in the development of AI models by which to assess the placement of ETTs in CXRs. Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

Other

Jump to: Research

10 pages, 1081 KiB  
Study Protocol
The NILS Study Protocol: A Retrospective Validation Study of an Artificial Neural Network Based Preoperative Decision-Making Tool for Noninvasive Lymph Node Staging in Women with Primary Breast Cancer (ISRCTN14341750)
by Ida Skarping, Looket Dihge, Pär-Ola Bendahl, Linnea Huss, Julia Ellbrant, Mattias Ohlsson and Lisa Rydén
Diagnostics 2022, 12(3), 582; https://doi.org/10.3390/diagnostics12030582 - 24 Feb 2022
Cited by 5 | Viewed by 2282
Abstract
Newly diagnosed breast cancer (BC) patients with clinical T1–T2 N0 disease undergo sentinel-lymph-node (SLN) biopsy, although most of them have a benign SLN. The pilot noninvasive lymph node staging (NILS) artificial neural network (ANN) model to predict nodal status was published in 2019, [...] Read more.
Newly diagnosed breast cancer (BC) patients with clinical T1–T2 N0 disease undergo sentinel-lymph-node (SLN) biopsy, although most of them have a benign SLN. The pilot noninvasive lymph node staging (NILS) artificial neural network (ANN) model to predict nodal status was published in 2019, showing the potential to identify patients with a low risk of SLN metastasis. The aim of this study is to assess the performance measures of the model after a web-based implementation for the prediction of a healthy SLN in clinically N0 BC patients. This retrospective study was designed to validate the NILS prediction model for SLN status using preoperatively available clinicopathological and radiological data. The model results in an estimated probability of a healthy SLN for each study participant. Our primary endpoint is to report on the performance of the NILS prediction model to distinguish between healthy and metastatic SLNs (N0 vs. N+) and compare the observed and predicted event rates of benign SLNs. After validation, the prediction model may assist medical professionals and BC patients in shared decision making on omitting SLN biopsies in patients predicted to be node-negative by the NILS model. This study was prospectively registered in the ISRCTN registry (identification number: 14341750). Full article
(This article belongs to the Special Issue Intelligent Data Analysis for Medical Diagnosis)
Show Figures

Figure 1

Back to TopTop