Predicting the Onset of Diabetes with Machine Learning Methods
Abstract
:1. Introduction
2. Literature Review
3. Steps and Methods
3.1. Research Subjects
3.2. Data Preprocessing
3.3. Data Analysis and Classification
3.4. Model Evaluation Metrics
- True Positives (TP): someone with diabetes and was predicted to have diabetes.
- False Positives (FP): someone without diabetes was predicted to have diabetes.
- False Negatives (FN): someone with diabetes was not predicted to have diabetes.
- True Negatives (TN): someone without diabetes was not predicted to have diabetes.
3.5. Machine Learning Model
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- International Diabetes Federation. 2021. Available online: https://diabetesatlas.org/atlas/tenth-edition/ (accessed on 6 December 2021).
- American Diabetes Association. Standards of medical care in diabetes—2009. Diabetes Care 2009, 32 (Suppl. 1), S13–S61. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Stephen, C.; Daniel, D. The value of early detection of type 2 diabetes. Curr. Opin. Endocrinol. Diabetes Obes. 2009, 16, 95–99. [Google Scholar] [CrossRef]
- Dagliati, A.; Marini, S.; Sacchi, L.; Cogni, G.; Teliti, M.; Tibollo, V.; De Cata, P.; Chiovato, L.; Bellazzi, R. Machine Learning Methods to Predict Diabetes Complications. J. Diabetes Sci. Technol. 2018, 12, 295–302. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tapp, R.J.; Shaw, J.E.; Zimmet, P.Z.; Balkau, B.; Chadban, S.J.; Tonkin, A.M.; Welborn, T.A.; Atkins, R.C. Albuminuria is evident in the early stages of diabetes onset: Results from the Australian Diabetes, Obesity, and Lifestyle Study (AusDiab). Am. J. Kidney Dis. 2004, 44, 792–798. [Google Scholar] [CrossRef] [PubMed]
- Katarya, R.; Maan, S. Stress Detection using Smartwatches with Machine Learning: A Survey. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 306–310. [Google Scholar] [CrossRef]
- Expert Committee on the Diagnosis and Clasification of Diabetes Mellitus. American Diabetes Association: Clinical practice recommendations 2002. Diabetes Care. 2002, 25 (Suppl. 1), S1–S147. [Google Scholar] [CrossRef] [Green Version]
- Joshi, R.D.; Dhakal, C.K. Predicting Type 2 Diabetes Using Logistic Regression and Machine Learning Approaches. Int. J. Environ. Res. Public Health 2021, 18, 7346. [Google Scholar] [CrossRef]
- Kavakiotis, I.; Tsave, O.; Salifoglou, A.; Maglaveras, N.; Vlahavas, I.; Chouvarda, I. Machine Learning and Data Mining Methods in Diabetes Research. Comput. Struct. Biotechnol. J. 2017, 15, 104–116. [Google Scholar] [CrossRef]
- Rodríguez-Rodríguez, I.; Chatzigiannakis, I.; Rodríguez, J.-V.; Maranghi, M.; Gentili, M.; Zamora-Izquierdo, M.-Á. Utility of Big Data in Predicting Short-Term Blood Glucose Levels in Type 1 Diabetes Mellitus Through Machine Learning Techniques. Sensors 2019, 19, 4482. [Google Scholar] [CrossRef] [Green Version]
- Kopitar, L.; Kocbek, P.; Cilar, L.; Sheikh, A.; Stiglic, G. Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Sci. Rep. 2020, 10, 11981. [Google Scholar] [CrossRef]
- Makroum, M.A.; Adda, M.; Bouzouane, A.; Ibrahim, H. Machine Learning and Smart Devices for Diabetes Management: Systematic Review. Sensors 2022, 22, 1843. [Google Scholar] [CrossRef]
- Ahmad, H.F.; Mukhtar, H.; Alaqail, H.; Seliaman, M.; Alhumam, A. Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning. Appl. Sci. 2021, 11, 1173. [Google Scholar] [CrossRef]
- Jian, Y.; Pasquier, M.; Sagahyroon, A.; Aloul, F. A Machine Learning Approach to Predicting Diabetes Complications. Healthcare 2021, 9, 1712. [Google Scholar] [CrossRef]
- Jagannathan, R.; Neves, J.S.; Dorcely, B.; Chung, S.T.; Tamura, K.; Rhee, M.; Bergman, M. The Oral Glucose Tolerance Test: 100 Years Later. Diabetes Metab. Syndr. Obes. 2020, 13, 3787–3805. [Google Scholar] [CrossRef]
- Markoulidakis, I.; Rallis, I.; Georgoulas, I.; Kopsiaftis, G.; Doulamis, A.; Doulamis, N. Multiclass Confusion Matrix Reduction Method and Its Application on Net Promoter Score Classification Problem. Technologies 2021, 9, 81. [Google Scholar] [CrossRef]
- Larabi-Marie-Sainte, S.; Aburahmah, L.; Almohaini, R.; Saba, T. Current Techniques for Diabetes Prediction: Review and Case Study. Appl. Sci. 2019, 9, 4604. [Google Scholar] [CrossRef] [Green Version]
- Meng, X.-H.; Huang, Y.-X.; Rao, D.-P.; Zhang, Q.; Liu, Q. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J. Med. Sci. 2013, 29, 93–99. [Google Scholar] [CrossRef] [Green Version]
- Abdulhadi, N.; Al-Mousa, A. Diabetes Detection Using Machine Learning Classification Methods. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021; pp. 350–354. [Google Scholar]
- Mujumdar, A.; Vaidehi, V. Diabetes Prediction using Machine Learning Algorithms. Procedia Comput. Sci. 2019, 165, 292–299. [Google Scholar] [CrossRef]
- Birjais, R.; Mourya, A.K.; Chauhan, R.; Kaur, H. Prediction and diagnosis of future diabetes risk: A machine learning approach. SN Appl. Sci. 2019, 1, 1112. [Google Scholar] [CrossRef] [Green Version]
- Katarya, R.; Srinivas, P. Identifying Risks in Cardiovascular Disease Using Supervised Machine Learning Algorithms. ICICNIS 2020. 2020. Available online: https://ssrn.com/abstract=3769903 (accessed on 24 January 2023).
- Gadekallu, T.R.; Khare, N.; Bhattacharya, S.; Singh, S.; Maddikunta, P.K.R.; Ra, I.-H.; Alazab, M. Early Detection of Diabetic Retinopathy Using PCA-Firefly Based Deep Learning Model. Electronics 2020, 9, 274. [Google Scholar] [CrossRef] [Green Version]
- Nadeem, M.W.; Goh, H.G.; Ponnusamy, V.; Andonovic, I.; Khan, M.A.; Hussain, M. A Fusion-Based Machine Learning Approach for the Prediction of the Onset of Diabetes. Healthcare 2021, 9, 1393. [Google Scholar] [CrossRef]
- Ryu, K.S.; Lee, S.W.; Batbaatar, E.; Lee, J.W.; Choi, K.S.; Cha, H.S. A Deep Learning Model for Estimation of Patients with Undiagnosed Diabetes. Appl. Sci. 2020, 10, 421. [Google Scholar] [CrossRef] [Green Version]
- Rahul; Katarya, R. A Review: Predicting the Performance of Students Using Machine learning Classification Techniques. In Proceedings of the 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 12–14 December 2019; pp. 36–41. [Google Scholar] [CrossRef]
- Hasan, M.K.; Alam, M.A.; Das, D.; Hossain, E.; Hasan, M. Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers. IEEE Access 2020, 8, 76516–76531. [Google Scholar] [CrossRef]
- Ghosh, P.; Azam, S.; Karim, A.; Hassan, M.; Roy, K.; Jonkman, M. A Comparative Study of Different Machine Learning Tools in Detecting Diabetes. Procedia Comput. Sci. 2021, 192, 467–477. [Google Scholar] [CrossRef]
- Lai, H.; Huang, H.; Keshavjee, K.; Guergachi, A.; Gao, X. Predictive models for diabetes mellitus using machine learning techniques. BMC Endocr. Disord. 2019, 19, 101. [Google Scholar] [CrossRef] [Green Version]
- Katarya, R.; Jain, S. Comparison of Different Machine Learning Models for Diabetes Detection. In Proceedings of the 2020 IEEE International Conference on Advances and Developments in Electrical and Electronics Engineering (ICADEE), Coimbatore, India, 10–11 December 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Katarya, R.; Srinivas, P. Predicting Heart Disease at Early Stages Using Machine Learning: A Survey. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 302–305. [Google Scholar] [CrossRef]
- Deberneh, H.M.; Kim, I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. Int. J. Environ. Res. Public Health 2021, 18, 3317. [Google Scholar] [CrossRef]
- Sisodia, D.; Sisodia, D.S. Prediction of diabetes using classification algorithms. Procedia Comput. Sci. 2018, 132, 1578–1585. [Google Scholar] [CrossRef]
- Kaur, H.; Kumari, V. Predictive modelling and analytics for diabetes using a machine learning approach. Appl. Comput. Inform. 2022, 18, 90–100. [Google Scholar] [CrossRef]
- Battineni, G.; Sagaro, G.G.; Nalini, C.; Amenta, F.; Tayebati, S.K. Comparative Machine-Learning Approach: A Follow-Up Study on Type 2 Diabetes Predictions by Cross-Validation Methods. Machines 2019, 7, 74. [Google Scholar] [CrossRef] [Green Version]
- Forouhi, N.G.; Wareham, N.J. Epidemiology of diabetes. Medicine 2010, 38, 602–606. [Google Scholar] [CrossRef]
- Gupta, A.; Katarya, R. Social media based surveillance systems for healthcare using machine learning: A systematic review. J. Biomed. Inform. 2020, 108, 103500. [Google Scholar] [CrossRef]
Metrics for Evaluation of the Model | |||||||||
---|---|---|---|---|---|---|---|---|---|
Model | True Positive | False Positive | False Negative | True Negative | Accuracy | Precision | Recall | F1 Score | AUC |
Two-Class Logistic Regression | 620 | 229 | 365 | 1786 | 0.802 | 0.73 | 0.629 | 0.676 | 0.87 |
Two-Class Neural Network | 877 | 169 | 108 | 1846 | 0.908 | 0.838 | 0.89 | 0.864 | 0.966 |
Two-Class Decision Jungle | 873 | 95 | 112 | 1920 | 0.931 | 0.902 | 0.886 | 0.894 | 0.976 |
Two-Class Boosted Decision Tree | 917 | 72 | 68 | 1943 | 0.953 | 0.927 | 0.931 | 0.929 | 0.991 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chou, C.-Y.; Hsu, D.-Y.; Chou, C.-H. Predicting the Onset of Diabetes with Machine Learning Methods. J. Pers. Med. 2023, 13, 406. https://doi.org/10.3390/jpm13030406
Chou C-Y, Hsu D-Y, Chou C-H. Predicting the Onset of Diabetes with Machine Learning Methods. Journal of Personalized Medicine. 2023; 13(3):406. https://doi.org/10.3390/jpm13030406
Chicago/Turabian StyleChou, Chun-Yang, Ding-Yang Hsu, and Chun-Hung Chou. 2023. "Predicting the Onset of Diabetes with Machine Learning Methods" Journal of Personalized Medicine 13, no. 3: 406. https://doi.org/10.3390/jpm13030406
APA StyleChou, C. -Y., Hsu, D. -Y., & Chou, C. -H. (2023). Predicting the Onset of Diabetes with Machine Learning Methods. Journal of Personalized Medicine, 13(3), 406. https://doi.org/10.3390/jpm13030406