Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning
Abstract
:1. Introduction
- Achieve a mathematical representation to diagnose MetS using HMS criteria.
- Propose a segmentation of MetS using HMS criteria.
- Develop a framework to diagnose the different MetS types according to HMS criteria using a set of variables that doctors can obtain using non-invasive methods in a first consultation.
- Evaluate two machine learning techniques using performance indicators for each MetS type.
2. Methodology
2.1. Review
- Could authors predict the Metabolic Syndrome types or segmentation without a blood test? Y/N.
- What Metabolic Syndrome diagnostic criteria did the authors use? e.g., ATP II, IDF, HMS, or other criteria recognized.
- What ANN configuration did the authors use?
- What validation method did the authors use? e.g., hold out, random subsampling, and others.
- What performance indicators did the authors use? e.g., Sensitivity, Area Under the ROC curve, Specificity.
2.2. Analysis
2.2.1. Design and Study Population
- Age of 20 years or over.
- The subject can understand the instructions explained by the researchers.
- The subject can sign an informed consent.
- The subject resides permanently in the area.
- Are you pregnant?
- Are you bedridden?
2.2.2. Physical Examination and Blood Tests
2.3. Model
2.3.1. Mathematical Representation to Diagnose MetS
- W: Represents the normal(0) or raised(1) status of the dichotomous values of the waist circumference
- P: Represents the normal(0) or raised(1) status of the dichotomous variable of the blood pressure
- G: Represents the normal(0) or raised(1) status of the dichotomous variable of the fasting plasma glucose
- H: Represents the normal(0) or lower(1) status of the dichotomous variable of the HDL-C
- T: Represents the normal(0) or raised(1) status of the dichotomous variable of the triglycerides
2.3.2. Proposed Model MetS Segmentation
2.3.3. Framework to Diagnose the MetS Types
- (a)
- Extraction, Transformation, and Load (ETL)In this stage, we collected the data from a population of 615 subjects who authorized taking a blood sample to measure the values of triglycerides, HDL-C, and fasting plasma glucose. Moreover, the study recorded the anthropometric and clinical variables such as Age, Sex, Weight, Height, Waist Circumference (WC), Hip Circumference (HC), Systole Blood Pressure (SBP), and Diastole Blood Pressure (DBP).Later, through the transformation process, we obtained Body Mass Index, Body Fat Percentage, Waist Hip circumference ratio, Dichotomous Blood Pressure Systolic, Dichotomous Diastolic Blood Pressure, Dichotomous Blood Pressure, Dichotomous triglycerides, Dichotomous fasting blood sugar, Dichotomous HDL-C, and Dichotomous Waist circumference among others.Afterward, we used dichotomous values of the HMS criteria’ risk factors to build the different MetS types obtained from the segmentation process explained in the previous subsection. We obtained the output variables WPG, WPH, WPT, WGH, WGT, WTH, PGT, PGH, PHT, and GHT. Therefore, all anthropometric and clinical data was loaded in a dataset of 615 records.
- (b)
- Statistical analysis and balancing datasetIn this stage, we began with a dataset containing 615 people with samples of biochemical variables with their respective diagnostic of MetS. Then, we did a descriptive statistical analysis of the dataset, finding that some types of MetS were imbalanced, as shown in the Results section.This problem was caused by the low prevalence of the risk factor for fasting blood glucose in the study population. This low prevalence is expected in a study of MetS [40]. We resolved this imbalance by using a data balancing technique, such as the Synthetic Minority Oversampling Technique (SMOTE) [41,42] implemented by WEKA. We created synthetic data to get a balanced dataset of 799 records (615 plus 184 synthetic data) and a better distribution of risk factors of MetS, thus improving the quality of discrimination.
- (c)
- ModelingIn this stage, we use an algorithm to select the necessary non-biochemical features. We used Sequential Feature Selection in Matlab to achieve the maximum discrimination in both datasets (imbalanced and balanced) of the proposed model’s output variables.For the following step, we used several Multilayer Perceptron (MLP) ANN to predict each MetS type: WPG, WPH, WPT, WGH, WGT, WTH, PGT, PGH, PTH, and GHT. These ANN should be trained before being used to predict the output variable value, i.e., the dependent variable. Each ANN is formed by neurons whose elements are a set of inputs that can come from other neurons or the outside, as shown in Figure 3 the basic structure of an ANN.Each structure of ANN should be initialized according to the propagation rule to the starting and each node has synaptic weights, which are the degree of communication between neurons, as shown Equations (7) and (8). Then, the data used to train the network is introduced into the network after the propagation algorithm is employed to obtain the final parameters in the network. In practice, the algorithm is divided into two parts: network training and network testing. The steps of propagation algorithm are described as follows [43]:This information flows in one direction only from the inputs to the hidden layer and after to the output layer, i.e., the information that comes from different activation function neurons, which is responsible for determining the current state and finally converges all the data to the output [33].Each ANN has several hidden neurons that have functions, such as the hyperbolic tangent sigmoid function and an output layer with a neuron. The neuron has a function that can be a log-sigmoid function [44,46].It should be noted that there are no hard and fast rules for the number of hidden neurons. These hidden neurons can be calculated or found empirically and are highly dependent on the problem and the dataset [47]. However, we used the methodology mentioned by [48,49,50] and described in Equation (9), where the number of hidden neurons (NHN) can be of the input variables plus an output variable.We used Equation (9), to estimate the number of hidden neurons to contribute to research in the area of machine learning for the diagnosis of MetS without using biochemical variables and in a way, describe every detail of the process for experimentation by other researchers can continue investigating these models as well as Chen [32] that used other equation to calculate the hidden neurons.Another machine learning technique used to diagnose the MetS types was the ensemble Random undersampling Boosted tree (RusBoost) because the data from the MetS study is imbalanced [51]. This technique improves the performance indicators of models using imbalanced data by applying a random undersampling technique. The technique randomly removes samples from the majority class [52], as shown in the algorithm detailed in Appendix B with the configuration showed in Table 5.
2.4. Performance Indicators and Model Assessment
2.5. Document
3. Results
3.1. Data Description
3.2. Experiment to Diagnose the Traditional MetS without Biochemical Variables
3.3. Experiments to Diagnose Each MetS Type without a Blood Test
- Approach 1 is using only the ANN technique with a feature selection algorithm.
- Approach 2 uses an ensemble classification algorithm in the dataset, which is the Random undersampling Boosted tree (RusBoost) ensemble.
- Approach 3 uses SMOTE to create more data that we called dataset with oversampling for then applying ANN.
- Approach 4 is using the dataset with oversampling and RusBoost.
3.3.1. Approach 1: Diagnosis of Each MetS Type Using the Original Dataset and ANN
3.3.2. Approach 2: Diagnosis of Each MetS Type Using the Original Dataset and RusBoost
3.3.3. Approach 3: Diagnosis of Each MetS Type Using the Dataset with Oversampling and ANN
3.3.4. Approach 4: Diagnosis of Each MetS Type Using the Dataset with Oversampling and RusBoost
4. Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
MDPI | Multidisciplinary Digital Publishing Institute |
DOAJ | Directory of open access journals |
IEEE | Institute of Electrical and Electronics Engineers |
ACM | Association for Computing Machinery |
DBLP | Digital Bibliography Library Project |
GCP | Good Clinical Practices |
ICH | Guide and the International Conference on Harmonization |
WHO | World Health Organization |
NCEP ATP III | National Cholesterol Education Programme Adult Treatment Panel III |
EGIR | European Group for the study of Insulin Resistance |
IDF | International Diabetes Federation |
HMS | Harmonized Metabolic Syndrome |
MetS | Metabolic Syndrome |
MetSG | Metabolic Syndrome General |
CHD | Coronary Heart Disease |
IR | Insulin Resistance |
ICD | International Classification of Diseases |
OR | Odds Ratio |
CI | Confidence Interval |
SS | Sensitivity |
SP | Specificity |
FNR | False Negative Rate |
FPR | False Positive Rate |
AROC | Area under Receiver Operating Characteristic Curve |
WC | Waist Circumference |
BP | Blood Pressure |
HDL-C | High-Density Lipoprotein Cholesterol |
FPG | Fasting Plasma Glucose |
TG | Triglycerides |
WG | Weight |
HG | Height |
HC | Hip Circumference |
WHHR | Waist to Hip ratio |
WSR | Waist to Stature |
BMI | Body Mass Index |
BFP | Body Fat Percentage |
SBP | Systole Blood Pressure |
DBP | Diastole Blood Pressure |
SBPD | Systole Blood Pressure Dichotomous |
DBPD | Diastole Blood Pressure Dichotomous |
W | Represents the normal(0) or raised(1) status of the dichotomous values of the WC |
P | Represents the normal(0) or raised(1) status of the dichotomous variable of the BP |
G | Represents the normal(0) or raised(1) status of the dichotomous variable of the FPG |
H | Represents the normal(0) or lowed(1) status of the dichotomous variable of the HDL-C |
T | Represents the normal(0) or raised(1) status of the dichotomous variable of the TG |
ANN | Artificial Neural Networks |
SMOTE | Synthetic Minority Oversampling Technique |
PCLR | Principal Component Logistic Regression |
RUSBoost | Random Undersampling Synthetic Minority Oversampling Technique |
Appendix A. Solution of Quine–McCluskey Algorithm to Minimize the MetS Types
- All those implicants of order 0, where only one variable has changed its state are grouped together. The group is obtained by eliminating the changed variable of those implicants of order 1. An example is the implicant of order 0, number 7 (W’P’GHT) and number 15 (W’PGHT), which are grouped together, resulting in W’GHT, which is of order 1.
- Then the implicants of order 1, where only one variable has changed its state are grouped together, obtained by eliminating changed variable. For example, the implicants 7, 15 (W’GHT) and 23, 31 (WGHT) (both implicants of order 1) are grouped together, resulting in GHT, which is of order 2.
- This process is carried out on all the implicants of order 0, until all implicants are minimized as shown Equation (6)
IMPLICANTS | |||||
---|---|---|---|---|---|
n | Order 0 * | Order 1 | Order 2 | ||
7 | W’P’GHT | 7, 15 | W’GHT | 7, 15, 23, 31 | GHT |
11 | W’PG’HT | 7, 23 | P’GHT | 11, 15, 27, 31 | PHT |
13 | W’PGH’T | 11, 2 | W’PHT | 13, 15, 29, 31 | PGT |
14 | W’PGHT’ | 11, 3 | PG’HT | 14, 15, 30, 31 | PGH |
15 | W’PGHT | 13, 2 | W’PGT | 19, 23, 27, 31 | WHT |
19 | WP’G’HT | 3, 29 | PGH’T | 21, 23, 29, 31 | WGT |
21 | WP’GH’T | 14, 2 | W’PGH | 22, 23, 30, 31 | WGH |
22 | WP’GHT’ | 14, 30 | PGHT’ | 25, 27, 30, 31 | WPT |
23 | WP’GHT | 15, 31 | PGHT | 26, 27, 30, 31 | WPH |
25 | WPG’H’T | 19, 23 | WP’HT | 28, 29, 30, 31 | WPG |
26 | WPG’HT’ | 19, 27 | WG’HT | ||
27 | WPG’HT | 21, 23 | WP’GT | ||
28 | WPGH’T’ | 21, 29 | WGH’T | ||
29 | WPGH’T | 22, 23 | WP’GH | ||
30 | WPGHT’ | 22, 30 | WGHT’ | ||
31 | WPGHT | 23, 31 | WGHT | ||
25, 3 | WPG’T | ||||
25, 29 | WPH’T | ||||
26, 27 | WPG’H | ||||
26, 30 | WPHT’ | ||||
27, 31 | WPHT | ||||
28, 29 | WPGH’ | ||||
28, 30 | WPGT’ | ||||
29, 31 | WPGT | ||||
30, 31 | WPGH |
Appendix B
Algorithm A1 RUSBoost Algorithm(Adapted from [65]). |
Given: Set S of examples ,..., with minority class Weak learner (decision tree), WeakLearn Number of iterations, T Desired percentage of total instances to be represented by the minority class, N
|
References
- Kaur, J. A Comprehensive Review on Metabolic Syndrome. Cardiol. Res. Pract. 2014, 1–21. [Google Scholar] [CrossRef]
- Cornier, M.-A.; Dabelea, D.; Hernandez, T.L.; Lindstrom, R.C.; Steig, A.J.; Stob, N.R.; Eckel, R.H. The Metabolic Syndrome. Endocr. Rev. 2008, 29, 777–822. [Google Scholar] [CrossRef] [PubMed]
- Müller-Nordhorn, J.; Willich, S.N. Coronary Heart Disease. In International Encyclopedia of Public Health, 2nd ed.; Academic Press: Cambridge, MA, USA, 2017; Volume 2, pp. 159–167. [Google Scholar] [CrossRef]
- WHO. Global Action Plan for the Prevention and Control of Noncommunicable Diseases 2013–2020; World Heal. Organ: Geneva, Switzerland, 2013; ISBN 978-9-24150-623-6. [Google Scholar]
- Navarro Lechuga, E.; Vargas Moranth, R. Metabolic syndrome in the southeast of Barranquilla (Colombia). Salud Uninorte 2008, 24, 40–52. [Google Scholar]
- Chobanian, A.V.; Bakris, G.L.; Black, H.R.; Cushman, W.C.; Green, L.A.; Izzo, J.L.; Roccella, E.J. Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Hypertension 2003, 42, 1206–1252. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Esposito, K.; Chiodini, P.; Colao, A.; Lenzi, A.; Giugliano, D. Metabolic Syndrome and Risk of Cancer: A systematic review and meta-analysis. Diabetes Care 2012, 35, 2402–2411. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, J.; Muntner, P.; Hamm, L.L.; Jones, D.W.; Batuman, V.; Fonseca, V.; He, J. The Metabolic Syndrome and Chronic Kidney Disease in U.S. Adults. Ann. Intern. Med. 2004, 140, 167. [Google Scholar] [CrossRef] [PubMed]
- Grundy, S.M. Metabolic Syndrome: Connecting and Reconciling Cardiovascular and Diabetes Worlds. J. Am. Coll. Cardiol. 2006, 47, 1093–1100. [Google Scholar] [CrossRef] [Green Version]
- Grundy, S.M. Metabolic Syndrome Pandemic. Arterioscler. Thromb. Vasc. Biol. 2008, 28, 629–636. [Google Scholar] [CrossRef] [Green Version]
- Ford, W.H.; Giles, E.S.; Dietz, W.H. Prevalence of the Metabolic Syndrome Among US Adult. J. Am. Med. Assoc. 2002, 287, 356–359. [Google Scholar] [CrossRef]
- Mozumdar, A.; Liguori, G. Persistent Increase of Prevalence of Metabolic Syndrome Among U.S. Adults: NHANES III to NHANES 1999–2006. Diabetes Care 2011, 34, 216–219. [Google Scholar] [CrossRef] [Green Version]
- Aguilar, M.; Bhuket, T.; Torres, S.; Liu, B. Prevalence of the Metabolic Syndrome in the United States, 2003–2012. JAMA 2015, 313, 1973–1974. [Google Scholar] [CrossRef] [PubMed]
- Lakka, H. The Metabolic Syndrome and Total and Cardiovascular Disease Mortality in Middle-aged Men. JAMA 2002, 288, 2709–2716. [Google Scholar] [CrossRef] [PubMed]
- Grundy, S.M. Metabolic Syndrome: A Multiplex Cardiovascular Risk Factor. J. Clin. Endocrinol. Metab. 2007, 92, 399–404. [Google Scholar] [CrossRef]
- Aschner, P. Metabolic syndrome as a risk factor for diabetes. Expert Rev. Cardiovasc. Ther. 2010, 8, 407–412. [Google Scholar] [CrossRef]
- Gutiérrez-Solis, R.M.; Datta Banik, A.L.; Méndez-González, S. Prevalence of Metabolic Syndrome in Mexico: A Systematic Review and Meta-Analysis. Metabolic Syndrome and Related Disorders. Metab. Syndr. Relat. Disord. 2018, 16, 395–405. [Google Scholar] [CrossRef] [PubMed]
- Navarro, E.; Vargas, R.F. Coronary risk according to Framinghan equation in adults with metabolic syndrome in the city of Soledad, Atlantico, 2010. Rev. Colomb. Cardiol. 2012, 19, 109–118. [Google Scholar]
- Alberti, K.G.M.M.; Zimmet, P.Z. Definition, Diagnosis and Classification of Diabetes Mellitus and its Complications Part 1: Diagnosis and Classification of Diabetes Mellitus Provisional Report of a WHO Consultation. Diabet. Med. 1998, 15, 539–553. [Google Scholar] [CrossRef]
- Bartlett, J.G.M. Executive summary of the third report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation and treatment of high blood cholesterol in adults. Infect. Dis. Clin. Pract. 2001, 10, 287–288. [Google Scholar]
- Balkau, B.; Charles, M. Comment on the provisional report from the WHO consultation. European Group for the Study of Insulin Resistance (EGIR). Diabet. Med. 1999, 16, 442–443. [Google Scholar]
- Alberti, K.G.M.M.; Zimmet, P.; Shaw, J. Metabolic syndrome—A new world-wide definition. A Consensus Statement from the International Diabetes Federation. J. Compil. 2006, 23, 469–480. [Google Scholar] [CrossRef]
- Alberti, K.G.M.M.; Eckel, R.H.; Grundy, S.M.; Zimmet, P.Z.; Cleeman, J.I.; Donato, K.A.; Smith, S.C. Harmonizing the Metabolic Syndrome International Atherosclerosis Society; and International Association for the Study of Obesity. Circulation 2009, 120, 1640–1645. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Minsalud. Informe Nacional de Calidad de la Atención en Salud 2015; Ministerio de Salud y Protección Social: Bogotá, Colombia, 2015; p. 217. [Google Scholar]
- Irving, G.; Neves, A.L.; Dambha-Miller, H.; Oishi, A.; Tagashira, H.; Verho, A.; Holden, J. International variations in primary care Doctor consultation time: A systematic review of 67 countries. BMJ Open 2017, 7, e017902. [Google Scholar] [CrossRef] [PubMed]
- Jover, A.; Corbella, E.; Mun, A.; Pedro-botet, J.; Herna, A.; Zu, M. Prevalence of Metabolic Syndrome and its Components in Patients With Acute Coronary Syndrome. Rev. EspañOla Cardiol. 2011, 64, 579–586. [Google Scholar] [CrossRef] [PubMed]
- De Kroon, M.L.; Renders, C.M.; Kuipers, E.C.; van Wouwe, J.P.; Van Buuren, S.; De Jonge, G.A.; Hirasing, R.A. Identifying metabolic syndrome without blood tests in young adults—The Terneuzen Birth Cohort. Eur. J. Public Health 2008, 18, 656–660. [Google Scholar] [CrossRef] [Green Version]
- Hsiung, D.Y.; Liu, C.W.; Cheng, P.C.; Ma, W.F. Using non-invasive assessment methods to predict the risk of metabolic syndrome. Appl. Nurs. Res. 2015, 28, 72–77. [Google Scholar] [CrossRef]
- Alshehri, A. Metabolic syndrome and cardiovascular risk. J. Fam. Community Med. 2010, 17, 73. [Google Scholar] [CrossRef]
- Barrios, M.; Jimeno, M.; Villalba, P.; Navarro, E. Novel Data Mining Methodology for Healthcare Applied to a New Model to Diagnose Metabolic Syndrome without a blood test. Diagnostics 2019, 9, 192. [Google Scholar] [CrossRef] [Green Version]
- Murguía-Romero, M.; Jiménez-Flores, R.; Méndez-Cruz, A.R.; Villalobos-Molina, R. Predicting Metabolic Syndrome with Neural Networks. In Advances in Artificial Intelligence and Its Applications; Castro, F., Gelbukh, A., González, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 464–472. [Google Scholar]
- Chen, H.; Xiong, S.; Ren, X. Evaluating the Risk of Metabolic Syndrome Based on an Artificial Intelligence Model. Abstr. Appl. Anal. 2014, 2014, 207268. [Google Scholar] [CrossRef]
- Ivanović, D.; Kupusinac, A.; Stokić, E.; Doroslovački, R.; Ivetić, D. ANN Prediction of Metabolic Syndrome: A Complex Puzzle that will be Completed. J. Med. Syst. 2016, 40, 264. [Google Scholar] [CrossRef]
- Navarro Lechuga, E.; Vargas Moranth, R.F.; Alcocer Olaciregui, A.E. Grasa corporal total como posible indicador de síndrome metabólico en adultos. Rev. EspañOla Nutr. Hum. DietéTica 2016, 20, 198. [Google Scholar] [CrossRef] [Green Version]
- Rodríguez, A.S.; Soidan, J.L.G.; Gómez, M.J.A.; Rodríguez, R.L.; del Alonso, A.Á.; Fernández, M.R.P. Metabolic syndrome and visceral fat in women with cardiovascular risk factor. Nutr. Hosp. 2017, 34, 863–868. [Google Scholar]
- Lean, M.E.J.; Han, T.S.; Deurenberg, P. Predicting body composition by densitometry from simple anthropometric measurements. Am. J. Clin. Nutr. 1996, 63, 4–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fliotsos, M.; Zhao, D.; Rao, V.N.; Ndumele, C.E.; Guallar, E.; Burke, G.L.; Michos, E.D. Body mass index from Early-, Mid-, and Older-adulthood and risk of heart failure and atherosclerotic cardiovascular disease: MESA. J. Am. Heart Assoc. 2018, 7, e009599. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Floyd, T.L. Digital Fundamentals, 8th ed.; Pearson Education: New York City, NY, USA, 2002; ISBN 978-0130995278. [Google Scholar]
- Rosen, K.H. Discrete Mathematics and Its Applications, 5th ed.; McGraw-Hill Higher Education: New York, NY, USA, 2002. [Google Scholar]
- Perveen, S.; Shahbaz, M.; Keshavjee, K.; Guergachi, A. Metabolic Syndrome and Development of Diabetes Mellitus: Predictive Modeling Based on Machine Learning Techniques. IEEE Access 2019, 7, 1365–1375. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Fernandez, A.; Garcia, S.; Herrera, F.; Chawla, N.V. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
- Kumar, S. Neural Networks, 2nd ed.; Tata McGraw-Hill Education: New York, NY, USA, 2012. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: Secaucus, NJ, USA, 2006. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning (Second); Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
- Witten, I.H.; Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems); Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2005. [Google Scholar]
- Andrea, T.A.; Kalayeh, H. Applications of Neural Networks m Quantitative Structure-Activity Relationships of Dihydrofolate Reductase Inhibitors. J. Med. Chem. 1991, 34, 2824–2836. [Google Scholar] [CrossRef]
- Boger, Z.; Guterman, H. Knowledge extraction from artificial neural networks models. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando, FL, USA, 12–15 October 1997; pp. 3030–3035. [Google Scholar] [CrossRef]
- Karsoliya, S. Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture. Int. J. Eng. Trends Technol. 2012, 3, 714–717. [Google Scholar]
- Panchal, F.S.; Panchal, M. Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network. Int. J. Comput. Sci. Mob. Comput. 2014, 3, 455–464. [Google Scholar]
- Mounce, S.R.; Ellis, K.; Edwards, J.M.; Speight, V.L.; Jakomis, N.; Boxall, J.B. Ensemble Decision Tree Models Using RUSBoost for Estimating Risk of Iron Failure in Drinking Water Distribution Systems. Water Resour. Manag. 2017, 31, 1575–1589. [Google Scholar] [CrossRef] [Green Version]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: Improving classification performance when training data is skewed. In Proceedings of the 2008 19th International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008; pp. 1–4. [Google Scholar]
- Xu, Q.S.; Liang, Y.Z. Monte Carlo cross validation. Chemom. Intell. Lab. Syst. 2001, 56, 1–11. [Google Scholar] [CrossRef]
- Shao, J. Linear model selection by cross-validation. J. Stat. Plan. Inference 2005, 128, 231–240. [Google Scholar] [CrossRef]
- Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
- Berrar, D. Cross-Validation. In Encyclopedia of Bioinformatics and Computational Biology; Elsevier: Amsterdam, The Netherlands, 2019; pp. 542–545. [Google Scholar] [CrossRef]
- Hosmer, D.W.; Lemeshow, S. Assessing the Fit of the Model. In Applied Logistic Regression; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2004; pp. 143–202. [Google Scholar] [CrossRef]
- Guyon, I.; Andre, E. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
- Rückstieß, T.; Osendorfer, C.; van der Smagt, P. Sequential Feature Selection for Classification; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, German, 2011; pp. 132–141. [Google Scholar] [CrossRef] [Green Version]
- Duncan, G.E.; Perri, M.G.; Theriaque, D.W.; Hutson, A.D.; Eckel, R.H.; Stacpoole, P.W. Exercise Training, Without Weight Loss, Increases Insulin Sensitivity and Postheparin Plasma Lipase Activity in Previously Sedentary Adults. Diabetes Care 2003, 26, 557–562. [Google Scholar] [CrossRef] [Green Version]
- Bouwmeester, W.; Zuithoff, N.P.; Mallett, S.; Geerlings, M.I.; Vergouwe, Y.; Steyerberg, E.W.; Moons, K.G. Reporting and methods in clinical prediction research: A systematic review. PLoS Med. 2012, 9, e1001221. [Google Scholar] [CrossRef] [Green Version]
- Sun, Y.; Wong, A.K.C.; Kamel, M.S. Classification of imbalanced data: A review. Int. J. Pattern Recognit. Artif. Intell. 2009, 23, 687–719. [Google Scholar] [CrossRef]
- Melillo, P.; Luca, N.D.; Bracale, M.; Pecchia, L. Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Heal. Inform. 2013, 17, 727–733. [Google Scholar] [CrossRef]
- Bolón-Canedo, V.; Sánchez-Maroño, N.; Alonso-Betanzos, A. A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 2013, 34, 483–519. [Google Scholar] [CrossRef]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2020, 40, 185–197. [Google Scholar] [CrossRef]
Risk Factors | HMS Criteria |
---|---|
Central Obesity | Waist Circumference (WC) population and country specific |
Triglycerides (TG) | ≥150 mg/dL |
Fasting Plasma Glucose (FPG) | ≥100 mg/dL |
High-Density Lipoprotein Cholesterol (HDL-C) | <40 mg/dL in males <50 mg/dL in females |
Blood Pressure | Systolic ≥ 130 mmHg and/or Diastolic ≥ 85 mmHg |
Diagnostic criteria | Three risk factors |
Authors | Murguia-Romero [31] | Chen [32] | Kupusinac [33] |
---|---|---|---|
Age | E | E | |
Sex | E | E | E |
WG:Weight | E | I | I |
HG:Height | E | I | I |
WC: Waist circumference | E | E | I |
HC: Hip circumference | E | ||
WHR: Waist to Hip ratio | E | ||
WSR: Waist to Stature Ratio | E | ||
BMI: Body Mass Index | E | E | E |
SBP: Systolic blood pressure | E | E | |
DBP: Diastolic blood pressure | E | E | |
Hidden neurons | 25 | 5 | 85 and 96 |
n | W | P | G | H | T | MetS |
---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 1 | 0 |
2 | 0 | 0 | 0 | 1 | 0 | 0 |
3 | 0 | 0 | 0 | 1 | 1 | 0 |
4 | 0 | 0 | 1 | 0 | 0 | 0 |
5 | 0 | 0 | 1 | 0 | 1 | 0 |
6 | 0 | 0 | 1 | 1 | 0 | 0 |
7 | 0 | 0 | 1 | 1 | 1 | 1 |
8 | 0 | 1 | 0 | 0 | 0 | 0 |
9 | 0 | 1 | 0 | 0 | 1 | 0 |
10 | 0 | 1 | 0 | 1 | 0 | 0 |
11 | 0 | 1 | 0 | 1 | 1 | 1 |
12 | 0 | 1 | 1 | 0 | 0 | 0 |
13 | 0 | 1 | 1 | 0 | 1 | 1 |
14 | 0 | 1 | 1 | 1 | 0 | 1 |
15 | 0 | 1 | 1 | 1 | 1 | 1 |
16 | 1 | 0 | 0 | 0 | 0 | 0 |
17 | 1 | 0 | 0 | 0 | 1 | 0 |
18 | 1 | 0 | 0 | 1 | 0 | 0 |
19 | 1 | 0 | 0 | 1 | 1 | 1 |
20 | 1 | 0 | 1 | 0 | 0 | 0 |
21 | 1 | 0 | 1 | 0 | 1 | 1 |
22 | 1 | 0 | 1 | 1 | 0 | 1 |
23 | 1 | 0 | 1 | 1 | 1 | 1 |
24 | 1 | 1 | 0 | 0 | 0 | 0 |
25 | 1 | 1 | 0 | 0 | 1 | 1 |
26 | 1 | 1 | 0 | 1 | 0 | 1 |
27 | 1 | 1 | 0 | 1 | 1 | 1 |
28 | 1 | 1 | 1 | 0 | 0 | 1 |
29 | 1 | 1 | 1 | 0 | 1 | 1 |
30 | 1 | 1 | 1 | 1 | 0 | 1 |
31 | 1 | 1 | 1 | 1 | 1 | 1 |
Type | Diagnostic of MetS |
---|---|
WPT | Increased Waist Circumference, Blood Pressure, and Triglycerides levels |
WPH | Increased Waist Circumference, Blood Pressure, and reduction of HDL-C levels |
WPG | Increased Waist Circumference, Blood Pressure, and Fasting Plasma Glucose levels |
WGT | Increased Waist Circumference, Fasting Plasma Glucose, and Triglycerides levels |
WGH | Increased Waist Circumference, Fasting Plasma Glucose, and decreased HDL-C levels |
WTH | Increased Waist Circumference, Triglycerides, and decreased HDL-C levels |
PGT | Increased Blood Pressure, Fasting Plasma Glucose, Triglycerides levels |
PGH | Increased Blood Pressure, Fasting Plasma Glucose, and decreased HDL-C levels |
PHT | Increased Blood Pressure, Triglycerides and decreased HDL-C levels |
GHT | Increased Fasting Plasma Glucose, Triglycerides and decreased HDL-C levels |
Learned Type | Decision Tree |
---|---|
Maximum number of splits | 20 |
Number of learners | 30 |
Learning rate | 0,1 |
AROC | Discrimination Ability |
---|---|
AROC = 0.5 | No discrimination |
0.5 < AROC < 0.7 | Regular |
0.7 ≤ AROC < 0.8 | Acceptable |
0.8 ≤ AROC < 0.9 | Excellent |
AROC ≥ 0.9 | Outstanding |
Variables * | MetS m(SD) | No MetS m(SD) | Total m(SD) | p |
---|---|---|---|---|
TG | 216.94 (112.8) | 121.84 (63.05) | 160.81 (98.67) | <0.001 |
GL | 97.33 (38.82) | 84 (19.56) | 89.47 (29.74) | <0.001 |
HDL-C[W] | 38 (8.22) | 46.97 (13.54) | 43.39 (12.49) | <0.001 |
HDL-C[M] | 36.11 (11.1) | 43.32 (11.63) | 40.27 (11.93) | <0.001 |
Variables | MetS m(SD) | No MetS m(SD) | Total m(SD) | p |
---|---|---|---|---|
Age (year) | 47.62 (17.49) | 38.89 (15.96) | 42.61 (17.17) | <0.001 |
WC (cm) | 99.81 (11.33) | 87.24 (11.91) | 92.59 (13.21) | <0.001 |
HC (cm) | 105.51 (10.56) | 93.73 (12.50) | 98.75 (13.07) | <0.001 |
Weight (Kg) | 79.08 (17.11) | 66.59 (13.81) | 71.71 (16.43) | <0.001 |
Height (m) | 1.64 (0.09) | 1.62 (0.09) | 1.63 (0.09) | 0.068 |
BMI (Kg/m) | 29.09 (5.31) | 25.26 (4.74) | 26.89 (5.33) | <0.001 |
WHR * | 0.94 (0.05) | 0.93 (0.09) | 0.94 (0.08) | <0.001 |
BFP (%) | 38.64 (8.46) | 30.86 (10.23) | 34.05 (10.28) | <0.001 |
SBP (mmHg) | 128.52 (18,46) | 112.91 (12,61) | 119.55 (17.19) | <0.001 |
DBP (mmHg) | 78.48 (11.13) | 71.18 (9.21) | 74.29 (10.69) | <0.001 |
Parameter | Value |
---|---|
Training Function | Levenberg–Marquardt back-propagation |
min_grad | 10 |
mu | 10 |
mu_dec | 0.1 |
mu_inc | 10 |
mu_max | 10 |
HL function | hyperbolic tangent sigmoid |
Out function | Log-sigmoid |
Types | Predicting Variables | ||||
---|---|---|---|---|---|
WPT | WC | SBP | DBP | POD | |
WPH | BMI | BFP | HC | SBP | DBP |
WPG | AGE | HC | SBP | ||
WGT | WC | ||||
WGH | BFP | HC | |||
WTH | BFP | WC | |||
PGH | SBP | POD | |||
PGT | AGE | WG | SBP | ||
PTH | HC | SBP | DBP | ||
GHT | WSR | ||||
MetSG | AGE | WC | WHR | SBP |
WPT | WPH | WPG | WGT | WGH | WTH | PGH | PGT | PTH | GHT | MetSG |
---|---|---|---|---|---|---|---|---|---|---|
4 | 4 | 3 | 2 | 2 | 2 | 2 | 3 | 3 | 2 | 4 |
Target | Predicting Variables | ||||
---|---|---|---|---|---|
WPT | WC | SBP | DBP | ||
WPH | BFP | HG | WHR | SBP | DBP |
WPG | AGE | POD | WG | SBP | |
WGT | WC | POD | |||
WGH | BFP | HC | WHR | DBP | |
WTH | BFP | WC | |||
PGH | AGE | POD | WG | SBP | |
PGT | AGE | WG | SBP | ||
PTH | BFP | HC | SBP | DBP | |
GHT | HC | POD | |||
MetG | AGE | WC | WHR | SBP |
WPT | WPH | WPG | WGT | WGH | WTH | PGH | PGT | PTH | GHT | MetSG |
---|---|---|---|---|---|---|---|---|---|---|
3 | 4 | 4 | 2 | 4 | 2 | 4 | 3 | 4 | 2 | 4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Barrios, M.; Jimeno, M.; Villalba, P.; Navarro, E. Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning. Appl. Sci. 2020, 10, 8404. https://doi.org/10.3390/app10238404
Barrios M, Jimeno M, Villalba P, Navarro E. Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning. Applied Sciences. 2020; 10(23):8404. https://doi.org/10.3390/app10238404
Chicago/Turabian StyleBarrios, Mauricio, Miguel Jimeno, Pedro Villalba, and Edgar Navarro. 2020. "Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning" Applied Sciences 10, no. 23: 8404. https://doi.org/10.3390/app10238404
APA StyleBarrios, M., Jimeno, M., Villalba, P., & Navarro, E. (2020). Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning. Applied Sciences, 10(23), 8404. https://doi.org/10.3390/app10238404