Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing

Pavón-Pulido, Nieves; Dominguez, Ligia; Blasco-García, Jesús Damián; Veronese, Nicola; Lucas-Ochoa, Ana-María; Fernández-Villalba, Emiliano; González-Cuello, Ana-María; Barbagallo, Mario; Herrero, Maria-Trinidad

doi:10.3390/jcm13226794

Open AccessArticle

Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing

by

Nieves Pavón-Pulido

²

,

Ligia Dominguez

^3,4,

Jesús Damián Blasco-García

¹,

Nicola Veronese

³

,

Ana-María Lucas-Ochoa

¹,

Emiliano Fernández-Villalba

¹,

Ana-María González-Cuello

¹

,

Mario Barbagallo

³

and

Maria-Trinidad Herrero

^1,*

on behalf of GOING FWD Investigators

¹

Clinical and Experimental Neuroscience (NiCE), Institute for Aging Research, Biomedical Institute for Bio-Health Research of Murcia (IMIB-Arrixaca), School of Medicine, Campus Mare Nostrum, UniWell, University of Murcia, 30100 Murcia, Spain

²

Department of Automation, Electrical Engineering and Electronic Technology, Campus Muralla del Mar, Technical University of Cartagena, 30202 Cartagena, Murcia, Spain

³

Geriatric Unit, Department of Medicine, University of Palermo, 90100 Palermo, Italy

⁴

Faculty of Medicine and Surgery, University of Enna “Kore”, 94100 Enna, Italy

^*

Author to whom correspondence should be addressed.

J. Clin. Med. 2024, 13(22), 6794; https://doi.org/10.3390/jcm13226794

Submission received: 27 August 2024 / Revised: 31 October 2024 / Accepted: 7 November 2024 / Published: 12 November 2024

(This article belongs to the Special Issue Artificial Intelligence (AI)-Based Diagnosis in Clinical Practice)

Download

Browse Figures

Versions Notes

Abstract

:

Background: After its introduction in the ICD-10-CM in 2016, sarcopenia is a condition widely considered to be a medical disease with important consequences for the elderly. Considering its high prevalence in older adults and its detrimental effects on health, it is essential to identify its risk factors to inform targeted interventions. Methods: Taking data from wave 2 of the ELSA, using ML-based methods, this study investigates which factors are significantly associated with sarcopenia. The Minimum Redundancy Maximum Relevance algorithm has been used to allow for an optimal set of features that could predict the dependent variable. Such a feature is the input of a ML-based prediction model, trained and validated to predict the risk of developing or not developing a disease. Results: The presented methods are suitable to identify the risk of acquired sarcopenia. Age and other relevant features related with dementia and musculoskeletal conditions agree with previous knowledge about sarcopenia. The present classifier has an excellent performance since the “true positive rate” is 0.81 and the low “false positive rate” is 0.26. Conclusions: There is a high prevalence of sarcopenia in elderly people, with age and the presence of dementia and musculoskeletal conditions being strong predictors. The new proposed approach paves the path to test the prediction of the incidence of sarcopenia in older adults.

Keywords:

artificial intelligence; machine learning; decision tree; sarcopenia; English Longitudinal Study of Ageing; older adults; epidemiology; aging; cohort; prospective

1. Introduction

Sarcopenia traditionally refers to “age-related muscle loss, affecting a combination of appendicular muscle mass, muscle strength, and/or physical performance measures” [1,2]. This condition is now widely considered to be a medical disease after its introduction in the ICD-10-CM in 2016 [3]. Even though it is generally studied in geriatrics, it is still underdiagnosed, since there is no real consensus on how it should be diagnosed [4]. The prevalence of sarcopenia is still discussed. A recent meta-analysis including only community-dwelling adults aged ≥ 55 years reported a prevalence of at least 10%, but it depends on the method used for assessing sarcopenia [5]. Sarcopenia may have important consequences for older people. For example, highly suggestive evidence supports the idea that sarcopenia is associated with the risk of mortality, disability, and falls, as well as relevant outcomes for older people [6]. Therefore, sarcopenia is associated with significant health care costs [6]. Considering the high prevalence of sarcopenia in older adults, the adverse effects on health, and the significant health care costs, it is important to identify the risk factors of sarcopenia in older adults to achieve targeted interventions.

Artificial intelligence (AI) is the term employed to describe the “use of computers and technology to simulate intelligent behavior and critical thinking comparable to a human being” [7]. In this regard, AI may help with a better identification of factors associated with a medical condition (e.g., sarcopenia) using complex databases and big-data information. The studies regarding AI and sarcopenia are mainly focused on how to diagnose sarcopenia using this tool, particularly for the assessment of body composition [8]. On the contrary, only one study explored the risk factors potentially associated with sarcopenia using artificial intelligence; this should be explored further, as risk factors are usually interrelated and only the use of conventional statistical methods could complicate the detection of sarcopenia [9].

Given this background, the aim of the present study was to investigate which factors were significantly associated with sarcopenia using artificial intelligence in a large representative sample population of English older adults.

2. Materials and Methods

2.1. Study Population

This study used data from wave 2 of the English Longitudinal Study of Ageing (ELSA), a prospective and nationally representative cohort of men and women living in England [10]. Wave 2 (baseline survey) was conducted between 2004 and 2005. The ELSA was approved by the London Multicentre Research Ethics Committee (MREC/01/2/91). Written informed consent was obtained from all participants.

2.2. Sarcopenia (Dependent Variable)

Following the criteria of the Revised European Consensus on the definition and diagnosis of sarcopenia [11,12], this condition was defined as having weak handgrip strength (defined as <27 kg for men and <16 kg for women using the average value of three handgrip measurements of the dominant hand) [11] and low skeletal muscle mass (SMM), as reflected by the lower skeletal mass index (SMI). SMM was calculated based on the equation proposed by Lee and colleagues [13]: SMM = 0.244 × weight + 7.8 × height + 6.6 × sex − 0.098 × age + race − 3.3 (where female = 0 and male = 1; race = 0 [White and Hispanic], race = 1.9 [Black], and race = −1.6 [Asian]). SMM was further divided by body mass index (BMI), based on weight and height measured by a trained nurse, to create an SMI [14]. Low SMM was defined as the lowest quartile of the SMI, based on sex-stratified values [15].

2.3. Other Factors

The selection of factors (predictor variables), possibly associated with sarcopenia, was based on the literature. The following factors were included: age, sex, years of schooling, number of alcoholic drinks consumed per week, retirement age (all considered as continuous variables), CASP-19 (Control, Autonomy, Self-realization, and Pleasure) indicating quality of life [16], CES-D (Center for Epidemiologic Studies Depression Scale) indicating the presence of depressive symptoms [17], body mass index (as a continuous variable), smoking status (ever, present, never), marital status (married, widowed, other status), difficulty in activities of daily living (ADL) and in instrumental ADL, physical activity level (high, moderate, low, sedentary), and the number of comorbidities. All of the factors are extensively reported in Appendix A.

2.4. Methods and Tools Used for Selecting Features and Obtaining a Prediction Model

MATLAB 2022 (the programming and numeric computing platform) was used as the tool for carrying out all of the experiments. The Statistics and Machine Learning Toolbox 2022 was selected, since it allows for multidimensional data analysis and feature extraction to be carried out, providing dimensionality reduction and feature selection methods to identify those variables with maximum predictive power. Furthermore, the toolbox provided supervised, semi-supervised, and unsupervised algorithms based on machine learning (ML), including boosted decision trees, ready to be programmed, configured, and applied over raw data in multiple formats. The database was imported from an Excel 2022-formatted file. The following steps were carried out to select a collection of relevant features, which had been used as inputs in an ML-based prediction model: selection of relevant features, statistical tests for demonstrating that such features allow for the generation of a prediction model with at least the same accuracy than the one including all of the features, and the training and validation of the model.

2.4.1. Selection of Relevant Features

The fscmrmr MATLAB function was selected in the present study. The function implements the Minimum Redundancy Maximum Relevance (MRMR) algorithm [18], which allowed features to be ranked for classification. It found an optimal set of features that was mutually and maximally dissimilar and could represent the response variable effectively. The algorithm minimized the redundancy and maximized the relevance of a feature set to the response variable. It quantified the redundancy and relevance using the mutual information on variables pairwise, mutual information on features, and mutual information on a feature and its response. It was useful for application in a previous preprocessing stage and in the classification problems.

The goal of the MRMR algorithm was finding an optimal collection of predictors C that maximized the relevance and minimized the redundancy of C, denoted as W_C, where

V_{C} = \frac{1}{|C|} \sum_{x ϵ C} I (x, y)

(1)

W_{C} = \frac{1}{{|C|}^{2}} \sum_{x, z ϵ C} I (x, z)

(2)

I is the mutual information between two variables that measured how much the uncertainty of one variable could be reduced when knowing the other variable:

I (X, Z) = \sum_{i, j} P (X = x_{i}, Z = z_{j}) \log \frac{P (X = x_{i}, Z = z_{j})}{P (X = x_{i}) P (Z = z_{j})}

(3)

X and Z are two discrete random variables. If X and Z are independent, I is 0. If X and Z are the same random variables, they are the entropy of X. The fscmrmr MATLAB function uses the concept of mutual information to compute such a value for both categorical and continuous (numerical) variables, discretizing numerical ones into 256 bins or the number of unique values in the variable if it was less than 256. It also found optimal bivariate bins for each pair of features using the adaptive algorithm described in [19].

The fscmrmr function allowed for the missing values to be considered for ranking features.

2.4.2. Prediction Model

The database was split into two tables, training (2995 samples) and validation (1999), meaning that 40% of the samples were reserved for validation. As a previous step before training the selected model, a test was carried out to prove that training a model with only the features selected using the MRMR algorithm, compared to training it with all the features included in the database, does not lose accuracy. The MATLAB function testckfold was used for this purpose, since it statistically assesses the accuracies of two classification models by repeatedly cross-validating such models, calculating the differences in the classification loss and formulating the test statistics by combining the classification loss differences. The function returned a logical value of 0 with a specific p-value, if rejecting the null hypothesis failed. The null hypothesis was that the model with all of the predictors was, at most, as accurate as the model with the selected ones.

Finally, a ML-based model for prediction purposes was selected, in particular, RUSBoosted Trees, implemented in MATLAB using the function fitcensemble, passing the method ‘RUSBoost’ to it as a parameter. Random under-sampling boosting (RUSBoost) was especially effective at classifying imbalanced data, that is, when a class had much fewer members than the another. In these cases, the unbalance was evident, such as in the case of 4994 samples, where only 460 were individuals diagnosed with sarcopenia. Therefore, class 1 (healthy individuals) contained 4534 observations and class 2 (individuals with sarcopenia) involved 460 observations. The RUS (Random Under Sampling) algorithm took N, the number of members in the class with the fewest members in the training data, as the basic unit for sampling; then, the classes with more members were under sampled by taking only N observations of every class. In this case, there were 2 classes; then, for each weak learner in the ensemble, RUSBoost took a subset of the data with N observations from each of the 2 classes.

3. Results

Among the 9432 participants present at wave two, 3186 were excluded since they aged less than 60 years, 1157 since no information regarding weight or height was available, and 95 as no information regarding handgrip strength was present. Therefore, 4994 participants were included in the analysis. The mean age of the participants was 70.9± SD 7.7 years (range: 60–90), and 45.3% were men. The overall prevalence of sarcopenia was 9.2% (95%CI: 8.4–10.0%) with a significant higher presence of this condition in women than in men (10.2 vs. 8%, p = 0.001).

Only the nine best ranked features were selected because the model trained offered the best results in terms of accuracy with those features. Figure 1 shows the nine best features ranked using the MRMR-based algorithm.

Two training datasets, one including all of the forty-seven features and another one only including the selected nine features described in Table 1, were used as inputs when the “testckfold” function was applied. Their outputs proved the null hypothesis that the model with all of the predictors was, at most, as accurate as the model with the nine ones. In this case, the output of the function was 0, indicating the failure to reject the null hypothesis, with a p-value of 0.9918. This could be interpreted as meaning that, when using the model with the nine predictors, it did not result in loss of accuracy compared to the model with all of the predictors.

Figure 2 and Figure 3 show the results obtained after validating the model with the training and test datasets. Figure 2 shows the performance of the model when it was applied using data from the training dataset. Figure 3 offers the performance of the model when it was applied using data from the validation dataset.

Figure 4 demonstrates the ROC (Receiving Operating Characteristic) curve, which allows for the true positive rate (TPR) against the false positive rate (FPR) to be represented. The ROC curve obtained after testing the model demonstrates that the attained classifier shows a good behavior because the “true positive rate” was high (0.81) and the “false positive rate” was moderately low (0.26). Then, the model could be appropriate for applying automatic initial screening of potential patients.

4. Discussion

In this work, which included about 5000 elderly participants, we found that the prevalence of sarcopenia was high, affecting 1/10 elderly people. Using an ML-based approach, the main factors associated with sarcopenia were age and the presence of dementia. Of the other seven remaining factors included, we found that Parkinson’s Disease, congestive heart failure, diabetic eye disease, osteoporosis, disability in IADL, and arthritis were also significantly associated, although to a lesser extent, with the risk of sarcopenia.

We observed that sarcopenia is a common condition in the ELSA study, since about one person out of ten was affected. These data were in line with other epidemiological works, even if great differences were present due to the different definitions and settings in which these studies were conducted [20]. Moreover, among the several available factors present in the ELSA study, only a few were associated with a higher presence of sarcopenia. In particular, dementia (especially Alzheimer’s disease) was significantly present in sarcopenia, supported by other studies. For example, in a systematic review of 13 studies, the authors found a higher prevalence of sarcopenia in participants with dementia compared to controls [21]. This epidemiological evidence could be confirmed by several justifications, including that people having dementia were commonly more sedentary than cognitively intact counterparts [22]. Moreover, muscle mass loss has been associated with Alzheimer’s disease related to brain atrophy, indicating that brain atrophy and loss of muscle mass may co-occur [23]. In addition, our study suggests that sarcopenia could be strongly associated with some musculoskeletal conditions such as osteoporosis or arthritis. Our findings are supported by the previous literature [24,25]. It is also worth highlighting that sarcopenia was associated with the presence of congestive heart failure; this association between sarcopenia and congestive heart failure seems to start at a younger age, differently from the conditions above [26].

The use of ML-based techniques for extracting relevant features from the dataset confirmed several of the mentioned assumptions. This is a novel finding, since the data themselves include knowledge that has stayed hidden from the human eye, but it could be extracted by applying the suitable algorithms. In this work, we obtained a set of relevant features by applying the MRMR algorithm [18]. That means that each selected feature was calculated as relevant according to the data, and then it had an influence on the behavior of the response variable. In the present case, the method was capable of detecting that age was relevant, agreeing with previous knowledge about sarcopenia. In addition, the MRMR method also had relevant characteristics related to dementia and musculoskeletal conditions. We also demonstrated that it was possible to generate a model for predicting Sarcopenia by only considering the extracted features with the MRMR method. As the data were not balanced and the number of healthy individuals was considerably greater than subjects with sarcopenia, the Random Under-Sampling boost (RUSboost) selected for classifying inputs, and the results were good both for training and validation data, offering an appropriate value of AUC, obtaining a high “true positive rate” (0.81) and moderately low “false positive rate” (0.26).

Decision tree models effectively handle a wide variety of types of data, including skewed data and data requiring slight pre-processing and manipulation, which differs from traditional statistical techniques like regression. These kind of models were more transparent regarding interpretation than black-box models, and this is an advantage in the field of health, since the interpretability of the model could be necessary to comprehend the mechanism of illness. In particular, decision tree algorithms are classifier models that learn from a training dataset and generate a flowchart-like tree structure illustrating the relationships between input predictors and the target class. Any specific input is classified by starting at the root of the tree, and the output (the class) is provided depending on the values of the attributes, tracing a path down through the branches of the tree. As decision trees could be unstable, since small changes in the data could generate a completely different tree, they are particularly well-suited for ensembles. Ensemble learning allows for multiple classifiers to be used and weighted and then combined for obtaining a better classifier with better performance in comparison to individual classifiers. There are different ensemble methods, and boosting is a general method for improving the performance of a weak learner (such as a decision tree).

RUSboost demonstrated itself to be specifically designed for solving this kind of problem, particularly when it found skewed data (many more observations of one class), as occurred in the sarcopenia database. Moreover, as in all of the boost algorithms, RUSboost uses shallow trees, and this construction could be better in terms of time and the usage of computational resources. Therefore, by combining the feature extraction method that generated the more relevant features in the context of predicting sarcopenia and the RUSboost model, the results were promising, even when the dataset was very unbalanced, and it was possible to retrain the system if new training data were included in the dataset, allowing for the model to be improved if new studies were made. The present findings demonstrate that machine learning allowed for knowledge to be extracted without carrying out tedious preprocessing tasks, and it offered a robust predictor in terms of true and false positives, as the ROC curve shows in Figure 4, which is useful for applying automatic initial screenings of potential patients.

The findings of our study should be interpreted within its limitations. First, the definition of sarcopenia that was made in the ELSA study using handgrip strength and an equation derived from anthropometric measures was not validated among UK people [27]. Second, the cross-sectional nature of this study may preclude any of the casual inferences of our findings. Third, the skeletal muscle mass (SMM) was calculated based on an equation, and this could be inaccurate. Finally, we did not consider, even if they were important according to the literature, several potential risk factors for sarcopenia, including information regarding diet [28].

5. Conclusions

The present study, conducted in the context of the ELSA cohort and using a machine learning approach, suggests that the prevalence of sarcopenia in elderly people is high. In particular, age and the presence of dementia are among the strongest predictors of sarcopenia in older people, by means of new methodology and analyses, using artificial intelligence with four different algorithms. Despite these factors being already extensively studied and included regularly in standard comprehensive geriatric assessments, this new approach opens new avenues to test predictions of the incidence of sarcopenia in older adults, a condition that is highly associated with disability and mortality.

Author Contributions

N.P.-P. drafted the manuscript and contributed for review layout. L.D. contributed to the conceptualization of the design of this paper and drafted it. J.D.B.-G. contributed to the review layout and critically reviewed the manuscript. N.V. contributed to the conceptualization of the design of this paper and drafted it. A.-M.L.-O. contributed to the review layout and critically reviewed the manuscript. E.F.-V. contributed to the review layout and critically reviewed the manuscript. A.-M.G.-C. contributed to the review layout and reviewed the manuscript. M.B. contributed to the review layout and critically reviewed the manuscript. M.-T.H. contributed to the conceptualization of the design of this paper and both drafted and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the grants from: the “Ministerio de Ciencia y Tecnología. Agencia Estatal e Investigación”, Spain, (Project Ref. Number: TED2021-130942B-C21//MICIU/AEI/10.13039/501100011033 and TED2021-130942A-C22); the “Fundación Primafrío” with the code number 39747; the COST Participatory Approaches with Older Adults (PAAR-net), with the code number CA22167, and La Caixa Foundation (ID 100010434, with code LCF/PR/DE18/52010001) as part of the GOING-FWD Consortium funded by the GENDER-NET Plus ERA-NET Initiative (Project Ref. Number: GNP-78).

Institutional Review Board Statement

The ELSA was approved by the London Multicentre Research Ethics Committee (MREC/01/2/91) on 14 March 2002.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data are contained within the article and Appendix A.

Acknowledgments

The ELSA was developed by a team of researchers based at University College London, the National Centre for Social Research, and the Institute for Fiscal Studies. The data were collected by the National Centre for Social Research. The funding was provided by the National Institute of Aging in the USA, and a consortium of UK government departments coordinated by the Office for National Statistics. The developers and funders of the ELSA and the UK Data Archive do not bear any responsibility for the analyses or interpretations presented here.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Potential predictors (independent variables) included in the database containing the ELSA study’s data.

Name	Description	Type
r2agey	Age of patient	Numerical
gender	Gender	Categorical
bmi	Body Mass Index (BMI)	Numerical
raedyrs_e	Years of education	Numerical
r2mstat	Marital status	Numerical
r2smokev	Smoke ever	Categorical
r2smoken	Smokes now	Categorical
r2adla	Some Diff-ADLs/0–5	Numerical
r2cesd	CESD score	Numerical
hedimbp	Ever reported high blood pressure (HBP) (diagnosed)	Categorical
hediman	Ever reported angina (diagnosed)	Categorical
hedimmi	Ever reported myocardial infarction (diagnosed)	Categorical
hedimhf	Ever reported congestive heart failure (diagnosed)	Categorical
hedimhm	Ever reported heart murmur (diagnosed)	Categorical
hedimar	Ever reported arrhythmia (diagnosed)	Categorical
hedimdi	Ever reported diabetes or high blood sugar (HBS) (diagnosed)	Categorical
hedbts	Ever reported diabetes (diagnosed)	Categorical
hedimst	Ever reported stroke (diagnosed)	Categorical
cvd7dihb	Ever reported any of 7 cvd-related diseases (excluding HBP)	Categorical
cvd7dbts	Ever reported any of 7 cvd-related diseases (excluding HBS and HBP)	Categorical
hediblu	Ever reported hedibonic lung disease (diagnosed)	Categorical
hedibas	Ever reported asthma (diagnosed)	Categorical
hedibar	Ever reported arthritis (diagnosed)	Categorical
hedibos	Ever reported osteoporosis (diagnosed)	Categorical
hedibca	Ever reported cancer (diagnosed)	Categorical
hedibpd	Ever reported Parkinson’s Disease (diagnosed)	Categorical
hedibps	Ever reported psychiatric disorder (diagnosed)	Categorical
hedibad	Ever reported Alzheimer’s Disease (diagnosed)	Categorical
hedibde	Ever reported dementia or memory impairment (diagnosed)	Categorical
heoptgl	Ever reported glaucoma (diagnosed)	Categorical
heoptdi	Ever reported diabetic eye disease (diagnosed)	Categorical
heoptmd	Ever reported macular degeneration (diagnosed)	Categorical
heoptca	Ever reported cataract (diagnosed)	Categorical
palevel	Physical activity summary	Categorical
raracem	Race—masked	Categorical
rarelig_e	Religion	Categorical
r2iadlza	r2iadlza:w2 R Some Diff-IADLs:/0–5	Categorical
r2drinkd_e	Days/week drinks	Numerical
s2readrc	Word recall list read by	Numerical
s2imrc	Immediate word recall	Numerical
s2dlrc	Delayed word recall	Numerical
s2orient	Cognition orient (summary date naming)	Numerical
s2verbf	Verbal fluency score	Numerical
s2tr20	Recall summary score	Numerical
s2prmt1	Prospective memory task 1	Numerical
r2retage	Retirement age	Numerical
CASP2	CASP-19 quality of life scale	Numerical

References

Woo, J. Sarcopenia. Clin. Geriatr. Med. 2017, 33, 305–314. [Google Scholar] [CrossRef] [PubMed]
Ben Kirk, B.; Cawthon, P.M.; Arai, H.; A Ávila-Funes, J.; Barazzoni, R.; Bhasin, S.; Binder, E.F.; Bruyere, O.; Cederholm, T.; Chen, L.-K.; et al. The Conceptual Definition of Sarcopenia: Delphi Consensus from the Global Leadership Initiative in Sarcopenia (GLIS). Age Ageing 2024, 53, afae052. [Google Scholar] [CrossRef] [PubMed]
Falcon, L.J.; Harris-Love, M.O. Sarcopenia and the new ICD-10-CM code: Screening, staging, and diagnosis considerations. Fed. Pract. 2017, 34, 24. [Google Scholar] [PubMed]
Evans, W.J.; Guralnik, J.; Cawthon, P.; Appleby, J.; Landi, F.; Clarke, L.; Vellas, B.; Roubenoff, L.F.R. Sarcopenia: No consensus, no diagnostic criteria, and no approved indication-How did we get here? Geroscience 2024, 46, 183–190. [Google Scholar] [CrossRef] [PubMed]
Mayhew, A.J.; Amog, K.; Phillips, S.; Parise, G.; McNicholas, P.D.; de Souza, R.J.; Thabane, L.; Raina, P. The prevalence of sarcopenia in community-dwelling older adults, an exploration of differences between studies and within definitions: A systematic review and meta-analyses. Age Ageing 2019, 48, 48–56. [Google Scholar] [CrossRef]
Veronese, N.; Demurtas, J.; Soysal, P.; Smith, L.; Torbahn, G.; Schoene, D.; Schwingshackl, L.; Sieber, C.; Bauer, J.; Cesari, M.; et al. Sarcopenia and health-related outcomes: An umbrella review of observational studies. Eur. Geriatr. Med. 2019, 10, 853–862. [Google Scholar] [CrossRef]
Amisha, P.M.; Pathania, M.; Rathaur, V.K. Overview of artificial intelligence in medicine. J. Fam. Med. Prim. Care 2019, 8, 2328. [Google Scholar] [CrossRef]
Rozynek, M.; Kucybała, I.; Urbanik, A.; Wojciechowski, W. The use of artificial intelligence in the imaging of sarcopenia: A narrative review of current status and perspectives. Nutrition 2021, 89, 111227. [Google Scholar] [CrossRef]
Kang, Y.-J.; Yoo, J.-I.; Ha, Y.-C. Sarcopenia feature selection and risk prediction using machine learning: A cross-sectional study. Medicine 2019, 98, e17699. [Google Scholar] [CrossRef]
Steptoe, A.; Breeze, E.; Banks, J.; Nazroo, J. Cohort profile: The English longitudinal study of ageing. Int. J. Epidemiol. 2013, 42, 1640–1648. [Google Scholar] [CrossRef]
Cruz-Jentoft, A.J.; Bahat, G.; Bauer, J.; Boirie, Y.; Bruyère, O.; Cederholm, T.; Cooper, C.; Landi, F.; Rolland, Y.; Sayer, A.A.; et al. Sarcopenia: Revised European consensus on definition and diagnosis. Age Ageing 2019, 48, 16–31. [Google Scholar] [CrossRef] [PubMed]
Veronese, N.; Smith, L.; Cereda, E.; Maggi, S.; Barbagallo, M.; Dominguez, L.J.; Koyanagi, A. Multimorbidity increases the risk for sarcopenia onset: Longitudinal analyses from the English Longitudinal Study of Ageing. Exp. Gerontol. 2021, 156, 111624. [Google Scholar] [CrossRef] [PubMed]
Lee, R.C.; Wang, Z.; Heo, M.; Ross, R.; Janssen, I.; Heymsfield, S.B. Total-body skeletal muscle mass: Development and cross-validation of anthropometric prediction models. Am. J. Clin. Nutr. 2000, 72, 796–803. [Google Scholar] [CrossRef] [PubMed]
Studenski, S.A.; Peters, K.W.; Alley, D.E.; Cawthon, P.M.; McLean, R.R.; Harris, T.B.; Ferrucci, L.; Guralnik, J.M.; Fragala, M.S.; Kenny, A.M.; et al. The FNIH sarcopenia project: Rationale, study description, conference recommendations, and final estimates. J. Gerontol. A Biol. Sci. Med. Sci. 2014, 69, 547–558. [Google Scholar] [CrossRef]
Tyrovolas, S.; Koyanagi, A.; Olaya, B.; Ayuso-Mateos, J.L.; Miret, M.; Chatterji, S.; Tobiasz-Adamczyk, B.; Koskinen, S.; Leonardi, M.; Haro, J.M. Factors associated with skeletal muscle mass, sarcopenia, and sarcopenic obesity in older adults: A multi-continent study. J. Cachexia Sarcopenia Muscle 2016, 7, 312–321. [Google Scholar] [CrossRef]
Hyde, M.; Wiggins, R.D.; Higgs, P.; Blane, D.B. A measure of quality of life in early old age: The theory, development and properties of a needs satisfaction model (CASP-19). Aging Ment. Health 2003, 7, 186–194. [Google Scholar] [CrossRef]
Cosco, T.D.; Lachance, C.C.; Blodgett, J.M.; Stubbs, B.; Co, M.; Veronese, N.; Wu, Y.-T.; Prina, A.M. Latent structure of the Centre for Epidemiologic Studies Depression Scale (CES-D) in older adult populations: A systematic review. Aging Ment. Health 2020, 24, 700–704. [Google Scholar] [CrossRef]
Ding, C.; Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 2005, 3, 185–205. [Google Scholar] [CrossRef]
Darbellay, G.A.; Vajda, I. Estimation of the information by an adaptive partitioning of the observation space. IEEE Trans. Inf. Theory 1999, 45, 1315–1321. [Google Scholar] [CrossRef]
Shafiee, G.; Keshtkar, A.; Soltani, A.; Ahadi, Z.; Larijani, B.; Heshmat, R. Prevalence of sarcopenia in the world: A systematic review and meta-analysis of general population studies. J. Diabetes Metab. Disord. 2017, 16, 21. [Google Scholar] [CrossRef]
Waite, S.J.; Maitland, S.; Thomas, A.; Yarnall, A.J. Sarcopenia and frailty in individuals with dementia: A systematic review. Arch. Gerontol. Geriatr. 2021, 92, 104268. [Google Scholar] [CrossRef] [PubMed]
Hartman, Y.A.; Karssemeijer, E.G.; van Diepen, L.A.; Rikkert, M.G.O.; Thijssen, D.H. Dementia patients are more sedentary and less physically active than age-and sex-matched cognitively healthy older adults. Dement. Geriatr. Cogn. Disord. 2018, 46, 81–89. [Google Scholar] [CrossRef] [PubMed]
Burns, J.M.; Johnson, D.K.; Watts, A.; Swerdlow, R.H.; Brooks, W.M. Reduced lean mass in early Alzheimer disease and its association with brain atrophy. Arch. Neurol. 2010, 67, 428–433. [Google Scholar] [CrossRef] [PubMed]
Reginster, J.-Y.; Beaudart, C.; Buckinx, F.; Bruyère, O. Osteoporosis and sarcopenia: Two diseases or one? Curr. Opin. Clin. Nutr. Metab. Care 2016, 19, 31. [Google Scholar] [CrossRef]
Pickering, M.-E.; Chapurlat, R. Where two common conditions of aging meet: Osteoarthritis and sarcopenia. Calcif. Tissue Int. 2020, 107, 203–211. [Google Scholar] [CrossRef]
Springer, J.; Springer, J.I.; Anker, S.D. Muscle wasting and sarcopenia in heart failure and beyond: Update 2017. ESC Heart Fail. 2017, 4, 492–498. [Google Scholar] [CrossRef]
Veronese, N.; Koyanagi, A.; Cereda, E.; Maggi, S.; Barbagallo, M.; Dominguez, L.J.; Smith, L. Sarcopenia reduces quality of life in the long-term: Longitudinal analyses from the English longitudinal study of ageing. Eur. Geriatr. Med. 2022, 13, 633–639. [Google Scholar] [CrossRef]
Sieber, C.C. Malnutrition and sarcopenia. Aging Clin. Exp. Res. 2019, 31, 793–798. [Google Scholar] [CrossRef]

Figure 1. List of the nine best features ranked using the MRMR-based method.

Figure 2. Confusion matrix for the training dataset.

Figure 3. Confusion matrix for the test dataset.

Figure 4. ROC curve.

Table 1. List of ranked features with their descriptions and types.

Name	Description	Type
Age (r2agey)	Age of patient	Numerical
Alzheimer’s disease (hedibad)	Ever reported Alzheimer’s Disease (diagnosed)	Categorical
Parkinson disease (hedibpd)	Ever reported Parkinson’s Disease (diagnosed)	Categorical
Congestive heart failure (hedimhf)	Ever reported congestive heart failure (diagnosed)	Categorical
Dementia (hedibde)	Ever reported dementia or memory impairment (diagnosed)	Categorical
Diabetes (heoptdi)	Ever reported diabetic eye disease (diagnosed)	Categorical
Osteoporosis (hedibos)	Ever reported osteoporosis (diagnosed)	Categorical
Diff-IADL (r2iadlza)	r2iadlza:w2 R Some Diff-IADLs:/0–5	Categorical
Arthritis (hedibar)	Ever reported arthritis (diagnosed)	Categorical

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pavón-Pulido, N.; Dominguez, L.; Blasco-García, J.D.; Veronese, N.; Lucas-Ochoa, A.-M.; Fernández-Villalba, E.; González-Cuello, A.-M.; Barbagallo, M.; Herrero, M.-T., on behalf of GOING FWD Investigators. Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing. J. Clin. Med. 2024, 13, 6794. https://doi.org/10.3390/jcm13226794

AMA Style

Pavón-Pulido N, Dominguez L, Blasco-García JD, Veronese N, Lucas-Ochoa A-M, Fernández-Villalba E, González-Cuello A-M, Barbagallo M, Herrero M-T on behalf of GOING FWD Investigators. Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing. Journal of Clinical Medicine. 2024; 13(22):6794. https://doi.org/10.3390/jcm13226794

Chicago/Turabian Style

Pavón-Pulido, Nieves, Ligia Dominguez, Jesús Damián Blasco-García, Nicola Veronese, Ana-María Lucas-Ochoa, Emiliano Fernández-Villalba, Ana-María González-Cuello, Mario Barbagallo, and Maria-Trinidad Herrero on behalf of GOING FWD Investigators. 2024. "Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing" Journal of Clinical Medicine 13, no. 22: 6794. https://doi.org/10.3390/jcm13226794

APA Style

Pavón-Pulido, N., Dominguez, L., Blasco-García, J. D., Veronese, N., Lucas-Ochoa, A. -M., Fernández-Villalba, E., González-Cuello, A. -M., Barbagallo, M., & Herrero, M. -T., on behalf of GOING FWD Investigators. (2024). Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing. Journal of Clinical Medicine, 13(22), 6794. https://doi.org/10.3390/jcm13226794

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification of Predictors of Sarcopenia in Older Adults Using Machine Learning: English Longitudinal Study of Ageing

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Population

2.2. Sarcopenia (Dependent Variable)

2.3. Other Factors

2.4. Methods and Tools Used for Selecting Features and Obtaining a Prediction Model

2.4.1. Selection of Relevant Features

2.4.2. Prediction Model

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI