Determining Risk Factors Associated with Depression and Anxiety in Young Lung Cancer Patients: A Novel Optimization Algorithm
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Design and Study Database
2.2. Ethics Statement
2.3. Study Population and Possible Risk Factors Selection
2.4. Combining Multiple Correspondence Analysis and the K-Means Clustering Algorithm with v-Fold Cross-Validation (MCA–k-Means Clustering Algorithm)
2.4.1. Step 1. Multiple Correspondence Analysis
- (1)
- Transform the raw data matrix into a Burt matrix:
- If a categorical variable is binary, then place it in the Burt matrix as an original variable matrix.
- If a categorical variable has more than two levels (i.e., Jk > 2 levels), then convert this variable into an index variable (containing only 0 and 1); this forms an indicator matrix I × Jk where each column contains index variables coded with 0 or 1.
- Place all index variable columns together to form the indicator matrix XI×J.
- Calculate the Burt matrix as (XI×J)′·XI×J.
- (2)
- Calculate the column and row coordinates as follows:
- The total orders of MI×K (N) are observed and the probability matrix is defined as P = N − 1X.
- Define r as the vector of the row totals of P (i.e., r = P1, where 1 is a unit vector of ones) and define c as the vector of the column totals of P. Then, Dc = diag{c} and Dr = diag{r}.
- Calculate the Euclidean coordinates by using a singular value decomposition method as follows:
- (3)
- The number of dimensions is determined using an inertia value as follows:
- The inertia value is calculated based on a Pearson chi-squared ( value from the rows and columns to identify their coordinate centers as follows:
- If a subset of F or G is selected, then the inertia values for the row and column coordinates are calculated as:
2.4.2. Step 2. K-Means Clustering with v-Fold Cross-Validation
- (1)
- Determination of the range of numbers of clusters for the k-means clustering algorithm: In this study, the number was set from k = 2 to n, where n ≤ 10;
- (2)
- Determination of the initial cluster centers: The initial cluster centers were selected at random;
- (3)
- Iteration scheme: Assigning all index variables to their nearest cluster centers. The Euclidean distance was used as the distance measurement in the iterative classification scheme;
- (4)
- To determine the optimal clustering, v-fold cross-validation was applied to estimate the optimal number of clusters and the optimal clustering. The details of the v-fold cross-validation are as follows:
- (a)
- Divide F or G into v folds (denoted Fi or Gi, I = 1, …, v), in this study, we set v = 5;
- (b)
- For i = 1 to v, take Fi or Gi as the testing set and {F}\Fi or {G}\Gi as the training sets;
- (c)
- Compute the mean Euclidean distances, which are called the clustering costs in this study, within each cluster of training sets, set these as the new cluster centers and replace the cluster centers of the previous step;
- (d)
- Compute the mean Euclidean distances of each index variable (or the level of all of the categorical variables) of the testing set from the new cluster centers derived from the training sets;
- (5)
- Iterate from (1);
- (6)
- If k = j, which indicates the minimum mean Euclidean distances (i.e., minimal clustering cost) of each index variable of the testing set, j would be the optimum number of clusters.
- (7)
- Clustering stopping rule: If || < 0.01, then stop further dividing and clustering.
- (8)
- Regarding the determination of number of clusters, we adopted the method proposed by Wang [25], the optimal algorithm will iterate in order to classify factors into different numbers of clusters, calculate the cluster cost (in this study, we used the mean sum of squares within clusters as the cluster cost measurement) and compare the sums of squares between clusters. If the sum of squares of k clusters did not show statistically significant difference from k + 1 clusters, the optimal number of clusters is determined as k.
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Cheng, T.-Y.D.; Cramb, S.M.; Baade, P.D.; Youlden, D.R.; Nwogu, C.; Reid, M.E. The international epidemiology of lung cancer: Latest trends, disparities, and tumor characteristics. J. Thorac. Oncol. 2016, 11, 1653–1671. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cho, J.H.; Zhou, W.; Choi, Y.-L.; Sun, J.-M.; Choi, H.; Kim, T.-E.; Dolled-Filhart, M.; Emancipator, K.; Rutkowski, M.A.; Kim, J. Retrospective molecular epidemiology study of PD-L1 expression in patients with EGFR-Mutant non-small cell lung cancer. Cancer Res. Treat. 2018, 50, 95–102. [Google Scholar] [CrossRef]
- Christiani, D.C. Smoking and the molecular epidemiology of lung cancer. Clin. Chest Med. 2000, 21, 87–93. [Google Scholar] [CrossRef]
- Ha, S.Y.; Choi, S.-J.; Cho, J.H.; Choi, H.J.; Lee, J.; Jung, K.; Irwin, D.; Liu, X.; Lira, M.E.; Mao, M.; et al. Lung cancer in never-smoker Asian females is driven by oncogenic mutations, most often involving EGFR. Oncotarget 2015, 6, 5465–5474. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [Green Version]
- Enstone, A.; Greaney, M.; Povsic, M.; Wyn, R.; Penrod, J.R.; Yuan, Y. The economic burden of small cell lung cancer: A systematic review of the literature. Pharm. Open 2017, 2, 139–152. [Google Scholar] [CrossRef] [Green Version]
- De Groot, P.M.; Wu, C.C.; Carter, B.W.; Munden, R.F. The epidemiology of lung cancer. Transl. Lung Cancer Res. 2018, 7, 220–233. [Google Scholar] [CrossRef]
- van der Meer, D.J.; Karim-Kos, H.E.; van der Mark, M.; Aben, K.K.H.; Bijlsma, R.M.; Rijneveld, A.W.; van der Graaf, W.T.A.; Husson, O. Incidence, survival, and mortality trends of cancers diagnosed in adolescents and young adults (15–39 Years): A population-based study in The Netherlands 1990–2016. Cancers 2020, 18, 3421. [Google Scholar] [CrossRef]
- Sacco, P.C.; Maione, P.; Guida, C.; Gridelli, C. The combination of new immunotherapy and radiotherapy: A new potential treatment for locally advanced non-small cell lung cancer. Curr. Clin. Pharmacol. 2017, 12, 4–10. [Google Scholar] [CrossRef]
- Hirsh, V. New developments in the treatment of advanced squamous cell lung cancer: Focus on afatinib. OncoTargets Ther. 2017, 10, 2513–2526. [Google Scholar] [CrossRef] [Green Version]
- Wang, H.; Zhang, J.; Shi, F.; Zhang, C.; Jiao, Q.; Zhu, H. Better cancer specific survival in young small cell lung cancer patients especially with AJCC stage III. Oncotarget 2017, 8, 34923–34934. [Google Scholar] [CrossRef] [Green Version]
- Arnold, B.N.; Thomas, D.C.; Rosen, J.E.; Salazar, M.C.; Blasberg, J.D.; Boffa, D.J.; Detterbeck, F.C.; Kim, A.W. Lung cancer in the very young: Treatment and survival in the national cancer data base. J. Thorac. Oncol. 2016, 11, 1121–1131. [Google Scholar] [CrossRef] [Green Version]
- Liu, M.; Cai, X.; Yu, W.; Lv, C.; Fu, X. Clinical significance of age at diagnosis among young non-small cell lung cancer patients under 40 years old: A population-based study. Oncotarget 2015, 6, 44963–44970. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, B.; Quan, X.; Xu, C.; Lv, J.; Li, C.; Dong, L.; Liu, M. Lung cancer in young adults aged 35 years or younger: A full-scale analysis and review. J. Cancer 2019, 10, 3553–3559. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rich, A.L.; Khakwani, A.; Free, C.M.; Tata, L.J.; Stanley, R.A.; Peake, M.D.; Hubbard, R.B.; Baldwin, D.R. Non-small cell lung cancer in young adults: Presentation and survival in the English National Lung Cancer Audit: QJM. Int. J. Med. 2015, 108, 891–897. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Arrieta, Ó.; Angulo, L.P.; Núñez-Valencia, C.; Dorantes-Gallareta, Y.; Macedo, E.O.; Martínez-López, D.; Alvarado, S.; Corona-Cruz, J.-F.; Oñate-Ocaña, L.F. Association of depression and anxiety on quality of life, treatment adherence, and prognosis in patients with advanced non-small cell lung cancer. Ann. Surg. Oncol. 2012, 20, 1941–1948. [Google Scholar] [CrossRef]
- Yan, X.; Chen, X.; Li, M.; Zhang, P. Prevalence and risk factors of anxiety and depression in Chinese patients with lung cancer: A cross-sectional study. Cancer Manag. Res. 2019, 11, 4347–4356. [Google Scholar] [CrossRef] [Green Version]
- Johnson, C.G.; Brodsky, J.L.; Cataldo, J.K. Lung cancer stigma, anxiety, depression, and quality of life. J. Psychosoc. Oncol. 2014, 32, 59–73. [Google Scholar] [CrossRef] [Green Version]
- Park, S.; Kang, C.H.; Hwang, Y.; Seong, Y.W.; Lee, H.J.; Park, I.K.; Kim, Y.T. Risk factors for postoperative anxiety and depression after surgical treatment for lung cancer. Eur. J. Cardiothorac. Surg. 2016, 49, e16–e21. [Google Scholar] [CrossRef] [Green Version]
- Ting, C.-T.; Kuo, C.-J.; Hu, H.-Y.; Lee, Y.-L.; Tsai, T.-H. Prescription frequency and patterns of Chinese herbal medicine for liver cancer patients in Taiwan: A cross-sectional analysis of the National Health Insurance Research Database. BMC Complement. Altern. Med. 2017, 17, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Jung, J.Y.; Lee, J.M.; Kim, M.S.; Shim, Y.M.; Zo, J.I.; Yun, Y.H. Comparison of fatigue, depression, and anxiety as factors affecting posttreatment health-related quality of life in lung cancer survivors. Psych. Oncol. 2018, 27, 465–470. [Google Scholar] [CrossRef]
- Ambrogi, F.; Biganzoli, E.; Boracchi, P. Multiple correspondence analysis in S-PLUS. Comput. Methods Programs Biomed. 2005, 79, 161–167. [Google Scholar] [CrossRef]
- Shrivastav, M.; Iaizzo, P. Discrimination of ischemia and normal sinus rhythm for cardiac signals using a modified k means clustering algorithm. In Proceedings of the 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Lyon, France, 22–26 August 2007; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2007; Volume 2007, pp. 3856–3859. [Google Scholar]
- Saatchi, M.; McClure, M.C.; McKay, S.D.; Rolf, M.M.; Kim, J.; Decker, J.E.; Taxis, T.M.; Chapple, R.H.; Ramey, H.R.; Northcutt, S.L.; et al. Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation. Genet. Sel. Evol. (GSE) 2011, 43, 40. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, J. Consistent selection of the number of clusters via cross validation. Biometrika 2010, 97, 893–904. [Google Scholar] [CrossRef]
- Hagen, K.B.; Aas, T.; Kvaloy, J.T.; Eriksen, H.R.; Soiland, H.; Lind, R. Fatigue, anxiety and depression overrule the role of oncological treatment in predicting self-reported health complaints in women with breast cancer compared to healthy controls. Breast 2016, 28, 100–106. [Google Scholar] [CrossRef] [PubMed]
- Gorman, J.R.; Su, H.I.; Roberts, S.C.; Dominick, S.A.; Malcarne, V.L. Experiencing reproductive concerns as a female cancer survivor is associated with depression. Cancer 2015, 121, 935–942. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- King, G.; Zeng, L. Logistic Regression in Rare Events Data. Politi. Anal. 2001, 9, 137–163. [Google Scholar] [CrossRef] [Green Version]
- Westphal, C. Logistic regression for extremely rare events: The case of school shootings. SSRN Electron. J. 2013. [Google Scholar] [CrossRef] [Green Version]
- Nations, J.A.; Nathan, S.D. Comorbidities of Advanced Lung Disease. Mt. Sinai J. Med. A J. Transl. Pers. Med. 2009, 76, 53–62. [Google Scholar] [CrossRef]
- Sculier, J.P.; Botta, I.; Bucalau, A.M.; Compagnie, M.; Eskenazi, A.; Fischler, R.; Gorham, J.; Mans, L.; Rozen, L.; Speybrouck, S.; et al. Medical anticancer treatment of lung cancer associated with comorbidities: A review. Lung Cancer 2015, 87, 241–248. [Google Scholar] [CrossRef] [PubMed]
- Paal, B.V. A Comparison of Different Methods for Modelling Rare Events Data; Universiteit Gent: Brussel, Belgium, 2014. [Google Scholar]
- Seib, C.; Porter-Steele, J.; Ng, S.K.; Turner, J.; McGuire, A.; McDonald, N.; Balaam, S.; Yates, P.; McCarthy, A.; Anderson, D. Life stress and symptoms of anxiety and depression in women after cancer: The mediating effect of stress appraisal and coping. Psychooncology 2018, 27, 1787–1794. [Google Scholar] [CrossRef] [PubMed]
- Hong-Jhe, C.; Chin-Yuan, K.; Ming-Shium, T.; Fu-Wei, W.; Ru-Yih, C.; Kuang-Chieh, H.; Hsiang-Ju, P.; Ming-Yueh, C.; Pan-Ming, C.; Chih-Chuan, P. The incidence and risk of osteoporosis in patients with anxiety disorder: A Population-based retrospective cohort study. Medicine 2016, 95, e4912. [Google Scholar] [CrossRef] [PubMed]
- Yeh, M.J.; Chang, H.H. National health insurance in Taiwan. Health Aff. 2015, 34, 1067. [Google Scholar] [CrossRef] [Green Version]
- Shi, Q.; Li, K.J.; Treuer, T.; Wang, B.C.M.; Gaich, C.L.; Lee, C.H.; Wu, W.S.; Furnback, W.; Tang, C.H. Estimating the response and economic burden of rheumatoid arthritis patients treated with biologic disease-modifying antirheumatic drugs in Taiwan using the National Health Insurance Research Database (NHIRD). PLoS ONE 2018, 13, e0193489. [Google Scholar] [CrossRef] [Green Version]
- Lu, T.; Yang, X.; Huang, Y.; Zhao, M.; Li, M.; Ma, K.; Yin, J.; Zhan, C.; Wang, Q. Trends in the incidence, treatment, and survival of patients with lung cancer in the last four decades. Cancer Manag. Res. 2019, 11, 943–953. [Google Scholar] [CrossRef] [Green Version]
Variable | n | (%) |
---|---|---|
Sex | ||
Female | 502 | 49.1 |
Male | 520 | 50.9 |
Age | ||
20–29 y | 154 | 15.1 |
30–39 y | 868 | 84.9 |
Charlson comorbidity index (CCI) | ||
CCI = 0 | 870 | 85.1 |
CCI = 1 | 91 | 8.9 |
CCI ≥ 2 | 61 | 6 |
Diabetes mellitus (DM) | ||
Yes | 23 | 2.3 |
No | 999 | 97.7 |
Hypertension | ||
Yes | 23 | 2.3 |
No | 999 | 97.7 |
Asthma | ||
Yes | 16 | 1.6 |
No | 1006 | 98.4 |
Liver cirrhosis | ||
Yes | 9 | 0.9 |
No | 1013 | 99.1 |
Chronic obstructive pulmonary disease (COPD) | ||
Yes | 51 | 5 |
No | 971 | 95 |
Autoimmune diseases | ||
Yes | 8 | 0.8 |
No | 1014 | 99.2 |
Cerebral diseases | ||
Yes | 11 | 1.1 |
No | 1011 | 98.9 |
Heart failure | ||
Yes | 2 | 0.2 |
No | 1020 | 99.8 |
Hepatitis B virus (HBV) | ||
Yes | 34 | 3.3 |
No | 988 | 96.7 |
Renal diseases | ||
Yes | 6 | 0.6 |
No | 1016 | 99.4 |
Osteoporosis | ||
Yes | 16 | 1.6 |
No | 1006 | 98.4 |
Depression | ||
Yes | 25 | 2.4 |
No | 997 | 97.6 |
Anxiety | ||
Yes | 15 | 1.5 |
No | 1007 | 98.5 |
Variable | Final Classification |
---|---|
Autoimmune disease = Yes | 1 |
Cerebral disease = Yes | 1 |
Heart failure = Yes | 1 |
Osteoporosis = Yes | 2 |
Anxiety = Yes | 2 |
Depression = Yes | 3 |
DM = No | 3 |
Age: 20–29 y | 3 |
Age: 30–39 y | 3 |
CCI = 0 | 3 |
Sex = Female | 3 |
DM = Yes | 4 |
Hypertension = Yes | 4 |
Asthma = Yes | 4 |
Liver cirrhosis = Yes | 4 |
COPD = Yes | 4 |
HBV = Yes | 4 |
CCI ≥ 2 | 4 |
Depression = No | 5 |
Hypertension = No | 5 |
Asthma = No | 5 |
Liver cirrhosis = No | 5 |
COPD = No | 5 |
Autoimmune disease = No | 5 |
Cerebral disease = No | 5 |
Heart failure = No | 5 |
HBV = No | 5 |
Osteoporosis = No | 5 |
Anxiety = No | 5 |
CCI = 1 | 5 |
Sex = Male | 5 |
(a) | ||||||||
Variable | DV = Depression | DV = Anxiety | ||||||
Score | p-Value | Score | p-Value | |||||
Sex: Male vs. Female | 0.485 | 0.486 | 0.108 | 0.742 | ||||
Age: 30–39 vs. 20–29 years | 0.017 | 0.895 | 0.036 | 0.85 | ||||
CCI = 1 vs. CCI = 0 | 2.505 | 0.113 | 0.368 | 0.544 | ||||
CCI ≥ 2 vs. CCI = 0 | 1.661 | 0.197 | 0.966 | 0.326 | ||||
DM: Yes vs. No | 0.59 | 0.442 | 0.35 | 0.554 | ||||
Hypertension: Yes vs. No | 0.357 | 0.55 | 0.35 | 0.554 | ||||
Asthma: Yes vs. No | 0.408 | 0.523 | 2.571 | 0.109 | ||||
Liver cirrhosis: Yes vs. No | 0.228 | 0.633 | 0.135 | 0.713 | ||||
COPD: Yes vs. No | 0.053 | 0.818 | 0.09 | 0.764 | ||||
Autoimmune: Yes vs. No | 0.202 | 0.653 | 0.12 | 0.729 | ||||
Cerebral diseases: Yes vs. No | 0.279 | 0.597 | 0.166 | 0.684 | ||||
Heart failure: Yes vs. No | 0.05 | 0.823 | 0.03 | 0.863 | ||||
HBV: Yes vs. No | 1.74 | 0.187 | 0.524 | 0.469 | ||||
Renal diseases: Yes vs. No | 0.151 | 0.697 | 0.09 | 0.764 | ||||
Osteoporosis: Yes vs. No | 0.408 | 0.523 | 2.571 | 0.109 | ||||
(b) | ||||||||
Variable | DV = Depression | DV = Anxiety | ||||||
Beta | S.E. | Odds Ratio (OR) | p-value | Beta | S.E. | Odds Ratio (OR) | p-value | |
Sex: Male vs. Female | −0.352 | 0.418 | 0.703 | 0.399 | −0.122 | 0.530 | 0.885 | 0.818 |
Age: 30–39 vs. 20–29 years | −0.010 | 0.561 | 0.990 | 0.986 | 0.038 | 0.781 | 1.038 | 0.961 |
CCI = 1 vs. CCI = 0 | −17.329 | 4055.844 | <0.001 | 0.997 | 0.302 | 0.872 | 1.353 | 0.729 |
CCI ≥ 2 vs. CCI = 0 | 1.377 | 0.829 | 3.964 | 0.097 | −15.014 | 4327.707 | <0.001 | 0.997 |
DM: Yes vs. No | −18.980 | 7844.428 | <0.001 | 0.998 | −14.048 | 6665.142 | <0.001 | 0.998 |
Hypertension: Yes vs. No | 1.217 | 1.092 | 3.377 | 0.265 | −16.095 | 7460.922 | <0.001 | 0.998 |
Asthma: Yes vs. No | −17.069 | 8269.606 | <0.001 | 0.998 | 18.338 | 6580.883 | 92,044,936.212 | 0.998 |
Liver cirrhosis: Yes vs. No | −16.960 | 11,705.571 | <0.001 | 0.999 | −16.057 | 11,754.148 | <0.001 | 0.999 |
COPD: Yes vs. No | −0.177 | 1.116 | 0.838 | 0.874 | −16.910 | 6580.883 | <0.001 | 0.998 |
Autoimmune: Yes vs. No | −17.528 | 12,892.859 | <0.001 | 0.999 | −17.150 | 13,490.401 | <0.001 | 0.999 |
Cerebral diseases: Yes vs. No | −17.579 | 11,033.665 | <0.001 | 0.999 | −16.307 | 11,052.028 | <0.001 | 0.999 |
Heart failure: Yes vs. No | −18.191 | 25,475.907 | <0.001 | 0.999 | −16.436 | 26,494.679 | <0.001 | 1.000 |
HBV: Yes vs. No | 0.225 | 0.962 | 1.252 | 0.815 | −16.248 | 6243.383 | <0.001 | 0.998 |
Renal diseases: Yes vs. No | −18.808 | 16,186.569 | <0.001 | 0.999 | −14.876 | 14,129.940 | <0.001 | 0.999 |
Osteoporosis: Yes vs. No | −17.344 | 9805.003 | <0.001 | 0.999 | 1.507 | 1.131 | 4.511 | 0.183 |
Constant | −3.468 | 0.561 | 0.031 | <0.001 | −4.152 | 0.778 | 0.016 | <0.001 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, Y.-W.; Liu, C.-Y. Determining Risk Factors Associated with Depression and Anxiety in Young Lung Cancer Patients: A Novel Optimization Algorithm. Medicina 2021, 57, 340. https://doi.org/10.3390/medicina57040340
Fang Y-W, Liu C-Y. Determining Risk Factors Associated with Depression and Anxiety in Young Lung Cancer Patients: A Novel Optimization Algorithm. Medicina. 2021; 57(4):340. https://doi.org/10.3390/medicina57040340
Chicago/Turabian StyleFang, Yu-Wei, and Chieh-Yu Liu. 2021. "Determining Risk Factors Associated with Depression and Anxiety in Young Lung Cancer Patients: A Novel Optimization Algorithm" Medicina 57, no. 4: 340. https://doi.org/10.3390/medicina57040340
APA StyleFang, Y. -W., & Liu, C. -Y. (2021). Determining Risk Factors Associated with Depression and Anxiety in Young Lung Cancer Patients: A Novel Optimization Algorithm. Medicina, 57(4), 340. https://doi.org/10.3390/medicina57040340