A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data
Abstract
:Featured Application
Abstract
1. Introduction
2. Related Work
3. Materials and Methods
3.1. Data Source
3.2. Components of the NLP-Extracted ECOG PS Variable
3.3. Study Cohorts
3.4. Performance Analyses
4. Results
4.1. Overall Study Population and Impact on ECOG PS Completeness
4.2. Algorithm Performance in Training and Testing Cohorts
4.3. Analysis of an aNSCLC Cohort: Impact on Sample Availability and Prognostic Value
5. Discussion
Limitations
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Khozin, S.; Blumenthal, G.; Pazdur, R. Real-world Data for Clinical Evidence Generation in Oncology. J. Natl. Cancer Inst. 2017, 109, djx187. [Google Scholar] [CrossRef] [PubMed]
- Berger, M.L.; Curtis, M.D.; Smith, G.; Harnett, J.; Abernethy, A.P. Opportunities and challenges in leveraging electronic health record data in oncology. Future Oncol. 2016, 12, 1261–1274. [Google Scholar] [CrossRef] [PubMed]
- Callahan, A.; Shah, N.H.; Chen, J.H. Research and Reporting Considerations for Observational Studies Using Electronic Health Record Data. Ann. Intern. Med. 2020, 172 (Suppl. S11), S79–S84. [Google Scholar] [CrossRef] [PubMed]
- Rudin, R.S.; Friedberg, M.W.; Shekelle, P.; Shah, N.; Bates, D.W. Getting Value from Electronic Health Records: Research Needed to Improve Practice. Ann. Intern. Med. 2020, 172 (Suppl. S11), S130–S136. [Google Scholar] [CrossRef]
- Guinn, D.; Wilhelm, E.E.; Lieberman, G.; Khozin, S. Assessing function of electronic health records for real-world data generation. BMJ Evid.-Based Med. 2018, 24, 95–98. [Google Scholar] [CrossRef]
- Zhang, J.; Symons, J.; Agapow, P.; Teo, J.T.; Paxton, C.A.; Abdi, J.; Mattie, H.; Davie, C.; Torres, A.Z.; Folarin, A.; et al. Best practices in the real-world data life cycle. PLoS Digit. Health 2022, 1, e0000003. [Google Scholar] [CrossRef]
- Tayefi, M.; Ngo, P.; Chomutare, T.; Dalianis, H.; Salvi, E.; Budrionis, A.; Godtliebsen, F. Challenges and opportunities beyond structured data in analysis of electronic health records. WIREs Comput. Stat. 2021, 13, e1549. [Google Scholar] [CrossRef]
- Beaulieu-Jones, B.K.; Lavage, D.R.; Snyder, J.W.; Moore, J.H.; Pendergrass, S.A.; Bauer, C.R. Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis. JMIR Public Health Surveill. 2018, 6, e11. [Google Scholar] [CrossRef]
- Perkins, N.; Cole, S.R.; Harel, O.; Tchetgen, E.J.T.; Sun, B.; Mitchell, E.M.; Schisterman, E. Principled Approaches to Missing Data in Epidemiologic Studies. Am. J. Epidemiol. 2017, 187, 568–575. [Google Scholar] [CrossRef]
- Haneuse, S.; Arterburn, D.; Daniels, M.J. Assessing Missing Data Assumptions in EHR-Based Studies: A Complex and Underappreciated Task. JAMA Netw. Open 2021, 4, e210184. [Google Scholar] [CrossRef]
- Kruse, C.S.; Stein, A.; Thomas, H.; Kaur, H. The use of Electronic Health Records to Support Population Health: A Systematic Review of the Literature. J. Med. Syst. 2018, 42, 1–16. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.; Howard, R.; Amorrortu, R.P.; Stewart, S.C.; Wang, X.; Calip, G.S.; Rollison, D.E. Assessing the Contribution of Scanned Outside Documents to the Completeness of Real-World Data Abstraction. JCO Clin. Cancer Inform. 2023, 7, e2200118. [Google Scholar] [CrossRef]
- Birnbaum, B.; Nussbaum, N.; Seidl-Rathkopf, K.; Agrawal, M.; Estevez, M.; Estola, E.; Haimson, J.; He, L.; Larson, P.; Richardson, P. Model-Assisted Cohort Selection with Bias Analysis for Generating Large-Scale Cohorts from the EHR for Oncology Research. arXiv 2020, arXiv:2001.09765. [Google Scholar]
- Waskom, M.L.; Tan, K.; Wiberg, H.; Cohen, A.B.; Wittmershaus, B.; Shapiro, W. A hybrid approach to scalable real-world data curation by machine learning and human experts. medRxiv 2023. [Google Scholar] [CrossRef]
- Bhattad, P.B.; Jain, V. Artificial Intelligence in Modern Medicine—The Evolving Necessity of the Present and Role in Transforming the Future of Medical Care. Cureus 2020, 12, e8041. [Google Scholar] [CrossRef]
- Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; Depristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef]
- Rajkomar, A.; Dean, J.; Kohane, I. Machine Learning in Medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [Google Scholar] [CrossRef]
- Datta, S.; Bernstam, E.V.; Roberts, K. A frame semantic overview of NLP-based information extraction for cancer-related EHR notes. J. Biomed. Inform. 2019, 100, 103301. [Google Scholar] [CrossRef]
- Bertsimas, D.; Wiberg, H. Machine Learning in Oncology: Methods, Applications, and Challenges. JCO Clin. Cancer Inform. 2020, 4, 885–894. [Google Scholar] [CrossRef]
- Yim, W.-W.; Yetisgen, M.; Harris, W.P.; Kwan, S.W. Natural Language Processing in Oncology. JAMA Oncol. 2016, 2, 797–804. [Google Scholar] [CrossRef]
- Rajman, M.; Besançon, R. Text Mining: Natural Language techniques and Text Mining applications. In Data Mining and Reverse Engineering. IFIP—The International Federation for Information Processing; Spaccapietra, S., Maryanski, F., Eds.; Springer: Boston, MA, USA, 1998; pp. 50–64. [Google Scholar] [CrossRef]
- Bera, K.; Schalper, K.A.; Rimm, D.L.; Velcheti, V.; Madabhushi, A. Artificial intelligence in digital pathology—New tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 2019, 16, 703–715. [Google Scholar] [CrossRef] [PubMed]
- Feeny, A.K.; Chung, M.K.; Madabhushi, A.; Attia, Z.I.; Cikes, M.; Firouznia, M.; Friedman, P.A.; Kalscheur, M.M.; Kapa, S.; Narayan, S.M.; et al. Artificial Intelligence and Machine Learning in Arrhythmias and Cardiac Electrophysiology. Circ. Arrhythmia Electrophysiol. 2020, 13, e007952. [Google Scholar] [CrossRef] [PubMed]
- D’amore, B.; Smolinski-Zhao, S.; Daye, D.; Uppot, R.N. Role of Machine Learning and Artificial Intelligence in Interventional Oncology. Curr. Oncol. Rep. 2021, 23, 1–8. [Google Scholar] [CrossRef]
- Jayatilake, S.M.D.A.C.; Ganegoda, G.U. Involvement of Machine Learning Tools in Healthcare Decision Making. J. Healthc. Eng. 2021, 2021, 6679512. [Google Scholar] [CrossRef] [PubMed]
- Jiang, M.; Ma, Y.; Guo, S.; Jin, L.; Lv, L.; Han, L.; An, N. Using Machine Learning Technologies in Pressure Injury Management: Systematic Review. JMIR Public Health Surveill. 2021, 9, e25704. [Google Scholar] [CrossRef] [PubMed]
- Peterson, D.J.; Ostberg, N.P.; Blayney, D.W.; Brooks, J.D.; Hernandez-Boussard, T. Machine Learning Applied to Electronic Health Records: Identification of Chemotherapy Patients at High Risk for Preventable Emergency Department Visits and Hospital Admissions. JCO Clin. Cancer Inform. 2021, 5, 1106–1126. [Google Scholar] [CrossRef]
- Banerjee, I.; Bozkurt, S.; Caswell-Jin, J.; Kurian, A.W.; Rubin, D.L. Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer. JCO Clin. Cancer Inform. 2019, 3, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Karimi, Y.H.; Blayney, D.W.; Kurian, A.W.; Shen, J.; Yamashita, R.; Rubin, D.; Banerjee, I. Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data. JCO Clin. Cancer Inform. 2021, 5, 469–478. [Google Scholar] [CrossRef]
- Kehl, K.L.; Xu, W.; Lepisto, E.; Elmarakeby, H.; Hassett, M.J.; Van Allen, E.M.; Johnson, B.E.; Schrag, D. Natural Language Processing to Ascertain Cancer Outcomes from Medical Oncologist Notes. JCO Clin. Cancer Inform. 2020, 4, 680–690. [Google Scholar] [CrossRef]
- Fu, S.; Chen, D.; He, H.; Liu, S.; Moon, S.; Peterson, K.J.; Shen, F.; Wang, L.; Wang, Y.; Wen, A.; et al. Clinical concept extraction: A methodology review. J. Biomed. Inform. 2020, 109, 103526. [Google Scholar] [CrossRef]
- Savova, G.K.; Danciu, I.; Alamudun, F.; Miller, T.; Lin, C.; Bitterman, D.S.; Tourassi, G.; Warner, J.L. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records. Cancer Res 2019, 79, 5463–5470. [Google Scholar] [CrossRef] [PubMed]
- Deshmukh, P.R.; Phalnikar, R. Information extraction for prognostic stage prediction from breast cancer medical records using NLP and ML. Med. Biol. Eng. Comput. 2021, 59, 1751–1772. [Google Scholar] [CrossRef] [PubMed]
- Oken, M.M.; Creech, R.H.; Tormey, D.C.; Horton, J.; Davis, T.E.; McFadden, E.T.; Carbone, P.P. Toxicity and response criteria of the Eastern Cooperative Oncology Group. Am. J. Clin. Oncol. 1982, 5, 649–656. [Google Scholar] [CrossRef] [PubMed]
- Albain, K.S.; Crowley, J.J.; Leblanc, M.; Livingston, R.B. Survival determinants in extensive-stage non-small-cell lung cancer: The Southwest Oncology Group experience. J. Clin. Oncol. 1991, 9, 1618–1626. [Google Scholar] [CrossRef] [PubMed]
- Jang, R.W.; Caraiscos, V.B.; Swami, N.; Banerjee, S.; Mak, E.; Kaya, E.; Rodin, G.; Bryson, J.; Ridley, J.Z.; Le, L.W.; et al. Simple Prognostic Model for Patients with Advanced Cancer Based on Performance Status. J. Oncol. Pr. 2014, 10, e335–e341. [Google Scholar] [CrossRef]
- Köhne, C.-H.; Cunningham, D.; Di Costanzo, F.; Glimelius, B.; Blijham, G.; Aranda, E.; Scheithauer, W.; Rougier, P.; Palmer, M.; Wils, J.; et al. Clinical determinants of survival in patients with 5-fluorouracil- based treatment for metastatic colorectal cancer: Results of a multivariate analysis of 3825 patients. Ann. Oncol. 2002, 13, 308–317. [Google Scholar] [CrossRef]
- Sargent, D.J.; Köhne, C.H.; Sanoff, H.K.; Bot, B.M.; Seymour, M.T.; de Gramont, A.; Porschen, R.; Saltz, L.B.; Rougier, P.; Tournigand, C.; et al. Pooled Safety and Efficacy Analysis Examining the Effect of Performance Status on Outcomes in Nine First-Line Treatment Trials Using Individual Data from Patients with Metastatic Colorectal Cancer. J. Clin. Oncol. 2009, 27, 1948–1955. [Google Scholar] [CrossRef]
- Schiller, J.H.; Harrington, D.; Belani, C.P.; Langer, C.; Sandler, A.; Krook, J.; Zhu, J.; Johnson, D.H. Comparison of Four Chemotherapy Regimens for Advanced Non–Small-Cell Lung Cancer. N. Engl. J. Med. 2002, 346, 92–98. [Google Scholar] [CrossRef]
- Sengeløv, L.; Kamby, C.; Geertsen, P.; Andersen, L.J.; von der Maase, H. Predictive factors of response to cisplatin-based chemotherapy and the relation of response to survival in patients with metastatic urothelial cancer. Cancer Chemother. Pharmacol. 2000, 46, 357–364. [Google Scholar] [CrossRef]
- Blagden, S.P.; Charman, S.C.; Sharples, L.D.; Magee, L.R.A.; Gilligan, D. Performance status score: Do patients and their oncologists agree? Br. J. Cancer 2003, 89, 1022–1027. [Google Scholar] [CrossRef]
- Roila, F.; Lupattelli, M.; Sassi, M.; Basurto, C.; Bracarda, S.; Picciafuoco, M.; Boschetti, E.; Milella, G.; Ballatori, E.; Tonato, M.; et al. Intra and interobserver variability in cancer patients’ performance status assessed according to Karnofsky and ECOG scales. Ann. Oncol. 1991, 2, 437–439. [Google Scholar] [CrossRef] [PubMed]
- Sorensen, J.B.; Klee, M.R.; Palshof, T.; Hansen, H.H. Performance status assessment in cancer patients. An inter-observer variability study. Br. J. Cancer 1993, 67, 773–775. [Google Scholar] [CrossRef] [PubMed]
- Wang, L.; Fu, S.; Wen, A.; Ruan, X.; He, H.; Liu, S.; Moon, S.; Mai, M.; Riaz, I.B.; Wang, N.; et al. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clin. Cancer Inform. 2022, 6, e2200006. [Google Scholar] [CrossRef] [PubMed]
- Hom, J.; Nikowitz, J.; Ottesen, R.; Niland, J.C. Facilitating clinical research through automation: Combining optical character recognition with natural language processing. Clin. Trials 2022, 19, 504–511. [Google Scholar] [CrossRef]
- Agaronnik, N.; Lindvall, C.; El-Jawahri, A.; He, W.; Iezzoni, L. Use of Natural Language Processing to Assess Frequency of Functional Status Documentation for Patients Newly Diagnosed with Colorectal Cancer. JAMA Oncol. 2020, 6, 1628–1630. [Google Scholar] [CrossRef]
- Gauthier, M.-P.; Law, J.H.; Le, L.W.; Li, J.J.; Zahir, S.; Nirmalakumar, S.; Sung, M.; Pettengell, C.; Aviv, S.; Chu, R.; et al. Automating Access to Real-World Evidence. JTO Clin. Res. Rep. 2022, 3, 100340. [Google Scholar] [CrossRef]
- Herath, D.H.; Wilson-Ing, D.; Ramos, E.; Morstyn, G. Assessing the natural language processing capabilities of IBM Watson for oncology using real Australian lung cancer cases. J. Clin. Oncol. 2016, 34, e18229. [Google Scholar] [CrossRef]
- Ma, X.; Long, L.; Moon, S.; Adamson, B.J.; Baxi, S.S. Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR. medRxiv 2020. [Google Scholar] [CrossRef]
- Haimson, J.D.; Baxi, S.; Meropol, N.; Ambwani, G.; Backenroth, D.; Murali, M.; Rosic, A.; Chengsheng, J. Prognostic Score Based on Health Information. U.S. Patent 11651252, 16 May 2023. [Google Scholar]
- Center for Drug Evaluation and Research Center for Biologics Evaluation and Research Oncology Center of Excellence. Real-World Data: Assessing Electronic Health Records and Medical Claims Data to Support Regulatory Decision-Making for Drug and Biological Products; Draft Guidance for Industry. US Food & Drug Administration Web Site. September 2021. Available online: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-electronic-health-records-and-medical-claims-data-support-regulatory (accessed on 27 February 2023).
- Kent, S.; Burn, E.; Dawoud, D.; Jonsson, P.; Østby, J.T.; Hughes, N.; Rijnbeek, P.; Bouvy, J.C. Common Problems, Common Data Model Solutions: Evidence Generation for Health Technology Assessment. Pharmacoeconomics 2020, 39, 275–285. [Google Scholar] [CrossRef]
- Gupta, S.; Belouali, A.; Shah, N.J.; Atkins, M.B.; Madhavan, S. Automated Identification of Patients with Immune-Related Adverse Events from Clinical Notes Using Word Embedding and Machine Learning. JCO Clin. Cancer Inform. 2021, 5, 541–549. [Google Scholar] [CrossRef]
- Koleck, T.A.; Dreisbach, C.; Bourne, P.E.; Bakken, S. Natural language processing of symptoms documented in free-text narratives of electronic health records: A systematic review. J. Am. Med. Inform. Assoc. 2019, 26, 364–379. [Google Scholar] [CrossRef] [PubMed]
- Dall’olio, F.G.; Maggio, I.; Massucci, M.; Mollica, V.; Fragomeno, B.; Ardizzoni, A. ECOG performance status ≥2 as a prognostic factor in patients with advanced non small cell lung cancer treated with immune checkpoint inhibitors—A systematic review and meta-analysis of real world data. Lung Cancer 2020, 145, 95–104. [Google Scholar] [CrossRef] [PubMed]
- Kawaguchi, T.; Takada, M.; Kubo, A.; Matsumura, A.; Fukai, S.; Tamura, A.; Saito, R.; Maruyama, Y.; Kawahara, M.; Ou, S.-H.I. Performance Status and Smoking Status Are Independent Favorable Prognostic Factors for Survival in Non-small Cell Lung Cancer: A Comprehensive Analysis of 26,957 Patients with NSCLC. J. Thorac. Oncol. 2010, 5, 620–630. [Google Scholar] [CrossRef]
- Kawsar, H.; Gaudel, P.; Suleiman, N.; Al-Jumayli, M.; Huang, C.; Neupane, P. 221 Poor performance status negatively affects survival benefit of immunotherapy in non-small cell lung cancer. J. Immunother. Cancer 2020, 8, A131–A132. [Google Scholar] [CrossRef]
- Sehgal, K.; Gill, R.R.; Widick, P.; Bindal, P.; McDonald, D.C.; Shea, M.; Rangachari, D.; Costa, D.B. Association of Performance Status with Survival in Patients with Advanced Non–Small Cell Lung Cancer Treated With Pembrolizumab Monotherapy. JAMA Netw. Open 2021, 4, e2037120. [Google Scholar] [CrossRef]
- Catalano, M.; Aprile, G.; Conca, R.; Petrioli, R.; Ramello, M.; Roviello, G. The impact of age, performance status and comorbidities on nab-paclitaxel plus gemcitabine effectiveness in patients with metastatic pancreatic cancer. Sci. Rep. 2022, 12 (Suppl. S3), 1–7. [Google Scholar] [CrossRef]
- Petito, L.; García-Albéniz, X.; Logan, R.W.; Howlader, N.; Mariotto, A.B.; Dahabreh, I.J.; Hernán, M.A. Estimates of Overall Survival in Patients with Cancer Receiving Different Treatment Regimens: Emulating Hypothetical Target Trials in the Surveillance, Epidemiology, and End Results (SEER)-Medicare Linked Database. JAMA Netw. Open 2020, 3, e200452. [Google Scholar] [CrossRef]
- Tan, K.; Bryan, J.; Segal, B.; Bellomo, L.; Nussbaum, N.; Tucker, M.; Torres, A.Z.; Bennette, C.; Capra, W.; Curtis, M.; et al. Emulating Control Arms for Cancer Clinical Trials Using External Cohorts Created from Electronic Health Record-Derived Real-World Data. Clin. Pharmacol. Ther. 2021, 111, 168–178. [Google Scholar] [CrossRef]
- Lilenbaum, R.C.; Cashy, J.; Hensing, T.A.; Young, S.; Cella, D. Prevalence of Poor Performance Status in Lung Cancer Patients: Implications for Research. J. Thorac. Oncol. 2008, 3, 125–129. [Google Scholar] [CrossRef]
- Boukovinas, I.; Kosmidis, P. Treatment of non-small cell lung cancer patients with performance status2 (PS2). Lung Cancer 2009, 63, 10–15. [Google Scholar] [CrossRef]
Characteristic | Testing N = 5341 Unique Patient-Disease | Training N = 2519 Unique Patient-Disease | p-Value 1 | |
---|---|---|---|---|
Age at 1L | 18–64 | 2232 (41.8%) | 1028 (40.8%) | >0.05 |
65–74 | 1732 (32.4%) | 781 (31.0%) | ||
75 and older | 1377 (25.8%) | 710 (28.2%) | ||
Race | Asian | 112 (2.1%) | 45 (1.8%) | 0.004 |
Black or African American | 469 (8.8%) | 222 (8.8%) | ||
Other Race | 619 (11.6%) | 297 (11.8%) | ||
Unknown | 490 (9.2%) | 300 (11.9%) | ||
White | 3651 (68.5%) | 1655 (65.7%) | ||
Ethnicity | Hispanic or Latino | 332 (6.2%) | 203 (8.0%) | 0.002 |
Unknown/Non-Hispanic | 5009 (93.8%) | 2316 (92.0%) | ||
Gender | F | 2769 (51.9%) | 1307 (51.9%) | >0.9 |
M | 2571 (48.1%) | 1212 (48.1%) | ||
(Missing) | 1 | 0 | ||
Practice Type | Academic | 1186 (22.2%) | 221 (8.8%) | <0.001 |
Community | 4155 (77.8%) | 2298 (91.2%) | ||
Year of Initial/Adv/Met Diagnosis/First Treatment 2 | <2018 | 4323 (80.9%) | 2151 (85.34%) | <0.001 |
≥2018 | 1018 (19.1%) | 368 (14.6%) | ||
Group Stage (if applicable) | 0 | 1 (<0.1%) | 1 (<0.1%) | Not Applicable 3 |
I | 284 (5.3%) | 114 (4.5%) | ||
II | 387 (7.2%) | 185 (7.3%) | ||
III | 815 (15.3%) | 405 (16.1%) | ||
IV | 2090 (39.1%) | 1066 (42.3%) | ||
Not Applicable | 1764 (33.0%) | 748 (29.7%) | ||
Year of start of 1L | <2018 | 3847 (72.0%) | 2008 (79.7%) | <0.001 |
≥2018 | 1494 (27.3%) | 511 (20.3%) |
Cohort | Accuracy a | Sensitivity | PPV | F1-Score | |
---|---|---|---|---|---|
ECOG 0–4 in Testing set | 0.93 (0.92–0.94) | 0.88 (0.87–0.89) | 0.88 (0.87–0.89) | 0.88 (0.87–0.89) | |
ECOG 0–4 in Training set b | 0.83 (0.82–0.84) | 0.80 (0.78–0.82) | 0.75 (0.73–0.77) | 0.77 (0.76–0.78) | |
Testing Cohort | ECOG PS 0 | 0.98 (0.98–0.980 | 0.90 (0.88–0.92) | 0.89 (0.87–0.91) | 0.90 (0.89–0.91) |
ECOG PS 1 | 0.96 (0.96–0.96) | 0.88 (0.86–0.90) | 0.88 (0.86–0.90) | 0.88 (0.87–0.89) | |
ECOG PS 2 | 0.98 (0.98–0.98) | 0.85 (0.81–0.89) | 0.84 (0.80–0.88) | 0.84 (0.83–0.85) | |
ECOG PS 3 | 1.00 (1.00–1.00) | 0.75 (0.67–0.83) | 0.89 (0.83–0.95) | 0.81 (0.80–0.82) | |
ECOG PS 0–1 | 0.95 (0.95–0.95) | 0.91 (0.90–0.92) | 0.91 (0.90–0.92) | 0.91 (0.90–0.92) | |
ECOG PS 2–4 | 0.98 (0.98–0.98) | 0.84 (0.81–0.87) | 0.85 (0.82–0.88) | 0.85 (0.84–0.86) | |
Training Cohort | ECOG PS 0 | 0.95 (0.94–0.96) | 0.74 (0.70–0.78) | 0.84 (0.80–0.88) | 0.79 (0.78–0.80) |
ECOG PS 1 | 0.94 (0.93–0.95) | 0.78 (0.75–0.81) | 0.87 (0.84–0.90) | 0.83 (0.82–0.84) | |
ECOG PS 2 | 0.98 (0.98–0.98) | 0.81 (0.76–0.86) | 0.88 (0.83–0.93) | 0.84 (0.83–0.85) | |
ECOG PS 3 | 0.97 (0.96–0.98) | 0.95 (0.92–0.98) | 0.69 (0.63–0.75) | 0.80 (0.79–0.81) | |
ECOG PS 0–1 | 0.90 (0.89–0.91) | 0.79 (0.76–0.82) | 0.89 (0.87–0.91) | 0.84 (0.83–0.85) | |
ECOG PS 2–4 | 0.92 (0.91–0.93) | 0.94 (0.92–0.96) | 0.64 (0.60–0.68) | 0.76 (0.75–0.77) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cohen, A.B.; Rosic, A.; Harrison, K.; Richey, M.; Nemeth, S.; Ambwani, G.; Miksad, R.; Haaland, B.; Jiang, C. A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data. Appl. Sci. 2023, 13, 6209. https://doi.org/10.3390/app13106209
Cohen AB, Rosic A, Harrison K, Richey M, Nemeth S, Ambwani G, Miksad R, Haaland B, Jiang C. A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data. Applied Sciences. 2023; 13(10):6209. https://doi.org/10.3390/app13106209
Chicago/Turabian StyleCohen, Aaron B., Andrej Rosic, Katherine Harrison, Madeline Richey, Sheila Nemeth, Geetu Ambwani, Rebecca Miksad, Benjamin Haaland, and Chengsheng Jiang. 2023. "A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data" Applied Sciences 13, no. 10: 6209. https://doi.org/10.3390/app13106209
APA StyleCohen, A. B., Rosic, A., Harrison, K., Richey, M., Nemeth, S., Ambwani, G., Miksad, R., Haaland, B., & Jiang, C. (2023). A Natural Language Processing Algorithm to Improve Completeness of ECOG Performance Status in Real-World Data. Applied Sciences, 13(10), 6209. https://doi.org/10.3390/app13106209