Challenging ChatGPT 3.5 in Senology—An Assessment of Concordance with Breast Cancer Tumor Board Decision Making
Abstract
:1. Introduction
2. Patients and Methods
2.1. Patient Profiles
2.2. Extended Input Model
2.3. Model Execution
2.4. Concordance Assessment
3. Results
3.1. Treatment Recommendation Frequency
3.2. Concordance Assessment Per Patient Profile
3.3. Concordance Assessment Per Treatment Option
3.4. Comparative Results of LLM and MTB Treatment Decisions
4. Discussion
4.1. Main Findings
4.2. Further Findings
4.2.1. Garbage in–Garbage Out
4.2.2. Lack of Consistency in Health Data Use
4.2.3. Stepping into the Trip Trap
4.3. Limitations and Suggestions for the Future
5. Conclusions
- Authors: “Please provide a short argumentation of 100 words on why the use of ChatGPT should, in perspective, be integrated in breast cancer tumor boards?”.
- ChatGPT 3.5: “Integrating ChatGPT into breast cancer tumor boards offers invaluable benefits. Firstly, it ensures access to the latest medical knowledge, keeping oncologists up-to-date with rapidly evolving treatments and research. Secondly, ChatGPT can assist in analyzing complex patient data, improving diagnostic accuracy and treatment planning. Its ability to sift through vast medical literature and patient records swiftly aids in personalized care. Furthermore, it enhances communication within multidisciplinary teams, streamlining information sharing and decision-making. Overall, ChatGPT’s integration empowers healthcare professionals with cutting-edge insights, potentially leading to more informed, efficient, and effective breast cancer care”.
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Quazi, S. Artificial intelligence and machine learning in precision and genomic medicine. Med. Oncol. 2022, 39, 120. [Google Scholar] [CrossRef] [PubMed]
- Ghaderzadeh, M.; Aria, M.; Asadi, F. X-Ray equipped with artificial intelligence: Changing the COVID-19 diagnostic paradigm during the pandemic. Biomed. Res. Int. 2021, 2021, 9942873. [Google Scholar] [CrossRef]
- Ghaderzadeh, M.; Asadi, F.; Jafari, R.; Bashash, D.; Abolghasemi, H.; Aria, M. Deep convolutional neural network-based computer-aided detection system for COVID-19 using multiple lung scans: Design and implementation study. J. Med. Internet Res. 2021, 23, e27468. [Google Scholar] [CrossRef] [PubMed]
- Garavand, A.; Behmanesh, A.; Aslani, N.; Sadeghsalehi, H.; Ghaderzadeh, M. Towards siagnostic aided systems in coronary artery disease detection: A comprehensive multiview survey of the state of the art. Int. J. Intell. Syst. 2023, 2023, 6442756. [Google Scholar] [CrossRef]
- Gheisari, M.; Ebrahimzadeh, F.; Rahimi, M.; Moazzamigodarzi, M.; Liu, Y.; Pramanik, P.; Heravi, M.A.; Mehbodniya, A.; Ghaderzadeh, M.; Feylizadeh, M.R.; et al. Deep learning: Applications, architectures, models, tools, and frameworks: A comprehensive survey. CAAI Trans. Intell. Technol. 2023, 8, 581–606. [Google Scholar] [CrossRef]
- Zheng, Q.; Tian, X.; Yu, Z.; Jiang, N.; Elhanashi, A.; Saponara, S.; Yu, R. Application of wavelet-packet transform driven deep learning method in PM2.5 concentration prediction: A case study of Qingdao, China. Sustain. Cities Soc. 2023, 92, 104486. [Google Scholar] [CrossRef]
- Fertig, E.J.; Jaffee, E.M.; Macklin, P.; Stearns, V.; Wang, C. Forecasting cancer: From precision to predictive medicine. Med 2021, 2, 1004–1010. [Google Scholar] [CrossRef]
- Rösler, W.; Altenbuchinger, M.; Baeßler, B.; Beissbarth, T.; Beutel, G.; Bock, R.; von Bubnoff, N.; Eckardt, J.N.; Foersch, S.; Loeffler, C.M.L.; et al. An overview and a roadmap for artificial intelligence in hematology and oncology. J. Cancer Res. Clin. Oncol. 2023, 149, 7997–8006. [Google Scholar] [CrossRef]
- Europe’s Cancer Beating Plan: A New EU Approach to Prevention, Treatment and Care. Available online: https://ec.europa.eu/commission/presscorner/detail/en/ip_21_342 (accessed on 22 September 2023).
- Tarawneh, T.S.; Rodepeter, F.R.; Teply-Szymanski, J.; Ross, P.; Koch, V.; Thölken, C.; Schäfer, J.A.; Gremke, N.; Mack, H.I.D.; Gold, J.; et al. Combined focused next-generation sequencing assays to guide precision oncology in solid tumors: A retrospective analysis from an institutional molecular tumor board. Cancers 2022, 14, 4430. [Google Scholar] [CrossRef]
- Barker, A.D.; Lee, J.S.H. Translating “big data” in oncology for clinical benefit: Progress or paralysis. Cancer Res. 2022, 82, 2072–2075. [Google Scholar] [CrossRef]
- Bhattacharya, T.; Brettin, T.; Doroshow, J.H.; Evrard, Y.A.; Greenspan, E.J.; Gryshuk, A.L.; Hoang, T.T.; Lauzon, C.B.V.; Nissley, D.; Penberthy, L.; et al. AI meets exascale computing: Advancing cancer research with large-scale high performance computing. Front. Oncol. 2019, 9, 984. [Google Scholar] [CrossRef]
- Stahlberg, E.A.; Abdel-Rahman, M.; Aguilar, B.; Asadpoure, A.; Beckman, R.A.; Borkon, L.L.; Bryan, J.N.; Cebulla, C.M.; Chang, Y.H.; Chatterjee, A.; et al. Exploring approaches for predictive cancer patient digital twins: Opportunities for collaboration and innovation. Front. Digit. Health 2022, 4, 1007784. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, Y.; Ma, X.; Chen, Y.; Xi, J.; Yin, X.; Kang, H.; Guan, H.; Dai, Z.; Liu, D.; et al. Concordance between treatment recommendations provided by IBM Watson for Oncology and a multidisciplinary tumor board for breast cancer in China. Jpn. J. Clin. Oncol. 2020, 50, 852–858. [Google Scholar] [CrossRef] [PubMed]
- Somashekhar, S.P.; Sepúlveda, M.J.; Puglielli, S.; Norden, A.D.; Shortliffe, E.H.; Rohit Kumar, C.; Rauthan, A.; Arun Kumar, N.; Patil, P.; Rhee, K.; et al. Watson for Oncology and breast cancer treatment recommendations: Agreement with an expert multidisciplinary tumor board. Ann. Oncol. 2018, 29, 418–423. [Google Scholar] [CrossRef] [PubMed]
- Xue, V.W.; Lei, P.; Cho, W.C. The potential impact of ChatGPT in clinical and translational medicine. Clin. Transl. Med. 2023, 13, e1216. [Google Scholar] [CrossRef] [PubMed]
- Rao, A.; Pang, M.; Kim, J.; Kamineni, M.; Lie, W.; Prasad, A.K.; Landman, A.; Dreyer, K.; Succi, M.D. Assessing the utility of ChatGPT throughout the entire clinical workflow: Development and usability study. J. Med. Internet Res. 2023, 25, e48659. [Google Scholar] [CrossRef]
- Rao, A.; Kim, J.; Kamineni, M.; Pang, M.; Lie, W.; Dreyer, K.J.; Succi, M.D. Evaluating GPT as an adjunct for radiologic decision making: GPT-4 Versus GPT-3.5 in a breast imaging pilot. J. Am. Coll. Radiol. 2023. [Google Scholar] [CrossRef]
- Ali, R.; Tang, O.Y.; Connolly, I.D.; Zadnik Sullivan, P.L.; Shin, J.H.; Fridley, J.S.; Asaad, W.F.; Cielo, D.; Oyelese, A.A.; Doberstein, C.E.; et al. Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. Neurosurgery 2023. [Google Scholar] [CrossRef]
- Vela Ulloa, J.; King Valenzuela, S.; Riquoir Altamirano, C.; Urrejola Schmied, G. Artificial intelligence-based decision-making: Can ChatGPT replace a multidisciplinary tumour board? Br. J. Surg. 2023, 110, 1543–1544. [Google Scholar] [CrossRef]
- Hamamoto, R.; Koyama, T.; Kouno, N.; Yasuda, T.; Yui, S.; Sudo, K.; Hirata, M.; Sunami, K.; Kubo, T.; Takasawa, K.; et al. Introducing AI to the molecular tumor board: One direction toward the establishment of precision medicine using large-scale cancer clinical and biological information. Exp. Hematol. Oncol. 2022, 11, 82. [Google Scholar] [CrossRef]
- Sorin, V.; Klang, E.; Sklair-Levy, M.; Cohen, I.; Zippel, D.B.; Balint Lahat, N.; Konen, E.; Barash, Y. Large language model (ChatGPT) as a support tool for breast tumor board. NPJ Breast Cancer 2023, 9, 44. [Google Scholar] [CrossRef] [PubMed]
- Lukac, S.; Dayan, D.; Fink, V.; Leinert, E.; Hartkopf, A.; Veselinovic, K.; Janni, W.; Rack, B.; Pfister, K.; Heitmeir, B.; et al. Evaluating ChatGPT as an adjunct for the multidisciplinary tumor board decision-making in primary breast cancer cases. Arch. Gynecol. Obstet. 2023. [Google Scholar] [CrossRef]
- Interdisciplinary Evidenced-Based Practice Guideline for the Early Detection, Diagnosis, Treatment and Follow-Up of Breast Cancer Long Version 4.4, May 2021, AWMF Registration Number: 032/045OL. Available online: https://www.leitlinienprogramm-onkologie.de/leitlinien/mammakarzinom (accessed on 22 September 2023).
- Bloom, H.J.; Richardson, W.W. Histological grading and prognosis in breast cancer; a study of 1409 cases of which 359 have been followed for 15 years. Br. J. Cancer 1957, 11, 359–377. [Google Scholar] [CrossRef]
- Compton, C. Getting to personalized cancer medicine: Taking out the garbage. Cancer 2007, 110, 1641–1643. [Google Scholar] [CrossRef] [PubMed]
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.; Dai, W.; Madotto, A.; et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
- Perez, F.; Ribeiro, I. Ignore previous prompt: Attack techniques for language models. arXiv 2022. [Google Scholar] [CrossRef]
- De Angelis, L.; Baglivo, F.; Arzilli, G.; Privitera, G.P.; Ferragina, P.; Tozzi, A.E.; Rizzo, C. ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Front. Public Health 2023, 11, 1166120. [Google Scholar] [CrossRef]
- Sorin, V.; Barash, Y.; Konen, E.; Klang, E. Large language models for oncological applications. J. Cancer Res. Clin. Oncol. 2023, 149, 9505–9508. [Google Scholar] [CrossRef]
- Patel, S.B.; Lam, K. ChatGPT: The future of discharge summaries? Lancet Digit. Health 2023, 5, e107–e108. [Google Scholar] [CrossRef] [PubMed]
Patient Profiles | ||||
---|---|---|---|---|
Immunohistochemical and Molecular Subtype | Postmenopausal | Premenopausal | ||
Nodal Negative | Nodal Positive | Nodal Negative | Nodal Positive | |
Luminal A | P1 | P2 | P3 | P4 |
Luminal B | P5 | P6 | P7 | P8 |
Her2 positive | P9 | P10 | P11 | P12 |
Triple negative | P13 | P14 | P15 | P16 |
DCIS | P17 | P18 | ||
DCIS with narrow resection margin | P19 | |||
Inflammatory breast cancer | P20 |
Treatment Option | ST | ET | CT | RT | GT | |||||
---|---|---|---|---|---|---|---|---|---|---|
Model Execution | LLM | MTB | LLM | MTB | LLM | MTB | LLM | MTB | LLM | MTB |
Recommendation frequency | 2 | 3 | 13 | 8 | 13 | 11 | 16 | 15 | 17 | 11 |
PP | ST | ET | CT | RT | GT | CC per PP | |
---|---|---|---|---|---|---|---|
Postmenopausal Luminal A N− | 1 | yes | yes | yes | yes | no | no |
Postmenopausal Luminal A N+ | 2 | yes | yes | no | yes | no | no |
Premenopausal Luminal A N− | 3 | yes | yes | yes | yes | yes | yes |
Premenopausal Luminal A N+ | 4 | yes | yes | yes | yes | yes | yes |
Postmenopausal Luminal B Her2− N− | 5 | yes | yes | yes | yes | yes | yes |
Postmenopausal Luminal B Her2− N+ | 6 | yes | yes | yes | yes | no | no |
Premenopausal Luminal B Her2− N− | 7 | yes | yes | n.a. | yes | no | no |
Premenopausal Luminal B Her2+ N+ | 8 | yes | yes | yes | yes | yes | yes |
Postmenopausal Her2+ ER/PR− N− | 9 | yes | no | yes | yes | no | no |
Postmenopausal Her2+ ER/PR− N+ | 10 | yes | yes | yes | yes | no | no |
Premenopausal Her2+ ER/PR− N- | 11 | yes | yes | yes | yes | yes | yes |
Premenopausal Her2+ ER/PR− N+ | 12 | yes | yes | yes | yes | yes | yes |
Postmenopausal Triple Negative N− | 13 | yes | yes | yes | yes | yes | yes |
Postmenopausal Triple Negative N+ | 14 | yes | yes | yes | yes | yes | yes |
Premenopausal Triple Negative N− | 15 | yes | yes | yes | yes | yes | yes |
Premenopausal Triple Negative N+ | 16 | yes | no | yes | yes | yes | no |
Postmenopausal DCIS, clear resection margin | 17 | yes | no | yes | no | yes | no |
Premenopausal DCIS, clear resection margin | 18 | yes | no | yes | yes | yes | no |
Postmenopausal DCIS, narrow resection margin | 19 | no | no | yes | yes | yes | no |
Inflammatory Breast Cancer | 20 | yes | yes | yes | yes | yes | yes |
CC per TO | 95.0% | 75.0% | 94.7% | 95.0% | 70.0% | 50.0% |
PP | ST | ET | CT | RT | GT | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
LLM | MTB | LLM | MTB | LLM | MTB | LLM | MTB | LLM | MTB | ||
Postmenopausal Luminal A N− | 1 | no | no | yes | yes | no | no | yes | yes | yes | no |
Postmenopausal Luminal A N+ | 2 | no | no | yes | yes | yes | no | yes | yes | yes | no |
Premenopausal Luminal A N− | 3 | no | no | yes | yes | no | no | yes | yes | no | no |
Premenopausal Luminal A N+ | 4 | yes | yes | yes | yes | no | no | yes | yes | no | no |
Postmenopausal Luminal B Her2− N− | 5 | no | no | yes | yes | no | no | yes | yes | yes | yes |
Postmenopausal Luminal B Her2− N+ | 6 | no | no | yes | yes | yes | yes | yes | yes | yes | no |
Premenopausal Luminal B Her2− N− | 7 | yes | yes | yes | yes | yes | n.a. | yes | yes | yes | no |
Premenopausal Luminal B Her2+ N+ | 8 | no | no | yes | yes | yes | yes | yes | yes | yes | yes |
Postmenopausal Her2+ ER/PR− N− | 9 | no | no | yes | no | yes | yes | yes | yes | yes | no |
Postmenopausal Her2+ ER/PR− N+ | 10 | no | no | no | no | yes | yes | yes | yes | yes | no |
Premenopausal Her2+ ER/PR− N− | 11 | no | no | no | no | yes | yes | yes | yes | yes | yes |
Premenopausal Her2+ ER/PR− N+ | 12 | no | no | no | no | yes | yes | yes | yes | yes | yes |
Postmenopausal Triple Negative N− | 13 | no | no | no | no | yes | yes | yes | yes | yes | yes |
Postmenopausal Triple Negative N+ | 14 | no | no | no | no | yes | yes | no | no | yes | yes |
Premenopausal Triple Negative N− | 15 | no | no | no | no | yes | yes | no | no | yes | yes |
Premenopausal Triple Negative N+ | 16 | no | no | yes | no | yes | yes | no | no | yes | yes |
Postmenopausal DCIS, clear resection margin | 17 | no | no | yes | no | no | no | yes | no | yes | yes |
Premenopausal DCIS, clear resection margin | 18 | no | no | yes | no | no | no | yes | yes | yes | yes |
Postmenopausal DCIS, narrow resection margin | 19 | no | yes | yes | no | no | no | yes | yes | no | no |
Inflammatory Breast Cancer | 20 | no | no | no | no | yes | yes | no | no | yes | yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Griewing, S.; Gremke, N.; Wagner, U.; Lingenfelder, M.; Kuhn, S.; Boekhoff, J. Challenging ChatGPT 3.5 in Senology—An Assessment of Concordance with Breast Cancer Tumor Board Decision Making. J. Pers. Med. 2023, 13, 1502. https://doi.org/10.3390/jpm13101502
Griewing S, Gremke N, Wagner U, Lingenfelder M, Kuhn S, Boekhoff J. Challenging ChatGPT 3.5 in Senology—An Assessment of Concordance with Breast Cancer Tumor Board Decision Making. Journal of Personalized Medicine. 2023; 13(10):1502. https://doi.org/10.3390/jpm13101502
Chicago/Turabian StyleGriewing, Sebastian, Niklas Gremke, Uwe Wagner, Michael Lingenfelder, Sebastian Kuhn, and Jelena Boekhoff. 2023. "Challenging ChatGPT 3.5 in Senology—An Assessment of Concordance with Breast Cancer Tumor Board Decision Making" Journal of Personalized Medicine 13, no. 10: 1502. https://doi.org/10.3390/jpm13101502
APA StyleGriewing, S., Gremke, N., Wagner, U., Lingenfelder, M., Kuhn, S., & Boekhoff, J. (2023). Challenging ChatGPT 3.5 in Senology—An Assessment of Concordance with Breast Cancer Tumor Board Decision Making. Journal of Personalized Medicine, 13(10), 1502. https://doi.org/10.3390/jpm13101502