Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning
Abstract
:1. Introduction
2. Experimental Details
2.1. Data Set Selection
2.2. Development of Features
2.3. Data Processing
3. Results
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- King, M.R. A Conversation on Artificial Intelligence, Chatbots, and Plagiarism in Higher Education. Cel. Mol. Bioeng. 2023, 16, 1–2. [Google Scholar] [CrossRef] [PubMed]
- Abdaljaleel, M.; Barakat, M.; Alsanafi, M.; Salim, N.A.; Abazid, H.; Malaeb, D.; Mohammed, A.H.; Hassan, B.A.R.; Wayyes, A.M.; Farhan, S.S.; et al. A multinational study on the factors influencing university students’ attitudes and usage of ChatGPT. Sci. Rep. 2024, 14, 1983. [Google Scholar] [CrossRef] [PubMed]
- Ibrahim, H.; Liu, F.; Asim, R.; Battu, B.; Benabderrahmane, S.; Alhafni, B.; Adnan, W.; Alhanai, T.; AlShebli, B.; Baghdadi, R. Perception, performance, and detectability of conversational artificial intelligence across 32 university courses. Sci. Rep. 2024, 13, 12187. [Google Scholar] [CrossRef] [PubMed]
- Alasadi, E.A.; Baiz, C.R. Generative AI in Education and Research: Opportunities, Concerns, and Solutions. J. Chem. Educ. 2023, 100, 2965–2971. [Google Scholar] [CrossRef]
- Iskender, A. Holy or Unholy? Interview with OpenAIs ChatGPT. Eur. J. Tour. Res. 2023, 34, 3414. [Google Scholar] [CrossRef]
- Mitrovic, S.; Andreoletti, D.; Ayoub, O. ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated text. arXiv, 2023; arXiv:2301.13852. [Google Scholar]
- Cingillioglu, I. Detecting AI-generated essays: The ChatGPT challenge. Emerald Insight 2023, 40, 259–268. [Google Scholar] [CrossRef]
- Desaire, H.; Chua, A.E.; Kim, M.G.; Hua, D. Accurately detecting AI text when ChatGPT is told to write like a chemist. Cell Rep. Phys. Sci. 2023, 4, 101672. [Google Scholar] [CrossRef] [PubMed]
- Al-Smadi, M. ChatGPT and Beyond: The Generative AI Revolution in Education. arXiv 2023, arXiv:2311.15198. [Google Scholar]
- Lund, B.D.; Wang, T.; Mannuru, N.S.; Nie, B.; Shimray, S.; Wang, Z. ChatGPT and a New Academic Reality: AI-Written Research Papers and the Ethics of the Large Language Models in Scholarly Publishing. JASIST 2023, 74, 570–581. [Google Scholar] [CrossRef]
- Bhattachargee, A.; Liu, H. Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text? arXiv 2023, arXiv:2308.01284. [Google Scholar] [CrossRef]
- Gao, C.A.; Howard, F.M.; Markov, N.S.; Dyer, E.C.; Ramesh, S.; Luo, Y.; Pearson, A.T. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. Nature 2023, 6, 75. [Google Scholar] [CrossRef] [PubMed]
- Liang, G.; Guerrero, J.; Zheng, F.; Alsmadi, I. Enhancing Neural Text Detector Robustness with µAttacking and RR-Training. Electronics 2023, 12, 1948. [Google Scholar] [CrossRef]
- Guo, B.; Zhang, X.; Wang, Z.; Jiang, M.; Nie, J.; Ding, Y.; Yue, J.; Wu, Y. How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection. arXiv 2023, arXiv:2301.07597. [Google Scholar]
- Uchendu, A.; Ma, Z.; Le, T.; Zhang, R.; Lee, D. TURINGBENCH: A benchmark environment for Turing test in the age of neutral text generation. arXiv 2021, arXiv:2109.13296. [Google Scholar]
- Desaire, H.; Chua, A.E.; Isom, M.; Jarosova, R.; Hua, D. Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools. Cell Rep. Phys. Sci. 2023, 4, 101426. [Google Scholar] [CrossRef] [PubMed]
- Patekar, J. Writing with AI: University Students’ Use of ChatGPT. J. Lang. Educ. 2023, 9, 128–138. [Google Scholar] [CrossRef]
Feature Number | Feature Type | Short Description | Greater in |
---|---|---|---|
1 | 1 | sentences per article | Human |
2 | 1 | words per article | Human |
3 | 2 | “-” present | ChatGPT |
4 | 2 | “;” or “:” present | Human |
5 | 2 | “?” present | Human |
6 | 2 | how many quotation marks present | Human |
7 | 3 | standard deviation in sentence length | Human |
8 | 3 | length differences in consecutive sentences | Human |
9 | 4 | “which” present | Human |
10 | 4 | how many “said” present | Human |
11 | 4 | how many “but” present | Human |
12 | 4 | how many “this” present | Human |
13 | 4 | how many “freshman, sophomore, junior, senior” present | Human |
Article-Level (% Correct) | |||
---|---|---|---|
Articles | Percent Correct | AUC | |
Training Set (Human) | 120 | 94.2% | |
Training Set a (ChatGPT) | 120 | 92.5% | |
Overall Accuracy: 93.3% | 0.933 | ||
Test Set 1 (Human) | 50 | 92% | |
Test Set 1 a (ChatGPT) | 50 | 98% | |
Overall Accuracy: 95% | 0.95 | ||
Test Set 2 b (ChatGPT) | 50 | 78% | |
Overall Accuracy: 85% | 0.85 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, M.-G.; Desaire, H. Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning. Information 2024, 15, 307. https://doi.org/10.3390/info15060307
Kim M-G, Desaire H. Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning. Information. 2024; 15(6):307. https://doi.org/10.3390/info15060307
Chicago/Turabian StyleKim, Min-Gyu, and Heather Desaire. 2024. "Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning" Information 15, no. 6: 307. https://doi.org/10.3390/info15060307
APA StyleKim, M. -G., & Desaire, H. (2024). Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning. Information, 15(6), 307. https://doi.org/10.3390/info15060307