Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions
Abstract
:1. Introduction
- Provide a concise summary of the format of medical case-based multiple-choice questions (MCQs).
- Offer a comprehensive examination of approaches used for generating medical case-based MCQs automatically.
- Present applications of medical case-based MCQ auto-generation.
- Give insight into evaluation and validation of automatically generated medical case-based MCQs.
- Provide a concise overview of potential improvements and future research directions.
2. Methods
3. Background and Context
4. Case-Based Reasoning (CBR) in Medical Education
- Understanding the Patient’s Problem: The first step in CBR involves thoroughly understanding the patient’s symptoms and medical history, which helps in forming an initial idea about the possible medical conditions the patient might have.
- Knowledge Application: Students apply their knowledge of anatomy, organ systems, and pathology to reason about the disease processes that could explain the patient’s symptoms, which is crucial for accurate diagnosis.
- Pattern Recognition: Students learn to recognize patterns in patient problems and compare them with illness scripts, which are mental representations of diseases based on previous cases they have studied or encountered.
- Systematic Discussion: Through systematic discussion, students elaborate on the possible courses of action from the initial presentation of the patient to the final steps of clinical management, which helps in refining their clinical reasoning skills.
- Decision-Making Practice: CBR also involves training students in decision-making from different perspectives, such as considering the burden on the patient and the cost for the hospital, which is essential for holistic patient care.
- Case Vignettes: Students work with case vignettes that present different medical scenarios, helping them practice and apply their clinical reasoning skills in a controlled, educational environment.
5. Structure of Case-Based MCQs
6. Approaches for MCQ Generation
6.1. Introduction to Medical Ontologies
- Field-Specific Ontologies: Focused on particular areas of medicine, such as gene ontology (GO) and human phenotype ontology (HPO), these ontologies explore specific topics like gene functions and phenotypic abnormalities.
- General Medical Knowledge Ontologies: These include comprehensive terminologies like Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) and International Classification of Diseases (ICD), which cover a broad range of diseases, clinical findings, and procedures.
- Ontologies Addressing Common Misconceptions: These are designed to clarify and correct frequently misunderstood information in medical fields, thus preventing errors in clinical practice and improving patient outcomes.
6.2. Ontology-Based Approaches
6.2.1. How Does Ontology Work to Generate MCQs?
- Ontology Components:
6.2.2. Rule-Based Generation
- Ontology Design: The above table shows how to design an ontology using a rule-based method for creating MCQs based on medical cases [1]. Ontology design involves structuring and defining entities, properties, and rules to represent knowledge in the medical domain. Entities are the core components or concepts within the domain that are relevant to the creation of MCQs. In the medical domain, examples of these entities include patients, who represent the individuals receiving medical care; symptoms, which are signs or indications of a condition or disease; diseases, which are medical conditions that affect the patient; treatments, which are medical interventions used to treat diseases; and tests, which are diagnostic procedures used to identify or monitor diseases.
- Properties: Properties describe the attributes or relationships between entities. In the medical domain, examples include the property hasSymptom, which indicates the symptoms experienced by the patient; hasTestResult, which denotes the results of diagnostic tests conducted on the patient; and givenTreatment, which specifies the treatments administered to the patient.
- Rules: Rules are guidelines that explain how entities and properties work together to create new knowledge or actions. They play a crucial role in a rule-based ontology system. For example, in the field of medical domain, a rule could be as follows: If a patient shows symptoms ‘X’ and tests positive for ‘Y’, then they could potentially have condition ‘Z’. This rule utilizes information about symptoms and test results to suggest an illness based on the patient’s situation. For instance, if a patient has a temperature and tests positive for a specific virus, this rule could indicate that the patient might have an illness linked to those symptoms and test outcomes.
6.2.3. Template-Based Generation
6.2.4. Case-Based MCQ Based on Ontology Applications
6.3. MCQs Generation Using Artificial Intelligence
6.3.1. Machine Learning Approaches
Supervised Learning in MCQ Generation
- Recognizing Patterns and Learning: The main goal of SL is to recognize patterns in medical cases and how questions are formulated based on those cases [42]. For example, an algorithm learns to generate questions from patient scenarios by recognizing specific clinical patterns, such as symptoms or test results that match known disease profiles [20]. Using supervised learning, the system is trained on labeled medical cases with the correct diagnoses, allowing it to learn patterns in the data and generate similar questions that assess a learner’s ability to recognize symptoms and apply clinical knowledge. In this context, pattern recognition questions (PRQs) focus on identifying diseases from clinical patterns, requiring the examinee to match presented symptoms and test results with known disease patterns to make a diagnosis. This approach helps simplify complex decision-making by breaking down medical reasoning into smaller, manageable parts, thereby improving diagnosis or classification. In a study by Swe (2019), the decision tree supervised learning model is used to recognize medical patterns through a hierarchical structure [41]. The algorithm splits data into branches based on feature values (such as symptoms or test results), leading to a decision or classification at the tree’s leaves. This structure simplifies complex decision-making by breaking it into smaller, more manageable parts, helping the model learn and identify key patterns in medical data for improved diagnosis or classification.
- Extracting Features: Feature extraction serves as the phase in learning algorithms, where crucial characteristics are derived from the training data [43]. In the context of MCQs, these key elements include scenarios, diagnostic criteria, treatment options and potential patient outcomes. This process plays a role in understanding the nuances of presenting cases and dealing with complexities when crafting questions. For example, decision tree and random forest algorithms are effective for feature extraction due to their inherent ability to perform feature selection during the model training process [41]. Random forest is an ensemble learning method primarily used for classification and regression tasks. It operates by constructing multiple decision trees during training and outputting the mode of the classes (for classification) or mean prediction (for regression) of the individual trees. These algorithms split data based on the most informative features, which helps in identifying and prioritizing features that contribute significantly to the decision-making process.
- Training and Testing: In the study by Yuan et al. (2017), a two-phase model is proposed to generate questions using supervised learning and reinforcement learning techniques [44]. Initially, the model is trained with teacher forcing to ensure it learns the correct sequence of outputs by maximizing the likelihood of ground-truth sequences. It is later fine-tuned with policy gradient reinforcement learning to allow improvements on sequences not encountered during training. The model’s effectiveness is evaluated using the SQuAD dataset, and this approach can be adapted to generate medical MCQs by incorporating clinical datasets for relevance and accuracy in medical education.
Unsupervised Learning in MCQ Generation
- Extraction of Features from Unlabeled Data: UL excels at extracting features from content. The system autonomously identifies concepts such as disease symptoms, diagnostic methods or treatment options and incorporates them into relevant inquiries [45]. This feature is particularly useful, for covering an array of topics and ensuring comprehensive educational resources. This process indeed exemplifies unsupervised learning as it involves discovering patterns and features in unlabeled data without predefined labels or categories.
- Discovery of Patterns and Clustering: The discovery of patterns and clustering, particularly through algorithms like hierarchical clustering, plays a crucial role in unsupervised learning (UL) for generating MCQs in medical education [46]. These algorithms group similar data, aiding in the identification of both common and rare data for question creation. Additionally, unsupervised information extraction (IE) techniques provide advantages by identifying important semantic relations between concepts from unannotated texts, without relying on pre-defined rules or patterns. This approach enhances the flexibility and quality of MCQ generation, making it especially useful in contexts where manually annotated data are unavailable or costly.
- Anomalies Detection: One appealing aspect of unsupervised learning (UL) is its ability to identify anomalies [47]. In the context of MCQ generation, this entails pinpointing situations. These instances can serve as the foundation for creating demanding MCQs that assess a student’s competence in managing scenarios.
6.3.2. Deep Learning Approaches
6.3.3. Natural Language Processing (NLP)
Medical Case Scenario Analysis and Extraction of Key Information
- Entity Recognition: Advanced NLP models perform Named Entity Recognition (NER) to identify and classify specific entities (e.g., diseases, drugs, symptoms) within the text [6]. This process helps in categorizing the information that can later be used to construct the stem of an MCQ.
- Contextual Understanding: NLP techniques used in generative models, like BERT, excel in interpreting the context of medical text [57]. For instance, BERT can differentiate between multiple meanings of the word “cold” (virus or temperature) based on the surrounding context. Its bidirectional architecture allows it to understand text from both left-to-right and right-to-left, which is essential for accurate medical text generation. By using these capabilities, BERT enhances the quality of automatically generated medical MCQs, ensuring contextually appropriate and accurate question formation.
Creation of Consistent, Authentic MCQ Prompts
- Scenario Simulation: NLP can be used to simulate clinical scenarios that are realistic and relevant to the curriculum [58]. This involves creatively integrating different pieces of extracted information to form a scenario that mirrors real-life clinical situations.
- Generation of Content: Content generation involves the use of tools like GPT, which have been trained on extensive text data to produce coherent and contextually relevant text [58]. Through natural language processing (NLP), these models generate content by predicting the next word in a sequence based on the preceding words. This capability allows for the construction of statements that adhere to typical medical evaluation standards while maintaining flexibility in expression. By leveraging the predictive power of NLP, these tools can generate questions and educational content that are both accurate and reflective of common medical scenarios, thereby enhancing the quality and relevance of the material.
Generation of Plausible Incorrect Options (Distractors)
- Similarity and Distinction of Semantic Content: Techniques like semantic analysis enable NLP models to provide distractors that possess both similarity with and distinction from the correct response [59]. For instance, in a question regarding a treatment, the distractor options may consist of different pharmaceuticals that are suitable for related problems but are not suitable for the specific case specified in the question.
- Analysis of Error Patterns: NLP can evaluate patterns in students’ misinterpretations of comparable instances, enabling the creation of distractors that mirror prevalent mistakes in the medical domain [60].
6.3.4. MCQ Generation Using AI Transformers
- Understanding Context: Transformers can understand the shades of meaning in terms and concepts, setting them apart from other models in a case study setting [62]. This deep understanding allows for the creation of MCQs that are clinically precise and also intricately linked to context assessing students’ capacity to apply medical knowledge in challenging real-world situations.
- Data Scalability: Experts can significantly enhance the capabilities of transformers by training them using a diverse mix of textbooks, journals, and case studies. This method allows transformers to capture a broader and more comprehensive range of information compared to traditional approaches like rule-based systems and long short-term memory (LSTM) networks, which often struggle with complex and varied medical data [63]. By leveraging this training, transformers can generate MCQs across different medical fields and specialties, offering flexible and scalable solutions to meet the growing demand for high-quality educational materials.
- Innovative Question Formulation: Transformers have sophisticated NLP capabilities that enable them to generate innovative and creative questions and answers [64]. Automated systems could generate subtle distractors (incorrect answers) that closely resemble typical misunderstandings or mistakes in clinical reasoning. This is achieved by training the models on large datasets that include examples of common errors and misconceptions in medical practice. By analyzing these patterns, the models can produce distractors that are both plausible and challenging. This approach improves the instructional significance of the MCQs by encouraging students to engage in critical and discriminative thinking. One common method to evaluate the effectiveness of distractors is through item analysis, which involves statistical techniques to examine how test-takers respond to each option [65]. This can help identify which distractors are working well and which are not. In addition, distractors are reviewed by subject matter experts [66]. These experts can assess whether the distractors are plausible and relevant to the question being asked, ensuring they align with the guidelines of standardized exams like the United States Medical Licensing Examination (USMLE) and Comprehensive Osteopathic Medical Licensing Examination (COMLEX) of the United States. Moreover, analyzing how students interact with the distractors can provide valuable insights. If a distractor is never chosen, it may be too obviously incorrect. Conversely, if a distractor is chosen too frequently, it might be misleading or too similar to the correct answer. Tracking the number of times each distractor is selected can help identify which ones are effective and which need revision. Also, observing when and how often students engage with the practice quizzes can also indicate distractor effectiveness. For instance, if students frequently access quizzes and attempt questions multiple times, it suggests that the distractors are challenging enough to encourage repeated practice, which is a sign of their effectiveness. Another way to assess these distractors is collecting feedback from students about the distractors, which can provide direct insights into their effectiveness. Students can report if they found certain distractors confusing or too easy to eliminate, which can help in refining the questions to better assess their knowledge. Also, using statistical methods to analyze the performance of distractors can provide objective data. For example, item analysis techniques like the discrimination index can measure how well a distractor differentiates between high-performing and low-performing students. A good distractor should be more likely chosen by students who do not know the correct answer.
6.3.5. Case-Based MCQs Based on AI Applications
7. Evaluation and Validation of Automatically Generated MCQs
- Precision:
- Difficulty Score:
- Discrimination Index:
- Significance (Relevance):
- Guessing Factor:
- Feedback analysis:
- ROUGE metric
- BLEU metric
- Unweighted Kappa metric
- Kruskal–Wallis Test
8. Comparative Analysis and Evaluation of Automatic Case-Based MCQ Generation Applications
9. Practical Implementation and Educational Impact
10. Research Gaps and Limitations
11. Potential Areas for Further Investigation (Future Research Directions)
11.1. Development of Data Sources and Interoperability
11.2. AI–Human Content Collaboration
11.3. Adaptive Learning Systems and Personalization
11.4. Research and Collaborative Development
11.5. NLP Developments
12. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kumar, A.P.; Nayak, A.; Chaitanya, M.S.K.; Ghosh, K. A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques. Int. J. Artif. Intell. Educ. 2023, 34, 332–375. [Google Scholar] [CrossRef]
- Kurdi, G.; Leo, J.; Parsia, B.; Sattler, U.; Al-Emari, S. A Systematic Review of Automatic Question Generation for Educational Purposes. Int. J. Artif. Intell. Educ. 2020, 30, 121–204. [Google Scholar] [CrossRef]
- Falcão, F.; Costa, P.; Pêgo, J.M. Feasibility assurance: A review of automatic item generation in medical assessment. Adv. Health Sci. Educ. 2022, 27, 405–425. [Google Scholar] [CrossRef]
- Al Shuriaqi, S.; Aal Abdulsalam, A.; Masters, K. Generation of Medical Case-Based Multiple-Choice Questions. Int. Med. Educ. 2023, 3, 12–22. [Google Scholar] [CrossRef]
- Cohen Aubart, F.; Lhote, R.; Hertig, A.; Noel, N.; Costedoat-Chalumeau, N.; Cariou, A.; Meyer, G.; Cymbalista, F.; De Prost, N.; Pottier, P.; et al. Progressive clinical case-based multiple-choice questions: An innovative way to evaluate and rank undergraduate medical students. Rev. Méd. Interne 2021, 42, 302–309. [Google Scholar] [CrossRef]
- Leo, J.; Kurdi, G.; Matentzoglu, N.; Parsia, B.; Sattler, U.; Forge, S.; Donato, G.; Dowling, W. Ontology-Based Generation of Medical, Multi-term MCQs. Int. J. Artif. Intell. Educ. 2019, 29, 145–188. [Google Scholar] [CrossRef]
- Bansal, A.; Dubey, A.; Singh, V.K.; Goswami, B.; Kaushik, S. Comparison of traditional essay questions versus case based modified essay questions in biochemistry. Biochem. Mol. Biol. Educ. 2023, 51, 494–498. [Google Scholar] [CrossRef]
- Gartmeier, M.; Pfurtscheller, T.; Hapfelmeier, A.; Grünewald, M.; Häusler, J.; Seidel, T.; Berberat, P.O. Teacher questions and student responses in case-based learning: Outcomes of a video study in medical education. BMC Med. Educ. 2019, 19, 455. [Google Scholar] [CrossRef]
- Basuki, S.; Rizky, A.; Wicaksono, G.W. Case Based Reasioning (CBR) for Medical Question Answering System. Kinet. Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control 2018, 3, 113–118. [Google Scholar] [CrossRef]
- Majumder, M.; Saha, S.K. A System for Generating Multiple Choice Questions: With a Novel Approach for Sentence Selection. In Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications, Beijing, China, 31 July 2015; Association for Computational Linguistics: Stroudsburg, PA, USA, 2015; pp. 64–72. [Google Scholar] [CrossRef]
- Madri, V.R.; Meruva, S. A comprehensive review on MCQ generation from text. Multimed. Tools Appl. 2023, 82, 39415–39434. [Google Scholar] [CrossRef]
- Moon, H.; Yang, Y.; Shin, J.; Yu, H.; Lee, S.; Jeong, M.; Park, J.; Kim, M.; Choi, S. Evaluating the Knowledge Dependency of Questions. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 10512–10526. [Google Scholar] [CrossRef]
- Moore, S.; Costello, E.; Nguyen, H.A.; Stamper, J. An Automatic Question Usability Evaluation Toolkit. In Artificial Intelligence in Education; Olney, A.M., Chounta, I.-A., Liu, Z., Santos, O.C., Bittencourt, I.I., Eds.; Lecture Notes in Computer Science; Springer Nature: Cham, Switzerland, 2024; Volume 14830, pp. 31–46. [Google Scholar] [CrossRef]
- Manoj, D.; Maria John, P. Natural language processing based question and answer generator. Int. Adv. Res. J. Sci. Eng. Technol. 2024, 11, 135–141. [Google Scholar] [CrossRef]
- Dhanya, N.M.; Balaji, R.K.; Akash, S. AiXAM—AI assisted Online MCQ Generation Platform using Google T5 and Sense2Vec. In Proceedings of the 2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore, India, 23–25 February 2022; pp. 38–44. [Google Scholar] [CrossRef]
- Maheen, F.; Asif, M.; Ahmad, H.; Ahmad, S.; Alturise, F.; Asiry, O.; Ghadi, Y.Y. Automatic computer science domain multiple-choice questions generation based on informative sentences. PeerJ Comput. Sci. 2022, 8, e1010. [Google Scholar] [CrossRef]
- Paul, R.J.; Jamal, S.; Bejoy, S.; Daniel, R.J.; Aju, N. QGen: Automated Question Paper Generator. In Proceedings of the 2024 5th International Conference on Innovative Trends in Information Technology (ICITIIT), Kottayam, India, 15–16 March 2024; pp. 1–4. [Google Scholar] [CrossRef]
- Ten Cate, O.; Custers, E.J.F.M.; Durning, S.J. (Eds.) Principles and Practice of Case-Based Clinical Reasoning Education; Innovation and Change in Professional Education; Springer International Publishing: Cham, Switzerland, 2018; Volume 15. [Google Scholar] [CrossRef]
- Al-Rukban, M. Guidelines for the construction of multiple choice questions tests. J. Fam. Community Med. 2006, 13, 125. [Google Scholar] [CrossRef]
- Freiwald, T.; Salimi, M.; Khaljani, E.; Harendza, S. Pattern recognition as a concept for multiple-choice questions in a national licensing exam. BMC Med. Educ. 2014, 14, 232. [Google Scholar] [CrossRef]
- Family Medicine Modular Subject Exam—Content Outline. Available online: https://www.nbme.org/sites/default/files/2022-01/Family_Medicine_Sample_Items.pdf (accessed on 16 January 2024).
- Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene Ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef]
- El-Sappagh, S.; Franda, F.; Ali, F.; Kwak, K.-S. SNOMED CT standard ontology based on the ontology for general medical science. BMC Med. Inform. Decis. Mak. 2018, 18, 76. [Google Scholar] [CrossRef]
- Bernasconi, A.; Masseroli, M. Biological and Medical Ontologies: Human Phenotype Ontology (HPO). In Encyclopedia of Bioinformatics and Computational Biology; Ranganathan, S., Gribskov, M., Nakai, K., Schönbach, C., Eds.; Academic Press: Oxford, UK, 2019; pp. 848–857. [Google Scholar] [CrossRef]
- Mulla, N.; Gharpure, P. Automatic question generation: A review of methodologies, datasets, evaluation metrics, and applications. Prog. Artif. Intell. 2023, 12, 1–32. [Google Scholar] [CrossRef]
- Wang, W.; Hao, T.; Liu, W. Automatic Question Generation for Learning Evaluation in Medicine. In Advances in Web Based Learning—ICWL 2007; Leung, H., Li, F., Lau, R., Li, Q., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2008; Volume 4823, pp. 242–251. [Google Scholar] [CrossRef]
- Ladas, N.; Borchert, F.; Franz, S.; Rehberg, A.; Strauch, N.; Sommer, K.K.; Marschollek, M.; Gietzelt, M. Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts. Health Inform. J. 2023, 29, 146045822311646. [Google Scholar] [CrossRef]
- Xue, X.; Wu, Q.; Ye, M.; Lv, J. Efficient Ontology Meta-Matching Based on Interpolation Model Assisted Evolutionary Algorithm. Mathematics 2022, 10, 3212. [Google Scholar] [CrossRef]
- Das, R.; Ray, A.; Mondal, S.; Das, D. A rule based question generation framework to deal with simple and complex sentences. In Proceedings of the 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India, 21–24 September 2016; pp. 542–548. [Google Scholar] [CrossRef]
- Rao, P.R.; Jhawar, T.N.; Kachave, Y.A.; Hirlekar, V. Generating QA from Rule-based Algorithms. In Proceedings of the 2022 International Conference on Electronics and Renewable Systems (ICEARS), Tuticorin, India, 16–18 March 2022; pp. 1697–1703. [Google Scholar] [CrossRef]
- Zhang, R.; Guo, J.; Chen, L.; Fan, Y.; Cheng, X. A Review on Question Generation from Natural Language Text. ACM Trans. Inf. Syst. 2022, 40, 1–43. [Google Scholar] [CrossRef]
- Patil, P.M.; Bhavsar, R.P.; Pawar, B.V. A Review on Natural Language Processing based Automatic Question Generation. In Proceedings of the 2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS), Trichy, India, 24–26 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Mehta, P.K.; Jain, P.; Makwana, C.; Raut, D.C.M. Automated MCQ Generator using Natural Language Processing. Int. Res. J. Eng. Technol. 2021, 8, 2705–2710. [Google Scholar]
- Karamanis, N.; Ha, L.A.; Mitkov, R. Generating Multiple-Choice Test Items from Medical Text: A Pilot Study. In Proceedings of the Fourth International Natural Language Generation Conference, Sydney, Australia, 15–16 July 2006; Association for Computational Linguistics: Stroudsburg, PA, USA, 2006; pp. 111–113. [Google Scholar]
- Mitkov, R.; An Ha, L.; Karamanis, N. A computer-aided environment for generating multiple-choice test items. Nat. Lang. Eng. 2006, 12, 177–194. [Google Scholar] [CrossRef]
- Gierl, M.J.; Lai, H.; Turner, S.R. Using automatic item generation to create multiple-choice test items. Med. Educ. 2012, 46, 757–765. [Google Scholar] [CrossRef]
- Khodeir, N.; Wanas, N.; Darwish, N.; Hegazy, N. Bayesian based adaptive question generation technique. J. Electr. Syst. Inf. Technol. 2014, 1, 10–16. [Google Scholar] [CrossRef]
- Mendonça, M.O.K.; Netto, S.L.; Diniz, P.S.R.; Theodoridis, S. Chapter 13—Machine learning: Review and trends. In Signal Processing and Machine Learning Theory; Diniz, P.S.R., Ed.; Academic Press: Cambridge, MA, USA, 2024; pp. 869–959. [Google Scholar] [CrossRef]
- Ono, S.; Goto, T. Introduction to supervised machine learning in clinical epidemiology. Ann. Clin. Epidemiol. 2022, 4, 63–71. [Google Scholar] [CrossRef]
- Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing different supervised machine learning algorithms for disease prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 281. [Google Scholar] [CrossRef]
- Swe, T.T. Analysis of Tree Based Supervised Learning Algorithms on Medical Data. Int. J. Sci. Res. Publ. 2019, 9, p8817. [Google Scholar] [CrossRef]
- Mondal, N.; Lohia, M. Supervised Text Classification using Text Search. arXiv 2020, arXiv:2011.13832. [Google Scholar]
- Ahmadi, S.A.; Mehrshad, N.; Razavi, S.M. Supervised feature extraction method based on low-rank representation with preserving local pairwise constraints for hyperspectral images. Signal Image Video Process. 2019, 13, 583–590. [Google Scholar] [CrossRef]
- Yuan, X.; Wang, T.; Gulcehre, C.; Sordoni, A.; Bachman, P.; Zhang, S.; Subramanian, S.; Trischler, A. Machine Comprehension by Text-to-Text Neural Question Generation. In Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, BC, Canada, 3 August 2017; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 15–25. [Google Scholar] [CrossRef]
- Talukdar, J.; Singh, T.P.; Barman, B. Unsupervised Learning. In Artificial Intelligence in Healthcare Industry; Talukdar, J., Singh, T.P., Barman, B., Eds.; Springer Nature: Singapore, 2023; pp. 87–107. [Google Scholar] [CrossRef]
- Afzal, N.; Mitkov, R. Automatic generation of multiple choice questions using dependency-based semantic relations. Soft Comput. 2014, 18, 1269–1281. [Google Scholar] [CrossRef]
- Yousefpour, A.; Shishehbor, M.; Foumani, Z.Z.; Bostanabad, R. Unsupervised Anomaly Detection via Nonlinear Manifold Learning. arXiv 2023, arXiv:2306.09441. [Google Scholar] [CrossRef]
- Shen, S.; Li, Y.; Du, N.; Wu, X.; Xie, Y.; Ge, S.; Yang, T.; Wang, K.; Liang, X.; Fan, W. On the Generation of Medical Question-Answer Pairs. arXiv 2019, arXiv:1811.00681. [Google Scholar] [CrossRef]
- Shen, F.; Lee, Y. MedTQ: Dynamic Topic Discovery and Query Generation for Medical Ontologies. arXiv 2018, arXiv:1802.03855. [Google Scholar]
- Bas, A.; Topal, M.O.; Duman, C.; Van Heerden, I. A Brief History of Deep Learning-Based Text Generation. In Proceedings of the 2022 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 20–22 December 2022; pp. 1–4. [Google Scholar] [CrossRef]
- Hu, Y.; Han, G.; Liu, X.; Li, H.; Xing, L.; Gu, Y.; Zhou, Z.; Li, H. Design and Implementation of a Medical Question and Answer System Based on Deep Learning. Math. Probl. Eng. 2022, 2022, 1–6. [Google Scholar] [CrossRef]
- Zou, H. AIADA: Accuracy Impact Assessment of Deprecated Python API Usages on Deep Learning Models. J. Softw. 2022, 17, 269–281. [Google Scholar] [CrossRef]
- Reddy, S.; Raghu, D.; Khapra, M.M.; Joshi, S. Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Valencia, Spain, 3–7 April 2017; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 376–385. [Google Scholar] [CrossRef]
- Mitra, N.K.; Chitra, E. Glimpses of the Use of Generative AI and ChatGPT in Medical Education. Educ. Med. J. 2024, 16, 155–164. [Google Scholar] [CrossRef]
- He, X.; Nassar, I.; Kiros, J.; Haffari, G.; Norouzi, M. Generate, Annotate, and Learn: NLP with Synthetic Text. Trans. Assoc. Comput. Linguist. 2022, 10, 826–842. [Google Scholar] [CrossRef]
- Biswas, D.; Nadipalli, S.; Sneha, B.; Gupta, D.; Amudha, J. Natural Question Generation using Transformers and Reinforcement Learning. In Proceedings of the 2022 OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, 14–16 December 2022; pp. 283–288. [Google Scholar] [CrossRef]
- Ferrando, J.; Gállego, G.I.; Tsiamas, I.; Costa-jussà, M.R. Explaining How Transformers Use Context to Build Predictions. arXiv 2023, arXiv:2305.12535. [Google Scholar]
- Kıyak, Y.S. A ChatGPT Prompt for Writing Case-Based Multiple-Choice Questions. Rev. Esp. Educ. Méd. 2023, 4, 98–103. [Google Scholar] [CrossRef]
- Nemani, P.; Vollala, S. A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A Transformer-based Approach. In Proceedings of the 2022 IEEE 19th India Council International Conference (INDICON), Kochi, India, 24–26 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Yunjiu, L.; Wei, W.; Zheng, Y. Artificial Intelligence-Generated and Human Expert-Designed Vocabulary Tests: A Comparative Study. SAGE Open 2022, 12, 215824402210821. [Google Scholar] [CrossRef]
- Tay, Y.; Bahri, D.; Metzler, D.; Juan, D.-C.; Zhao, Z.; Zheng, C. Synthesizer: Rethinking Self-Attention in Transformer Models. arXiv 2021, arXiv:2005.00743. [Google Scholar]
- Miller, K. Comprehension of Contextual Semantics Across Clinical Healthcare Domains. In Proceedings of the 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI), Rochester, MN, USA, 11–14 June 2022; pp. 479–480. [Google Scholar] [CrossRef]
- Chandraju, A.V.; Gnanasigamani, L.J. Transformer-Based Abstract Generation of Medical Case Reports. Int. J. Eng. Adv. Technol. 2022, 12, 110–113. [Google Scholar] [CrossRef]
- Rodriguez-Torrealba, R.; Garcia-Lopez, E.; Garcia-Cabot, A. End-to-End generation of Multiple-Choice questions using Text-to-Text transfer Transformer models. Expert Syst. Appl. 2022, 208, 118258. [Google Scholar] [CrossRef]
- Rao, M.C.; Sreedhar, P.; Bhanurangarao, M.; Sujatha, G. Automatic Multiple-Choice Question and Answer (MCQA) Generation Using Deep Learning Model. In Proceedings of the 2nd International Conference on Cognitive and Intelligent Computing, Hyderabad, India, 27–28 December 2022; Kumar, A., Ghinea, G., Merugu, S., Eds.; Springer Nature: Singapore, 2023; pp. 1–8. [Google Scholar]
- Berman, J.; McCoy, L.; Camarata, T. LLM-Generated Multiple Choice Practice Quizzes for Pre-Clinical Medical Students; Use and Validity. Physiology 2024, 39, 376. [Google Scholar] [CrossRef]
- Moradi, M.; Samwald, M. Improving the robustness and accuracy of biomedical language models through adversarial training. J. Biomed. Inform. 2022, 132, 104114. [Google Scholar] [CrossRef]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 2022, 54, 1–35. [Google Scholar] [CrossRef]
- Denecke, K.; May, R.; Rivera-Romero, O. Transformer Models in Healthcare: A Survey and Thematic Analysis of Potentials, Shortcomings and Risks. J. Med. Syst. 2024, 48, 23. [Google Scholar] [CrossRef]
- Cheung, B.H.H.; Lau, G.K.K.; Wong, G.T.C.; Lee, E.Y.P.; Kulkarni, D.; Seow, C.S.; Wong, R.; Co, M.T.H. ChatGPT Versus Human in Generating Medical Graduate Exam Questions—An International Prospective Study; Medical Education: Tokyo, Japan, 2023. [Google Scholar] [CrossRef]
- Agarwal, M.; Sharma, P.; Goswami, A. Analysing the Applicability of ChatGPT, Bard, and Bing to Generate Reasoning-Based Multiple-Choice Questions in Medical Physiology. Cureus 2023, 15, e40977. [Google Scholar] [CrossRef]
- Huang, K.; Ji, F.; Lu, W.; Xiao, Y. Research on Text Generation of Medical Intelligent Question and Answer Based on Bi-LSTM and Neural Network Technology. In Proceedings of the 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science (ICIS), Zhuhai, China, 26–28 June 2022; pp. 54–59. [Google Scholar] [CrossRef]
- Sileo, D.; Uma, K.; Moens, M.-F. Generating Multiple-Choice Questions for Medical Question Answering with Distractors and Cue-Masking. arXiv 2023, arXiv:2303.07069. [Google Scholar]
- Sykes, B.; Simon, L.; Rabin, J. Unifying and Extending Precision Recall Metrics for Assessing Generative Models. arXiv 2024, arXiv:2405.01611. [Google Scholar]
- Embretson, S.E.; Reise, S.P. Item Response Theory for Psychologists; Lawrence Erlbaum Associates Publishers: Mahwah, NJ, USA, 2000; p. xi, 371. [Google Scholar] [CrossRef]
- Isnawati, I.; Sriyati, S.; Agustin, R.R.; Supriyadi, S.; Kasi, Y.F.; Ismail, I. Analysis of Question Difficulty Levels Based on Science Process Skills Indicators Using the Rasch Model. Tadris J. Kegur. Dan Ilmu Tarb. 2024, 9, 31. [Google Scholar] [CrossRef]
- Demaidi, M.N.; Gaber, M.M.; Filer, N. Evaluating the quality of the ontology-based auto-generated questions. Smart Learn. Environ. 2017, 4, 7. [Google Scholar] [CrossRef]
- Rezigalla, A.A. AI in medical education: Uses of AI in construction type A MCQs. BMC Med. Educ. 2024, 24, 247. [Google Scholar] [CrossRef]
- Alqahtani, S. Multiple choice questions as a tool for summative assessment in medical schools. Bull. Egypt. Soc. Physiol. Sci. 2024, 44, 29–38. [Google Scholar] [CrossRef]
- Mahjabeen, W.; Alam, S.; Hassan, U.; Zafar, T.; Butt, R.; Konain, S.; Rizvi, M. Difficulty Index, Discrimination Index and Distractor Efficiency in Multiple Choice Questions. Ann. PIMS. 2017, 13, 310–315. [Google Scholar]
- Kurdi, G.; Parsia, B.; Sattler, U. An Experimental Evaluation of Automatically Generated Multiple Choice Questions from Ontologies. In OWL: Experiences and Directions—Reasoner Evaluation; Dragoni, M., Poveda-Villalón, M., Jimenez-Ruiz, E., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 24–39. [Google Scholar]
- Cooper, B.; Foy, J.M. Guessing in Multiple-choice Tests. Med. Educ. 1967, 1, 212–215. [Google Scholar] [CrossRef]
- May, K. Book Review: Fundamentals of Item Response Theory Ronald K. Hambleton, H. Swaminathan, and H. Jane Rogers Newbury Park CA: Sage, 1991, 174 pp. Appl. Psychol. Meas. 1993, 17, 293–294. [Google Scholar] [CrossRef]
- Rai, N.; Rai, N. Multiple choice questions: As formative assessment. Int. J. Med. Biomed. Stud. 2019, 3, 75–79. [Google Scholar] [CrossRef]
- Das, B.; Majumder, M.; Phadikar, S.; Sekh, A.A. Automatic question generation and answer assessment: A survey. Res. Pract. Technol. Enhanc. Learn. 2021, 16, 5. [Google Scholar] [CrossRef]
- Shaheer, S.; Hossain, I.; Sarna, S.N.; Kabir Mehedi, M.H.; Rasel, A.A. Evaluating Question generation models using QA systems and Semantic Textual Similarity. In Proceedings of the 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–11 March 2023; pp. 431–435. [Google Scholar] [CrossRef]
- Sellam, T.; Das, D.; Parikh, A.P. BLEURT: Learning Robust Metrics for Text Generation. arXiv 2020, arXiv:2004.04696. [Google Scholar]
- Mishra, S. Nitika Understanding the calculation of the kappa statistic: A measure of inter-observer reliability. Int. J. Acad. Med. 2016, 2, 217. [Google Scholar] [CrossRef]
- Bobbitt, Z. Kruskal-Wallis Test: Definition, Formula, and Example. Available online: https://www.statology.org/kruskal-wallis-test/ (accessed on 16 January 2024).
- Kıyak, Y.S.; Kononowicz, A.A. Case-based MCQ generator: A custom ChatGPT based on published prompts in the literature for automatic item generation. Med. Teach. 2024, 46, 1018–1020. [Google Scholar] [CrossRef]
- Moore, S.; Schmucker, R.; Mitchell, T.; Stamper, J. Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions. In Proceedings of the Eleventh ACM Conference on Learning @ Scale, Atlanta, GA, USA, 18–20 July 2024; ACM: New York, NY, USA, 2024; pp. 122–133. [Google Scholar] [CrossRef]
- Indran, I.R.; Paranthaman, P.; Gupta, N.; Mustafa, N. Twelve tips to leverage AI for efficient and effective medical question generation: A guide for educators using Chat GPT. Med. Teach. 2024, 46, 1021–1026. [Google Scholar] [CrossRef]
- Kıyak, Y.S.; Emekli, E. ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: A literature review. Postgrad. Med. J. 2024, qgae065. [Google Scholar] [CrossRef]
- Murphy Lonergan, R.; Curry, J.; Dhas, K.; Simmons, B.I. Stratified Evaluation of GPT’s Question Answering in Surgery Reveals Artificial Intelligence (AI) Knowledge Gaps. Cureus 2023, 15, e48788. [Google Scholar] [CrossRef]
- Abdallah, A.; Kasem, M.; Hamada, M.A.; Sdeek, S. Automated Question-Answer Medical Model based on Deep Learning Technology. In Proceedings of the 6th International Conference on Engineering & MIS 2020, Almaty, Kazakhstan, 14–16 September 2020; ACM: New York, NY, USA, 2020; pp. 1–8. [Google Scholar] [CrossRef]
- Ahamed, S.H.; Reddy, K.R.K.; Shoba, L.K. Enhancing Education with NLP-through AI-Enhanced Q&A Evaluation and Testing using Leveraging algorithms. In Proceedings of the 2024 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), Chennai, India, 9–10 May 2024; pp. 1–7. [Google Scholar] [CrossRef]
- MacLeod, A.; Luong, V.; Cameron, P.; Burm, S.; Field, S.; Kits, O.; Miller, S.; Stewart, W.A. Case-Informed Learning in Medical Education: A Call for Ontological Fidelity. Perspect. Med. Educ. 2023, 2, 120–128. [Google Scholar] [CrossRef]
- Pugh, D.; De Champlain, A.; Gierl, M.; Lai, H.; Touchie, C. Can automated item generation be used to develop high quality MCQs that assess application of knowledge? Res. Pract. Technol. Enhanc. Learn. 2020, 15, 12. [Google Scholar] [CrossRef]
Case-Based MCQ Example | Constituent |
---|---|
“A 23-year-old man comes to the physician because of a 1-week history of painful urination and a clear urethral discharge. One month ago, he had similar symptoms and completed a course of doxycycline therapy for a chlamydial infection. He has no previous history of sexually transmitted diseases. He has been sexually active with one female partner for 2 years, and she takes an oral contraceptive. The examination shows no abnormalities. A urine polymerase chain reaction test is positive for Chlamydia trachomatis. Which of the following is the most likely explanation for this patient’s current infection?” [21] | Stem |
| Distractor 1 |
| Distractor 2 |
| Distractor 3 |
| Answer (or key) |
| Distractor 4 |
Ontology Design | Example in Medical Domain |
---|---|
Entities | Patient, Symptoms, Diseases, Treatments, Tests, etc. |
Properties | ‘hasSymptom’, ‘hasTestResult’, ‘givenTreatment’, etc. |
Rules | These are the critical aspects of the rule-based ontology. For instance, a rule could be: “If a patient hasSymptom ‘X’ and hasTestResult ‘Y’, then they might haveDisease ‘Z’.” |
Ontology Rule: | “If a patient hasSymptom ‘fever’ and ‘rash’, they might haveDisease ‘measles’.” |
MCQ Generation: | Scenario: “A 5-year-old patient presents with a 3-day history of fever followed by a rash that started on the face and spread downwards.” |
Stem (Question): “What is the most likely diagnosis?” | |
Options: (A) Common cold (B) Measles (correct) (C) Chickenpox (D) Rosacea |
Ontology Design | Example in Medical Domain |
---|---|
Entities | Diseases, symptoms, treatments, etc. |
Attributes | Characteristics of the diseases like onset time, severity, etc. |
Relations | Links between entities, like a disease causing certain symptoms. |
MCQ Template: | “A patient presents with [Symptom1], [Symptom2], and [Symptom3]. What is the most likely diagnosis?“ |
Entities: | Diseases, e.g., common cold, influenza, allergies |
Attributes: | Common Cold: sneezing, runny nose, mild fever. Influenza: high fever, muscle aches, fatigue. Allergies: sneezing, itchy eyes, runny nose. |
MCQ Generation: | “A patient presents with sneezing, runny nose, and mild fever. What is the most likely diagnosis?” |
Correct answer: Common Cold | |
Distractors (wrong choices): Influenza, Allergies |
Case Report | “A 45-year-old male patient presents with fever, cough, and difficulty breathing. Chest X-ray revealed bilateral pneumonia. The patient traveled recently to a region with a high number of COVID-19 cases”. |
Data Preprocessing |
|
Information Extraction |
|
MCQ Template Creation: | “A [age] year old [gender] with a history of [history] presents with [symptom]. What is the most likely diagnosis?” |
Stem Generation: | “A 45-year-old male with a history of traveling to a high-risk COVID-19 region presents with fever, cough, and difficulty breathing. What is the most likely diagnosis?” |
Distractor Generation: |
|
Generated MCQ: | “A 45-year-old male with a history of traveling to a high-risk COVID-19 region presents with fever, cough, and difficulty breathing. What is the most likely diagnosis?”
|
Study Reference | Techniques Used | Performance | Dataset Description |
---|---|---|---|
Karamanis et al. (2006) [34] | Rule-based approaches:
| The average time taken per MCTI was around 3 min, which is significantly faster than manual production estimates by experts. | Text and the Unified Medical Language System (UMLS) |
Wang et al. (2008) [26] | Template-based approaches:
| Experiments conducted on 100 medical articles using 23 question templates on headache aspects showed 88 accurate questions generated, with 83 correctly answered. Mistakes in question generation were mainly due to insufficiently defined entries and keywords in templates. | A mix of text in diseases, symptoms, causes, therapies, medicines and devices. |
Gierl et al. (2012) [36] | Template-based approaches:
| The AIG process generated 1248 multiple-choice items for diagnosing complications with postoperative fever. The 1248 items were produced in a total of 6 h across three stages: Stage 1 (3 h), Stage 2 (2 h), and Stage 3 (1 h) | Medical case-based questions. |
Khodeir et al. (2014) [37] | Rule-based approaches:
| There is a significant improvement in the approximation accuracy of the student model. The student model’s ability to estimate or predict outcomes or behaviors is enhanced by 40%. Additionally, the paper mentions a 35% reduction in the number of assessing questions needed when adapted generated questions are utilized. | Medical case-based diagnostic questions. |
Leo et al. (2019) [6] | Template-based approach:
| The study generated 3,407,493 questions using an approach implemented by EMCQG. | Clinical dataset from EMMeT |
Study | Technique Used | Dataset | Performance | Evaluation Metrics Used | Key Findings |
---|---|---|---|---|---|
Leo et al. (2019) [6] | ontology-based approach | Clinical dataset from EMMeT | Application generated over 3 million questions across four physician specialties and conducted a user study involving 15 medical experts to evaluate the approach. The evaluation revealed that 129 questions (30%) were deemed appropriate for exam use by both experts, while an additional 216 questions (50%) were considered suitable by at least one expert. | Unweighted Kappa, Feedback analysis, Difficulty Score | The key findings of the study show that the ontology-based approach is effective in generating high-quality, complex MCQs that are suitable for medical education and assessment. |
Huang et al. (2022) [72] | Bi-LSTM (Bidirectional Long Short-Term Memory), neural network technology | The dataset comprises queries and answers related to double eyelid surgery | This overview indicates that the application’s performance is assessed through its optimization strategy, accuracy, loss rate, and user interaction capabilities, aiming to provide effective and reliable medical information. | Accuracy, Loss rate | The study does not explicitly detail the key findings or the exact figures for model accuracy and loss rate, but it emphasizes the importance of these metrics in evaluating the model’s performance. The use of a specific medical dataset suggests a focused approach to improving AI-driven medical consultations. |
Cheung et al. (2023) [70] | LLM (ChatGPT) | Two standard undergraduate medical textbooks (Harrison’s, and Bailey & Love’s) | ChatGPT was able to produce 50 questions in 20 min and 25 s, significantly faster than the 211 min and 33 s required by human examiners for the same number of questions. However, in the relevance domain, AI-generated questions scored slightly lower than human-generated ones. | Appropriateness, clarity and specificity, relevance | The study found no significant difference in the overall quality of questions between those generated by AI and humans, except in the relevance domain where AI was slightly inferior. AI-generated questions showed a wider range of scores, indicating variability in quality compared to the more consistent human-generated questions. |
Y. S. Kıyak (2023) [58] | LLM (ChatGPT) | Not Specified | The ChatGPT prompt introduced in the paper can generate a large number of high-quality case-based multiple-choice questions (MCQs) quickly, which significantly reduces the effort required by subject matter experts in medical education. | Relevance, Feedback analysis | The paper finds that the ChatGPT prompt can generate a large number of high-quality case-based MCQs efficiently, significantly reducing the effort required by subject matter experts. |
Agarwal et al. (2023) [71] | LLM (ChatGPT, Bard, and Bing) | Large datasets of text from the internet. | ChatGPT generated 110 MCQs, Bard generated 110 MCQs, and Bing generated 100 MCQs, as it failed to generate questions for two competencies. | Kruskal–Wallis Test, Unweighted Kappa | ChatGPT produced the most valid MCQs with a median validity score of 3, while Bard and Bing had slightly lower validity scores, indicating that ChatGPT’s questions were more accurate and relevant. ChatGPT’s questions were the least difficult, with a median difficulty score of 1, whereas Bard and Bing had slightly higher difficulty scores. The reasoning ability required to answer the MCQs was rated similarly across all three AI models, with no significant difference, indicating that none of the models could generate questions that required a high level of subject understanding. |
Kiyak et al. (2024) [90] | LLM (ChatGPT) | Prompts published in medical education literature. Source not specified. | The performance of the case-based MCQ generator is enhanced by its ability to produce contextually relevant and high-quality MCQs efficiently, surpassing the capabilities of the standard ChatGPT | Relevance, Efficiency | The custom ChatGPT significantly streamlines the MCQ creation process by eliminating the need for manual prompt input, making it easier for medical educators to generate questions. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Al Shuraiqi, S.; Aal Abdulsalam, A.; Masters, K.; Zidoum, H.; AlZaabi, A. Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions. Big Data Cogn. Comput. 2024, 8, 139. https://doi.org/10.3390/bdcc8100139
Al Shuraiqi S, Aal Abdulsalam A, Masters K, Zidoum H, AlZaabi A. Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions. Big Data and Cognitive Computing. 2024; 8(10):139. https://doi.org/10.3390/bdcc8100139
Chicago/Turabian StyleAl Shuraiqi, Somaiya, Abdulrahman Aal Abdulsalam, Ken Masters, Hamza Zidoum, and Adhari AlZaabi. 2024. "Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions" Big Data and Cognitive Computing 8, no. 10: 139. https://doi.org/10.3390/bdcc8100139
APA StyleAl Shuraiqi, S., Aal Abdulsalam, A., Masters, K., Zidoum, H., & AlZaabi, A. (2024). Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions. Big Data and Cognitive Computing, 8(10), 139. https://doi.org/10.3390/bdcc8100139