1. Introduction
As the development of artificial intelligence (AI) exponentially accelerates, both real and expected impacts that it will have on society inevitably increase. This effect is not limited to the economy, which potentially may be impacted to the scale of USD 13 trillion by 2030, but to various fields such as technology, culture, and politics [
1]. The release of ChatGPT by OpenAI is not an exception to this trend, which is already disrupting multiple fields of society while people anticipate a mixture of positivity and negativity. Among the affected fields of society, education may incur one of the biggest impacts partly due to its nature that bases itself on learning/teaching “knowledge” and how it relies on text-based assessment or expression methods; both elements are more or less questioned by the rise of ChatGPT and related AI technologies. There is, therefore, an increasing need to address questions like “What is the nature of education?”, “How will AI affect higher education and the learning uptake of students?”, and “How can standard frameworks of learning, such as Dewey’s Reflective-Thought-and-Action model and Bloom’s taxonomy, be used to evaluate AI’s impact for better regulation?”. This opinion paper will approach and analyze the state of ChatGPT and its expected impact on higher education through the lens of educational theories, proposing a construction of possible evaluation criteria for the optimal usage of ChatGPT in the field of education. It thus comprises a comprehensive review of the existing literature on ChatGPT and an analysis of Dewey and Bloom’s widely accepted models of learning and presents findings in terms of expected consequences and strategy proposals for instructors and administration.
1.1. Functions of ChatGPT
ChatGPT is one of the series of generative pre-trained transformers that was developed and released in November 2022 by OpenAI. Typically categorized as a type of large language model (LLM) that utilizes deep learning and generates human-like text, it is fine-tuned based on its predecessors GPT-3 and GPT-3.5 (where the former having 175 billion and the latter having 6.7 billion parameters) [
2]. While these kinds of natural language processing (NLP) systems used to require a large corpus of texts and laborious “data-labeling” processes under supervision to be trained, ChatGPT has acquired the ability from GPT-3 to learn from any texts without specific training and is capable of rapid improvement and adjustments of information [
3,
4]. Yet, it also differs from its predecessors in its wider range of available information and improved capabilities of fine-tuning specific language/tasks [
5]. Hence, as an NLP system, ChatGPT and related AI technologies will most likely revolutionize the educational landscape in terms of learning, teaching, and assessing whether the consequences will be enhancement or stagnation of the knowledge creation process.
1.2. ChatGPT in Education: Why Research Is Necessary
Taking into account the various reactions against ChatGPT, the higher education sphere can be described as “confused” at best, with optimistic futurism on the one hand and pessimistic Luddism on the other. Setting aside the debate for a moment, what must be recognized here is that the influence made by ChatGPT and related technologies will persist in any field virtually indefinitely once it is introduced. Updating and improving ChatGPT, and by extension, the fields in which it is embedded can be expected to continue at an exponential pace considering the demand for further efficiency and productivity in modern society. However, the field of education is unique in that, unlike other fields, it intrinsically changes the future of our species by virtue of shaping what our future generations learn. More specifically, how human and material resources are used, along with everything that will be generated, constructed, and preserved by the educated, are what determine the trajectories of progress in every other field [
6,
7,
8]. The facts that education serves as the foundation of society and that education is one of the fields that is expected to be and is already largely disturbed by the introduction of these technologies suggest that there is a need to analyze the highly advanced AIs such as ChatGPT more deeply as soon as possible.
Hence, the optimal usage of ChatGPT in education must follow a measure that functions as a guideline of how these technologies can be used to maximize their potential in enhancing the learning and teaching process while also preventing or at least mitigating its negative influence as it is just a matter of time that these technologies will be fully incorporated into society. As mentioned above, there is no turning back once ChatGPT and related technologies alter the environment. Regarding the higher educational field and its fundamental contribution to the knowledge generation process in society, it implies that if these technologies are not managed with necessary levels of caution in their educational use, there is even a risk of compromising the entire cohort of human resources and knowledge, which is a consequence that no being could take responsibility for. Although the actual outcome may not be the extreme case of loss/stagnation of the knowledge creation process, understanding the potential impact of ChatGPT and related AI technologies in the educational field is nonetheless essential for their optimal utilization. Referring to Frantz Rowe’s comment on ChatGPT, it is possible that “in this Human-Computer Interaction oriented towards learning, neither agent does learn (…) the more we use transformers like ChatGPT, the more they are likely to get close to providing a correct answer. Conversely, the more users may lose their ability to reflect and discern alternatives and write in an original way” (p. 36) [
9]. Therefore, it cannot be stressed enough that these tools must be optimally regulated to enhance its gain or mitigate its loss in education, which may require the formation of reliable evaluative criteria.
2. Literature Review and New Perspective
2.1. Positive Expectations
2.1.1. Personalized Learning
Starting from the positive expectation for ChatGPT in the existing literature, one of the main elements frequently mentioned is its capability of enhancing the overall quality of learning through the personalization of students’ learning. For instance, while higher education providers already routinely incorporate Universal Design for Learning (UDL) principles, including multiple modes of engagement, representation, and action and expression [
10], it is generally expected that ChatGPT will enrich the learning process by providing the basic information that students need without going through the process of asking the teachers/professors or doing comprehensive research on the background [
11]. This, in turn, is expected to make the students’ learning more targeted and customized to their own abilities as ChatGPT removes the necessity of asking inessential questions, ultimately resulting in better motivation of the students towards learning. This moves the focus of student learning away from traditional, albeit still useful, forms of information uptake to more organic forms of self-reflection and meta-cognition. The key feature of ChatGPT that will contribute to this enhancement is its capability of fine-tuning; as ChatGPT has a wide coverage of knowledge, it is theoretically possible to construct a Personalized Adaptive Learning (PAL) system for any field of knowledge for any kind of student [
12]. Because these mentioned benefits not only apply to students’ learning processes but also to the lecturers’ teaching processes, it is expected to enhance learning from both sides of the interaction.
2.1.2. Enhanced Practicality Efficiency
In addition to personalizing students’ education, it is also pointed out that ChatGPT and related technologies will shift the format and mode of their learning. While conventional education often relies on theory-based materials (e.g., knowledge taught on text), the introduction of ChatGPT into education would allow students to outsource certain knowledge depending on their needs/wants and focus more on “hands-on” learning that will give them direct experience in their field of interest [
4]. Essentially, ChatGPT and related technologies are expected to reduce more tedious learning processes, such as ‘cramming in’ information that students would use only for their exams, where the learned content would not be retained for long. This would also incentivize student-centered pedagogies, where students would identify what they need to learn to achieve their goals, and teachers could support them more effectively with the assistance mentioned above of ChatGPT’s personalization. As this paper focuses on higher education, this expectation for ChatGPT could be seen as a milestone for implementing student-centered learning in an environment with an adequate foundation for students to think on their own and sufficient avenues for exploration and practical learning.
2.1.3. Assessment Reforms
Building on the above, ChatGPT and related technologies may also influence the method of assessment as a byproduct of enhancing learning and teaching. This is expected to be achieved mainly through the de-prioritization of “obsolete” assessments such as traditional essays and cram-in exams, where ChatGPT’s impacts have been assessed to be most drastic and radical [
12]. An argument for adopting ChatGPT as an innovator for assessment techniques is this: if students can achieve an A by using ChatGPT (not considering the problems of plagiarism), then they are not learning anything valuable in the first place. While one may fault the students for using ChatGPT, the fact that the class could be passed using ChatGPT nonetheless undermines the contents, value, and meaning of the class itself, revealing an underlying defect in traditional assessment—it does not accurately capture levels of student learning. Of course, these allegedly “obsolete” assessments may still prove to be useful in the future for validating basic knowledge retention and regurgitation, especially since the UDL principles emphasize inclusivity and thus do not warrant a complete jettisoning of traditional assessment formats. Still, the transformation of these assessments is expected to take place at a more advanced level; that is, future ChatGPT-conscious assessments will focus instead on heightened instructor–student interaction with methods like real-time proctoring, AI–proof assessment, AI–complementary assessment, and so on. Many of these new tools are discussed later in our strategy recommendations. Hence, from the positive side of view, ChatGPT is seen as a reformer of traditional education that enables innovation in education, allowing the field to focus more on acquiring higher-order skills and creating new values.
2.2. Negative Expectations
2.2.1. Overreliance
While the above-mentioned works of literature claim the future of education to be brightly lit by ChatGPT, some expectations point out the possible obstacles to introducing ChatGPT into the educational field. To begin with, the concern for the deterioration of lower-order skills must be mentioned. The ability of ChatGPT may allow outsourcing of the cumbersome process of going through basic knowledge, but this also is a double-edged sword as students who have not acquired the necessary skills could become over-reliant on these tools, failing to learn. This problem is further magnified by the lack of consistency in the information provided by ChatGPT (whether due to the probabilistic nature of GPTs or the bias in the trained data set), where students without proper user ability, who often are the ones that become over-reliant, would suffer more loss [
13]. The issue with ChatGPT is that it does not always provide “right” answers; the closest allegorical representation might be “a smart friend” who is “knowledgeable” but “not sure of where they gained the information” yet “is willing to tell” about it. Viewing ChatGPT and related technologies from this perspective, it may be easier to comprehend the issue of implementing these tools into education with simple-minded optimism.
2.2.2. Loss of Creative Thinking
Moreover, the negative disruption is not only expected to happen in lower-order skills but could also influence higher-order skills such as creativity, leading to a diminution of “originality” and the importance of knowledge itself. The concern here is that if every user relies on ChatGPT to learn, then the outcome of education will inevitably be shared (i.e., the same) with all other users at a certain point in time [
9]. This indicates that in the long run, there is a risk of eradicating the generation of new knowledge, hence critically undermining the raison d’etre of education. Of course, this may not be critical if students (or users in general) could retain their creativity and refer to the information provided as a mere sample or a possible guideline of new ideas and not as answers. However, Krügel et al. suggest that AI is capable of influencing human decisions to the same level as human advisors would, where the majority of advised people do not notice that their thoughts are being influenced by AIs [
14,
15]. In addition, most people do not seem to care about the legitimacy of the source of information that AIs are providing [
16,
17]. As ChatGPT is not an exception to this case, unconscious influence on ideas and the belief of “garbage in, gospel out” may and will be a critical combination in the field of education, even for users with higher-order skills.
2.2.3. Lack of Authenticity and/or Accuracy
In connection to the learning side of education, identification of authorship is also being put at risk due to these AI tools. As already discussed, ChatGPT generally generates answers based on the probability or “likeliness” of a certain term to appear next to the previously generated word/content. According to criticism by Noam Chomsky, ChatGPT is “incapable of distinguishing the possible from the impossible (…) systems can learn both that the earth is flat and that the earth is round. They trade merely in probabilities that change over time” [
18]; essentially, it does not “understand” what it is producing. This problematic feature is not only observed in the general content that ChatGPT generates but also is reflected in its irresponsibility in citations (which for humans is not “generation”, but for ChatGPT it is), where this “irresponsibility” even extends to an issue of source fabrication. Yet, the more serious problem may be that ChatGPT can be seen as an entity that “writes in their word”; under current logic, it is theoretically not impossible to reference ChatGPT itself as a source of information or even list it as an author/co-author of research [
19]. To this argument, major institutions (aside from many of those who are keeping their silence) such as Taylor & Francis, Springer Nature, Science, Elsevier, and ICML have announced that ChatGPT cannot be regarded as an author of research mostly due to its lack of both reliability and ability to take responsibility [
9]. However, in terms of reliability, it must be remembered that human authors make a similar mistake. The real barrier referring to ChatGPT as an academic information source, according to Rowe again (p. 36), is that:
“it should be also transparent about how ideas have been derived and articulated from the literature and how the methods have been designed and used in the particular instance of [the] research. This notably requires that researchers cite their sources of inspiration both in order to demonstrate integrity and to facilitate further research through possible contestation. Something that is currently missing with ChatGPT. (…) If we lose the ability to identify the literature background, we lose the capacity to assess the value of the contribution”.
Hence, the current issue of ChatGPT regarding authorship seems intrinsically rooted in its inconsideration towards future research and learning; for education, this lack of long-run perspective is fatal. Yet focusing only on protecting academic authorship from ChatGPT and related technologies as a reaction may cause an infinite cat-and-mouse game of AI technology being used to review AI-influenced work ad infinitum, wasting resources and humans learning nothing in the end. Therefore, the issues with the contribution of ChatGPT to education must be analyzed comprehensively to understand all of the possible risks stated above and prevent expected disturbances.
2.3. New Perspectives and Application of Educational Theories
Considering the two sides of the argument, the following question must be questioned again: Why is there a large debate about ChatGPT in education? If the concern here is that it simply “disturbs” the current education system and, therefore, institutions/teachers/students may not be able to keep up with the change, the argument for complete prohibition seems to be insufficient; the revolutionary technologies in the past have more or less experienced a similar reaction, where with an adequate regulation, ChatGPT and the related technologies could provide a drastic innovation in education. In addition, as ChatGPT is already out in the world, sooner or later, it will directly affect the education process due to its nature as a cutting-edge technology. Another concern may be its influence on educational “outcomes” where, for instance, ChatGPT and related AI tools could be used to cheat or substantially alter the result of students’ assessments [
20]. This perspective is more concrete compared to the argument made above, as it indirectly indicates the critical issue with the modern education system: the discrepancy between “learning” and “outcomes”. Students are incentivized to focus on the quantifiable result (e.g., grades), which is supposed to be an indicator of their acquired skills but is often not, rather than the learning itself [
21]. Hence, ChatGPT and related AI tools could make this situation worse, as it could enhance this “result-focused” trend and may end up destroying knowledge/content generation. Yet this concern also seems to be missing the point, as the fact that tools such as ChatGPT exist does not necessarily lead to the act of cheating. These inappropriate methods of alteration are rather induced by the environment, not the technology itself. Still, these concerns have outlined the importance of converting the current “result-focused” trend into “learning-focused”, as ChatGPT would only improve its ability while the environment consists of a major element that determines the understanding of “education” for students that will be using these AI tools.
To optimize the usage of ChatGPT in education before the outcomes are catastrophic, it is not enough to come up with general expectations or reconstruct the meaning of “education”; the former is “too general”, and the latter requires time. Statements such as focusing more on “critical thinking” and “creativity” in the times of education with AI assistance may point out the overall flow, but they do not provide any process (the “how”) for achieving that goal. Ideas such as the importance of recognizing digital literacy, data literacy, and AI literacy [
22] provide concrete goals, but they are also not the concepts that could cover the entire impact that ChatGPT would have in the educational field. The aforementioned reconstruction in the understanding of what “education” means to students, teachers, and society may more or less mitigate the influence of ChatGPT if it can successfully be shifted to the “learning-focused” trend, but the effects would only be observed in the long run. Thus, education, in the face of rising AI tools, could be seen as lacking solutions that can provide guidelines that can be used immediately and flexibly. Taking that into account, this paper aims to provide the guidelines in the format of evaluative criteria that use relevant major educational theories: John Dewey’s Reflective-Thought-and-Action model (which will be abbreviated as RTA theory) and revised Bloom’s taxonomy. Although the reasons for choosing these two theories are further explained in the sections below in detail, the attempt to apply these theories as an evaluation method is justified by their validated contribution to learning. As discussed above, one of the major issues evaluating ChatGPT and related AI technology in education is that due to its nature of being introduced only recently, there are no explicit frameworks that are widely acknowledged and implemented to evaluate these tools specifically in the higher educational field. Hence, although relatively dated and not specifically designed for evaluating learning under the influence of artificial intelligence, assuming that the fundamental principles/meaning of education have not changed significantly from the past, analyzing the potential impacts of ChatGPT in learning under these educational theories would be meaningful.
2.3.1. John Dewey’s Reflective-Thought-and-Action Model
The RTA theory proposed by John Dewey (1859–1952), a prominent American educational reformer associated with pragmatism and experiential learning, observes the process of an individual learning knowledge as the result of following five chronological steps [
23]:
Encountering disturbance and uncertainty;
Intellectualization and definition of the problem;
Studying the conditions of the situation and formulating a working hypothesis;
Reasoning the hypothesis (Expecting the results);
Testing the hypothesis.
Only after completing these steps, the learner is either able to gain the solution to the problem they had encountered initially or obtain an idea/concept that helps them to solve the problem in the future, where further learning will be conducted following the same five steps [
24]. One important aspect that must be mentioned is that because experiential learning has the disadvantage of having relative difficulties in the evaluation of information soundness (as it is heavily dependent on the environment and derived experience), RTA theory itself does not guarantee the quality of the solutions or the ideas gained by the learners [
25,
26].
Despite the risks that RTA theory presents, it could function as a robust benchmark for evaluating the effect of ChatGPT and related technologies in education. From its structure, Dewey’s RTA theory is viewed as a step-based approach; thus, if one of the steps fails due to certain factors, the following steps would also be compromised, disrupting the successful knowledge acquisition of the learner. For instance, if the learner cannot identify the problem (i.e., failure in step 2), there would, first of all, be no “problem” that could be solved; hence, there would be no learning for the learner. A more concrete example provided by Dewey is during step 3, where he claims that the “chief function of philosophy is not to find out what difference ready-made formulae make, if true, but to arrive at and to clarify their means as programs of behaviour for modifying the existent world” [
23]. Dewey’s focus on the experiential learning and formulation of “hypothesis”, and less emphasis on conceptual knowledge, derives from his view that if the learner cannot hypothesize a solution, then they would not be able to learn [
27]. Considering these characteristics of RTA theory, although ChatGPT could generate “solutions” according to the provided prompts, it is questionable whether the users are “learning” with enhancements or are merely “outsourcing” the necessary knowledge-gaining experience. As education requires the learner to formulate and retain their knowledge as part of the learning, if specific stages in the RTA theory that are expected to largely (and constantly) be enhanced/disrupted by ChatGPT could be identified, it is worth evaluating the stage to implement an optimal regulation.
Another aspect of RTA theory that is significant for the topic of generative AI comes from Dewey’s ideology of experiential education. According to Dewey, “an experience may be immediately enjoyable and yet promote the formation of a slack and careless attitude; this attitude then operates to modify the quality of subsequent experiences so as to prevent a person from getting out of them what they have to give” [
28]. This closely resembles the critical attitude towards implementing ChatGPT in education, where the concern is weighted against the consequence of future knowledge generation. As Dewey claims that the quality of the experience depends on (1) immediate agreeableness/disagreeableness but also on (2) latent influence upon later experience [
28], it should always be reminded that for an education to be successful, there is an inherent need to select the educational experience that positively influences the subsequent experience of the learner. While ChatGPT may create new opportunities for experiential education by outsourcing certain knowledge (as mentioned in the literature reviews above), it does not change the fact that ChatGPT does so through providing conceptual “solutions” that they are fed as input; the learner may be satisfied at the level of immediate agreeableness thus stop learning beyond what they are provided, but because it does not connect to the positive later experience (i.e., promotion of further learning), the supposed new opportunity could, in reality, hinder education. Once these perspectives of Dewey’s ideology and theory are incorporated into the evaluation of ChatGPT, it may allow the construction of a more comprehensive regulatory framework.
2.3.2. Revised Bloom’s Taxonomy
Bloom’s taxonomy is a set of educational hierarchical models with three domains, (1) cognitive, (2) affective, and (3) psychomotor, which was proposed by the team of educators led by Benjamin Bloom to classify the learning objectives by the types and difficulty of the tasks [
29]. The first (cognitive) domain focuses on knowledge in particular, and the model was revised in 2001 to the following six levels [
29,
30]:
Remembering;
Understanding;
Applying;
Analyzing;
Evaluating;
Creating.
The stages are composed of lower to higher cognitive complexity, whereas, similar to Dewey’s RTA theory, the learner must successfully acquire knowledge at the lower levels to advance to the higher level. For example, if the learners want to learn mathematics, they must first be able to memorize numbers and equations before understanding what they mean, which they can afterward apply to the given problems and proceed to experiment and related critiques, perhaps leading them to the final stage where they can create new theorems. Although this model is not without criticism, such as it being an imperfect hierarchy or reality having less clear-cut lines between the levels, revised Bloom’s taxonomy is still widely used in today’s educational setting as a reference [
31,
32]; this paper will also propose it to be used in the evaluation of ChatGPT and the related technologies in education.
Yet, even without considering its popularity, the revised Bloom’s taxonomy is essentially relevant to the evaluation criteria. As the taxonomy is structured in a way specific to certain skills and complexities, it is possible to evaluate the case where ChatGPT is fine-tuned to each level. The learners who use these AI tools may be able to enhance their learning by raising efficiency in the acquisition of knowledge, where because each stage exists as an extension of the previous level, the learner (in the ideal situation) can also visualize the learning trajectory that they followed and will be following. The issue of overreliance may be mitigated through various methods, where, for instance, the confidence levels and accuracy of ChatGPT could be lowered by combining the keywords (used in the prompt) attributed to different levels of Bloom’s taxonomy [
33]; this could not only minimize dependence on generative AI tools but simultaneously encourage an active engagement in education. However, these measures do not eliminate the risk of the entire learning process collapsing if ChatGPT largely disrupts the lower stage of knowledge acquisition. Hence, through combining similar yet different educational theories, Dewey’s RTA theory, which focuses on the micro-level individual learning experience, and revised Bloom’s taxonomy, which focuses on the macro-level skill-dependent learning stages, this paper proposes multi-theory evaluation criteria that could become a starting point for the future regulatory framework.
3. Expectations of the Impact of ChatGPT on Students’ Learning Uptake
Observing the effect of ChatGPT and the related AI technology on education is difficult, taking into account factors such as its recent development, which lacks sufficient legal controls and past resources for examination. Moreover, regarding the role of education in the individuals’ learning and creation of knowledge, the format of obtaining data would require attentiveness to the consequences. For instance, experimentation that aims to achieve certain types of empirical data, such as the observation of the educational achievement between intentionally differentiated users and non-users of ChatGPT in their educational process, does not only have potential ethical risks but, depending on the size of the cohort, it could incur material risks to society as a possibility. Hence, to understand the possible arrangements of the evaluation criteria, surveys on the users that are related to education (professors/lecturers as well as students, since this paper focuses on higher education) regarding the stages/levels of the mentioned two educational theories would be recommended. Although the expectations of the educators and students would not be perfect, they should nonetheless reflect on which stages should have the initial focus of the regulation through average demand, contributing to the construction of the evaluation criteria. The criteria created through this method could not only serve as a temporal measure until empirical data of ChatGPT on education could be quantified and visualized but should also be used with the data afterward to create detailed criteria that express both qualitative and quantitative aspects of the evaluation.
Among the potential scenarios of expectations, one of the simplest could be that both models would express a larger disturbance in the lower stages. This is because people who become over-reliant or experience disruption in their learning may lack the necessary knowledge that is required to use these tools adequately in the first place, which indicates that they are at the lower levels/stages of the educational theories. For example, there would be a clear difference in the learning outcome between a user who used ChatGPT according to the questions they have internalized/formulated themselves and a user who merely inputted a given prompt without recognizing the meaning behind the problem. Although there is a chance that both of these users will receive a similar score for their output, there is a critical difference in terms of knowledge acquisition, which is one of the main purposes of education. Due to the nature of the educational theories that this paper refers to, the magnitude of disturbance should decrease (i.e., the potential of enhancement should increase) as the theoretical stages progress according to this scenario; the lower the stages/levels the learner is in, the higher the chance of them failing the knowledge acquisition. Hence, according to this scenario, the use of these AI tools should be heavily regulated when the learner is in the lower stages/levels.
However, the above-mentioned scenario may be misleading as it oversimplifies the types of risks that learners in each of the stages/levels have. With its current capabilities, ChatGPT has the following pernicious characteristics: (1) able to provide “solutions” without reasoning or confidence [
4,
9] and (2) able to directly/indirectly manipulate the knowledge of the user [
14,
15]. In (1), ChatGPT is liable to give solutions to a prompt without giving a full explanation to the user unless specifically asked to and is prone to data hallucination, while in (2), users are likely to trust and conform to generated solutions because of their perceived reliability. While the former is an avoidable risk for learners with higher knowledge capabilities, the latter is almost unavoidable in any of the stages/levels and can be insidious. These factors may introduce complexities into the scenario, where each stage/level could be enhanced or disrupted for different reasons. For instance, Dewey’s RTA theory could perhaps exhibit the following state (
Table 1):
While revised, Bloom’s taxonomy could exhibit the following (
Table 2):
The results of our analysis of ChatGPT through these tables reveal that ChatGPT is indeed a useful tool even at higher stages of learning, given conducive enough conditions like the learner’s own initiative and propensity for self-reflection. It has to be said that our analysis is merely one out of a few possible cases for ChatGPT’s impact on a learner in each learning model and, hence, is not conclusive evidence for or against ChatGPT’s use in higher education. For example, results may vary across different disciplines, and what might be a disturbance under Bloom’s taxonomy in a humanistic discipline may present as an enhancement in a technical programming discipline; in the former, core concepts must be comprehensively understood before any useful application in real-life contexts can be carried out, whereas, in the latter, ChatGPT’s simplifying summary of concepts and amenability to problem-solving can be helpful for students’ abilities to grasp the concepts in the first place. ChatGPT’s impact in this example is effectively reversed across the two different disciplines. Yet, it must also be acknowledged that the expectations according to the educational theories may provide significant insight into developing optimal regulation for using ChatGPT and related AI tools in the field of education.
4. Potential Strategies for Regulation
Given what has been said about the consequences of a general acceptance of the use of ChatGPT among both instructors and students, it is imperative that strategies regarding its regulation in the educational context are discussed. First, let it be noted that the strategies proposed in this section are non-exhaustive—they only point to the practices most relevant to what has already been discussed so far, namely, student learning processes and instructor–student dynamics. Secondly, ‘regulation’ here refers not only to methods of mitigation and harm reduction but also to processes of transformation and edification that could be useful to implement, considering the promising applications of ChatGPT in higher education. This is especially pertinent, seeing as students have “generally positive attitudes toward using ChatGPT for learning” [
34] despite the dangers of over-reliance on and being misled by potentially inaccurate generated text, indicating an increasingly common acceptance of its use in universities.
The strategies we suggest for the mitigation of ChatGPT’s harmful effects are (1) different and more innovative assessment tools and (2) a greater focus on personalized instruction. Assessment will drastically change in the face of ChatGPT’s capability to provide students with sufficiently accurate information rapidly. Instructors, therefore, should adapt to this change by developing new tools to measure students’ progress or embrace existing tools that reduce students’ overreliance on ChatGPT and aim to reduce the chances of their being misguided by imprecise information. These tools include reflective journals that can track a student’s uptake of knowledge over a longer temporal frame [
35]; ChatGPT–proof assessment questions that have been tailored against the more generic and superficial responses that current LLMs are capable of (with an added safeguard of using AI tools to check for AI intervention in student submissions); or even a new assessment format such as interviews or real-time, personal problem-solving, which comprehensively tests students’ higher-level understanding of concepts and procedural knowledge. Next, personalizing students’ learning will be important to ensure that they use ChatGPT as effectively and beneficially as they can. It has been noted that ChatGPT can provide feedback on student work quickly and reliably in various disciplines [
36]. This will leave instructors more room to work personally with students and give advanced guidance in higher-order skills, such as essay conceptualization, conceptual clarification, technical rectification (like in programming fields), and so on. Office hours need no longer be limited to basic knowledge verification but can extend to protracted discussions meeting students’ more esoteric questions that can enhance their knowledge base and deepen their understanding of course material.
Moving on to areas of edification, it is possible to capitalize on the educational benefits of ChatGPT by (1) co-opting flipped or hybrid classrooms for higher education settings and (2) increased guidance for students on the use of ChatGPT (or AI tools generally). Flipped classrooms are ones that prioritize student collaboration, real-time practice and testing, and the development of hands-on skills during class time, whereas out-of-class time is reserved mainly for knowledge uptake and information distribution through online resources or lectures. This practice has been shown to improve the quality of student discussions and inspire higher levels of initiative and productivity among students, given enough pedagogical innovation and rigor [
37,
38]. With the aid of ChatGPT, out-of-class time can be fully maximized by students for basic information retention, while in-class time can be reserved for practicing higher-order skills that apply such information to more advanced problems, thereby positively augmenting the flipped classroom approach. A hybrid classroom may also be considered, where stages of learning corresponding to the lower levels of Dewey’s RTA and Bloom’s taxonomy can take place online using ChatGPT as an instructional resource, whereas learning at the higher levels relies more heavily on personalized instruction. Finally, an important milestone for higher education will be syllabi catered specifically to AI guidance for students. Reiterating the threats of ChatGPT to student learning, it is essential that students are taught
how to use them appropriately and
what to use them for, not least because “behavioral intention” and “habit” are among the leading factors for why students turn to ChatGPT [
39,
40]. By ensuring that students approach ChatGPT with the requisite training and a conducive mindset to their learning, higher education providers will be able to better utilize ChatGPT as an educational tool rather than shy away from it. Crucially, the aim of these suggested practices, and others like it, is to reduce disruption and promote enhancement that ChatGPT presents to students’ learning, as established above in Dewey’s RTA and Bloom’s taxonomy learning models.
5. Conclusions
To appropriately manage the situation regarding the usage of ChatGPT and the related generative AIs in education, it is crucial to identify the benefits and loss in terms of learning as soon as possible since the further development of the technology is inevitable regarding the nature of these tools and the intention behind the development. From the literature reviews and considered scenarios, ChatGPT seems to hold the potential of polarizing learning by enlarging the gaps between disturbance and enhancement. Although it is still unidentifiable whether this “polarization” is a necessary cost of innovation that brings a net positive effect to society, what is clear is that the cost should be minimized where it could be and while it is still possible. Considering the potential strategies suggested in the previous section, if the cost and benefit could be identified accordingly, disturbance for learners could be minimized while the enhancement in knowledge attainment/generation could be maximized in the best-case scenario. Hence, the evaluation criteria that this paper proposes using validated educational theories and expectations of the educators/learners could become the starting point for understanding the effect of ChatGPT in a manner that is familiar to the people. Of course, the evaluation criteria are far from perfect as this paper did not consider important factors such as the types of tasks, the field of study, the environment of the learner and their notion of “education”, other educational theories, etc. Yet, as already mentioned, this paper proposes a “starting point” as it allows further development by building on the theories and expectations as the foundation; these evaluation criteria could become reliable, comprehensive measures for optimal regulations in the future.
Still, looking at the bigger picture, it must be understood that these evaluation criteria are a “short-term” solution in the field of education, where the fundamental issue that must be addressed is the meaning of education in society. As mentioned previously, the debate on how ChatGPT and related technology would disrupt education is primarily focused on the visible “outcome” (e.g., quality of essays and exams). Taking this into account, it is logical to claim that the introduction of ChatGPT would make the entire education process more efficient while also eliminating certain assessment methods because they are no longer a fit measure. However, there seems to be less debate on “why” people use ChatGPT in education in a manner that may not lead to learning in the first place. The answer may be that it is because the “learning” and the “outcome” of education may more or less be separated. When people know that they are assessed by the “outcome” and not by what they learned, the outsourcing of knowledge to the “better” option is almost an inevitable consequence. There is limited space for education as a process and a field of generating new knowledge if the notion of “education” among the learners is not learning but rather producing results that are convincing on the surface. Even if the learner is using ChatGPT to increase their “efficiency” and truly enhance their learning, if they are not aware that becoming more “efficient” by using ChatGPT does not necessarily mean that they are learning better than before, they would possibly face the consequences of indirect interference in their idea/judgment by ChatGPT. If the process does not lead to a conscious generation of new ideas and remains as a repetition of the existing information, what is the value of “learning”?
One of the worst-case scenarios may be that the fundamental principles of learning, whether intended or not, could be altered as a result of the advancement of these technologies and their usage without awareness of the consequences. In this case, any attempt to evaluate the degree of learning based on the existing theories and measures will lose its significance, whereas reinstating the lost value of learning will require not only time but also an immeasurable amount of resources and effort. Fortunately, the concern against these technologies is more or less a global phenomenon. Passed in the European Parliament on 13 March 2024, the European Union (EU) Artificial Intelligence Act proposed a comprehensive regulation against AI technology depending on their risk categories from “minimal” to “unacceptable”, where the revised category for general purpose AI (e.g., ChatGPT) was newly added in 2023. These adjustments and how the application of AI in “Education and vocational training” is specifically mentioned in the “high-risk” category represent the direction of caution expressed by the public sentiment against the application of AI technology in education [
41]. Since the implementation of the EU AI Act is a relatively recent event, its effects have yet to be observed in a recognizable format, specifically regarding education. Yet, the potential impacts caused by the Act are worth following in future research, where the suggestions of this paper may perhaps be developed accordingly depending on the outcomes. After all, it is not only technology that changes education but the environment, actions, and beliefs of the people who use it. As it is unknown whether the focus on the meaning of education or even a shift in it would be fully perceived by society, the evaluation will be difficult due to these new technologies and the expected adaptive effort of the educational system as an entity. Nonetheless, this paper hopes that the proposed attempt may be one of the first steps to construct clear criteria for evaluation in the new era of education, whose meaning needs constant reflection and updating.