1. Introduction
Generative artificial intelligence uses deep learning algorithms to create new text, images, video, or audio content, showing outstanding creative potential and significantly improving user productivity [
1]. Generative AI can offer richer and more natural interaction methods for HCI. However, service failures are an unavoidable phenomenon in any high-tech, boundary-pushing domain, especially within the constantly changing and experimenting realm of AI applications [
2]. Service failure refers to the gap between users’ expectations of service outcomes and their actual experiences [
3]. Service failures are highly likely to lead to a decrease in user trust and may even cause users to abandon the use [
4]. The competitive market environment and users’ continuous pursuit of high-standard service experiences make service failure in AI-driven interaction systems a critical issue that could affect brand loyalty and user satisfaction [
5]. Unstable user experiences may force consumers to switch to other brands, which is particularly notable in the generative AI market characterized by intense homogeneous competition. Therefore, detailed analysis and improvement in the user experience facing service failures in generative AI are indispensable parts of continuous service quality management and user retention plans.
Research on service failures in the field of human–computer interaction primarily focuses on the discrepancies between system performance and user perceptions. Current studies predominantly concentrate on the reliability of technology and the inconveniences users face due to interruptions [
6] as well as investigating the impacts of service failures on user experience from the perspectives of users’ emotional and behavioral responses [
7]. However, there is a paucity of research delving into users’ cognitive processing, especially cognitive dissonance indicators in scenarios of service failure. Cognitive conflict may be related to users’ mental models, as a discrepancy between users’ mental models of AI and the actual capabilities of currently generated AI [
8] leads to a difference between user expectations and actual services. Moreover, a deeper understanding of the mechanisms of cognitive work is of significant theoretical and practical importance for optimizing human–computer cooperation systems. The cognitive dissonance caused by insufficient services is often latent, as the effectiveness of generative AI technologies typically motivates users to continue their usage [
6]. Nonetheless, this subconscious cognitive dissonance and the resulting dissatisfaction can significantly affect their overall experience [
9], impacting user loyalty and the brand image of the firm. Particularly in today’s intensely competitive market, exploring the cognitive dissonance experienced by users in the face of service failures with generative AI becomes especially crucial, not only for the continuous iteration and optimization of generative AI products themselves but also for practically improving user experiences and thereby increasing market share.
Generative Artificial Intelligence (AI), fundamentally differs from most historical instances of automation technologies [
10], demonstrating a significant improvement in handling complex, emotionally charged tasks. Such tasks, traditionally thought to be challenging for AI technology, have seen many exciting applications in empathy, emotional expression, and the creation of emotion-related content through generative AI [
11]. Emotional tasks, inherently different from mechanical tasks, are particularly sensitive in human–computer interaction, encompassing but not limited to social interaction, emotional communication, and the perception of companion robots [
12], involving factors such as subjectivity, diversity, uncertainty, and unpredictability. The failure of emotion-oriented services has a different impact on consumer psychology compared to the failure of mechanical tasks. Users may lower their expectations, believing that AI lacks true empathy and emotional understanding capabilities [
13]. Therefore, the type of tasks handled by generative artificial intelligence is crucial, as it determines users’ expectations and standards toward the service, as well as the consequences and impacts of service failures. Despite the potential demonstrated by generative artificial intelligence in dealing with emotional tasks, current research on user experience under different tasks remains largely unexplored.
In the field of human–computer interaction, stance attribution refers to how people interpret and predict the behavior of artificial intelligence (AI) or robots [
14]. Design stance and intentional stance are shortcuts to understanding and predicting the behavior of complex systems. The design stance focuses more on the system’s function and purpose, whereas the intentional stance assigns to the system a form of “mental” state to predict its behavior [
15]. It is about interpreting the behavior of AI or robots either as stemming from the operations of a mind (intentional stance interpretation) or as the result of mechanical design (design stance interpretation) [
14]. The intentional stance attributes robot or AI behavior to human-like intentions, desires, or emotions, that is, “mental states” [
14]. Such attribution tendencies stem from the theory of mental models, which is the ability of individuals to understand that other intelligent agents have independent minds and, based on this ability, to speculate and comprehend their behavior [
16]. Stance attribution involves the process of evaluating and interpreting others’ behavior, and the attitudes and motivations behind it during interactions. When interacting with machines, people attribute certain behaviors to intentions, purposefulness, or emotions, which significantly impact users’ expectations, trust, and ways of interacting with AI [
17]. Studies have found that people attribute “stances” to AI and robots, believing these artificial agents possess a degree of autonomous intentions and emotional states, which directly affects the interaction experience with AI [
18]. However, existing research has largely focused on users’ attitudes toward stance attribution to AI or robots—how people attribute rational or emotional states to AI. Less explored is how different stance attribution interpretations, from an interaction design perspective, might indeed impact users’ perceptions and experiences.
Furthermore, when performing tasks with a high demand for emotional or mental interpretation, such as mental health counseling or educational tutoring, the use of a “mindful” mechanism by artificial intelligence may have a profound impact on users. Goetz et al. (2003) argued that the design of intelligent systems should match the task [
19]. It has been found that the type of task also affects users’ attitudes toward AI [
20]. Highly anthropomorphized service robots performing emotionally related tasks (compared to mechanical tasks) can effectively mitigate the negative impact or aversion these robots may cause in consumer reactions [
21]. Affective tasks typically involve emotional support and social interaction and are considered to be important for the user’s affective state and social connectedness, while mechanical tasks focus on helping the user to accomplish a specific task, with a predominance of explicitly functional tasks, such as product suggestion or information retrieval [
21]. Therefore, it is very necessary to clarify the stance attribution and explanation strategies under different task types to enhance user experience and adjust expectations. Currently, there is a lack of experimental research exploring the joint effect of attribution explanation and task type, with researchers not sufficiently discussing how the two interact.
Previous studies on the failures of robot and artificial intelligence services have predominantly utilized questionnaire self-report methods to collect users’ subjective attitudes and satisfaction levels after service failures [
22,
23,
24]. However, these self-reported data are often based on recollection, rather than immediate responses at the time of the event, and thus may be influenced by recall bias [
25]. In the field of generative AI, considering the immediacy of interactions with AI, the instant reactions of user attitudes are very important. Studying the immediate attitudes toward service failures can offer a new perspective for understanding user experiences. Dual-process theory posits that unconscious processes (System 1) are equally important in the attitude formation process and can even be more effective in decision-making and behavior prediction than conscious processing (System 2) [
26]. In some complex service interaction scenarios, especially those involving highly personalized and dynamically changing generative AI services, users’ attitudes might be influenced by intuitive reactions related to the specific features of the service, which are processed through System 1. These processes are often rapid, automatic, and difficult to consciously perceive. However, most existing studies have not delved deeply into the micro-psychological processes of users facing AI service failures. It is proposed that approaches based on behavioral and physiological neuroscience, such as motion/eye-tracking, electroencephalography (EEG), and functional near-infrared spectroscopy, should be used in human–computer interaction research to design intelligent systems that can understand and adapt to human users’ needs, emotions, and intentions [
27].
To bridge the aforementioned research gap, this study aims to explore users’ unconscious attitudes toward generative AI service failures by employing mental model theory and Dual-Processing Theory. Mental model theory, widely applied in the field of human–computer interaction, reveals how individuals construct models based on experience and knowledge to understand and predict events related to artificial intelligence. It posits that user expectations for interactions are based on their mental models of how they believe the system operates—cognitive frameworks constructed by individuals to predict their surroundings [
28]—which subsequently influence user expectations and attitudes toward artificial intelligence [
29]. The current research employs an experimental paradigm of event-related potentials with a modified oddball task to detect the unconscious cognitive processes of users when confronted with failures of generative AI services. Event-related potentials are direct responses of the brain to specific events or stimuli, offering a technique for providing real-time data on the cognitive processes of subjects toward complex tasks, as explored by Hinz et al. (2021) in their examination of action planning and outcome monitoring in human–computer interaction [
30]. The oddball paradigm, a commonly used ERP experimental design, focuses on studying human attention and cognitive processes. The fundamental concept behind the oddball paradigm is the random insertion of low-probability deviant stimuli into a sequence of high-probability standard stimuli, which can assess abilities such as working memory, stimulus discrimination, and response inhibition [
31]. By applying this paradigm in an improved version tailored to human–computer interaction, this approach effectively overcomes the issue of recall biases associated with self-reported studies, revealing the genuine and reliable attitudes of users during the human–computer interaction process.
This study explores whether the design of stance attribution explanations may affect users’ unconscious-level mental models from the perspective of AI psychology, and examines whether task type and stance attribution explanations combine to affect users’ unconscious cognitive conflicts and emotional changes in the event of AI service failures in the light of a new feature of generative AI, i.e., the prominence of affective task capabilities. The innovation of this study lies in discussing stance attribution explanations as a human–machine interaction design that affects users and integrating it with the distinctive feature of generative AI services’ affective capabilities, for the first time combining stance attribution explanations with task types to study users’ unconscious cognitive processes during generative AI service failures. Further innovatively, this study introduces the traditional weird-ball paradigm into the human–computer interaction field, exploring users’ immediate attitudes and micro-level psychological processes toward explanations of generative AI service failures through millisecond-level measurements of brain neurological layers, completely moving away from the recall bias issues brought by traditional self-report research methods. This research offers a new perspective for the application and extension of mental models theory, introduces a new research paradigm to the human–computer interaction field, and provides valuable insights and guidance for designers and developers of generative AI, suggesting the flexible adjustment of stance attribution explanation cues in alignment with task types to minimize users’ cognitive conflicts and emotional dissatisfaction during service failures, thereby enhancing overall user satisfaction and trust, and ultimately increasing market competitiveness.
5. Discussion
Few studies have explored the unconscious cognitive conflicts during human–computer interaction processes, nor have they considered the interpretation of stance attribution as a factor influencing human–computer interaction design. To understand people’s reactions when generative AI services do not meet expectations, it is essential to delve into human cognitive processes and our interactions with technology. This research design innovates the oddball paradigm and introduces it to the field of AI psychology, getting rid of the recall bias problem associated with the traditional self-reporting research methodology, and revealing how task-type and stance attribution explanations of generative AI affect users’ attitudes and behaviors when the service fails, as well as the micro-psychological and cognitive processing mechanisms involved in this process.
The theta frequency band (4–7 HZ) ERSP (Event-Related Spectral Perturbation) values represent event-related spectral changes that reflect the brain’s frequency-domain response to stimuli, allowing for analysis of changes in different neural oscillation bands. The theta frequency band (4–7 HZ) is a low-frequency neural oscillation primarily occurring in the frontal midline area, associated with attention, memory, emotional, and other cognitive functions [
44]. Increased theta power in this region is thought to reflect increased engagement of these cognitive control mechanisms. Therefore, the observed differences in ERSP values suggest that the interaction between stance interpretation and task type modulates the level of cognitive conflict experienced in response to AI service failures. This study found that the attribution of stance and task type comprehensively affects the induced theta-band ERSP (Event-Related Spectral Perturbation) energy values in users when generative AI services fail. Since the theta-band ERSP is associated with cognitive control and conflict, our findings indicate that under a design stance explanation, cognitive conflict induced by failures in emotional tasks is significantly lower than that in mechanical tasks. However, under an intentional stance explanation, cognitive conflict induced by failures in emotional tasks is significantly higher than that in mechanical tasks. Similarly, in the scenario of failures in emotional tasks, the cognitive conflict induced by failed generative AI services under an intentional stance explanation is far greater than that under a design stance explanation. In instances of mechanical task failures, users experience greater cognitive conflict in response to failed generative AI services explained by a design stance. Our results supported H1, which predicted a significant interaction between task type and stance interpretation type on the ERSP values in the theta frequency band. This significant interaction indicates that cognitive processing of AI service failures is influenced by both stance attribution explanation and task type. This finding aligns with the theory of mental models, which suggests that users’ internal representations of the AI system are shaped by both the task context and the provided explanations, leading to different expectations and responses to system failures. Further, the findings from the simple effects analysis supported Hypotheses 2 and 3. H2 proposed that under the intentional stance interpretation, failures in emotional tasks would induce higher cognitive conflict compared to mechanical tasks, reflected in higher ERSP energy values. This was marginally supported, with energy induced by failures in emotional tasks being marginally significantly higher than in mechanical task failures under the intentional stance. H3 posited that under the design stance interpretation, failures in mechanical tasks would induce higher cognitive conflict compared to emotional tasks. This was supported by the finding that under the design stance interpretation, energy induced by failures in emotional tasks was significantly lower than in mechanical tasks.
Phase Locking Value (PLV) is a metric for quantifying the synchronicity of neural electrical signals, with higher PLV values indicating more effective communication and information integration between brain regions. The synchronization between specific brain areas, such as the Medial Frontal Cortex (MFC) and the Lateral Frontal Cortex (LFC), is considered to be crucial for cognitive functions such as cognitive conflict, error detection, attention, and working memory [
46]. Phase Locking Value (PLV) quantifies the synchronization of neural oscillations between brain regions. Increased PLV between the frontal midline (FCz) and lateral prefrontal cortex (F6) in the theta band suggests enhanced communication and integration of information related to conflict monitoring and cognitive control [
46]. The observed differences in PLV values therefore indicate that stance interpretation and task type influence the degree of functional connectivity between these regions during the processing of AI service failures. This study employed PLV measurements between the Medial Frontal Cortex (e.g., FCz electrode) and the Lateral Frontal Cortex (e.g., F6 electrode), where the Medial Frontal Cortex is generally associated with error monitoring (e.g., functions of the Anterior Cingulate Cortex, ACC) and conflict monitoring (related to task switching, social signal processing), while the Lateral Frontal Cortex is related to goal setting and maintenance, participating in a broader range of cognitive control. The results of the functional connectivity in this study found that under the interpretation of the design stance, failures in mechanical tasks, as opposed to emotional tasks, elicited neural activity associated with unconscious cognitive conflicts and error monitoring in users. Moreover, the presence of service failures during emotional tasks, as interpreted by the intentional stance compared to the design stance, more profoundly elicited unconscious cognitive conflicts and error monitoring in users. Additionally, these findings confirmed the above-mentioned time–frequency analysis results.
Our research findings suggest that the stance attribution explanations of generative AI during human–machine interaction can significantly influence the mental models and expectations of users toward AI agents, as reflected in neural activity associated with unconscious processes. Stance attribution explanations can be regarded as a guiding mechanism that directs how users perceive and evaluate the actions of artificial intelligence. They profoundly affect the mental models of users’ internal representations, based on which users interpret, predict, and rationalize the behaviors of AI. These mental models play a decisive role in how users react to service failures encountered with AI services. Mental models suggest that individuals construct internal representations of systems based on information and experiences, including artificial intelligence and robotics [
3]. Mental models are dynamic, adaptive cognitive structures that help people understand complex systems such as AI agents and interact with them, serving as guides for how they comprehend and engage with AI systems [
49]. Mental models can also affect users’ expectations and attitudes toward artificial intelligence [
29], thereby influencing cognitive conflicts in users during different task service failures. Research has found that priming statements can influence how individuals construct mental models of generative AI using their past perceptions and expectations [
10]. In our study, stance attribution explanations play a similar role, influencing the psychological representations of artificial intelligence among users.
Belief initiation can affect the building of users’ psychological models [
10], and within human–computer interactions, stance attribution explanations can perform the function of belief implantation and suggestion, leading users to naturally interpret AI behaviors and phenomena through an “intentional stance”—attributing mental states or from a “design stance” perspective. Task type, referring to the tasks that users perform with AI involving emotional exchange or expression (emotional tasks) or information acquisition or processing (mechanical tasks), influences users’ needs and evaluations of AI [
21]. Services oriented toward emotion require empathy, communication, and emotion, while tasks that are tool-oriented are more objective, logical, and quantifiable, not involving emotion [
21].
When individuals receive interpretative explanations of AI behaviors in terms of intentions, they tend to project human-like psychological states (such as intentions, desires, and beliefs) onto the AI. During human–computer interactions involving emotional services with AI, if the AI is designed to lean more toward intentional stances, it is easier to evoke mental states associated with interacting with other humans [
51]; thus, any failure can trigger more severe cognitive dissonance. Interpretations focusing on the design stance emphasize functional and system design aspects, demonstrating the rationale behind behaviors in a mechanical or predetermined manner [
14]. Furthermore, it is commonly believed that robots and mechanical orientation tasks are more appropriately linked [
53]. Explanations based on the design stance facilitate people in connecting AI behavior with explicit programming and specific functionalities. Hence, any failure in mechanical tasks can lead to significant cognitive discord among users. Intelligent system designers and developers should delve into users’ micro-psychological understandings to create AI products that better align with user mental models.
User characteristics, such as prior experience with AI technology [
54], mental distance [
55], technical proficiency [
56], and individual differences in attachment styles [
57], may significantly influence how users interact with AI systems and react to service failures. Prior research has shown that positive prior experiences with AI technology can play a positive role in human–computer interaction, while perceived risk acts as a deterrent [
54]. In addition, personality traits may influence users’ perceptions and interactions with AI agents [
58]. Desirability and youthfulness predict a more positive attitude toward AI technology, while sensitivity to conspiracy theory beliefs leads to a more negative attitude [
59]. Customer inoculation measures can be effective in increasing satisfaction with service remedies after AI service failure [
55]. While our study focuses on the neural mechanisms behind cognitive conflicts when AI services fail, different user characteristics may modulate these neural responses. Combining user experience assessment with AI, technological literacy, and personality traits may provide deeper insights into the variability of user interactions with AI systems.
In summary, this study found that cognitive dissonance in the failure of AI services is deeply influenced by the way AI design explanations (intentional and design stances) and the nature of the tasks (emotional vs. mechanical) are presented. The theory of mental models underscores this relationship, illustrating how mismatched expectations manifest in user experience. In emotional tasks, explanations from an intentional stance lead users to expect interactions more similar to those with humans, thereby generating a stronger sense of dissonance when AI errs. In contrast, the design stance offers a clear, mechanical explanation for mechanical tasks’ outcomes, which, when unmet, indicates a gap in AI’s functional capabilities, thus affecting user trust and showing how a deviation from the expected framework drives cognitive imbalance. This study also demonstrates that users’ understanding of AI is a subjective mental model, and the presentation of AI is crucial [
10]. The user experience of generative AI largely depends on the mental models constructed by users themselves. This study demonstrates the need for precise mental models of users in human–computer interaction, which is of significant importance for the design of artificial intelligence and the enhancement of user experience.
5.1. Theoretical Implication
This study explores the theory of mental models in human–computer interaction using cognitive neuroscience methods and examines the role of stance attribution in mental models with a view to understanding how mental models influence human–AI interaction. The study further explores the concepts of intentional stance and design stance in the psychology of artificial intelligence and analyzes the neural mechanisms of cognitive conflicts induced by mental models. In addition, this study designed a new experimental method based on the classical oddball paradigm and applied it to AI psychology research. By measuring brain neural activity at the millisecond level, the method circumvents the problem of recall bias that may exist in traditional self-report methods and provides a new perspective for studying human–computer interaction.
5.2. Managerial Implication
The findings of this study provide crucial guidance for the design of AI systems and their interactive interfaces. The design of different stance attribution explanations influences users’ mental models, thereby affecting their cognitive conflicts and emotional responses. Thus, designers of generative AI can incorporate appropriate stance attribution explanations in explanations of different task failures to mitigate users’ negative experiences. Adjustments based on stance fundamentally affect users’ mental models of AI products. Expectations arising from a user’s design stance (e.g., “This assistant can calculate the optimal response”) vis-à-vis the intentional stance (e.g., “This assistant can understand your needs”) differ. Through these indicators, companies can guide users toward a more tolerant and understanding attitude toward the limitations of artificial intelligence, enhancing user experience and satisfaction with human–computer interaction.
5.3. Limitations and Directions for Future Research
However, this study has its limitations, revealing, to an extent, the impact of generative AI stance attribution explanation on mental models through indirect evidence from cognitive responses across different task types, without a direct measure to better prove changes in the unconscious level of mental models. Additionally, the study categorizes service content into emotional and mechanical types, a broad classification method that may restrict the applicability of the research findings in real-world scenarios. Future research could pioneer new paradigms for measuring users’ unconscious mental models, such as exploring through brain–computer interface machine learning technologies. Further research could also divide AI service tasks into more detailed categories to investigate users’ expectations and experiences during different tasks. Additionally, future studies might consider the ethical implications and social impacts of stance attribution in artificial intelligence, such as understanding how different attributions affect dependence on or bias against AI systems. A key limitation of this study is the indirect nature of measuring unconscious cognitive processes. While EEG offers valuable insights into neural activity associated with cognitive conflict and error monitoring, it does not provide direct access to unconscious thought. The observed theta-band ERSP and PLV changes are interpreted as correlates of these processes, but further research using complementary methods, such as implicit association tests, could provide a more nuanced understanding of the unconscious mechanisms at play. Although the sample size of this study was validated by statistical efficacy analysis, we recognize that a larger sample size may lead to stronger robustness of results. Future studies should consider expanding the sample size to further validate the findings of this study. While this study focused specifically on Cz and F6 based on their established association with cognitive conflict processing, future research could explore the activity and connectivity of other electrodes to gain a more comprehensive understanding of the neural network dynamics underlying HCI. This broader perspective could reveal additional insights into the complex interplay of brain regions involved in AI stance attribution explanation. This study acknowledges certain limitations regarding the generalizability of its findings. We did not collect detailed demographic information or assess user experience with AI/LLM technology, which could influence their responses to system failures. Future research should explore the impact of user characteristics, such as prior experience with AI, tech savviness, and attitudes toward AI, on their reactions to LLM errors. As user familiarity and acceptance of AI/LLM technology evolve, their expectations and responses to system failures might also change. Longitudinal studies are needed to understand how user perceptions and behaviors adapt over time in response to advancements in AI technology.