6.1. Self-Evaluation Accuracy
In order to fill the research gap, the current study examined (a) the influence of cognitive IDs on the accuracy of L2 speech self-evaluation (i.e., RQ1), and (b) how cognitive IDs affect the type of mismatch in self-evaluation (i.e., RQ2). According to the results, the participants’ self-evaluations and L1 English listeners’ evaluations of comprehensibility were weakly correlated, and the correlation was also weak in terms of accentedness. Judging from the existing L2 speech self-evaluation studies, which yielded mixed results in terms of comprehensibility (
Trofimovich et al. 2016;
Li 2018;
Strachan et al. 2019;
Saito et al. 2020b;
Isbell and Lee 2022) and accentedness (
Trofimovich et al. 2016;
Li 2018;
Isbell and Lee 2022), the current study supports the findings reported by
Trofimovich et al. (
2016) and
Saito et al. (
2020b), that L2 learners do not accurately perceive their own comprehensibility and accentedness. Moreover, the current study demonstrated that even when they can listen to their own speech, this mismatch still seems to occur. When comparing the correlation coefficient of comprehensibility and accentedness, it appears to be the case that assessing comprehensibility is somewhat more challenging than accentedness for L2 learners.
With regard to the relationship between the participants’ confidence scores (i.e., how much higher/lower a learner’s self-assessment was compared to a listener-based assessment of that learner) and their actual scores (i.e., English listeners’ evaluations), both comprehensibility and accentedness had strong negative correlations. The findings corroborate the results of previous studies that have identified a strong negative link between confidence scores and others’ evaluations (e.g.,
Saito et al. 2020b), indicating that L2 learners who were rated as being more comprehensible and less accented by listeners often underestimated their own speech, whereas those who were perceived less favorably by others had a tendency to overestimate their own capabilities. This finding further confirms the presence of the Dunning–Kruger effect in L2 speech evaluations (cf.
Kruger and Dunning 1999;
Li and Zhang 2021;
Ross 1998). Turning the focus to the distance scores (i.e., absolute differences between self-assessments and native speakers’ assessments), the current study reveals that the correlations between the distance scores and the native speakers’ evaluations differed depending on the constructs. While there did not appear to be an association in terms of accentedness, the result shows a strong negative association in terms of comprehensibility; that is, more comprehensible L2 learners tended to make self-assessments that differed from the assessments that the listeners made. This finding is different from that of
Isbell and Lee’s (
2022) study, which found that more comprehensible speakers could self-assess their comprehensibility accurately. A follow-up Welch’s
t-test analysis, which compared the absolute distances of the discrepancies between the overconfident group and the underconfident group, suggested that the underconfident group was further from the L1 English listeners’ evaluations (M = 2.5, SD = 1.9) than the overconfident group was (M = 1.4, SD = 1.4). This result further suggests that the participants tended to perceive their speech to be poor and less comprehensible even though they were actually reasonably comprehensible.
6.2. The Role of IDs in the Miscalibration of L2 Speech Evaluation
With regard to the first research question, which explored the role of IDs in the miscalibration of L2 speech evaluation, a hypothesis based on
Hulstijn’s (
2015,
2019) claim that cognitive factors played a minor role in comparison to linguistic knowledge in L2 listening comprehension was formulated for the current study. Since self-evaluation also involves listening and analyzing one’s own speech, it was hypothesized that, while the impact of cognitive IDs might not be as significant as the effect of the learners’ linguistic knowledge or their experience IDs, these cognitive factors would still have an influence on the accuracy of the learners’ self-evaluations of their L2 speech. The mixed-effect model comparisons reveal that, among L2 linguistic proficiency (grammar, vocabulary, and pronunciation), experience (age of learning, hours of English classes per week, study abroad experience, and hours of conversation in English per week), and cognitive IDs (perceptual acuity, audio-motor integration, working memory, phonological memory, and implicit learning), the distance scores for comprehensibility could be best explained by a model with L2 linguistic proficiency variables, whereas accentedness was explained by cognitive IDs. Overall, such a link may be explained by the differences in the constructs of comprehensibility and accentedness. L2 speech research shows that listeners’ attention is drawn to various linguistic aspects of speech, ranging from pronunciation to lexical and grammatical accuracy when judging comprehensibility, while pronunciation alone tends to explain the judgment of accentedness (see
Crowther et al. 2015;
Kang et al. 2010;
Saito et al. 2016, for instance). According to studies that explore the roles of cognitive IDs in L2 learning, explicit and implicit learning aptitude may contribute to improving learners’ representation of L2 systems and their processing abilities (e.g.,
Abrahamsson and Hyltenstam 2008;
Granena 2013). Such tendencies appear to be the case in terms of L2 pronunciation. For instance, learners with better aptitude profiles have demonstrated superior segmental and prosodic perception (
Kachlicka et al. 2019;
Saito et al. 2020a) and production (e.g.,
Saito et al. 2019). In the case of the current study, phonological memory (i.e., the ability to retain phonological information for more thorough sound decoding) was found to have a primary influence on accurate calibration. Therefore, the current study corroborates such evidence, suggesting that cognitive IDs may play a role in the self-evaluation of accentedness, which appears to require fine-grained segmental and phonological analysis to detect the influence of one’s own L1 on L2 performance. In contrast, cognitive IDs did not appear to significantly influence the self-evaluation of comprehensibility, as this construct is considered to encompass factors beyond pronunciation components (
Saito et al. 2016).
With respect to implicit learning, it was found that the participants with better implicit learning tended to miscaliburate their accentedness. Implicit learning of sequence (internalization of language patterns without conscious learning) may have led to inaccurate accentedness evaluation due to their familiarity with their own speech. Learners who are frequently exposed to their own speech patterns seem to develop familiarity with their voice features, resulting in perceiving it to be highly intelligible (
Mitterer et al. 2020). In the EFL contexts (the context wherein the participants are), where exposure to the target language is often limited to the classrooms, L2 learners who excel at implicit learning are continuously exposed to and practice their own speech patterns, rather than hearing speech produced by other users of English. This learning condition could lead to a greater familiarity with their own specific manner of speaking, instead of internalizing target language patterns through others’ speech.
In the case of comprehensibility distance scores (i.e., the amount of gap between self- and other-assessment scores), speech proficiency (i.e., the participants’ actual comprehensibility) was found to contribute significantly to the miscalibration. This means that the higher the speech proficiency, the poorer their estimation was. As reported in the previous section, the participants’ self-assessment of speech exhibited the Dunning–Kruger effect (also see
Saito et al. 2020b). Therefore, the current result may have been observed because the proficient participants wrongly underestimated their comprehensibility level. In turn, the finding could suggest that compared to accentedness, where they can concentrate on self-assessing the phonological features of their own speech, comprehensibility judgement may be more susceptible to the Dunning–Kruger effect.
Unlike L2 speech proficiency, their grammar and vocabulary knowledge appeared to contribute to a smaller distance score (i.e., better calibration). While the implementation of the comprehensibility judgment requires listeners to make a holistic and impressionistic judgment on the degree of easiness of understanding (or the amount of effort required to understand the speech), those who have better grammar and vocabulary knowledge could pay attention to the details of how well they could use grammatical and lexical items to deliver intended meanings. Due to the crucial role of lexicogrammatical accuracy in achieving better comprehensibility (cf.
Saito et al. 2016), the participants may have needed to have a high level of grammar and vocabulary knowledge to globally and accurately assess how well they could convey the message they wanted to deliver.
6.3. Impact of IDs on the Overconfidence and Underconfidence in Self-Evaluation
Concerning the second research question, an exploratory approach was adopted in the current study to examine which learner IDs contributed to overconfidence and underconfidence in the self-evaluations. On one hand, the multiple regression analyses revealed that experience of having studied abroad, vocabulary knowledge, perceptual acuity, and implicit learning were associated with the overestimation of comprehensibility and accentedness. On the other hand, the number of English classes per week, grammar knowledge, speech proficiency, audio-motor integration, and phonological memory were linked to underestimation. Such patterns can be speculated upon from various cognitive and experiential perspectives. For example, L2 learners with experience of having studied abroad may have overestimated their skills due to immersion and intensive interactions in the target language environment, which might have led to inflated perceptions of their linguistic abilities. Enhanced vocabulary knowledge may have contributed to overconfidence because these learners may have felt more equipped to understand and make themselves understood, and may have overlooked the finer nuances of language use that still needed improvement. Furthermore, perceptual acuity may have fostered overconfidence because the learners with refined auditory discrimination could have mistaken their ability to detect subtle phonetic differences for greater proficiency in producing those sounds. This sensitivity to nuances of sound may have given the learners false confidence in their speech abilities and led them to believe that they were replicating the sounds accurately, when, in reality, the precision of their production may not have matched their perception.
As has been discussed in the RQ1 section, a better implicit learning of sequence may have helped L2 learners effortlessly familiarize themselves with their own speech pattern. This familiarity may have fostered a sense of ease and confidence in their language abilities, as they become accustomed to the rhythms and sounds of their own speech. Consequently, this comfort could have caused them to overestimate their speaking abilities, mistaking familiarity for proficiency (cf. see
Ortega et al. 2022 for a similar discussion). Therefore, the participants in the current study might have not recognized the difference between their accustomed speech patterns and the native or target language features/patterns, leading to a gap between their perceived and actual proficiency.
Conversely, spending more time learning English in regular classes could have instilled a sense of underconfidence, as learners are constantly exposed to the complexities of the language, and the emphasis is on areas that need improvement. In fact, previous studies of L2 speech self-evaluations revealed that L2 learners’ perceived satisfaction with their performances strongly influenced how they evaluated themselves (e.g.,
Isbell and Lee 2022). As they were constantly reminded of the gaps in their knowledge and skills, this might have contributed to them having less confidence. Moreover, better grammar knowledge might have led to underestimation because the learners who were more aware of the detailed grammatical rules may have become more self-conscious about making errors. In the current study, the participants had access to their own speech and paid close attention to what they said. Therefore, when striving for accuracy, their focus on any mistakes that they heard might have caused them to notice their weak points more overtly, thus causing them to judge their speaking skills more harshly.
Two cognitive IDs, audio-motor integration and phonological memory, were also associated with underconfidence. Learners with advanced audio–motor integration are more likely to be adept at synchronizing their motor processes with auditory inputs, which is a pivotal skill for language production. This synchronization not only assists in perceiving sounds, but also in replicating sound sequences and patterns accurately, which is essential in language learning and in musical training (
Patel and Iversen 2014). Given that audio–motor integration underpins the coordination between auditory perception and the motor planning required for sound production, we might asume that learners with well-developed audio-motor integration abilities would be particularly skilled at emulating the speech patterns of a target language. However, this precise mimicry could increase their awareness of discrepancies between their own speech production and that of native speakers, potentially leading to an underestimation of their own language skills as they focus on even minor deviations from the ideal pronunciation (
Flaugnacco et al. 2014;
Gordon et al. 2015). L2 learners with stronger phonological memories have the advantage of retaining phonological information for more in-depth decoding. Therefore, the participants in this study may have been more able to critically evaluate their speech compared to the benchmarks due to their enhanced memory capacity. Research has shown that the integration of sound and motor execution is essential for success in L2 acquisition, and is beneficial for different types of linguistic training (
Brekelmans et al. 2022;
Li and DeKeyser 2017;
Saito et al. 2021). However, this detailed understanding and increased awareness might led learners to set extremely high goals. If they compare their speech to that of native speakers and fall short, they might think that their skills do not meet the desired level of quality.
The age of learning and the amount of conversation per week had different influences depending on the dimensions. Specifically, these factors were associated with a tendency toward overconfidence in the self-assessments of comprehensibility (e.g., see
Li 2018 for a similar result with age), yet these same factors were linked to underconfidence when evaluating accentedness. This suggests that earlier exposure to language learning and frequent conversational practice may boost learners’ self-perceptions of their ability to be understood, possibly due to the cumulative effect of extended practice over time. However, these factors did not appear to translate into confidence regarding accent, as the learners may have considered the elusive nuances of native-like pronunciation to be a challenge. By contrast, working memory showed the opposite trend, as it was linked to underconfidence in comprehensibility but overconfidence in accentedness. This could imply that learners with stronger working memory capacities may have been more critical of their ability to convey meaning effectively, perhaps due to more extensive self-monitoring, and they may also have overestimated their pronunciation skills, potentially overlooking subtle phonological details that characterize native speakers’ accents.
Overall, the variables linked to overestimation may foster a general sense of communicative efficacy, while those associated with underestimation may reflect a heightened awareness of linguistic precision and the gap between a learners’ current abilities and the target language norms. Having said this, the variance explained by the variables in each model was small (4.1% and 3.1%, respectively), and not all the ID variables were found to be statistically significant in either the mixed-effect modeling or the regression analyses. Therefore, the influence of learner variables such as experiential and cognitive IDs might be small to medium at most (
Ma and Winke 2019;
Ross 1998;
Suzuki 2015). Since studies in the field of psychology have indicated that the perceived difficulty of completing a task and the metacognitive awareness of the skill influence the level of confidence (e.g.,
Dunning et al. 1989;
Burson et al. 2006), the main factor in the miscalibration may have been psychological, such as the L2 learners’ degree of satisfaction with their own pronunciation and/or metacognitive profile; for example, the value they placed on L2 speaking and pronunciation (
Isbell and Lee 2022). In order to paint a fuller picture of the factors that affect the self-evaluation biases, further research with a wider range of IDs is required.
6.4. Limitations and Future Resarch Direction
Several methodological limitations need to be addressed. First, the participant population, drawn from a single demographic, may not capture the broader spectrum of L2 learners, which limits the generalizability of the findings (
Suzuki 2015;
Trofimovich et al. 2016). Second, the study’s methodology, which emphasized certain cognitive IDs, may have neglected other influential psychological or sociocultural factors that have an impact on self-evaluations of L2 speech. Second, the results related to the role of phonological memory may be inconclusive because the language of the cognitive tests used to measure participants’ phonological memory (adopted from
Gathercole et al. 2001) was only in L2 (i.e., English). Existing research on the nature and impact of learners’ phonological memory on L2 learning has suggested that the scores of phonological memory measured through the non-word stimuli created based on the L2 phonological system may not be an accurate representation of its language-independent component due to the influence of learners’ L2 knowledge (e.g., phonological features and semantic aspects of L2 lexical items; see
French and O’Brien 2008;
Van Der Lely and Gallon 2006). Therefore, to capture learners’ language-independent phonological memory, both L2-based non-word items and non-words generated from participants’ unfamiliar language need to be prepared for the task. These two types of phonological memory scores (i.e., language-dependent phonological memory and language-independent phonological memory scores) may help us further understand the role of phonological memory in learners’ self-evaluation behavior. In addition, the methodology allowed the participants to listen to their own speech for self-evaluation. However, other studies have often asked participants to self-evaluate without access to their speech (e.g.,
Saito et al. 2020b). This methodological choice may have influenced the self-assessment outcomes, which suggests that future studies should investigate the impact of the participants’ access or lack of access to their own speech on the accuracy of self-evaluations (
Isbell and Lee 2022). Expanding the research to include a more diverse participant base and comparing different self-assessment conditions could provide a more comprehensive understanding of the processes involved in self-evaluations of L2 speech.