Speech Analysis and Tools in L2 Pronunciation Acquisition

A special issue of Languages (ISSN 2226-471X).

Deadline for manuscript submissions: closed (20 July 2023) | Viewed by 24669

Special Issue Editors


E-Mail Website
Guest Editor
Faculté des Langues, Cultures et Sociétés, Département d'études Anglophones, University of Lille, Lille, France
Interests: second language acquisition; phonetics; prosody; natural language

E-Mail Website
Guest Editor
Institut de Français, University of Bern, Bern, Switzerland
Interests: second language acquisition; prosody; phonetics; psycholinguistics

Special Issue Information

Dear Colleagues,

Speech analysis techniques and tools are increasingly used within L2 pronunciation learning and teaching studies. They have been extensively used in research on the acquisition of L2 speech patterns to explore L2 perception and production. They are now the norm within this domain, and include spectral measures such as vowel formants, temporal measures such as voice onset time (VOT), as well as articulatory measures such as ultrasound tongue imaging (UTI), electroglottography (EGG) and electromagnetic articulography (EMA). However, even classical acoustic measurements such as vowel formants may pose challenges when applied to L2 speech; for example, Ferragne (2013) advises against the automatic extraction of formants for L2 data since realisations may deviate considerably from the expected target; the automatic extraction of suprasegmental parameters, such as duration, pitch and intensity, may be more robust. Additionally, many available speech tools, such as automatic aligners, are problematic when used with L2 data (Bailly & Martin, 2013).

Speech analysis is no longer restricted to studies on the acquisition of L2 sounds; recently, scholars have started introducing such tools into the classroom, usually to provide visual reinforcement for target pronunciation patterns, or to enable learners to focus on proprioception and self-correction. Among other tools, pitch visualisation has been used in the classroom to teach L2 intonation patterns (Herment, 2018) or tones (Chen, 2022), with the aim of illustrating target-like pitch contours or the pitch curve for learners’ own realisations. Similarly, visualisations of VOT have been effectively used in teaching interventions (Olson & Offerman, 2021) and to provide online pronunciation feedback via the Moodle platform (Wilson, 2008). The use of automatic formant plots to provide feedback on L2 vowel pronunciation was found to be too complex using technology available less than twenty years ago (Setter & Jenkins, 2005), but recent studies have reported the beneficial effects (Rehman & Das, 2020) of advances in speech analysis tools. Among articulatory techniques, many recent studies have reported benefits of using ultrasound tongue imaging (UTI) to teach segmental contrasts involving different tongue configurations for various classes of L2 languages (Gick et al. 2008; Pillot-Loiseau et al., 2015; Sisinni et al., 2016).

Speech analysis techniques have also been used to develop L2 Computer-Assisted Pronunciation Teaching (CAPT) software, designed to provide corrective automatic scoring and/or feedback, both at the segmental (Saleh & Gilakjani, 2021) and prosodic (Schwab & Goldman, 2018) levels. Many of these tools are based on acoustic measurements of L2 speech. However, even in this case, specific issues arise; for example, should CAPT systems provide generalised or customised feedback to learners (Rogerson-Revell, 2021)? Should CAPT systems be based on a specific native model used as the target reference (e.g., Southern British English, General American English, etc.), a language-independent set of properties designed to reflect L2 speech comprehensibility (cf. Mairano et al. 2019), or a combination of both? Moreover, should they use predefined expert features such as voice onset time and vowel formants (which can be used to provide precise feedback to learners), or should they use machine learning-based approaches?

The goal of this Special Issue is to showcase the state of the art of speech analysis tools and techniques by compiling fundamental studies on L2 speech acquisition, as well as studies with an applied perspective that use speech analysis techniques and tools to teach and/or evaluate L2 production or perception. We welcome contributions on a range of topics, including the following:

  • Innovative speech analysis techniques, methodologies or tools to study the acquisition of L2 speech patterns (production and/or perception), or discussions of the limits of specific L1-designed tools and methodologies for studying L2 speech data;
  • The development and/or use of speech analysis tools (acoustic or articulatory) in the context of L2 classrooms to improve learners’ pronunciation;
  • Innovations in the field of CAPT software for L2 pronunciation feedback and/or scoring.

We request that authors initially submit a proposed title and an abstract of 400–600 words summarising their intended contribution. Please send submissions to the Guest Editors ([email protected] and [email protected]) and to the Languages editorial office ([email protected]). Abstracts will be reviewed by the Guest Editors to assess their relevance to the Special Issue. Full manuscripts will undergo double-blind peer-review.

  • Abstract Submission Deadline: 30 January 2023
  • Max length of articles: 7000 words (excl. bibliography)

References

Ballier, N., & Martin, P. (2013) Developing corpus interoperability for phonetic investigation of learner corpora. In: A. Diaz-Nergillo, N. Ballier, & P. Thompson (Eds) Automatic Treatment and Analysis of Learner Corpus Data (pp. 151-168). Amsterdam: John Benjamins.

Chen, M. (2022). Computer-aided feedback on the pronunciation of Mandarin Chinese tones: using Praat to promote multimedia foreign language learning. Computer Assisted Language Learning, 1-26.

Ferragne, E. (2013) Automatic suprasegmental parameter extraction in learner corpora. In: A. Diaz-Nergillo, N. Ballier, & P. Thompson (Eds) Automatic Treatment and Analysis of Learner Corpus Data (pp. 151-168). Amsterdam: John Benjamins.

Gick, B., Bernhardt, B., Bacsfalvi, P., & Wilson, I. (2008). Ultrasound imaging applications in second language acquisition. In J. G. Hansen Edwards and M. L. Zampini (eds.), Phonology and Second Language Acquisition (pp. 309-322). Amsterdam: John Benjamins.

Herment, S. (2018). Apprentissage et enseignement de la prosodie : l’importance de la visualisation. Revue française de linguistique appliquée, 23(1), 73-88.

Mairano, P., Bouzon, C., Capliez, M., & De Iacovo, V. (2019) Acoustic distances, Pillai scores and LDA classification scores as metrics of L2 comprehensibility and nativelikeness. Proc. of ICPhS2019 (International Congress of Phonetic Sciences) (pp. 1104-1108), Melbourne (Australia), 5-9 August 2019.

Olson, D. J., & Offerman, H. M. (2021). Maximizing the effect of visual feedback for pronunciation instruction: A comparative analysis of three approaches. Journal of Second Language Pronunciation, 7(1), 89-115.

Pillot-Loiseau, C., Kamiyama, T., & Antolík, T. K. (2015). French/y/-/u/contrast in Japanese learners with/without ultrasound feedback: vowels, non-words and words. In Proceedings of the International Congress of Phonetic Sciences (ICPhS), Glasgow (UK), 10-15 August 2015, 1-5.

Rehman, I., & Das, A. (2020). Real-time visual acoustic feedback for non-native vowel production training. The Journal of the Acoustical Society of America, 148(4), 2657-2657.

Rogerson-Revell, P. M. (2021). Computer-assisted pronunciation training (CAPT): Current issues and future directions. RELC Journal, 52(1), 189-205.

Saleh, A. J., & Gilakjani, A. P. (2021). Investigating the impact of computer-assisted pronunciation teaching (CAPT) on improving intermediate EFL learners’ pronunciation ability. Education and Information Technologies, 26(1), 489-515.

Schwab, S. & Goldman, J.-P. (2018). MIAPARLE: Online training for the discrimination and production of stress contrasts, Proceeding of 9th Speech Prosody Conference (pp. 572-576), Poznan (Poland), 13-16 June 2018.

Setter, J. & Jenkins, J. (2005) State-of-the-art review article: Pronunciation. Language Teaching, 38, 1-17.

Sisinni, B., d’Apolito, S., Fivela, B. G., & Grimaldi, M. (2016). Ultrasound articulatory training for teaching pronunciation of L2 vowels. In Proceedings of ICT for language learning (pp. 265-270).

Wilson, I. (2008). Using Praat and Moodle for teaching segmental and suprasegmental pronunciation. In Proceedings of the 3rd international WorldCALL Conference: Using Technologies for Language Learning (pp. 112-115).

Dr. Paolo Mairano
Dr. Sandra Schwab
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Languages is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • speech analysis
  • speech tools
  • L2 acquisition
  • L2 pronunciation
  • L2 teaching
  • CAPT

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

18 pages, 7008 KiB  
Article
Could You Say [læp˺ tɒp˺]? Acquisition of Unreleased Stops by Advanced French Learners of English Using Spectrograms and Gestures
by Maelle Amand and Zakaria Touhami
Languages 2024, 9(8), 257; https://doi.org/10.3390/languages9080257 - 25 Jul 2024
Viewed by 859
Abstract
The present study analyses the production rates of stop-unrelease amongst advanced French learners of English before and after training. Although stop-unrelease may be regarded as a minor issue in English pronunciation teaching, it has received some attention in recent years. Earlier studies showed [...] Read more.
The present study analyses the production rates of stop-unrelease amongst advanced French learners of English before and after training. Although stop-unrelease may be regarded as a minor issue in English pronunciation teaching, it has received some attention in recent years. Earlier studies showed that amongst “phonetically naive English listeners”, the lack of release of /p/, /t/ and /k/ leads to lower identification scores. The present study analyses the speech of 31 French university students majoring in English to measure the efficiency of an awareness approach on the production of stop-unrelease. The experiment comprised three phases with a test and a control group. During Phase 1, both groups were asked to read pairs of words and sentences containing medial and final voiceless stops. We chose combinations of two identical stops (homorganic) or stops with different places of articulation (heterorganic), as well as stops in utterance-final position. Namely, wait for me at that table over there, that pan, or I like that truck. In Phase 2, one group watched an explanatory video to increase awareness on stop-unrelease in English before reading Phase 1 words and sentences a second time. The remaining group was the control group and did not receive any training. Among the participants, 17 read a French text containing pairs of stops in similar positions to those in the English one, which served as an L1 baseline. In total, six students continued until Phase 3 (reading the same stimuli a month later; three in the control group and three in the test group). The results showed that sentence-final stops were overwhelmingly released (above 90%) in both English and French in Phase 1. Training had a significant impact on sentence-final stop-unrelease (p < 0.001), which rose from 9.65% to 72.2%. Progress was also visible in other contexts as in heterorganic pairs of stops. Based on these results, we strongly recommend the combined use of spectrograms and gestures to raise awareness in a classroom or for online learning so as to reach multiple learner profiles and further increase efficiency in pronunciation learning. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

24 pages, 3287 KiB  
Article
Online Assessment of Cross-Linguistic Similarity as a Measure of L2 Perceptual Categorization Accuracy
by Juli Cebrian and Joan C. Mora
Languages 2024, 9(5), 152; https://doi.org/10.3390/languages9050152 - 23 Apr 2024
Viewed by 1445
Abstract
The effect of cross-linguistic similarity on the development of target-like categories in a second or additional language is widely attested. Research also shows that second-language speakers may access both their native and the second-language lexicons when processing second-language speech. Forty-three Catalan learners of [...] Read more.
The effect of cross-linguistic similarity on the development of target-like categories in a second or additional language is widely attested. Research also shows that second-language speakers may access both their native and the second-language lexicons when processing second-language speech. Forty-three Catalan learners of English performed a perceptual assimilation task evaluating the perceived similarity between English and Catalan vowels and also participated in a visual world eye-tracking experiment investigating between-language lexical competition. The focus of the study was the English vowel contrasts /iː/-/ɪ/ and /æ/-/ʌ/. The perceptual task confirmed that English /iː/ and /æ/were perceptually closer to native Catalan categories than English /ɪ/ and /ʌ/. The results of the spoken word recognition task indicated that learners experienced greater competition from native words when the target words contained English /iː/ and /æ/, illustrating a close link between the two types of tasks. However, differences in the magnitude of cross-language lexical competition were found to be only weakly related to learners’ degree of perceived similarity to native categories at an individual level. We conclude that online tasks provide a potentially effective method of assessing cross-linguistic similarity without the concerns inherent to more traditional offline approaches. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

16 pages, 2035 KiB  
Article
Enhancing Language Learners’ Comprehensibility through Automated Analysis of Pause Positions and Syllable Prominence
by Sylvain Coulange, Tsuneo Kato, Solange Rossato and Monica Masperi
Languages 2024, 9(3), 78; https://doi.org/10.3390/languages9030078 - 28 Feb 2024
Viewed by 1442
Abstract
This research paper addresses the challenge of providing effective feedback on spontaneous speech produced by second language (L2) English learners. As the position of pauses and lexical stress is often considered a determinative factor for easy comprehension by listeners, an automated pipeline is [...] Read more.
This research paper addresses the challenge of providing effective feedback on spontaneous speech produced by second language (L2) English learners. As the position of pauses and lexical stress is often considered a determinative factor for easy comprehension by listeners, an automated pipeline is introduced to analyze the position of pauses in speech, the lexical stress patterns of polysyllabic content words, and the degree of prosodic contrast between stressed and unstressed syllables, on the basis of F0, intensity, and duration measures. The pipeline is applied to 11 h of spontaneous speech from 176 French students with B1 and B2 proficiency levels. It appeared that B1 students make more pauses within phrases and less pauses between clauses than B2 speakers, with a large diversity among speakers at both proficiency levels. Overall, lexical stress is correctly placed in only 35.4% of instances, with B2 students achieving a significantly higher score (36%) than B1 students (29.6%). However, great variation among speakers is also observed, ranging from 0% to 68% in stress position accuracy. Stress typically falls on the last syllable regardless of the prosodic expectations, with the strong influence of syllable duration. Only proficient speakers show substantial F0 and intensity contrasts. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

15 pages, 455 KiB  
Article
Discrimination of Degrees of Foreign Accent across Different Speakers
by Rubén Pérez-Ramón
Languages 2024, 9(3), 72; https://doi.org/10.3390/languages9030072 - 23 Feb 2024
Cited by 1 | Viewed by 1929
Abstract
Second-language learners often encounter communication challenges due to a foreign accent (FA) in their speech, influenced by their native language (L1). This FA can affect rhythm, intonation, stress, and the segmental domain, which consists of individual language sounds. This study looks into the [...] Read more.
Second-language learners often encounter communication challenges due to a foreign accent (FA) in their speech, influenced by their native language (L1). This FA can affect rhythm, intonation, stress, and the segmental domain, which consists of individual language sounds. This study looks into the segmental FA aspect, exploring listeners’ perceptions when Spanish interacts with English. Utilizing the SIAEW corpus, which replaces segments of English words with anticipated Spanish-accented realizations, we assess the ability of non-native listeners to discriminate degrees of accent across male and female voices. This research aims to determine the impact of voice consistency on detecting accentedness variations, studying participants from Japanese and Spanish. Results show that, while listeners are generally able to discriminate degrees of foreign accent across speakers, some segmental transformations convey a more clear distinction depending on the phonological representations of the native and accented realisations on the listener’s system. Another finding is that listeners tend to better discriminate degrees of accent when words are more native-like sounding. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

20 pages, 3745 KiB  
Article
Exploring the Accent Mix Perceptually and Automatically: French Learners of English and the RP–GA Divide
by Emmanuel Ferragne, Anne Guyot Talbot, Hannah King and Sylvain Navarro
Languages 2024, 9(2), 50; https://doi.org/10.3390/languages9020050 - 29 Jan 2024
Cited by 1 | Viewed by 1910
Abstract
Acquiring a consistent accent and targeting a native standard like Received Pronunciation (RP) or General American (GA) are prerequisites for French learners who plan to become English teachers in France. Reliable methods to assess learners’ productions are therefore extremely valuable. We recorded a [...] Read more.
Acquiring a consistent accent and targeting a native standard like Received Pronunciation (RP) or General American (GA) are prerequisites for French learners who plan to become English teachers in France. Reliable methods to assess learners’ productions are therefore extremely valuable. We recorded a little over 300 students from our English Studies department and performed auditory analysis to investigate their accents and determine how close to native models their productions were. Inter-rater comparisons were carried out; they revealed overall good agreement scores which, however, varied across phonetic cues. Then, automatic speech recognition (ASR) and automatic accent identification (AID) were applied to the data. We provide exploratory interpretations of the ASR outputs, and show to what extent they agree with and complement our auditory ratings. AID turns out to be very consistent with our perception, and both types of measurements show that two thirds of our students favour an American, and the remaining third, a British pronunciation, although most of them have mixed features from the two accents. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

16 pages, 3172 KiB  
Article
Visible Vowels as a Tool for the Study of Language Transfer
by Wilbert Heeringa and Hans Van de Velde
Languages 2024, 9(2), 35; https://doi.org/10.3390/languages9020035 - 23 Jan 2024
Viewed by 1484
Abstract
In this paper, we demonstrate the use of Visible Vowels to detect formant and durational differences between L2 and L1 speakers. We used a dataset that contains vowel measures from L1 speakers of French and from L2 learners of French, with Italian, Spanish [...] Read more.
In this paper, we demonstrate the use of Visible Vowels to detect formant and durational differences between L2 and L1 speakers. We used a dataset that contains vowel measures from L1 speakers of French and from L2 learners of French, with Italian, Spanish and English as L1. We found that vowels that are not part of the L1 phonological system are often pronounced differently by L2 speakers. Inspired by the Native Language Magnet Theory which was introduced by Patricia Kuhl in 2000, we introduced magnet plots that relate vowels shared by the French phonological system and the learners’ phonological system—the magnet vowels—to the vowels found only in the French phonological system. At a glance, it can be seen which vowels are attracted to the magnets and which vowels become further away from the magnets. When comparing vowel spaces, we found that the shape of the French vowel space of the English learners differed most from the shape of L1 speakers’ vowel space. Finally, it was found that the vowel durations of the L2 speakers are greater than that of the L1 speakers of French, especially those of the English learners of French. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

18 pages, 2088 KiB  
Article
After Self-Imitation Prosodic Training L2 Learners Converge Prosodically to the Native Speakers
by Elisa Pellegrino
Languages 2024, 9(1), 33; https://doi.org/10.3390/languages9010033 - 22 Jan 2024
Cited by 1 | Viewed by 1804
Abstract
Little attention is paid to prosody in second language (L2) instruction, but computer-assisted pronunciation training (CAPT) offers learners solutions to improve the perception and production of L2 suprasegmentals. In this study, we extend with acoustic analysis a previous research showing the effectiveness of [...] Read more.
Little attention is paid to prosody in second language (L2) instruction, but computer-assisted pronunciation training (CAPT) offers learners solutions to improve the perception and production of L2 suprasegmentals. In this study, we extend with acoustic analysis a previous research showing the effectiveness of self-imitation training on prosodic improvements of Japanese learners of Italian. In light of the increased degree of correct match between intended and perceived pragmatic functions (e.g., speech acts), in this study, we aimed at quantifying the degree of prosodic convergence towards L1 Italian speakers used as a model for self-imitation training. To measure convergence, we calculated the difference in duration, F0 mean, and F0 max syllable-wise between L1 utterances and the corresponding L2 utterances produced before and after training. The results showed that after self-imitation training, L2 learners converged to the L1 speakers. The extent of the effect, however, varied based on the speech act, the acoustic measure, and the distance between L1 and L2 speakers before the training. The findings from perceptual and acoustic investigations, taken together, show the potential of self-imitation prosodic training as a valuable tool to help L2 learners communicate more effectively. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

24 pages, 2631 KiB  
Article
The ProA Online Tool for Prosody Assessment and Its Use for the Definition of Acoustic Models for Prosodic Evaluation of L2 Spanish Learners
by Juan-María Garrido and Daniel Ortega
Languages 2024, 9(1), 28; https://doi.org/10.3390/languages9010028 - 15 Jan 2024
Viewed by 1740
Abstract
Assessment of prosody is not usually included in the evaluation of oral expression skills of L2 Spanish learners. Some of the factors that probably explain this fact are the lack of adequate materials, correctness models and tools to carry out this assessment. This [...] Read more.
Assessment of prosody is not usually included in the evaluation of oral expression skills of L2 Spanish learners. Some of the factors that probably explain this fact are the lack of adequate materials, correctness models and tools to carry out this assessment. This paper describes one of the results of the ProA (Prosody Assessment) project, a web tool for the online assessment of Spanish prosody. The tool allows the online development of evaluation tests and rubrics, the completion of these tests and their remote scoring. An example of use of this tool for research purposes is also presented: three prosodic parameters (global energy, speech rate, F0 range) of a set of oral productions of two L2 Spanish learners, collected using the tests developed in the project, were evaluated by three L2 Spanish teachers using the web tool and the rubrics developed also in the ProA project, and the obtained ratings were compared with the results of the acoustic analysis of these parameters in the material to determine to what extent there was a correlation between evaluators’ judgements and prosodic parameters. The results obtained may be of interest, for example, for the development of future automatic prosody assessment systems. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

20 pages, 2132 KiB  
Article
An Open CAPT System for Prosody Practice: Practical Steps towards Multilingual Setup
by John Blake, Natalia Bogach, Akemi Kusakari, Iurii Lezhenin, Veronica Khaustova, Son Luu Xuan, Van Nhi Nguyen, Nam Ba Pham, Roman Svechnikov, Andrey Ostapchuk, Dmitrei Efimov and Evgeny Pyshkin
Languages 2024, 9(1), 27; https://doi.org/10.3390/languages9010027 - 12 Jan 2024
Cited by 2 | Viewed by 1858
Abstract
This paper discusses the challenges posed in creating a Computer-Assisted Pronunciation Training (CAPT) environment for multiple languages. By selecting one language from each of three different language families, we show that a single environment may be tailored to cater for different target languages. [...] Read more.
This paper discusses the challenges posed in creating a Computer-Assisted Pronunciation Training (CAPT) environment for multiple languages. By selecting one language from each of three different language families, we show that a single environment may be tailored to cater for different target languages. We detail the challenges faced during the development of a multimodal CAPT environment comprising a toolkit that manages mobile applications using speech signal processing, visualization, and estimation algorithms. Since the applied underlying mathematical and phonological models, as well as the feedback production algorithms, are based on sound signal processing and modeling rather than on particular languages, the system is language-agnostic and serves as an open toolkit for developing phrasal intonation training exercises for an open selection of languages. However, it was necessary to tailor the CAPT environment to the language-specific particularities in the multilingual setups, especially the additional requirements for adequate and consistent speech evaluation and feedback production. In our work, we describe our response to the challenges in visualizing and segmenting recorded pitch signals and modeling the language melody and rhythm necessary for such a multilingual adaptation, particularly for tonal syllable-timed and mora-timed languages. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

18 pages, 2700 KiB  
Article
Can L2 Pronunciation Be Evaluated without Reference to a Native Model? Pillai Scores for the Intrinsic Evaluation of L2 Vowels
by Paolo Mairano, Fabián Santiago and Leonardo Contreras Roa
Languages 2023, 8(4), 280; https://doi.org/10.3390/languages8040280 - 28 Nov 2023
Cited by 1 | Viewed by 1908
Abstract
In this article, we explore the possibility of evaluating L2 pronunciation, and, more specifically, L2 vowels, without referring to a native model, i.e., intrinsically. Instead of comparing L2 vowel productions to native speakers’ productions, we use Pillai scores to measure the overlap between [...] Read more.
In this article, we explore the possibility of evaluating L2 pronunciation, and, more specifically, L2 vowels, without referring to a native model, i.e., intrinsically. Instead of comparing L2 vowel productions to native speakers’ productions, we use Pillai scores to measure the overlap between target vowel categories in L2 English (/iː/ — /ɪ/, /ɑː/ — /æ/, /ɜː/ — /ʌ/, /uː/ — /ʊ/) for L1 French, L1 Spanish, and L1 Italian learners (n = 40); and in L2 French (/y/ — /u/, /ø/ — /o/, /ø/ — /e/, /ɛ˜/ — /e /, /ɑ˜/ — /a/, /ɔ˜/ — /o/) for L1 English, L1 Spanish, and L1 Italian learners (n = 48). We assume that a greater amount of overlap within a contrast indicates assimilated categories in a learner’s production, whereas a smaller amount of overlap indicates the establishment of phonological categories and distinct realisations for members of the contrast. Pillai scores were significant predictors of native ratings of comprehensibility and/or nativelikeness for many of the contrasts considered. Despite some limitations and caveats, we argue that Pillai scores and similar methods for the intrinsic evaluation of L2 pronunciation can be used, (i) to avoid direct comparisons of L2 users’ performance with native monolinguals, following recent trends in SLA research; (ii) when comparable L1 data are not available; (iii) within longitudinal studies to track the progressive development of new phonological categories. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

18 pages, 6058 KiB  
Article
Quantitative Methods for Analyzing Second Language Lexical Tone Production
by Alexis Zhou and Daniel J. Olson
Languages 2023, 8(3), 209; https://doi.org/10.3390/languages8030209 - 5 Sep 2023
Viewed by 2569
Abstract
The production of L2 lexical tone has proven difficult for learners of tonal languages, leading to the testing of different tone training techniques. To test the validity of these techniques, it is first necessary to capture the differences between L1 and L2 tone [...] Read more.
The production of L2 lexical tone has proven difficult for learners of tonal languages, leading to the testing of different tone training techniques. To test the validity of these techniques, it is first necessary to capture the differences between L1 and L2 tone datasets. The current study explores three analyses designed to compare L1 and L2 tone: (1) using a single deviation score, (2) using deviation score calculations for specific regions of tone productions, and (3) applying a complexity-invariant distance measure to the two time series datasets. These three analyses were tested using datasets sampled from a previous study testing the effects of a visual feedback paradigm on the production of L2 Mandarin tone. Results suggest the first two analyses, although useful for providing an overall evaluation of how L2 speakers’ pretest versus posttest productions compare to L1 speakers, lose critical information about tone, namely pitch height, contour, and the timing of the production. The third analysis, applying the complexity-invariant distance measure to the datasets, can provide the pertinent information lost from the first two analyses in a more robust manner. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

Review

Jump to: Research

13 pages, 694 KiB  
Review
Automatic Speech Recognition in L2 Learning: A Review Based on PRISMA Methodology
by Mireia Farrús
Languages 2023, 8(4), 242; https://doi.org/10.3390/languages8040242 - 20 Oct 2023
Cited by 3 | Viewed by 3428
Abstract
The language learning field is not exempt from benefiting from the most recent techniques that have revolutionised the field of speech technologies. L2 learning, especially when it comes to learning some of the most spoken languages in the world, is increasingly including more [...] Read more.
The language learning field is not exempt from benefiting from the most recent techniques that have revolutionised the field of speech technologies. L2 learning, especially when it comes to learning some of the most spoken languages in the world, is increasingly including more and more automated methods to assess linguistics aspects and provide feedback to learners, especially on pronunciation issues. On the one hand, only a few of these systems integrate automatic speech recognition as a helping tool for pronunciation assessment. On the other hand, most of the computer-assisted language pronunciation tools focus on the segmental level of the language, providing feedback on specific phonetic pronunciation, and disregarding the suprasegmental features based on intonation, among others. The current review, based on the PRISMA methodology for systematic reviews, overviews the existing tools for L2 learning, classifying them in terms of the assessment level, (grammatical, lexical, phonetic, and prosodic), and trying the explain why so few tools are nowadays dedicated to evaluate the intonational aspect. Moreover, the review also addresses the existing commercial systems, as well as the existing gap between those tools and the research developed in this area. Finally, the manuscript finishes with a discussion of the main findings and foresees future lines of research. Full article
(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)
Show Figures

Figure 1

Back to TopTop