“It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth

Docherty, Gerard; Foulkes, Paul; Gonzalez, Simon

doi:10.3390/languages9110343

Open AccessArticle

“It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth

by

Gerard Docherty

^1,*,

Paul Foulkes

²

and

Simon Gonzalez

³

¹

School of Humanities, Languages and Social Sciences, Griffith University, Brisbane, QLD 4111, Australia

²

Department of Language and Linguistic Science, University of York, York YO10 5DD, UK

³

School of Literature, Languages and Linguistics, Australian National University, Canberra, ACT 2600, Australia

^*

Author to whom correspondence should be addressed.

Languages 2024, 9(11), 343; https://doi.org/10.3390/languages9110343

Submission received: 28 April 2024 / Revised: 10 October 2024 / Accepted: 15 October 2024 / Published: 31 October 2024

(This article belongs to the Special Issue Advances in Australian English)

Download

Browse Figures

Versions Notes

Abstract

:

This study presents an acoustic analysis of vowel realisations in contexts where, in Australian English, a historical contrast between unstressed /ɪ/ and /ə/ has largely diminished in favour of a central schwa-like variant. The study is motivated by indications that there is greater complexity in this area of vowel variation than has been conventionally set out in the existing literature, and our goal is to shed new light by studying a dataset of conversational speech produced by 40 young speakers from Perth, WA. In doing so, we also offer some critical thoughts on the use of Wells’ lexical sets as a framework for analysis in work of this kind, in particular with reference to the treatment of items in unstressed position, and of grammatical (or function) words. The acoustic analysis focused on the realisation in F1/F2 space of a range of /ɪ/ and /ə/ variants in both accented and unaccented syllables (thus a broader approach than a focus on stressed kit vowels). For the purposes of comparison, we also analysed tokens of the fleece and happy-tensing lexical sets. Grammatical and non-grammatical words were analysed independently in order to understand the extent to which a high-frequency grammatical word such as it might contribute to the overall pattern of vowel alternation. Our findings are largely consistent with the small amount of previous work that has been carried out in this area, pointing to a continuum of realisations across a range of accented and unaccented contexts. The data suggest that the reduced historical /ɪ/ vowel encountered in unaccented syllables cannot be straightforwardly analysed as a merger with /ə/. We also highlight the way in which the grammatical word it participates in this alternation.

Keywords:

Australian English; vowel variation; acoustic properties; vowel weakening; grammatical words

1. Introduction1

Descriptive accounts of Australian English vowels are commonly framed with reference to the lexical sets devised by Wells (1982). For example, there are discussions of a chain shift involving the short front vowels kit, dress and trap (e.g., Cox and Palethorpe 2008; Docherty et al. 2019; Grama et al. 2019) and of emergent regional variation in the realisation across the lexical sets goat, near, goose and thought (Cox and Palethorpe 2019). The lexical set categories are typically presented as the basic components of a vowel system and analysed and distributed accordingly in two-dimensional acoustic or auditory space.

However, although the Wells lexical sets are widely used and in many ways convenient, it should be borne in mind that they were not initially conceived as equivalent to phonological categories. They were devised primarily as a heuristic to counter “the incoherent mess of symbols used in … contemporary publications” at the time (Wells 2010). There are several limitations to the lexical sets. For example, they are based on the standard accents of the UK and US and have been acknowledged to need modification to serve as an adequate frame of reference for other varieties (e.g., D’Onofrio et al. 2019; Grama et al. 2019). Furthermore, with the exception of comma, letter and happy, they refer only to the vowel bearing primary stress in the citation form of a word. While these issues are well-known—albeit rather tacitly—the lexical set system is frequently invoked in studies of phonological variation and change (e.g., Kerswill et al. 2008; Sóskuthy et al. 2015; Loakes et al. 2017; Penney et al. 2023), leaving a number of issues unclear. Among these are the following: To what extent do lexical sets capture the cognitive and phonetic categories used by speakers and listeners? Are the lexical sets adequate to capture the systematic influences on phonetic variation that speaker–listeners experience? To what extent is sound change in vowels enacted through phonological categories equivalent to lexical sets? How does word frequency interact with macro-level category labels such as lexical sets? Intersecting with this last point, how do grammatical (or function) words behave relative to lexical words? There is no prima facie reason why grammatical words such as it, them and that should not be included within the kit, dress and trap lexical sets, respectively. However, as they are typically produced in unaccented form, they are generally excluded from variationist and corpus-based phonetic studies. They have therefore almost entirely fallen out of the scope of previous investigations (some exceptions are discussed in Section 2 below). That they are also often highly frequent words potentially obscures the role that lexical frequency or predictability might play in sound change, for example.

Recent work has pointed persuasively to speaker–listeners’ representation and processing of phonological categories being influenced by a wide range of factors woven through their experience as participants in natural spoken communication—factors such as lexical frequency, predictability, phonological environment and social-indexicality (e.g., Cohen Priva 2017; Foulkes and Docherty 2006; Foulkes and Hay 2015; Shaw and Kawahara 2018). It therefore seems prudent to question the extent to which the application of monolithic analytic categories such as lexical sets, especially when judged across large and diverse corpora, can provide the explanatory power that is needed in accounts of variation and change. In this article, we illustrate such issues, with reference to the case of kit in Australian English. In particular, we consider the relationship between kit and /ə/ (i.e., comma and letter in the Wells lexical set system) and whether they display a merger in certain contexts.

In the next section (Section 2), we summarise what is known about the kit–schwa relationship in Australian English. We highlight a number of complexities that cause us to look beyond the lexical set as an analytic category, in particular, with reference to the place of grammatical words in broader sound changes. Section 3 summarises the aims of this study in detail. In Section 4, we outline the methodology adopted to analyse kit in a corpus gathered in Perth, Western Australia. We consider various sub-categories of kit, including the pronoun it, and compare them to other vowels as reference points (fleece and /ə/ in a range of phonological contexts). The Perth data are then illustrated and discussed in Section 5. The implications of the data are discussed in Section 6, and concluding comments are drawn in Section 7.

2. kit–Schwa in Australian English

Conventional accounts of Australian English vowels (e.g., Cox and Palethorpe 2007; Cox and Fletcher 2017; Cox 2019) state uncontroversially that there is a class of lexical items in that variety that are realised with [ɪ] in the accented nucleus (words such as bid, bitter, skill). Collectively, this set of words makes up the kit lexical set. It is also often observed that the [ɪ] realisation is not found in unstressed syllables in Australian varieties. Cox (2019, p. 22), for instance, notes that “schwa /ə/ is most commonly used, and it does not functionally contrast with kit in this context”. Examples of words that would have this more central unstressed nucleus include rabbit, waited, races, etc. Wells (1982, p. 167) labels this absence of contrast between /ə/ and /ɪ/ in unstressed syllables as “Weak Vowel Merger”. This generates pairs of items that are not contrastively differentiated in Australian English (although they might be for many speakers of other varieties, such as some in the UK; Fabricius 2002; Tasker 2020; Butcher and Stoakes 2024): for example, Rosa’s v. roses, Lennon v. Lenin, boxers v. boxes. Wells’ use of the term “Merger” in this context is unpacked to a degree by Trudgill (2006, p. 117ff.), who notes that Australia’s early colonial settlers included speakers from areas in the British Isles marked by a presence of an /ə/-/ɪ/ contrast (in unstressed syllables) and others from areas where that contrast was absent, with the latter vowel configuration becoming prevalent in the variety that became established across the settler population.

The quality of the vowel in unstressed syllables is not phonetically [ə] in every instance, however. A close front [ɪ] realisation is found in unstressed nuclei preceding a velar or palato-alveolar coda (e.g., panic, earwig, kidding, radish, cabbage; Wells 1982, p. 601; Cox 2019, p. 23), all contexts where those historical varieties with an unstressed /ə/-/ɪ/ contrast would presumably have retained an unstressed /ɪ/ in the final syllable. The examples of this exception condition provided by Cox (2019) also include the item stomach, suggesting that the [ɪ] realisation might also be found in items that, in other varieties of English, would have an unstressed /ə/ as the coda nucleus. Thus, the picture that emerges from descriptive accounts is one of schwa being the norm for all unstressed nuclei, irrespective of any historical differentiation, apart from in the exceptional pre-velar and palato-alveolar contexts where a closer [ɪ] realisation is found.

Our understanding of these vowel alternations in Australian English is limited by the fact that they have been subject to relatively little quantitative investigation. Cox and Palethorpe (2018) focus on the acoustic characteristics of word-final schwa realisations, comparing F1 and F2 of (a) word-final tokens of lexically determined schwa (e.g., Rosa), (b) tokens with lexical coda schwa that were followed by a possessive suffix (e.g., Rosa’s), and (c) other tokens with an unaccented coda plural suffix /-əz/ (e.g., roses). This last category reflects the context in which the progenitor varieties of Australian English would typically have a realisation akin to [ɪ] (Fabricius 2002; Tasker 2020) but in which conventional accounts of Australian English varieties would predict a more schwa-like realisation. Cox and Palethorpe noted variable realisations across the three experimental conditions but reported no significant difference between the latter two contexts (Rosa’s and roses), a finding that is in line with the predictions of a loss of contrast in those environments. However, the findings also suggest that the realisation of the nucleus in those two contexts yielded a more [ɨ]-like realisation that differed significantly from that of the word-final lexical schwa tokens (i.e., the Rosa context). Cox and Palethorpe’s findings are echoed in an acoustic study by Butcher and Stoakes (2024). Also based on F1 and F2 estimates of unstressed schwa nuclei, this study found no significant difference between “lexical” vs. “affix” schwa realisations (both with a following consonant). Butcher and Stoakes also identified a more open and retracted realisation of schwa when it occurs in word-final position (e.g., in butter as opposed to its occurrence in buttered).

The results of both studies suggest that the vowel realisation patterns in the unstressed contexts investigated may not be quite as straightforward as the conventional account would have it. For example, while Cox and Palethorpe can point to a neutralisation of the kit/schwa contrast in the Rosa’s/roses contexts, the realisation is not the same as that found for /-ə/ in word-final/pre-pausal contexts (Rosa); indeed, the [ɨ] realisation highlighted by Cox and Palethorpe suggests a quality that is closer to that found in the realisation of kit (but since the scope of that study did not extend to kit, it is not possible to know for sure). Of course, as a phonological category, /ə/ is well-known for its variable realisation (e.g., Van Bergem 1994; Bates 1995; Flemming 2009), and Cox and Palethorpe rightly note the possibility of the closer realisation found in the Rosa’s/roses contexts being driven by the coronal place of articulation of the surrounding consonants. Likewise, they note (as do Butcher and Stoakes 2024) that the more open word-final Rosa context is consistent with reports that phrase-final comma and letter in some varieties of Australian English can be realised with a vowel variant that is rather more open than [ə], towards [ɐ] (e.g., Cox 2019, p. 22; Grama et al. 2020).

Cox and Palethorpe’s study did not consider the realisation of pre-velar unstressed tokens. However, Butcher and Stoakes suggest that in this context, too, the realisation of the unstressed nucleus may be less stable than conventionally thought. Inter-speaker variability was observed in the realisation of the unstressed nucleus (panic was consistently realised with what Butcher and Stoakes describe as [ɛ], but there was considerable inter-speaker variability with respect to paddock and stomach). This, in turn, begs the question of whether the more advanced and possibly also raised pre-velar tokens are simply contextually conditioned variants of a neutralised kit/schwa contrast or whether they are genuine exceptions to the general merger process.

Many of the questions emerging from this existing work are reminiscent of those arising in studies of unstressed vowels in other varieties of English (Fabricius 2002; Flemming and Johnson 2004; Gordon et al. 2004; Tasker 2020). In the context of work on variable /ɪ/~/ə/ neutralisation in unstressed syllables in UK varieties of English, Tasker (2020, p. 27) notes that “[d]escriptions of variation in unstressed vowels have … implied that it is categorical; that speakers always aim for either /ɪ/ or /ə/. This idea implies that all other vowel differences are completely neutralised, and does not consider the idea that there could be any other intermediate variants. These ideas are based more on native speaker intuition than empirical evidence, and there is reason to suppose that there could be more fine-grained variation in unstressed vowels”. In this light, alongside the aforementioned questions arising from Cox and Palethorpe (2018) and Butcher and Stoakes (2024), the present investigation focuses on the nature of those unaccented realisations in Australian speakers of English and their relationship to the realisations found in the accented syllables (to which they are related at least historically if not synchronically).

As a further dimension to this study, we note that previous studies of these vowel alternations in Australian English have, to date, not considered a range of other factors that elsewhere have been shown to play a role in the development of sound changes. These factors include lexical frequency and grammatical word class. While lexical frequency has generated considerable debate (e.g., Dinkin 2008; Hay and Foulkes 2016; Hay et al. 2015; Labov 1994; Phillips 2006; Pierrehumbert 2001), very few studies have considered the place of grammatical words in sound change. Among those who have are Bybee (2002, 2017) and Phillips (1983, with reference to text-based evidence regarding distant historical changes). A number of other works have analysed phonetic variation within grammatical words (e.g., Bell et al. 2003; Grama et al., submitted; Shi et al. 2005). Working within an exemplar-based framework, Bybee (2002, 2017) hypothesises that sound change should operate faster on words or phrases that occur frequently in the favouring context. The change is then generalised to other words and contexts. This could in principle include grammatical words if they are biased to occur in favouring contexts. In support of this hypothesis, Bybee (2002) shows that /-t, -d/ deletion in English coda obstruent clusters applies to reduced negatives (e.g., didn’t) more frequently than phonologically parallel lexical words (such as student). This is not simply because negatives are frequent, but because they tend to occur in the context most conducive to deletion, i.e., preceding consonants. In her data, negative verb forms occurred far more often before consonants than did the other lexical classes analysed (80% of tokens against 42–47%), and had the highest deletion rate. Phillips (1983) offers a more nuanced view, concluding that frequency and grammatical word class act independently as factors in sound change. In her historical (text-based) data, grammatical words are sometimes affected early in sound change but sometimes late. She observes a pattern in terms of the type of change involved. Interestingly, she notes that cases where grammatical words are in the vanguard of a change are invariably weakening processes—precisely the kind of change that has been described as (historical) weak vowel merger for kit/schwa (see also Phillips 2006 for a more detailed argument on the role of frequency in change). We are unaware of any quantitative studies of kit which include high-frequency grammatical items such as it, this, with. Such words are typically excluded from variationist and corpus-based phonetic studies (as they were in Docherty et al. 2019 and Grama et al. 2019, for example). In connected speech, these words are typically produced as unaccented syllables, which justifies their exclusion in an analysis based on lexical sets. However, they can also be produced as accented (e.g., for the purposes of emphasis—that’s it! or this is the one), in which case they would potentially meet the criteria for inclusion and would unambiguously have an [ɪ] realisation.

One further point of comparison relevant to contextualising our understanding of the realisation of short vowels in the close front quadrant of the Australian English vowel space is the close [i] variant found in unaccented syllables in the so-called happy-tensing environments, such as happy, movie, ready. These are characterised by Wells (1982, p. 602) and Cox (2019) as being phonetically aligned to the /i:/ phoneme or the fleece lexical set. However, the fact that they are not prone to diphthongisation in the same way as other fleece nuclei and are deemed “metrically weak” (Cox and Fletcher 2017, p. 119) suggests that they should perhaps not be classified straightforwardly as members of the more general /i:/ vowel category associated with the fleece lexical set in this variety, albeit they are rarely discussed in that light, if at all. An alternative account is offered by Butcher and Stoakes (2024), who suggest that the close variant arising from happy-tensing can best be described as an allophonic variant of the unstressed nucleus, although this is beyond the scope of their empirical study. More generally, the configuration of the kit and fleece lexical sets is also of interest for other reasons. Some accounts suggest that kit retains a relatively raised quality (“a fronter, higher position than the nucleus of fleece”; Purser et al. 2020, p. 278) or is differentiated in quality from the long close front fleece vowel primarily by the target for the latter being achieved later following an on-glide (e.g., Cox 2019). Several other studies have reported lowering of kit as part of the short front vowel chain shift (e.g., Cox and Palethorpe 2008; Grama et al. 2019).

In sum, while overview accounts of Australian English /ɪ/ and its relationship to schwa in unstressed contexts typically give the impression that there is relatively little complexity, the sparse quantitative data that are available suggest that the picture may in fact not be quite so straightforward. Further investigation is therefore warranted to enhance our understanding of short front vowel variability in this variety but also to contribute to exploration of the superordinate questions discussed above regarding the explanatory power of lexical set-based analytic categories and the position of grammatical words in sound changes.

3. Aims of This Study

In this study, we aim to shed new light on variation and potential change in the realisations of /ɪ/ from an analysis of the conversational speech of speakers from Perth, Western Australia. Our focus is on comparing vowel realisations across items which are uncontroversially classified as /ɪ/ and which have retained a realisation around [ɪ] and others where the contrast between /ɪ/ and /ə/ is conventionally said to have been lost but where the little quantitative data that exist suggest a little more complexity.

Our study departs in a number of ways from the two previous studies with a specific focus on this topic, Cox and Palethorpe (2018) and Butcher and Stoakes (2024). First, we compare the contexts thought to be associated with the loss of contrast with other contexts where the contrast is maintained (not simply with /ə/, as in both previous studies—Butcher and Stoakes do briefly refer to the acoustic properties of /ɪ/ but with data from an earlier study with different participants). We also report comparisons with the realisations of the fleece lexical set and with the unstressed coda nuclei of items with word-final happy-tensing. Second, our data are drawn from conversational speech. Previous work has largely focused on analysis of /hVd/ keywords or, in the case of Cox and Palethorpe (2018), on a set of 12 isolated words identified to represent the phonological environments under investigation. Likewise, Butcher and Stoakes’ (2024) dataset consists of repetitions of 18 lexical items produced as the final item in a short carrier sentence. There is value in investigating highly controlled contexts such as these, but it unavoidably limits the dataset to contexts bearing a phrase-level accent and/or citation-form data, thereby limiting the phonetic variation that can potentially be observed across different phonological environments. However, in order to understand variation and change in general—and of /ɪ/ versus /ə/ in particular—it is valuable to see what happens to a vowel when its prosodic and phonological circumstances vary. Third, we have based this study on a sample of speakers of English from Perth, a location which has not previously been studied in relation to the variation that is in focus here, (indeed, there has been very little research on this variety at all, but see, for example, Docherty et al. 2015, 2018, 2019; Cox and Palethorpe 2019). Finally, we include tokens from the high-frequency grammatical item it in order to test the extent to which grammatical words participate in the loss of contrast with /ə/. As noted above, grammatical words are typically excluded from studies of phonological variation and change. This is despite their high frequency and the fact that, in principle, they meet the same structural conditions determining variation as are found in lexical words. We decided to consider a single grammatical word at this stage of our investigation in order to test how such items pattern in relation to the alternation in focus and thereby assess the value of sampling from a wider range of grammatical items in future work. It was chosen because it was the most frequent grammatical word in the kit category. It posed fewer segmentation difficulties than other candidates such as in or with, where the target vowel is flanked by approximants, liquids or nasals. Furthermore, it is not subject to the types of categorical reduction processes (potentially including full vowel deletion) that affect some other grammatical words to the extent that they are reflected in spelling (e.g., will > -’ll, is > -’s). Extracting reliable vowel measurements in such cases would therefore depend on the precision or otherwise of the segmentation used as the basis for the forced alignment. Note that in this study, all of the alignment was manually checked (see further in Section 3).

The three questions we address are the following:

(1): What is the relationship between the acoustic properties of the vowels associated with the fleece and kit lexical sets for contemporary speakers of English in Perth?
(2): Is the realisation of unstressed vowels consistent with conventional accounts of a loss of contrast between /ɪ/ and /ə/ arising from a process of “weak vowel merger”?
(3): Does the status of the word as grammatical or lexical impact significantly the realisation of the unaccented syllables? To what extent do grammatical words participate in the putative vowel weakening?

4. Materials and Methods

The material analysed in this study was drawn from a corpus of recordings collected in Perth in 2014–2016. The materials consist of twenty pairs of young speakers (aged 18–22) engaged in same-sex unscripted conversations. Each of the conversations lasted around 30 min. There were equal numbers of males and females, and all of the participants had been fully schooled in Perth (from age 5). While social class is not a focus of the investigation reported here, twenty of the speakers were residents of neighbourhoods ranked by the Australian Bureau of Statistics to be in the top socio-economic decile, and the remaining twenty were from neighbourhoods with a lower socio-economic ranking. (Social class effects on short front vowel realisations are reported in Docherty et al. 2018, 2019.) The majority of speaker pairs knew each other in advance but to varying degrees. A fieldworker was present in the same room as the participants in order to initiate and conclude the conversation recording process but only intervened if the participants’ conversation subsided and they were in need of a prompt.

The recordings (44 Khz, 16 bit) were made using Sennheiser EW112-P-G3 lapel microphones and an Edirol R44 digital recorder. Conversations were transcribed with ELAN (2022), starting five minutes into each recording, thus skipping over the initial negotiation of the nature of the task and allowing the participants to relax into the conversation. They were then force-aligned within LaBB-CAT (Fromont and Hay 2012) using HTK (Young et al. 2006), with manual correction of misaligned segment boundaries.

In order to address the research questions, we created a subset of the corpus with tokens occurring across a range of contexts relevant to the research questions itemised above. Each context is identified henceforth by the relevant acronym shown in bold in the list below.

Three of these were contexts where conventional accounts would predict an [ɪ] realisation and where we did not expect to encounter any centralisation of the vowel nucleus in focus:

(a): MONO: nucleus of monosyllabic content word (e.g., bid, trip).
(b): POLY_ACC: accented nucleus of a polysyllabic content word (e.g., bitter, issue).
(c): PREVEL_UNACC: unaccented nucleus in polysyllabic lexical items with a following velar or post-alveolar context (e.g., panic, earwig, kidding, radish).

A fourth context comprised tokens in unstressed environments where previous accounts typically indicate that there has been a loss of contrast between /ɪ/ and /ə/:

(d): UNACC: unaccented nuclei contained within a polysyllabic lexical item (e.g., massive2, rabbit, races). Note that this condition excludes any pre-velar contexts, as these are covered by condition (c).

Two subsets of the grammatical word it were generated. They were differentiated because initial auditory analysis of the data suggested that it in phrase-final position might be associated with greater levels of reduction in the quality of the vowel nucleus.

(e): PHRINT_IT: tokens of grammatical item it occurring phrase-internally (e.g., because it was a cool name).
(f): PREPAUS_IT: tokens of grammatical item it occurring phrase-finally (in most, but not all cases, also pre-pausally). Such cases are typically either clitics (e.g., then you can do it #) or tags (e.g., it will be like permanently cancelled won’t it #; NB # is used here to indicate a pause).

Three further conditions were investigated in order to provide points of reference for our comparative analysis of the /ɪ/ conditions:

(g): FLEECE: tokens of the fleece lexical set realised in monosyllabic lexical words (e.g., beach, keep, see).
(h): HAPPY_T: tokens where we expected that the vowel nucleus would be raised and fronted as per the pattern of unstressed happy-tensing reported for this and other varieties of English (e.g., city, movie, ready).
(i): SCHWA: tokens of lexical /-ə/ in unstressed syllables of polysyllabic content words, equating to the comma and letter lexical sets (undifferentiated in Australian English, e.g., bitter, wonder, pasta, Asia, society, fatigued).

Using default settings in Praat (Boersma and Weenink 2018), F1 and F2 values were estimated for each monophthong at the midpoint of each token (see Cox and Docherty 2023 for an overview of the caveats that apply to this static approach to vowel description). As an exception to this measurement protocol, F1 and F2 estimates for fleece were extracted at the point 80% through the duration of the token. This allows for the on-glide that tends to characterise realisations of this vowel in Australian English speakers (Cox et al. 2014) and ensures that the estimates were taken close to where F2 reached its peak. Pre-/l, w, j/, pre-nasal and post-/w, j, r/ environments were excluded, as well as post-nasal and post-lateral tokens where segmentation could not be undertaken reliably. In order to allow for comparison with Cox and Palethorpe (2018) and Butcher and Stoakes (2024), the formant estimates were not normalised, and consequently the findings for male and female speakers are reported separately below (note that for both of the aforementioned previous studies, the speech sample consisted exclusively of female speakers). Vowel duration measurements were also extracted, but (in contrast to the analysis presented by Docherty et al. (2019) focusing on the realisation of kit tokens) statistical analysis revealed no correlation between duration and the vowel quality of the set of /ɪ/ tokens.3 For the sake of exposition, we therefore ignore duration in the data and discussion that follow.

Table 1 provides a summary of the dataset, itemising the number and percentage of tokens in each category.

5. Findings

Figure 1 shows separately for female and male speakers the distribution in F1/F2 space (Hz) of the realisations corresponding to all of the conditions set out above. The condition labels are centred on the mean F1/F2 value with ellipses plotted at ±1 standard deviation.

An immediate observation is that a representation in F1/F2 space creates challenges for visualising the relationship between the individual conditions. This is not only due to the number of conditions contained within the plots but also to a large extent due to the considerable overlap of the realisations across those distributions, something which was expected given that the material analysed is from conversational speech where the factors driving token-to-token variation are more prevalent than in isolated word tokens. Therefore, for our analysis of the acoustic data, we adopted an approach previously deployed by Labov et al. (2013) and Grama et al. (2019), calculating F2 − (2 × F1) as a derivative indicator of relative location along the front diagonal of the vowel space. The F2 − (2 × F1) value provides a single metric of the acoustic properties of each token and, in the process, provides clearer comparative visualisations of the various conditions under investigation. Use of this unitary metric also has the advantage of allowing us to measure the dimension that we are interested in without having to carry out quantitative analyses of F1 and F2 separately and assuming no covariance, when in fact they are both closely determined by the overall shape of the vocal tract. We refer to this measure henceforth as F2deriv (Hz). As explained in detail by Labov et al. (2013, p. 40), higher values of F2deriv equate to a relatively closer and fronter vowel quality, precisely the dimension that is the focus of this study, as is evident from the overall distributions in Figure 1.

Figure 2 shows violin plots of the distributional density of F2deriv for each of the full set of conditions making up the dataset (females in the top panel, males in the lower panel). Viewing the data through the F2deriv lens in this way provides a more tractable means of addressing our research questions.

While the overlap across conditions that was evident in Figure 1 is still readily apparent within the violin plots, Figure 2 brings to the foreground a degree of clustering that substantially aligns with previous accounts of variation across the different contexts investigated. Thus, for female speakers, the FLEECE and HAPPY_T tokens (plots 1 and 2 to the extreme left of Figure 2, top panel) have the highest distribution of F2deriv values, with a second somewhat more open cluster being formed by the three conditions in which the kit/schwa contrast is said to be maintained (MONO, POLY_ACC and PREVEL_UNACC tokens; plots 3, 4 and 5). Less close and front realisations are found for the three conditions associated with a loss of the kit/schwa contrast (UNACC, PHRINT_IT and PREPAUS_IT: plots 6, 7 and 8), and the lowest F2deriv distribution—but also the most variable—is found for SCHWA (plot 9 on the extreme right). For male speakers, the patterns are largely the same. The principal difference is for PREPAUS_IT tokens, such that the nucleus of phrase-final it has a distribution that is skewed somewhat lower in F2deriv than is found in the other two environments associated with loss of the kit/schwa contrast (UNACC and PHRINT_IT). This distribution is also somewhat lower than is found for the female speakers; the distribution appears to reflect a higher proportion of PREPAUS_IT tokens closer to the centre of gravity of the SCHWA condition than is found in the PHRINT_IT condition.

The most strongly anticipated contrast-loss condition, UNACC, shows a good deal of overlap with SCHWA (as shown by comparing plots 6 and 9 in each panel). However, with most of the UNACC tokens falling within the higher end of the range of the SCHWA distribution and many SCHWA tokens falling at the lower end or outside of the UNACC range, it seems unlikely (at least based on visual scrutiny of the plots) that the UNACC and SCHWA tokens are components of a single distribution as might be expected if this alternation could faithfully be referred to as a weak vowel merger. This difference could reflect the fact that a good number of SCHWA tokens occur phrase-finally, eliciting a vowel variant that is more open than [ə], towards [ɐ]. (We did not explore this issue quantitatively within our data set, but the spread of data in Figure 2 suggests that a good number of SCHWA tokens were relatively open and also slightly back; our impression from auditory analysis is that open variants are found variably across speakers and perhaps less consistently than reported for other Australian varieties.) A further potential factor here is the impact on the SCHWA realisations of differences in place of articulation of adjacent consonants; in this regard, we note Penney et al.’s (2021) finding that /ə/ is somewhat retracted under the influence of adjacent bilabial plosives.

In order to gauge the contribution of the various conditions to the overall distribution of F2deriv across the full dataset, linear mixed-effects models were calculated using the lmer function as part of the lme4 package (Bates et al. 2015) in R (2020). Probability values were calculated using the lmerTest package (Kuznetsova et al. 2017). F2deriv was configured as the dependent variable. Speaker and word were included as random effects, and the condition (with its nine levels) was included as a fixed effect. MONO was chosen as the reference predictor for the model, as it is the archetypal context in which [ɪ] is encountered, and it therefore provides a useful basis on which to make statistical comparisons across conditions. The data for females and males were modelled separately.

The parameters of the models that were generated can be found in Appendix A. For ease of interpretation, they are depicted visually in Figure 3 (drawn using sjPlot—Lüdecke 2018). The quantities shown for each condition in Figure 3 are the estimates for F2deriv, indicating the difference between each condition and the reference condition MONO (i.e., kit in stressed monosyllabic words), which is represented in Figure 3 as the vertical zero line. The length of the horizontal line for each estimate indicates the 95% confidence interval (CI), the full details of which are provided in Appendix A. Estimates that fall below and above the reference intercept are shown in different colours. Thus, for example, the estimate for F2deriv of FLEECE for the female speakers was 336 Hz higher (shown in blue) than the reference value for MONO, while the estimate for POLY_ACC was 16 Hz lower (shown in red) than MONO. The asterisks indicate the level of the probability value associated with each of the predictors (* p < 0.05, ** p < 0.01, *** p < 0.001).

The statistical analysis provides a clear indication of the divergence of MONO, FLEECE and SCHWA conditions, notwithstanding the overlap in F2deriv distributions evident in Figure 2. The comparison of MONO with FLEECE is in line with accounts of changes in Australian English short front vowels that point to the kit lexical set lowering and distancing from the fleece lexical set, although the extent of overlap evident in Figure 1 and Figure 2 suggests that in this variety, the change is at a relatively early stage: Many MONO and FLEECE tokens yield similar F1/F2 values. Note also that, in line with the clustering shown in Figure 2, the estimates for HAPPY_T pattern closely with FLEECE. This does not indicate that the two vowels are identical, however; recall that values for HAPPY_T were taken at the midpoint, while those for FLEECE were taken 80% through the duration in order to avoid possible onglides. There is also a difference in duration, as expected, with HAPPY_T shorter than FLEECE overall, although the margin of that difference is relatively modest (mean durations of 104 ms v. 125 ms for females and 83 ms v. 113 ms for males).

The estimates for the two conditions where [ɪ] is anticipated (POLY_ACC, PREVEL_ UNACC) do not diverge significantly from the MONO reference intercept. This is not surprising in the case of POLY_ACC since tokens in this category, along with those contained within the MONO context, sit transparently within the kit lexical set. It is arguably more interesting in the case of PREVEL_UNACC. Despite being unstressed, tokens in this category also appear to align phonetically with vowels in the kit lexical set, thereby suggesting the retention of an underlying /ɪ/ in those items. This finding supports previous accounts (e.g., Cox 2019) indicating that the PREVEL_UNACC condition is a straightforward exception to the historical unstressed centralisation/merger process.

The modelling of the two it conditions paints a more complex picture. While the distribution of PHRINT_IT tokens (i.e., phrase-internal it) is skewed lower than MONO, as can be seen in both Figure 2 and Figure 3, the modelling suggests that this difference is not significant for either males or females. This seems to be largely attributable to the much lower precision associated with the estimate for PHRINT_IT, which in turn is suggestive of high variability in the realisation of that condition, possibly reflecting the fact that this condition does not differentiate across the various functions associated with phrase-internal it (see below for further discussion). For PREPAUS_IT tokens (phrase-final it), on the other hand, the model delivers a significant difference between the relevant estimate and that for MONO, reflecting (along with Figure 2) the somewhat lower distribution of realisations as measured by F2deriv.

Finally, in order to test for the statistical relationship between the SCHWA condition and UNACC condition, we ran an additional mixed-effects model using the same specification as described above, but in this case, with SCHWA chosen as the reference predictor. The data for females and males were again modelled separately. The parameters of the models that were generated can be found in Appendix B. The comparison of the F2deriv estimates for the reference vowel (SCHWA) with the UNACC condition yielded a significant difference for both males and females, suggesting that despite the significant overlap in their distributions, the two conditions were generating divergent patterns of realisation. The difference is also clearly visible in Figure 2. This divergence was in the same direction as found in the earlier study by Cox and Palethorpe (2018), with the UNACC tokens tending to be closer and fronter overall than the SCHWA tokens. The SCHWA-referenced mixed-effects model also shows that the estimates for PREPAUS_IT tokens do not differ significantly from those of the SCHWA condition, a finding that is in line with the somewhat lower F2deriv values for PREPAUS_IT tokens referred to above. This is the case for both male and female speakers.

6. Discussion

In this study, we set out to address three questions:

(1): What is the relationship between the fleece and kit lexical sets for contemporary speakers of English in Perth?

Our data suggest that there is only a modest lowering and retracting of kit vis-à-vis fleece in this variety. The F1/F2 values for kit (as represented by the MONO and POLY_ACC conditions) largely overlap those of fleece but with kit realisations concentrated in the lower end of the fleece distribution. Despite this overlap, statistical comparison of these distributions suggests that the realisations corresponding to the two lexical sets are not samples of the same distribution. This result is broadly in line with comparisons of kit vis-à-vis fleece made in some previous studies (e.g., Billington 2011; Cox and Palethorpe 2008; Grama et al. 2019). These studies report varying degrees of divergence between the two, with kit positioned in a lower and more open area of F1/F2 space while retaining a high level of acoustic proximity (and presumably significant overlap, although that is not so easy to discern when visualisations focus solely on F1/F2 means). One caveat applying here is the need to consider whether the 80% measurement point deployed for FLEECE tokens is the optimal basis on which to gauge the quality of that vowel category. Comparisons between kit and fleece are heavily contingent on differing assumptions made about the nature of the fleece realisations. For example, fleece is classified as a member of the set of monophthongs in the “HCE” vowel taxonomy that has become the de facto standard for describing Australian English (Harrington et al. 1997), but as a diphthong in some other studies (e.g., Elvin et al. 2016; Grama et al. 2021; Penney et al. 2023)—and analysed as such. This variability can render a direct comparison with kit a little problematic. When classified as a diphthong, the acoustic measures capturing the starting point and trajectory of a glide do not provide a consistent basis for comparing with the midpoint of monophthongs.

(2): Is the realisation of unstressed vowels consistent with conventional accounts of a loss of contrast between /ɪ/ and /ə/ arising from a process of “weak vowel merger”?

In general, the findings are in line with existing accounts of the conditions under which the contrast between /ɪ/ and /ə/ is said to diminish and with the findings of the previous acoustic study by Cox and Palethorpe (2018). MONO and POLY_ACC tokens are, as predicted, fronter and closer overall than those found in unaccented syllables in the UNACC content words (with the anticipated exception of the PREVEL_UNACC condition which aligns to the realisation of MONO and POLY_UNACC tokens).

The comparison between the UNACC and SCHWA conditions directly addresses Cox and Palethorpe’s (2018) observation that the reduction of contrast in unstressed syllables often results in a vowel which has more of an [ɨ] quality than [ə]. While there is a good deal of overlap in the vowel realisations across these two conditions, the statistical analysis points to there being a significant difference between the UNACC and SCHWA distributions, with the former condition being associated with a closer vowel quality overall. As noted above, one potential explanatory factor for this difference is the extent to which the realisation of the SCHWA condition is itself influenced by a tendency for a more open vowel in word-final/pre-pausal contexts (reported for many speakers of Australian English, e.g., by Grama et al. 2020, as well as by both Cox and Palethorpe 2018; Butcher and Stoakes 2024). Further investigation is needed to ascertain the extent to which this is a factor in the current dataset, especially given that in both previous studies, the word-final tokens are also pre-pausal, thus making it difficult to discriminate between an explanation focused on lexical vs. phrasal phonological context. But, overall, the present findings do suggest that caution is needed in conceiving of the unstressed realisations as a simple merger of /ɪ/ and /ə/. If the contrast is functionally absent as suggested by Cox (2019) but subject to some predictable allophonic variation as proposed by Butcher and Stoakes (2024), there are evidently a number of factors at play in determining the phonetic properties of the phonological category arising from the fusion of /ɪ/ and /ə/.

One of the most prominent features of the results is the extent of variability and overlap across all of the conditions (not least across conditions which are differentiated statistically). For both female and male speakers, there is a substantial range of F2deriv values that is shared by all of the conditions investigated. This is perhaps not too surprising given that the data have been sampled from conversational speech, which is inherently more variable than the more controlled isolated word material that is prevalent in previous work, and also given the known proclivity of /ə/ variants to be strongly context-dependent (Cohen Priva and Strand 2023; Penney et al. 2021; Tasker 2020). These overlapping distributions, however, do beg questions for further research regarding the factors that drive this variability and their impact on the likelihood of a token being closer and fronter. It would be instructive to consider not only the immediate phonological environment, as tested to an extent in this study, but also phonological and social factors such as speaker, speech rate, prosodic conditioning, etc. We should also note that while we have referred to “conversational” speech (in contrast to the read passages or isolated word styles that characterise a lot of existing work in this area), in reality, each of our conversations comprises a number of sub-styles (e.g., “banter” between the interlocutors, narrative story-telling, sharing of information, etc.). It is important not to simply assume that conversational speech is a unitary style. The validity of pooling tokens across long conversations such as those used in this study is a matter for further investigation.

(3): Does the status of the word as grammatical or lexical impact significantly on the realisation of the unaccented syllables? To what extent do grammatical words participate in the putative vowel weakening?

Tokens of the grammatical word it do appear to pattern overall with the F2deriv distribution found in the UNACC condition where a reduced nucleus is found, although the quantitative modelling is not conclusive about whether the it distributions are divergent from those of the MONO condition. The model estimates for phrase-final it conditions in Figure 3 (PREPAUS_IT) do reach significance, while those for phrase-internal it tokens (PHRINT_IT) do not—although the tendency observable in this data set is clearly intermediate between MONO and the significantly weakened conditions. In both cases, the confidence of the fit to the mixed-effects model is relatively low (as reflected by the CIs observable in Figure 3). As noted above, this may reflect complexity within the distribution of realisations arising from the range of different grammatical functions being undertaken by it; for example, tokens include it acting as subject pronouns, thus bearing a degree of stress, and others where it is cliticised and unstressed as in “I chose to believe it” or “you’ve gotta love it”.

What do the data suggest about the place of it in relation to the overall unstressed “merger” process? Recall that Bybee (2002, 2017) argues that change should operate faster on units (sounds, words or phrases) that occur frequently in favouring contexts. In principle, this could include grammatical words. Adopting a more nuanced position, Phillips (1983, 2006) suggests that grammatical words are only likely to be affected early by weakening or lenition changes. A similar interpretation might be made of our Perth data. In the categories of data that we analysed for signs of participation in the putative weak vowel “merger”, the most frequent is unaccented /ɪ/ (UNACC). Note in Table 1 that this category provides 16.5% of the female data analysed and 14.3% of the male data. Tokens in our PREPAUS_IT condition are much less frequent (just under 4% for both sexes) and are also less frequent than it in phrase-internal contexts (PHRINT_IT). But note also that, with the exception of the phrase boundary, the UNACC and PREPAUS_IT categories largely share the same prosodic structure: In both cases, the vowel in focus is unaccented and in the second (weak) syllable of a trochaic foot (e.g., ˈraces, ˈdo it #). These two categories also show the lowest F2deriv estimate in our statistical model, closest to lexical SCHWA (Figure 3). Frequency alone might therefore help explain the development of a partial weak vowel merger, as the /ɪ/-/ə/ contrast diminishes in the very frequent unaccented position within the foot. The grammatical word it clearly follows the general pattern, occurring with high frequency in unaccented positions. It can thus be seen to be participating in the weakening process. The high overall item frequency of it may further contribute to the spread of the reduced variant. Of interest, too, is the relative position of it in phrase-internal contexts (PHRINT_IT). Although the F2deriv estimates for this category were not significantly different from the MONO reference condition, the estimate values are intermediate between MONO and the two clearly differentiated categories just discussed. For the female speakers, for example, the F2deriv estimate for PHRINT_IT is −155 Hz (cf. −284 Hz and −312 Hz for UNACC and PREPAUS_IT, respectively). The data for PHRINT_IT also display a relatively wide confidence interval (presented as the wide lines in Figure 3). This wide CI likely reflects prosodic variability in the underlying data: Tokens in the PHRINT_IT condition were not differentiated by stress in the analysis and therefore combine both stressed and unstressed examples. The latter again typically occur in the weak position within the foot. Although we did not test for this explicitly, we can reasonably assume that the unstressed tokens contained more centralised schwa-like variants.

It is not possible, given the data analysed here, to draw any firm inferences on cause-and-effect in how the weakening process has spread. One possibility is that it is simply the foot structure that is responsible for the weakening: Any weak environment is opportune for the weakening of any /ɪ/ vowel, including grammatical words such as it. Another intriguing possibility, though, is that there is a contribution from the most frequent words occurring in that prosodic context. Although the token numbers for the PREPAUS_IT condition are relatively small in our data set, if we consider the trochees in aggregate across the PREPAUS_IT + UNACC conditions, it is the most frequent lexical item in these categories. This leaves open the possibility that the word it is in fact a driver behind the weakening change. Its frequent occurrence in unstressed and phrase-final context is especially conducive to reduction. Once weakening has taken hold in that context, the reduced form then might then have spread to it in other contexts (i.e., those contained in the PHRINT_IT category, the diverse functions of which provide a less consistently trochaic metrical setting for it), subsequently generalising to other words in the same prosodic context (UNACC). It is noteworthy too that the collapse of this contrast could potentially be facilitated by the existence of very few minimal pairs of the type market~mark it, planet~plan it, fillet~fill it, thus minimising the risks of misinterpretation on the part of listeners. While this account is speculative and would require testing with further analysis, the participation of it in the change accords with Bybee (2002, 2017) and Phillips (1983): The grammatical word occurs frequently in the context conducive to change, and as this is a weakening change, we can expect grammatical words to be affected early in the life cycle of the change.

As a general point, it is certainly of interest to consider the potential role of grammatical words as a participant—or even driver—of change. A further possibility is that other frequent grammatical words are also contributing to the change. This question also remains for future work. Crucially, however, eliminating grammatical words a priori from analysis within a lexical set-based framework would preclude consideration of such factors. Even if we are unable to determine its precise role, our data show without question that it is participating in the change from /ɪ/ to [ə].

7. Conclusions

In sum, our study supports the findings reported by Cox and Palethorpe (2018) and Butcher and Stoakes (2024) that the realisation of unstressed vowel nuclei in Australian speakers of English is a somewhat more nuanced process for speakers of English in Australia than is suggested by conventional accounts describing a loss of contrast between /ɪ/ and schwa. Further analysis is clearly warranted in order to paint a more complete picture of this process of potential contrast loss, in particular, focusing on a wider range of grammatical items in phrase-internal and phrase-final positions (and on the different functions of those grammatical items), on the extent of cross-speaker differentiation and on factors relating to the wide distribution of realisations across all of the conditions.

Our findings underscore the need for caution in the application of Wells’ lexical set notation to the analysis of variation and change in vowel realisations and for greater transparency in the decisions to exclude or include particular words and contexts in any study. The lexical set designator for the /ɪ/ vowel category (in Australian and other varieties of English) is in reality only applicable to accented vowel tokens. Indeed, this was built into Wells’ definition of the lexical set heuristic at its outset, but it has been passed over to an extent as lexical sets have come to take on the role of proxies for the vowel categories of a particular variety. Our findings, however, beg the question of how speakers and listeners represent the vowels in the different conditions that we analysed. Are the unaccented vowels in it, is, this, massive, rabbit identified in phonological representation as “the same” as the vowels in delicious, Christmas, gifts? Do they therefore have to be factored into our understanding, for example, of the factors impacting the status of /ɪ/ within putative ongoing short front vowel shifts, not least because of the high frequency of some of the grammatical items? If so, how valid is the widely adopted methodological approach of simply removing grammatical items from the analysis (e.g., the “stop words” in the transcriptional alignment tool FAVE—Rosenfelder et al. 2014)? Our analysis of it in the Perth corpus certainly suggests that this grammatical word is participating in this particular weakening process and might conceivably be acting as one driving force behind its spread.

Author Contributions

Conceptualization, G.D. and P.F.; methodology, G.D. and S.G.; software, S.G. and G.D.; validation, G.D., P.F. and S.G.; formal analysis, G.D. and P.F.; investigation, G.D., P.F., and S.G.; resources, G.D. and P.F.; data curation, G.D. and S.G.; writing—original draft preparation, G.D., P.F. and S.G.; writing—review and editing, G.D. and P.F.; visualization, G.D. and S.G.; project administration, G.D.; funding acquisition, G.D. and P.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Australian Research Council, grant number DP130104275 The social dynamics of language: A study of phonological variation and change in West Australian English awarded to the first two authors.

Institutional Review Board Statement

Ethical approval for the project from which this study arose was granted by Griffith University: #LAL/08/13/HREC.

Informed Consent Statement

In line with the ethics protocol for this study, informed consent was obtained from all participants whose recordings are included within the speech corpus upon which this study is based.

Data Availability Statement

Data are available upon request.

Acknowledgments

We are grateful to three anonymous peer-reviewers, to Nathaniel Mitchell, who contributed to the creation of the corpus from which the data reported here are extracted, to Ben Gibb-Reid for comments on a draft, and to participants at presentations of earlier versions of this work at a smoky Australian Linguistic Society meeting in December 2019 and in 2023 at the International Congress of Phonetic Sciences in Prague and at the Australian National University.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Model parameters arising from mixed-effects modelling of the overall distribution of F2deriv across the full dataset with MONO as the reference predictor, calculated using the lmer function as part of the lme4 package (Bates et al. 2015) in R (2020)—see main text.

lmer(f2deriv ~ condition + (1|speaker) + (1|word)

FEMALES

MALES

Appendix B

Model parameters arising from mixed-effects modelling of the overall distribution of F2deriv across the full dataset with SCHWA as the reference predictor, calculated using the lmer function as part of the lme4 package (Bates et al. 2015) in R (2020)—see main text.

lmer(f2deriv ~ condition + (1|speaker) + (1|word)

FEMALES:

MALES:

Notes

1	A preliminary account of this study was published as Docherty et al. (2023). An acoustic study of the realisation of KIT in the conversational speech of young English speakers in Australia. Proceedings of the 20th International Congress of Phonetic Sciences, pp. 3061–3065. The dataset has subsequently been expanded to include tokens of schwa as an additional condition, thereby facilitating a new quantitative analysis and a substantial re-framing of the theoretical dimension of the study.
2	We are grateful to an anonymous reviewer who suggested that “massive” should not be included in the UNACC condition, as its unstressed syllable is an exception to the more general process of unstressed vowel weakening. We note that a following labio-dental context is not listed as one of the exception contexts identified by Cox and Fletcher (2017, p. 118), but we are also conscious that the pronunciation guide provided by the Macquarie Dictionary (2024) appears to list all lexical items with an unstressed -ive suffix as having an [ɪ] realisation. In the apparent absence of any empirical evidence that would resolve this issue, we fall back on Wells’ (1982, p. 602) observation that while there is some cross-speaker differentiation with respect to the phonetic realisation of unstressed nuclei in items with the -ive suffix, a schwa realisation is “the usual Australian form”. Of course, Wells made this point over 40 years ago, so the situation may well have shifted, but, as mentioned, we are unaware of any contemporary empirical data that would clarify this. In order to further address the reviewer’s point, we repeated the statistical analysis of our data with all -ive items recoded as PREVEL_UNACC (20 tokens for female speakers, 11 for males). No significant differences were observed in the outcomes of the mixed-effects modelling that we report, and so we opted to retain the classification of our tokens as per our original analysis but remain alert to the fact that further research is needed to enhance our understanding of the factors that restrain vowel weakening for contemporary speakers of English in Australia.
3	Our finding of no correlation arose from an analysis of vowel duration across all of the conditions bar FLEECE and HAPPY_T (the former is clearly not comparable with the other conditions as a phonologically long vowel, and while the status of the latter, as described elsewhere in the paper, is ambiguous, it was notable that this condition generated the majority of vowel tokens with a duration of >200 ms). It would have been inappropriate to simply add duration into our existing statistical modelling as the different predictors that we model in respect of F2deriv include both of the above-mentioned conditions (in other words, it would surprising if duration did not turn out to be significant but for reasons that are orthogonal to the focus of this study).

References

Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67: 1–48. [Google Scholar] [CrossRef]
Bates, Sally A. R. 1995. Towards a Definition of Schwa: An Acoustic Investigation of Vowel Reduction in English. Ph.D. thesis, University of Edinburgh, Edinburgh, UK. [Google Scholar]
Bell, Alan, Daniel Jurafsky, Eric Fosler-Lussier, Cynthia Girand, Michelle Gregory, and Daniel Gildea. 2003. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America 113: 1001–24. [Google Scholar] [CrossRef] [PubMed]
Billington, Rosey. 2011. Location, location, location: Regional characteristics and national patterns of change in the vowels of Melbourne adolescents. Australian Journal of Linguistics 31: 275–303. [Google Scholar] [CrossRef]
Boersma, Paul, and David J. M. Weenink. 2018. Praat [Software], (version 6.0.40).
Butcher, Andrew, and Hywel Stoakes. 2024. Sheila’s roses (are in the paddick): Reduced vowels in Australian English. In Speech Dynamics: Synchronic Variation and Diachronic Change. Edited by Felicitas Kleber and Tamara Rathcke. Berlin and Boston: De Gruyter Mouton, pp. 207–44. [Google Scholar]
Bybee, Joan. 2002. Word frequency and context of use in the lexical diffusion of phonetically-conditioned sound change. Language Variation and Change 14: 261–90. [Google Scholar] [CrossRef]
Bybee, Joan. 2017. Grammatical and lexical factors in sound change: A usage-based approach. Language Variation and Change 29: 273–300. [Google Scholar] [CrossRef]
Cohen Priva, Uriel. 2017. Informativity and the actuation of lenition. Language 93: 569–97. [Google Scholar] [CrossRef]
Cohen Priva, Uriel, and Emily Strand. 2023. Schwa’s duration and acoustic position in American English. Journal of Phonetics 96: 101198. [Google Scholar] [CrossRef]
Cox, Felicity. 2019. Phonetics and phonology of Australian English. In Australian English Reimagined. Structure, Features and Developments. Edited by Louisa Willoughby and Howard Manns. London: Routledge, pp. 15–33. [Google Scholar]
Cox, Felicity, and Gerard Docherty. 2023. Sociophonetics and vowels. In The Handbook of Sociophonetics. Edited by Christopher Strelluf. London: Routledge, pp. 114–42. [Google Scholar]
Cox, Felicity, and Janet Fletcher. 2017. Australian English Pronunciation and Transcription. Cambridge: Cambridge University Press. [Google Scholar]
Cox, Felicity, and Sallyanne Palethorpe. 2007. Australian English (Illustrations of the IPA). Journal of the International Phonetic Association 37: 341–50. [Google Scholar] [CrossRef]
Cox, Felicity, and Sallyanne Palethorpe. 2008. Reversal of short front vowel raising in Australian English. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), Brisbane, Australia, September 22–26; pp. 342–45. [Google Scholar]
Cox, Felicity, and Sallyanne Palethorpe. 2018. Rosa’s roses—Unstressed vowel merger in Australian English. In Proceedings of the 17th Australasian Conference on Speech Science and Technology (SST17), Sydney, Australia, December 5–7; Canberra: Australasian Speech Science and Technology Association (ASSTA), pp. 89–92. [Google Scholar]
Cox, Felicity, and Sallyanne Palethorpe. 2019. Vowel variation in a standard context across four major Australian cities. In Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia, August 5–9; Edited by Sasha Calhoun, Paola Escudero, Marija Tabain and Paul Warren. Canberra: Australasian Speech Science and Technology Association Inc., pp. 577–81. [Google Scholar]
Cox, Felicity, Sallyanne Palethorpe, and Samantha Bentink. 2014. Phonetic Archaeology and 50 Years of Change to Australian English /iː/. Australian Journal of Linguistics 34: 50–75. [Google Scholar] [CrossRef]
D’Onofrio, Annette, Teresa Pratt, and Janneke Van Hofwegen. 2019. Compression in the California Vowel Shift: Tracking generational sound change in California’s Central Valley. Language Variation and Change 31: 193–217. [Google Scholar] [CrossRef]
Dinkin, Aaron. 2008. The real effect of word frequency on phonological variation. University of Pennsylvania Working Papers in Linguistics 14: 97–106. [Google Scholar]
Docherty, Gerard, Paul Foulkes, Simon Gonzalez, and Nathaniel Mitchell. 2018. Missed connections at the junction of sociolinguistics and speech processing. Topics in Cognitive Science 10: 759–74. [Google Scholar] [CrossRef] [PubMed]
Docherty, Gerard, Simon Gonzalez, Nathaniel Mitchell, and Paul Foulkes. 2015. Static vs dynamic perspectives on the realization of vowel nucleii in West Australian English. In Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK, August 10–14; Edited by the Scottish Consortium for ICPhS 2015. Glasgow: The University of Glasgow, Paper number 956.1-5. [Google Scholar]
Docherty, Gerard, Simon Gonzalez, Nathaniel Mitchell, and Paul Foulkes. 2019. An acoustic analysis of short front vowel realizations in the conversational style of young English speakers from Western Australia. In Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia, August 5–9; Edited by Sasha Calhoun, Paola Escudero, Marija Tabain and Paul Warren. Canberra: Australasian Speech Science and Technology Association Inc., pp. 1759–63. [Google Scholar]
Docherty, Gerard, Simon Gonzalez, and Paul Foulkes. 2023. An acoustic study of the realisation of KIT in the conversational speech of young English speakers in Australia. In Proceedings of the 20th International Congress of Phonetic Sciences, Prague, Czech Republic, August 7–11; Edited by Radek Skarnitzl and Jan Volín. London: International Phonetic Association, pp. 3061–5. [Google Scholar]
ELAN. 2022. ELAN. (version 6.4). [Computer Software]. Nijmegen: Max Planck Institute for Psycholinguistics, The Language Archive. [Google Scholar]
Elvin, Jaydene, Daniel Williams, and Paola Escudero. 2016. Dynamic acoustic properties of monophthongs and diphthongs in Western Sydney Australian English. Journal of the Acoustical Society of America 140: 576–81. [Google Scholar] [CrossRef] [PubMed]
Fabricius, Anne. 2002. Weak vowels in modern RP: An acoustic study of happy-tensing and kit/schwa shift. Language Variation and Change 14: 211–37. [Google Scholar] [CrossRef]
Flemming, Edward. 2009. The phonetics of schwa vowels. In Phonological Weakness in English: From Old to Present-Day. Edited by Donka Minkova. Basingstoke and New York: Palgrave Macmillan, pp. 78–95. [Google Scholar]
Flemming, Edward, and Stephanie Johnson. 2004. Rosa’s roses: Reduced vowels in American English. Journal of the International Phonetic Association 31: 83–96. [Google Scholar]
Foulkes, Paul, and Gerard Docherty. 2006. The social life of phonetics and phonology. Journal of Phonetics 34: 409–38. [Google Scholar] [CrossRef]
Foulkes, Paul, and Jennifer Hay. 2015. The emergence of sociophonetic structure. In The Handbook of Language Emergence. Edited by Brian MacWhinney and William O’Grady. Hoboken: Blackwell, pp. 292–313. [Google Scholar]
Fromont, Robert, and Jennifer Hay. 2012. LaBB-CAT: An annotation store. In Proceedings of the Australasian Language Technology Association Workshop. Edited by Paul Cook and Scott Nowsom. Dunedin: ACL Anthology, pp. 113–17. [Google Scholar]
Gordon, Elizabeth, Lyle Campbell, Jennifer Hay, Margaret Maclagan, Andrea Sudbury, and Peter Trudgill. 2004. New Zealand English: Its Origins and Evolution. Cambridge: Cambridge University Press. [Google Scholar]
Grama, James, Catherine Travis, and Simón Gonzalez. 2019. Initiation, progression, and conditioning of the short-front vowel shift in Australia. In Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia, August 5–9; Edited by Sasha Calhoun, Paola Escudero, Marija Tabain and Paul Warren. Canberra: Australasian Speech Science and Technology Association Inc., pp. 1769–73. [Google Scholar]
Grama, James, Catherine Travis, and Simon Gonzalez. 2020. Ethnolectal and community change ov(er) time: Word-final (er) in Australian English. Australian Journal of Linguistics 40: 346–68. [Google Scholar] [CrossRef]
Grama, James, Catherine Travis, and Simon Gonzalez. 2021. Ethnic variation in real time change in Australian English diphthongs. In Language Variation—European Perspectives VIII: Papers from the Tenth International Conference on Language Variation in Europe (ICLaVE 10), Leeuwarden, The Netherlands, June 2019. Edited by Hans Van de Velde, Nanna Haug Hilton and Remco Knooihuizen. Amsterdam and Philadelphia: John Benjamins, pp. 291–314. [Google Scholar]
Grama, James, Chloé Diskin-Holdaway, Ksenia Gnevsheva, James Brand, Jennifer Hay, Katie Drager, Paul Foulkes, and Gerard Docherty. Submitted. It’s not just a sound change: Linking phonetic and pragmatic change in a discourse-pragmatic marker.
Harrington, Jonathan, Felicity Cox, and Zoe Evans. 1997. An acoustic phonetic study of broad, general and cultivated Australian English vowels. Australian Journal of Linguistics 17: 155–84. [Google Scholar] [CrossRef]
Hay, Jennifer, and Paul Foulkes. 2016. The evolution of medial (-t-) over real and remembered time. Language 92: 298–330. [Google Scholar] [CrossRef]
Hay, Jennifer, Janet Pierrehumbert, Abby Walker, and Pat LaShell. 2015. Tracking word frequency effects through 130 years of sound change. Cognition 139: 83–91. [Google Scholar] [CrossRef]
Kerswill, Paul, Eivind N. Torgersen, and Sue Fox. 2008. Reversing “drift”: Innovation and diffusion in the London diphthong system. Language Variation and Change 20: 451–91. [Google Scholar] [CrossRef]
Kuznetsova, Alexandra, Per Brockhoff, and Rune Christensen. 2017. lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software 82: 1–26. [Google Scholar] [CrossRef]
Labov, William. 1994. Principles of Linguistic Change, Vol. 1: Internal Factors. Oxford: Blackwell. [Google Scholar]
Labov, William, Ingrid Rosenfelder, and Joseph Fruehwald. 2013. One hundred years of sound change in Philadelphia: Linear incrementation, reversal, and reanalysis. Language 89: 30–65. [Google Scholar] [CrossRef]
Loakes, Debbie, John Hajek, and Janet Fletcher. 2017. Can you t[ae]ll I’m from M[ae]lbourne? An overview of the DRESS and TRAP vowels before /l/ as a regional accent marker in Australian English. English World-Wide: A Journal of Varieties of English 38: 29–49. [Google Scholar] [CrossRef]
Lüdecke, Deniel. 2018. sjPlot: Data Visualization for Statistics in Social Science. R Package Version 2.8.15. Vienna: R Foundation for Statistical Computing. [Google Scholar]
Macquarie Dictionary. 2024. Macquarie Dictionary Online. Sydney: Macquarie Dictionary Publishers. [Google Scholar]
Penney, Joshua, Felicity Cox, and Andy Gibson. 2023. Variation in FACE and FLEECE trajectories in Australian English adolescents according to community language diversity. In Proceedings of the 20th International Congress of Phonetic Sciences, Prague, Czech Republic, August 7–11; Edited by Radek Skarnitzl and Jan Volín. London: International Phonetic Association, pp. 3522–26. [Google Scholar]
Penney, Joshua, Felicity Cox, and Anita Szakay. 2021. Glottalisation of word-final stops in Australian English unstressed syllables. Journal of the International Phonetic Association 51: 229–60. [Google Scholar] [CrossRef]
Phillips, Betty S. 1983. Lexical diffusion and function words. Linguistics 21: 487–99. [Google Scholar] [CrossRef]
Phillips, Betty S. 2006. Word Frequency and Lexical Diffusion. Basingstoke: Palgrave Macmillan. [Google Scholar]
Pierrehumbert, Janet. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Frequency Effects and the Emergence of Lexical Structure. Edited by Joan Bybee and Paul Hopper. Amsterdam: John Benjamins, pp. 137–57. [Google Scholar]
Purser, Benjamin, James Grama, and Catherine Travis. 2020. Australian English over time: Using sociolinguistic analysis to inform dialect coaching. Voice and Speech Review 14: 269–91. [Google Scholar] [CrossRef]
R. 2020. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
Rosenfelder, Ingrid, Josef Fruehwald, Keelan Evanini, Scott Seyfarth, Kyle Gorman, Hilary Prichard, and Jiahong Yuan. 2014. FAVE (Forced Alignment and Vowel Extraction) Suite, (version 1.1.3). Software.
Shaw, Jason, and Shigeto Kawahara. 2018. Predictability and phonology: Past, present and future. Linguistics Vanguard 4: 20180042. [Google Scholar] [CrossRef]
Shi, Rushen, Bryan Gick, Dara Kanwischer, and Ian Wilson. 2005. Frequency and category factors in the reduction and assimilation of function words: EPG and acoustic measures. Journal of Psycholinguistic Research 34: 341–64. [Google Scholar] [CrossRef]
Sóskuthy, Márton, Paul Foulkes, Bill Haddican, Jennifer Hay, and Vincent Hughes. 2015. Word-level distributions and structural factors co-determine GOOSE fronting. In Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK, August 10–14; Edited by the Scottish Consortium for ICPhS 2015. Glasgow: The University of Glasgow, Paper number 1001.1-5. [Google Scholar]
Tasker, Sarah. 2020. Patterns of Variation and Change in English Schwa. Ph.D. thesis, University of York, York, UK. [Google Scholar]
Trudgill, Peter. 2006. New-Dialect Formation: The Inevitability of Colonial Englishes. Edinburgh: Edinburgh University Press. [Google Scholar]
Van Bergem, Dick R. 1994. A model of coarticulatory effects on the schwa. Speech Communication 14: 143–62. [Google Scholar] [CrossRef]
Wells, John. 1982. Accents of English. Cambridge: Cambridge University Press. [Google Scholar]
Wells, John. 2010. Lexical Sets. Blog Post. February 1. Available online: https://phonetic-blog.blogspot.com/2010/02/lexical-sets.html (accessed on 1 October 2024).
Young, Steve, Gunnar Evermann, Dan Kershaw, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, and Phil Woodland. 2006. The HTK Book (for HTK Version 3.4). Cambridge: Cambridge University Engineering Department. [Google Scholar]

Figure 1. Distribution of realisations in F1/F2 space (Hz) for tokens across all nine conditions: females left panel; males right panel; labels centred on the mean F1/F2 value with ellipses at ±1 s.d.

Figure 2. Violin plots of the distributional density of F2deriv (Hz) for each of the nine conditions contained within the dataset (with associated median and inter-quartile range): females top panel; males lower panel. Conditions are numbered for ease of reference. Higher values of F2deriv equate to a relatively closer and fronter vowel quality (see text for details).

Figure 3. Results of the mixed-effects analysis of F2deriv (Hz) across each of the conditions making up the dataset: females top panel; males lower panel. MONO—represented by 0 on the x-axis—is used as the reference point in the analysis. The quantities shown for each condition are the estimates for F2deriv (with 95% CI), indicating the difference between each condition and the reference condition MONO, with an indication of whether that difference is significant or not.

Table 1. Number and percentage of tokens of each of the six /ɪ/ conditions and the three reference conditions included in the acoustic analysis (see main text for further explanation).

Category	Example	Females (N)	Males (N)	Females (%)	Males (%)
MONO	bid	173	243	8.7	12.9
POLY_ACC	issue	220	209	11.0	11.0
PREVEL_UNACC	panic	141	115	7.1	6.1
UNACC	rabbit	328	270	16.5	14.3
PHRINT_IT	it takes	285	239	14.3	12.7
PREPAUS_IT	take it #	69	67	3.5	3.6
FLEECE	peach	146	179	7.3	9.5
HAPPY_T	movie	208	177	10.4	9.4
SCHWA	Asia	423	388	21.2	20.6
Total		1993	1887

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Docherty, G.; Foulkes, P.; Gonzalez, S. “It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth. Languages 2024, 9, 343. https://doi.org/10.3390/languages9110343

AMA Style

Docherty G, Foulkes P, Gonzalez S. “It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth. Languages. 2024; 9(11):343. https://doi.org/10.3390/languages9110343

Chicago/Turabian Style

Docherty, Gerard, Paul Foulkes, and Simon Gonzalez. 2024. "“It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth" Languages 9, no. 11: 343. https://doi.org/10.3390/languages9110343

APA Style

Docherty, G., Foulkes, P., & Gonzalez, S. (2024). “It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth. Languages, 9(11), 343. https://doi.org/10.3390/languages9110343

Article Menu

“It’s a Bit Tricky, Isn’t It?”—An Acoustic Study of Contextual Variation in /ɪ/ in the Conversational Speech of Young People from Perth

Abstract

1. Introduction1

2. kit–Schwa in Australian English

3. Aims of This Study

4. Materials and Methods

5. Findings

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI