1. Introduction
Since the 1980s, code-switching, “an activity which may be observed in the speech (or writing) of bilinguals who go back and forth between their two languages in the same conversation” [
1], has been the focus of intensive study and debate. This linguistic phenomenon is not uncommon and can be found in various bilingual contexts [
2]. Previous data have shown that individual utterances can combine elements from more than one language [
3,
4]. To date, the Spanish/English language pair is one of the most frequently examined, possibly because of the large number of speakers of both languages and the availability of collected data, such as can be found at the BangorTalk website [
5]. We shall use the Spanish/English language pair to illustrate the range of possible combinations involving English and Spanish determiners and nouns. Examples (1a) and (1b) show Determiner Phrases (DPs) where the determiner and noun come from the same language, while examples (2a) and (2b) illustrate mixed DPs where the determiner and noun are in different languages. Spanish words are shown in italics below, and determiners in both languages are shown in bold font.
1. | English unmixed DP | |
| a. | The | house |
| | DET1.DEF | N |
| | | |
| Spanish unmixed DP | |
| b. | La | casa |
| | DET.DEF.F.S | N.F.S |
2. | Mixed DP | |
| a. | La | house |
| | DET.DEF.F.S | N |
| | | |
| b. | The | casa |
| | DET.DEF. | N.F.S |
It has been reported previously that among mixed DPs, type (2a) occurs more frequently than type (2b), or in other words, Spanish determiners occur more frequently in mixed DPs than English determiners. For example, Liceras et al. reported, from their review of research on mixed Spanish–English DPs in spontaneous adult speech and their own study of child speech, that mixed DPs with Spanish determiners are far more frequent than with English determiners [
6]. In their own study of child speech, only about 5% of the mixed DPs had English determiners; in adult speech, Jake et al. found 161 instances of Spanish determiners followed by English nouns, but no examples of English determiners followed by Spanish nouns [
7]. However, Liceras et al. [
6] do not provide information about the morphosyntactic frame in which the mixed DPs appeared, which Herring et al. [
8] found to be relevant, as will be described below. Liceras et al. also do not consider the proportion of mixed vs. unmixed DPs with a given determiner, in case unmixed Spanish DPs should be more common than unmixed English DPs [
6]. Instead, they explain the apparently greater frequency of Spanish determiners in mixed DPs in terms of the “intrinsic Gender feature of the Spanish Noun and the intrinsic Gender Agreement feature of the Spanish Determiner” [
6] (p. 828), both of which features are absent in English. Moro Quintanilla also reports that Spanish determiners in mixed DPs are far more frequent in the Gibraltar data collected by Moyer than English determiners (only 2/243), and, like Liceras et al. [
6], explains the distribution in terms of the “presence of an uninterpretable gender feature on the Spanish determiner, as opposed to its absence on the English determiner” [
9] (p. 222). However, Moro Quintanilla also does not consider the morphosyntactic frame of the mixed DP or compare them with unmixed DPs [
9]. Myers-Scotton and Jake also appear to concur with Liceras et al. [
6], and Moro Quintanilla [
9] on the assumption that the gender feature on Spanish determiners requires them to be ‘elected’ earlier in the language production process and that early election is related to greater frequency [
10]. However, their earlier work had drawn attention to the importance of the morphosyntactic frame of the clause ‘or matrix language’ in influencing the language of the determiner.
The matrix language framework (MLF) was developed by Myers-Scotton [
11] in order to account for common patterns found in intraclausal code-switching. Its main contribution is to capture a common asymmetry between the two languages involved, such that one provides the morphosyntactic frame or matrix language, and the other (the ”embedded language”). The matrix language can be identified by the word order of the clause (the Morpheme Order Principle) and by the language source of particular ”system morphemes” (the System Morpheme Principle). System morphemes are categorized as either ”early” or ”late”. Early system morphemes are “conceptually activated to express a part of speakers’ meanings that they wish to communicate” [
10] (p. 344) and include plural marking on nouns as well as determiners. Early system morphemes in a clause with code-switching can come from either the matrix language or the embedded language, but they are more likely to come from the matrix language. Late system morphemes have less semantic content than early system morphemes and a particular subcategory of late system morphemes, “outsider late system morphemes”, can only come from the matrix language and are thus important in determining the matrix language of a given clause. Examples of outsider late system morphemes are case markers or verb inflections which encode subject–verb agreement.
We can illustrate the identification of the matrix language in examples (3) and (4)
2 below:
3. | my mom got | the manguera, | | |
| | hosepipe | | |
| ‘My mom got the hosepipe.’ | | [herring9: CLA]3 |
4. | eso fue | en el front desk | en el reception | |
| that was | at the | at the | |
| ‘That was at the front desk, at the reception.’ | [zeledon1: CAR] |
Example (3) has an English matrix language or morphosyntactic frame on the basis of the finite verb got being English, whereas example (4) has a Spanish matrix language because the finite verb fue ‘was’ is Spanish (word order is not relevant here to distinguish between an English and a Spanish matrix language).
Returning to the issue of whether or not Spanish determiners occur more frequently in mixed DP constructions, Myers-Scotton and Jake argue for the influence of the matrix language (ML) [
10] (p. 356) even though they had appeared to support the viewpoints of Liceras et al. [
6], and Moro Quintanilla [
9]. They state that “If Spanish is the ML in any CS corpus, then it is likely Spanish determiners will dominate for this reason alone under an analysis based on the MLF model” [
10] (p. 356). This prediction had already been captured in the ‘Bilingual NP Hypothesis’ proposed by Jake et al. [
7] and was motivated by the Uniform Structure Principle according to which the “structures of the matrix language are always preferred” [
11] (p. 8).
Herring et al. attempted a preliminary evaluation of the influence of the matrix language on the determiner by using Welsh–English and Spanish–English data to assess the extent to which the matrix language matched the source language of the determiner in mixed DP constructions [
8]. If we look again at examples (3) and (4) above, we can see that the language of the determiner in (3) is English, and thus matches the English matrix language of (3), while the language of the determiner in (4) is Spanish and thus matches the Spanish matrix language of (4). So, in both these two examples, the language of the determiner and the finite verb match.
In the small amount of the data analysed by Herring et al. [
8], there was only one example out of 89 of a determiner (Spanish) and matrix language (English) mismatching. The matrix language of the clauses was Spanish in 90% of the cases, and the proportion of mixed DPs with a Spanish determiner found in those clauses was 91%, supporting the idea of a close relation between the language of the determiner and the matrix language of the clause. The distribution of the data also provides a possible explanation for the quantitative results reported by Liceras et al. [
6] and Moro Quintanilla [
9], i.e., the reason why the majority of mixed DPs appeared in clauses with Spanish as matrix language was that Spanish was the matrix language in the majority of cases. In other words, Spanish determiners could have been preferred to English determiners in mixed nominal constructions simply because speakers selected a Spanish morphosyntactic frame, or matrix language in which they inserted their mixed DPs.
Recent experimental evidence provides support from two types of acceptability judgments for Herring et al.’s conclusion. To experimentally test these two sets of predictions regarding the language of the determiner in nominal constructions, Parafita Couto and Stadthagen-González tested two separate groups of 40 early Spanish–English bilinguals [
12]. Their task was to evaluate the acceptability of sentences with code-switches between the determiner and the noun that reflected the predictions of the Minimalism Program, the MLF, both or none. The first group rated them on a Likert scale, while the second group performed a two-alternative forced-choice acceptability task (2AFC). Both experiments yielded converging evidence supporting Herring et al.’s [
8] suggested preference for a match between the language of the determiner and the matrix language.
In the present study, we attempted to build on Herring et al.’s [
8] work by investigating the link between the language of the determiner and the matrix language in a larger dataset than used previously. We focus on both mixed and unmixed nominal constructions in order to try to come closer to an empirically supported account of the regularities involved. Controlling for the matrix language, we measure the proportion of mixed DPs with each determiner as a proportion of the total number of DPs with the same determiner. Thus, we take into account the possibility that Spanish determiners might precede nouns more frequently than English determiners for internal linguistic reasons [
13,
14,
15,
16,
17,
18,
19].
4Our data will come from two language pairs: Spanish–English from Miami, USA, and Spanish–English creole from the south Atlantic coast of Nicaragua. Although the language pairs in the two communities are similar, the differing distribution of matrix languages and determiners will allow us to consider the relative influence of linguistic and social factors on the code-switching patterns found.
4. Results
The results of the Miami data analysis can be found in
Table 3. The rows show mixed and unmixed DPs and the total number of DPs, while the middle columns indicate the frequency of the determiners matching vs. not matching the matrix language, with the results for Spanish and English as matrix languages given separately. As the Table shows, there is a match of 98.1% between the language of the determiner and the matrix language. Thus, the overwhelming majority of both unmixed and mixed DPs have a determiner with the same language as the finite verb of the clause.
On the other hand, 1.9% of all DPs have a determiner that does not match the matrix language. Still, of this group, 95.15% (157/165) are embedded language islands. These are all of the unmixed DPs which do not match the ML, as shown in the above table. An example of such an island is given in example (8). The clause in example (8) has an English matrix language, yet the determiner phrase
una pareja ‘a couple’ has both the determiner and the noun in Spanish.
8. | I hope mom doesn’t think they’re | una pareja | you know | | |
| | a couple | | | [sastre12: MAD] |
In embedded language islands, the grammar of the Embedded Language temporarily prevails and so we expect its internal constituents to appear unaffected by the matrix language [
10] (p. 139).
Of the mixed constructions, only 2.9% (8/276 DPs) did not match the matrix language. This is a very small number but we can note some similarities between those eight cases, of which three examples are given below.
9. | pero aquí [en el north side]AdvP we don’t ever get direct sun. | |
| but here on the north side | | | | [María1: MAR] |
| | | | | |
10. | [en los dorms]AdvP they have a laundry room | |
| in the dorms | | | | [Herring14: CON] |
| | | | | |
11. | they did a sonogram blah blah blah | tumor en | [en el spleen]AdvP. | |
| | tumor in | in the spleen | [Zeledon8: MAR] |
Examples (9)–(11) contain mixed DPs that appear in Spanish adverbial phrases introduced by the Spanish preposition
en, ‘on’ in example (9) and ‘in’ in example (10). In the case of (9) and (10), the switch from a Spanish determiner to an English noun may have been anticipating the change of matrix language to English which occurs in the following clause (
we don’t ever get direct sun in (9) and
they have a laundry room in (10)). All three of these examples could be characterised by what Muysken has called “alternational” switching [
29], in which the switch occurs at a peripheral place in the clause. Adverbial phrases can be considered peripheral since they are not involved in the argument structure of the verb.
In addition to investigating the link between the language of the determiner and the matrix language, a second aim of our study was to measure the proportion of mixed DPs with each determiner as a fraction of the total number of DPs with the same determiner. We conducted this analysis on a subset of the data represented in
Table 3, in particular the data shown in the column headed “Matching ML”, where the determiner matched the ML. This was the case for 98.1% of the data as shown above. The results of this second analysis are shown in
Table 4. As the Table shows, there is indeed a higher proportion
5 (6.3%) of Spanish determiners followed by an English noun than English determiners followed by a Spanish noun (0.6%). Given the tendency of determiners to match the matrix language, this means that bilingual speakers are more likely to switch language after Spanish determiners than after English determiners.
Nicaragua Data
The results of the analysis of the Nicaragua data can be found in
Table 5. As in
Table 3, the rows show mixed and unmixed DPs and the total number of DPs, while the middle columns indicate the frequency of the determiners matching vs. not matching the matrix language. Next to each figure, we provide the percentage out of the total number of DPs.
Table 5 shows that there is a match of 99.7% between the language of the determiner and the matrix language.
The results of the Nicaragua data support the predictions of the MLF: only 0.3% of the DPs do not have a match between the language of the determiner and the matrix language of the clause. As in the case of the Miami data, the mismatched cases involve embedded language islands. An example of such an island is given in example (12). The clause in example (12) has an English creole matrix language, yet the DP
la escuela ‘the school’ is entirely in Spanish. All the islands found were Spanish determiner phrases in a NCE matrix language clause.
12. | di refreshment, | hav di celebración | de la escuela | |
| the | have the celebration | of the school | |
| ‘the refreshment, have the celebration in the school.’ | [F-BLU-1-06] |
All mixed constructions matched the matrix language.
Table 6 shows the numbers of unmixed and mixed DPs for each determiner and matrix language. As is clear, use of a Spanish matrix language is very rare in Nicaragua. However, a Fisher test (
p = 0.63) suggests no significant difference between the proportion of mixed DPs with a Spanish determiner and with an NCE determiner.
5. Discussion
Our results suggest that speakers do not appear to have much choice regarding the language of the determiner: instead, this is influenced by the language of the morpho-syntactic frame or matrix language, and it is in selecting the matrix language that speakers do appear to have some choice. Once they have done this and have selected a matching determiner, the next option is whether or not to switch to a different language when selecting the noun following the determiner. We have noted that this happens more often where the matrix language (and determiner) is Spanish in the Miami data. In the Nicaragua data, however, we have only a small number of clauses with Spanish matrix language, and no statistical indication of a difference in the proportion of switched nouns following Spanish as opposed to NCE determiners. However, in trying to account for the asymmetry that we find in the Miami data, we may note that previous work by Bhatt on Indian data has suggested that the directionality of switches tends to be towards the language of power, or the language with superior social status [
30]. Our findings seem consistent with this suggestion in that English has been the official language of Florida, the state where Miami is located, since 1988 [
31]. So the more numerous
6 switches from Spanish determiners to English nouns than the reverse are in the direction of the official language. In Nicaragua, we can see that even though there is no significant difference between the proportion of mixed DPs with a Spanish determiner and with an NCE determiner, all the switches observed are from creole to Spanish. If this trend is confirmed in further studies, it would once again indicate switching in the direction of the language of higher prestige [
28,
30]. Koskinen reports that although the regional languages of the Caribbean coast including English creole were made official in 1993, creole was not used officially in education until 2007 [
28]. Koskinen also reports that although the other regional languages have gained in status, creole “continues to be considered a form of ‘broken English’ or ‘bad English’” [
28] (p. 143). Spanish, on the other hand, is described as the “national language” [
28] (p. 153) and is clearly superior in prestige.
Other explanations for the asymmetrical pattern of switching following determiners in the Miami data would require more exploration, but Fricke and Kootstra’s work on the Miami data has established the importance of priming by material in the previous discourse, and this could be investigated in our data [
32]. This account would be supported by the exposure-driven account posited by Valdés-Kroff [
33], whereby bilingual speakers converge upon conventional production patterns. Such an emergent approach would offer an alternative as to how to account for asymmetrical structural distributions such as the ones we observed in our Miami and Nicaragua data. Another avenue to pursue would be the idea that code-switching tends to mark high information content as proposed by Myslin and Levy [
34]. They consider words with high information content to be less predictable than those of lower information content, and to signal to the listener that special attention is needed. In relation to our data, we would need to examine whether there is evidence of the switches to nouns in the minority language having higher information content than those in the official language. Another variable that could be considered would be the language proficiency or dominance of the speaker. For example, Liceras et al. argued that it is possible to gain insights from the code-switching patterns and preferences which differentiate child and adult native speakers, simultaneous bilingual speakers and L2 speakers [
35]. This, they say, could account for the conflicting evidence observed in the spontaneous switches produced in different communities of code-switchers.
One question that remains to be addressed is that of what determines the selection of the matrix language, since we have argued that the language of the determiner follows from this choice. We expect extralinguistic factors such as age of acquisition, language proficiency and the language of social networks to be all relevant, and hope to explore this question in the future.