1. Introduction
Language education is a fundamental part of society. Learning a language requires large amounts of both time and effort, but there is a myriad of reasons for pursuing such an undertaking. Some reasons are more casual, such as taking a short vacation overseas, whereas others are more serious, such as working or studying in a foreign country. Therefore, people of any age and background may be interested in learning or might be required to learn an additional language. With the world becoming more accessible on several fronts, English has emerged as a lingua franca for many to communicate with friends in different countries via the Internet, to increase tourism, or to conduct international business. Additionally, highly ranked universities in western countries attract many international students due to their diverse range of subject matters created and presented by renowned professors. Although this can incentivize or coerce the learning of a new language, many language students do not have enough language exposure or practice to establish a new lexicon. Furthermore, there are cases where language is purely studied for testing rather than its communicative function or cultural richness. Learning a language for its instrumental value rather than enjoyment can create barriers such as a lack of engagement and motivation, leading to students not wanting to practice.
One issue that arises from lack of practice is that language knowledge can quickly deteriorate when used infrequently [
1]; this is especially true for learners who have no exposure to the target language in their daily lives. Additionally, learners who find the language challenging to learn, or do not feel that the target language is beneficial, tend to become demotivated. Since thorough practice and immersive, meaningful experiences are critical factors for acquiring language, the unique nature of language education and increasing demand have created a need for effective language teaching methods. Pedagogical approaches such as the communicative method (which focuses on interaction) have been shown to help with language learning but greater learner motivation and engagement have been found through gameful approach incorporation [
2]. Consequently, there have been considerable advances in approaches to presenting language lessons, focusing on using interdisciplinary knowledge to create stimulating environments.
Given the issues surrounding language acquisition, gameful approaches have effectively overcome several learning barriers. Three distinct gameful approaches have emerged in language education. First and foremost, gamification, or the use of game design elements to encourage engagement, increased in popularity around 2011 when it entered Gartner’s Hype Cycle (
https://www.gartner.com/en/research/methodologies/gartner-hype-cycle (accessed on 12 April 2021)). Gartner’s Hype Cycle is a research methodology accompanied by a visual representation of how new and emerging technologies mature and evolve over time, especially when considering their application and success. Gamification’s popularity in language learning is evidenced by its frequency in the rise in mobile language-learning applications such as Duolingo [
3]. Another approach is using serious games. Serious games are games or game technology used for purposes beyond entertainment such as training, health, or education [
4]. Finally, although not a new concept, game-based learning is a topic that has received a fair amount of attention over the past few years. Game-based language learning has various forms, such as role-playing games for communication practice (e.g., Werewolf [
5]) or board games for solidifying grammar (e.g., Apples to Apples [
6]). With recent implementations moving to digital platforms (e.g., practicing target language by debating in Second Life [
7,
8]), research has found a plethora of benefits in addition to improvements in learning performance, such as increased motivation to learn through meaningful learning experiences [
9] and creating autotelic-conducive environments [
10].
It may seem counterproductive to play digital games, purely designed just for fun, for the purpose of learning. However, there is increasing evidence to suggest that users who play games in their non-native language can actively acquire a target language to progress in the game [
11] or talk to other players [
12]. Additionally, many games have external resources such as wiki sites that contain information valuable for the players. Exposure to digital games and these external environments surrounding games have been found to improve target language acquisition [
13,
14] to the extent that gamers, on average, tend to have more target language vocabulary than non-gamers [
15]. Gaming is no longer a niche market. The popularity of games is evident, with 75% of American homes having at least one gamer Entertainment Software Association (
https://www.theesa.com/resource/2020-essential-facts/ (accessed on 30 March 2021)) according to the Entertainment Software Association’s 2020 statistics. Given the potential impact of digital game-based learning on language acquisition, it is essential to collate the data regarding successful designs to advance language education implementation strategies. Furthermore, given the essential nature of communication, deciphering the mechanisms that contribute to effective language acquisition will be crucial to overcoming future challenges as the need for language education is ever-present.
The contribution of this paper is filling the dearth of knowledge regarding how to design digital game-based language learning (DGBLL). This paper aims to break down the games into their design features to find commonalities amongst DGBLL applications. The data presented here will be instrumental in suggesting design features for future applications employing this learning method. This knowledge was achieved by systematically searching the literature for design features that have been used frequently in DGBLL applications, and categorizing these features using an existing framework. The following questions were used to guide the study:
RQ1: What are the most frequently used game design elements in DGBLL?
RQ2: What are the differences in game elements between bespoke games, off-the-shelf games, games for entertainment, and games for education?
RQ3: Are there differences between bespoke and off-the-shelf games regarding their impact on observed outcomes within a specific age group?
RQ4: Are there differences between the minor design elements regarding language skills?
RQ5: What are some lesser-used game design elements in DGBLL applications that have shown potential?
This paper is structured as follows:
Section 2 provides background information and describes research related to this study;
Section 3 explains the methods used to collect, filter, and code the papers used in this review;
Section 4 presents the findings from the papers that were analyzed and discusses the patterns found in DGBLL applications according to the research questions;
Section 5 ends the paper with a conclusion and topics that require further research.
4. Results and Discussion
Games can be valuable tools for educators to create interactive content for language lessons. The search results are summarized in this section to provide some information regarding implementation strategies that have been used. Additionally, the discussion includes game design patterns found in the literature to suggest effective design practices. Surprisingly, identifying design features and the benefit of using specific elements were primarily found in bespoke game papers, especially studies that detailed the artefact’s creation. For example, Ongoro described the steps taken, from literature review to prototyping [
40]. Additionally, Ogoro included the justification of using points and feedback due to their success in other studies. Similarly, Yang designed a bespoke game and provided a detailed literature review before creating an artefact that evaluated the badge element [
127]. Yang’s literature review justified the reasons for and necessity of testing this element’s influence. In contrast with the design details found in bespoke papers, games for education were generally promoted for encouraging motivation based on previous works. For example, Azman [
92] justified using a COTS game for entertainment because the genre of the game provides evidence of positive outcomes, and it has attractive features that are known for motivating players. The elements that appealed to the researcher were not stated directly but were justified by its popularity and track record. Bespoke game papers tended to more closely examine the individual components, whereas COTS game papers focused on the prior results of the games.
4.1. What Are the Most Frequently Used Game Design Elements in DGBLL?
The first step in identifying patterns in game design is to examine an overview of the elements used.
Table 2 shows the summarized frequency information of the top five (eight for entertainment games due to duplicate frequencies) game design elements according to Marczewski’s framework. The table is divided into the four categories mentioned in
Section 3.2.2. The data in this table provide an idea of how games for specific purposes compare and offer a foundation for further analysis. The list of games can be found in
Appendix A Table A2 (bespoke game list) and
Table A3 (COTS game list).
From the table, games for entertainment were fewer and resulted in duplicate frequencies. Consequently, entertainment games contained more varied elements than the other categories. Overall, the most common game element was a feedback system, which was featured in 96% of the games. Giving language students feedback is an essential part of language acquisition and has shown positive results in computer-assisted learning [
152]. Furthermore, even mobile language learning applications designed for autonomous learning also place significant importance on having a feedback element [
3] to simulate a teacher providing correct responses or fixing errors. Various types of feedback are generally broken down into explicit (e.g., explaining what is wrong and showing the correction) and implicit (e.g., indicating that an utterance is incorrect by eliciting a new answer) feedback. Cornillie [
37] observed that explicit corrective feedback has more impact on language learning than implicit feedback. The test involved a digital game created to compare learners’ perceptions when exposed to different feedback types. In contrast with feedback in games for entertainment, when creating simulations, implicit feedback can help to encourage the natural flow of a conversation without stopping to explain language rules, such as interacting with Hazel’s virtual agents [
89]. A variety of feedback systems were found among the papers. For example, implementations of feedback ranged from point systems, such as a visible score for answering questions in Smith’s [
76] eBook game, to health systems, such as Economou’s [
105] action game for British Sign Language, where the player loses hitpoints for incorrect responses.
Theme and narrative, as elements, can be found together, such as in Hamari [
153], but the theme is often ignored as an individual element. However, there is a distinction between these two elements, and it is essential to note the difference since theme can be found in more educational applications than in stories. The theme includes settings or characters that are consistent within a game. Therefore, a theme can be included for user experience purposes. To illustrate, Duolingo does not contain a story, but the iconic owl is a consistent fantasy character that encourages the user. The setting of Spaceteam ESL [
71] is space-themed, but there is no narrative mentioned as a feature. Although a theme can stand alone, none of the papers with a narrative or story lacked a theme. Since storytelling is a prominent feature in game design [
154], an interesting finding was that nearly 60% of the bespoke and overall chosen games lacked an identifiable story.
The third most frequent element was points or experience points. Points are related to scores, and experience points are related to user progression. The user gains experience by completing a task, battle, or challenge and being rewarded or validated for the outcome, usually through an avatar. A scoring system was commonly found amongst the educational games, and experience point systems were more frequently found in games for entertainment.
Levels and progression are related to increasing difficulty within the learning content. This adaptive system can be displayed as the user increasing in skill (thus needing additional challenge), the user completing stages and progressing through content of increasing difficulty, or a combination of the two. This element was found frequently in games for entertainment (75%), regularly in COTS games (50%), but only 29.55% of educational games and 28.57% of the bespoke games. An example of this system is found in My-Pet-Store [
115] where the participants can progress through six levels of lesson material.
4.2. What Are the Differences in Game Elements between Bespoke Games, Off-the-Shelf Games, Games for Education, and Games for Entertainment?
The answer to RQ1 provided surface-level information regarding the differences between the elements in the four game categories. However, these games were the primary educational tools considered in this study. Therefore, further analysis was necessary to form a clear picture of the characteristics of each group and to identify differences and similarities.
Figure 2 details the number of game elements found in the analyzed papers, specifically displaying the median, interquartile range, and outliers among the different game categories.
The figure shows the large gap in game element quantity between games constructed for education versus entertainment. These findings are similar to those of Xu [
27] in that the DGBLL applications are inconsistent in their quantity of game characteristics. Interestingly, bespoke games and games for education categories were similar in element quantity. Although it is difficult to see on the box plot due to the minute difference, COTS educational games used fewer game elements, on average, than the bespoke games. The mean value of game elements in bespoke games was 5.56, and the mean value of COTS educational games was 5.11.
4.3. Are There Differences between Bespoke and Off-the-Shelf Games Regarding Their Impact on Observed Outcomes within a Specific Age Group?
Understandably, games intended for more mature audiences would not be used for younger users. Maturity, in this case, can refer to properties such as physical ability (motor skills or endurance) or cognitive ability (concrete or abstract thinking). Information regarding design patterns for specific ages can be beneficial in forming frameworks for compelling games.
Table 3 summarizes information regarding age groups and learning impact. In this table, the age groups are listed according to the coding in
Section 3. This column is followed by frequent elements found in the gameful approaches used in the papers. Frequent elements in this table are defined as elements occurring in 50% or more of the DGBLL applications. Next, bespoke and COTS games are listed as percentages with the number of games for entertainment found in brackets. Finally, the studies’ outcomes are listed, showing what type of outcome was tested. This is followed by the findings categorized as positive (Pos), negative (Neg), inconclusive or no significant findings (In/NS), and mixed (Mix). Finally, the number of papers per age group is presented in the last column. One paper [
99] was excluded because the age group was not indicated, being why the final column total is 113 instead of 114. Additionally, since this table uses data extracted from each paper, game duplicates are present.
Unsurprisingly, feedback and theme are heavily featured throughout the age groups. Three differences were found in the frequent elements present among the age groups. First, the points element was not featured as a frequent element in preschool DGBLL. Second, the theme element was not commonly found in mixed-age-group studies; however, levels were more frequent. Finally, secondary school papers contained the highest concentration of game elements with an average of 11 game elements.
Primary school was the only age group that used more bespoke applications. Secondary school studies featured the highest percentage of games for entertainment (37.5%), while preschool studies contained no games for entertainment. In terms of outcome, the papers showed that DGBLL was positive in general (79.65%). The most positive ratio was found in secondary school studies. These studies also included the most games for entertainment such as L.A. Noire [
79,
155] and The Sims [
102,
156]. These games were aimed at vocabulary and had a high concentration of game elements but were implemented for different reasons. The Sims was chosen because it met all the criteria for being a good game, and L.A. Noire was chosen for its interactive story. The games operate and are experienced in entirely different manners but are used for the same purpose: to provide a meaningful environment for vocabulary acquisition. This comparison is only one example of the diversity of implementations within the same age group and targeted language skill. Overall, the number of papers targeting secondary school was relatively low, with only 8 of the total 113 papers that mentioned participant age groups.
Table 3 also shows a significant difference in the number of papers among targeted age groups; more than 50% of the papers contained university and adult participants (
n = 61) (e.g., [
70,
76,
103]). Hung [
26] reported a similar finding with around 44% of the papers in the study involving university students. The next largest age group was primary school participants (
n = 29) that were found in around 25% of the papers. The sizeable disproportion in age groups complicates the identification of prediction patterns for game elements. Therefore, more research on this gameful approach is needed in preschool and secondary school settings. The disproportion may be due to less strict ethical procedures with adults, adults having more technical skills, or university faculty having the ability to manipulate their curriculum [
25].
4.4. Are There Differences between the Minor Design Elements Regarding Language Skills?
Language skills are the core of DGBLL and need to be thoroughly investigated. Since highly frequent (major) elements found during the analyses tend to be similar, it is important to look at the nuances within games aimed at specific language skills. These minor details can help us understand more about the role of game elements in DGBLL by shedding light on how they impact specific areas of acquisition.
Table 4 summarizes language skill implementation in the analyzed papers. For this summary, the number of games was analyzed instead of the number of papers. Considering that some authors such as Zhonggen [
82] and Karaaslan [
67] used multiple games, the total number of games was higher than the number of papers (
n = 119). Additionally, three papers were omitted because they did not explicitly test or observe language skills or performed tests that were not fully explained.
The summary echoes a similar result to Xu [
27] regarding the dominance of vocabulary studies. Of the 119 games in 111 papers, 51 games (42.86%) were aimed at vocabulary or used vocabulary as a means of performance testing. Additionally, vocabulary skills were targeted amongst the mixed language skill papers, making vocabulary the most popular language skill in DGBLL by an overwhelming margin.
Major elements were excluded from the language skills table to try and reduce redundant results. However, communication analysis needs further explanation due to the exclusion. Papers that targeted communication were the most likely to use a massively multiplayer online (MMO) game. The most popular MMO was Second Life, with six instances, followed by World of Warcraft (WoW) [
157], and Ragnarok Online [
158], which were used in three and two papers, respectively. The frequent use of MMOs (which often contain many game elements) creates an issue with unique major elements that need to be mentioned. When targeting communication skills, the games chosen or built for these studies contained four unique major elements: social networking, teams, exploration, and virtual economy. This combination of major and minor elements allowed for more freedom for the participants and encouraged interaction in the target language [
88].
Another issue was that input skills were rarely the focus of studies and the listening skill only featured in one paper by Müller [
138]. Consequently, the low number of papers (
n = 4) meant that one game with unique elements would ultimately occupy the minor elements cell. In the only listening skill paper, Müller’s study, the participants that used the bespoke DGBLL application Medicina were better able to identify words aurally. The game only had five elements, feedback, theme, points, leaderboard, and time pressure, and had significantly positive results. Overall, the input skills papers were very low on game elements, with 18 elements in total.
Output skills included some interesting DGBLL implementations such as Her Story [
159] for writing, and Khatoony’s bespoke virtual reality (VR) game [
121] for speaking. Lee [
103] conducted a study using Her Story, which is a unique COTS game for entertainment where players follow a police case through interview tapes. The participants in this study were successful in using the interactive game to inspire their creative writing. The VR game, in contrast, had very few game elements with the focus on the theme of each room and the interaction between the participant and the non-player character. This study, in particular, isolated the theme element in game-based learning, but the use of VR technology could have influenced the outcome due to the novelty and device-driven immersion factors.
Grammar also lacked in studies and was the only language skill that regularly implemented the learning element. Both of the games that contained learning, English Extras® In Business with A, An, and The and Phrasal Nerds [
160], were COTS games for education. These systems also contained narratives and taught grammar points before the participants went through the story. This combination worked effectively, according to Kao’s [
83] findings. Fallah [
96] compared Phrasal Nerds with Kahoot! [
161] and found that the narrative had an influence in participants choosing to use Phrasal Nerds over Kahoot!.
Studies that included multiple language skills contained the highest percentage of COTS games. Targeting multiple skills allowed for greater creativity within the DGBLL implementation. For example, Liang [
49] used Second Life as a multimodal DGBLL to aid in the elocution of words through communication activities and storytelling exercises. Liang’s strategy was especially effective by employing the narrative element. Another example is the bespoke application called ImALeG [
94]. ImALeG was created to teach Amazigh after its re-introduction into the school system in Morocco by having students explore the environment to collect alphabets. The exploration and narrative elements were a major part of the design, which included a companion and dialogue with non-player characters.
4.5. What Are Some Lesser Used Game Design Elements in DGBLL Applications That Have Shown Potential?
Marczewski’s framework includes specific features of games, which helps to identify uniquely implemented strategies for language learning. A few low-frequency elements are discussed in this section.
4.5.1. Boss Battles for Assessment
Some applications had self-contained assessment tools. Assessment using the quiz COTS game Kahoot! had mixed reviews [
104,
134]. One bespoke DGBLL application created by Ansteeg [
90] that had positive responses from the participants featured an assessment using a boss battle. Although boss battles were also regularly found in the games for entertainment used for DGBLL (41.67%), the feature was used more as a device or affordance for stimulating communication in the target language [
38]. Therefore, using boss battles for assessing knowledge makes Ansteeg’s prototype unique. Boss battles require more work to implement since other elements are necessary to support the feature, such as narrative/story and theme.
4.5.2. Collecting and Trading
Collecting items was featured in 5.77% of the bespoke games and 7% overall. Collection methods varied widely amongst the games. Economou’s [
105] British Sign Language application had users explore the environment to find and collect target language to fight enemies (this is similar to [
90]). In contrast, Nori School [
32] required users to defeat monsters to collect items. Trading involves being able to give and receive items from other users or game characters.
4.5.3. Anonymity and Anxiety
The use of technology to mediate communication between language learners was shown to reduce anxiety toward using the language [
162]. The game element anonymity, which allows a user to mask their identity or assume the identity of an avatar, fulfils this purpose. Language learners have found it easier to chat in a target language using a messaging system or find that using such a system increases their confidence [
163,
164]. Therefore, it is understandable that using a massively multiplayer online game with chat functionality can facilitate a similar type of environment. Collaboration during social gaming can contribute to language skills and practice that is outside of the scope of traditional classroom teaching. A good example is Zheng’s [
38] study on WoW, which observed how coaction creates a meaningful experience during language practice.
4.5.4. Networking, Questing, and Exploring Together
Social networking is the ability to communicate with other users. Questing involves guiding a user through content by assigning tasks. Both social networking and questing are uncommon elements in DGBLL (occurring in 16% and 19% of games, respectively). Although they are rarely used together in total (7%), they are more regularly found together if one is present (around 40%). Interestingly, exploration, an uncommon element found in 22% of the total game population, occurred 100% of the time when questing and social networking were present. This type of design allows users to acquire language incidentally and practice language through communicating and adventure together. WoW has been used in a few studies that focused on communication and found mostly positive observations [
12,
38,
44,
92]. Another game called MeetMe used by Yamazaki [
68] lacked questing because the game was mainly for socializing. However, Yamazaki provided the participants quests outside of the application. Yamazaki found that the participants were encouraged to use the target language when communicating with native speakers to aid them in completing the quests.
4.5.5. Points, Badges, and Leaderboards
The gameful approach review by Dichev [
25] found that 27% of the gamification studies used a combination of points, badges, and leaderboards. However, in DGBLL, this combination is found far less frequently. Points were commonly present in games (52%). In contrast, leaderboards (17%) and badges (14%) were uncommon. The combination of these three elements was only found in 5% of the analyzed games. Therefore, this combination of elements is rarely found and distinguishes DGBLL from other gameful approaches.
4.6. Limitations
The papers reviewed during this study were limited to those written in English, so relevant studies in other languages were not included. Another limitation, which is common in literature reviews, was that the keywords were specific (in this case, focused on serious and digital games). During the literature search, some papers with similar concepts and experiments were labelled with alternative words such as simulations, computer-aided language learning, or technology-enhanced language learning. Therefore, there may be papers related to the topic presented in this study that were missed due to the wording.
Another important limitation is related to access to the applications mentioned in the analyzed papers. It was possible to find some of the games, but many of the game elements were extracted from screenshots and descriptions instead of hands-on experience with the applications.
5. Conclusions
DGBLL is still a growing field and requires more data to provide implementation guidelines for the education community. This review of the literature provides a valuable summary regarding game design patterns found in language education and how these features can be used. The most frequent features included providing feedback, having a theme, telling a story, progression through stages of increasing difficulty and, finally, giving points. However, only a few studies showed causal evidence of specific or combinations of elements creating the positive outcome, which creates issues with generalizing the effectiveness of their implementations.
Unlike Hung’s review [
26], the analysis in this paper showed similar usage of bespoke games versus COTS games in DGBLL, where 52% of the games were bespoke and 48% of the games were off-the-shelf. However, games for educational purposes (
n = 88) far exceeded the number of games for entertainment (
n = 12). Bespoke and COTS games also share the same three most frequent game elements: feedback, theme, and points. The studies using this gameful approach were found to produce predominantly positive results.
Tertiary education and adult participants outnumbered other age groups by a significant amount. Additionally, research was severely lacking in the preschool and secondary school age groups. Input, output, and grammar language skills were lacking in research compared with vocabulary and communication studies. Input, especially, suffered from a dearth of data, with the listening skill only found in 1 paper out of 111. Therefore, future research will benefit from addressing these less-targeted categories to better understand the impact of DGBLL.
Finally, this literature review covers the number of game elements and their frequency patterns but does not cover the quality of their implementation. The quality of a game element refers to factors such as the creativity or impact of a given element. For example, the points element is commonly found in games, but its implementation can manifest in various ways, such as a leaderboard entry or avatar levelling. Within these manifestations, which is the most effective and why? These are the types of questions that need to be addressed in further reviews.