1. Introduction
The adoption of immersive reality technologies across different domains and application themes, such as architecture, medical practice, engineering, and tourism, has increased recently [
1,
2,
3,
4,
5,
6,
7,
8]. For instance, Alizadehsalehi, Hadavi [
1] review the existing literature, case studies, and applications of immersive reality technology in Architecture, Engineering, and Construction (AEC) industry and outline a roadmap that promotes the integration of immersive reality technology, cloud computing, digital twins, emerging technologies in IoT and cognitive computing to solve a variety of construction and management issues in the industry. Similarly, Alizadehsalehi and Yitmen [
2] present a framework that integrate digital twin, building information modelling (BIM), and immersive reality technology, aiming at monitoring construction progress.
The tourism industry, digital cultural heritage, and architectural heritage have benefited from immersive reality. In recent years, studies applied to these domains have demonstrated how the integration of cultural computing, 3D modelling, and immersive reality improve awareness of cultural heritage [
4]. Furthermore, studies also show how immersive reality plays a role in reviving the tourism industry from its COVID-19 pandemic-induced economic challenges [
5].
Recent studies in the virtual heritage (VH) domain have recognised the importance of collaboration, social interaction, and engagement in exhibiting technologies that museums provide to visitors [
9,
10]. In this regard, immersive reality technologies are becoming a popular choice to enhance visitors’ experience.
Museums are shared spaces, and it is very crucial that immersive reality technologies embrace this characteristic. However, not all forms of immersive reality technologies can naturally enable collaboration between visitors. For instance, virtual reality (VR) creates an artificial barrier between visitors and between the real and virtual words. In contrast, Augmented Reality (AR) and Mixed Reality (MR) do not create artificial barrier between visitors because virtual objects are overlayed on top of visitors’ views of the real world. Hence, social interaction between visitors and a contextual relationship between visitors and the real world can be maintained.
In this paper, we evaluate a mixed reality application designed and implemented to enhance cultural learning in museums. The mixed reality application (Clouds-based Collaborative and Multi-modal Mixed Reality) attempts to enable collaboration, engagement, and a contextual relationship in mixed reality applications that specifically aim at virtual heritage themes in the context of enhancing cultural learning.
This paper is a continuation of previous published works produced as part of the first author’s PhD research project. The publications are summarised and presented in
Table 1 to make the reading smoother and establish a connection between the papers. The published works are categorized into four research phases.
Phase one: Exploring the state-of-the-art
A Survey of Augmented, Virtual, and Mixed Reality for Cultural Heritage [
11].
A Comparison of Immersive Realities and Interaction Methods: Cultural Learning in Virtual Heritage [
12].
Phase two: Establishing the conceptual base
Redefining Mixed Reality: User-Reality-Virtuality and Virtual Heritage Perspectives [
13].
Mixed Reality: A Bridge or a Fusion Between Two Worlds? [
14].
Phase three: Design and implementation
From Photo to 3D to Mixed Reality: A Complete Workflow for Cultural Heritage Visualisation and Experience [
15].
Walkable Mixed Reality Map as Interaction Interface for Virtual Heritage [
16].
Clouds-Based Collaborative and Multi-Modal Mixed Reality for Virtual Heritage [
17].
Phase four: Evaluation
The remainder of this paper is structured as follows.
Section 2 will discuss existing studies in the context of providing theoretical background for the study.
Section 3 will provide detailed discussion on the research model and explores various assumptions. Following that,
Section 4 will explain the research methodology adopted.
Section 5 will present detailed discussion on data analysis and results. Finally,
Section 6 and
Section 7 will offer discussions and conclusions, including theoretical contribution, practical benefits to the virtual heritage domain, and future works.
3. Conceptual Model
In this section, we discuss our conceptual model. Based on the theoretical background presented in the previous section, we discuss the research model (framework) that led to the design, implementation, and evaluation of clouds-based collaborative and multi-modal mixed reality [
17].
Figure 1 shows our conceptual model, which presents the characteristics of collaborative and multi-modal mixed reality affecting users’ cultural learning experience via collaboration, engagement, a contextual relationship, and their associated enablers. Establishing enhanced cultural learning as our objective, we also outline how the major characteristics of collaborative and multi-modal mixed reality influence cultural learning in virtual heritage applications at museums and heritage sites. We further explore how these characteristics are connected to and influence each other to attain the primary objective.
3.1. Collaborative Interaction, Collaboration (Social Interaction), Contextual Relationship, and Engagement
Collaborative interaction refers to the ability of interaction methods to enable effective and meaningful collaboration between users. As discussed in the previous section, collaboration, engagement, and contextual relationships influence cultural learning in virtual heritage. When viewed as characteristics of cultural learning in virtual heritage, collaborative interaction, therefore, can enable social interaction, a contextual relationship, and engagement. Hence, we hypothesise that collaborative interaction in mixed reality will have a positive effect on engagement, collaboration (social interaction), and a contextual relationship between users and the virtual environment.
3.2. Multi-Modal Interaction, Collaboration (Social Interaction), Contextual Relationship, and Engagement
Multi-modal interaction methods in mixed reality enable users to manipulate the virtual environment and interact with the application via multiple modes, such as gesture, speech, movement, and gaze. These characteristics lead to more natural way of interaction that requires less effort form users. As a result, users will not be distracted by the complexity of the interaction methods, that in turn results in enhanced engagement that facilitates a real-virtual environment to establish a contextual relationship between users and the environment itself. Hence, we hypothesise that multi-modal interaction in mixed reality will have a positive effect on engagement and the contextual relationship between users and the virtual environment.
3.3. Collaboration (Social Interaction) and Cultural Learning
Collaboration (social interaction) in virtual environments, as discussed in the previous section, is one of the characteristics of collaborative and multi-modal mixed reality. We have discussed in the introductory section that museums are shared spaces. As such, social interaction is often implicit in the visiting experience. However, contextual cultural interaction with artefacts, displays, and related media is seldom effectively leveraged. Interaction methods in virtual heritage applications need to embrace this potential. We hypothesise that collaboration (social interaction) in mixed reality will have a positive effect on cultural learning.
3.4. Contextual Relationship and Cultural Learning
Contextual relationship is a three-way relationship between users, the real world, and the virtual environment [
13]. The relationship between the virtual environment and the real world is as crucial as the social interaction between users. We hypothesise that contextual relationship in mixed reality will have a positive effect on cultural leaning.
3.5. Engagement and Cultural Learning
Engagement in virtual environments, as discussed in the previous section, is one of the characteristics of collaborative and multi-modal mixed reality. We hypothesise that engagement in mixed reality will have a positive effect on cultural leaning.
4. Method
4.1. Study Context
Figure 2 shows SS Xantho, launched in 1848, which is one of the world’s first iron ships and western Australia’s first coastal steamer. Xantho was selected as the cultural context for the collaborative and multi-modal mixed reality application we evaluate in this paper [
17,
39]. Xantho was selected because of its significance to the maritime archaeology of western Australia (it has also been depicted in Aboriginal rock art), it was used as a “tramp steamer”, pearler, and convict ship, before sinking in 1872. Besides a permanent section in the Western Australia Shipwreck Museum, featuring the ship and related artifacts, the museum has made available 3D models of the ship and its engine “…the only known example of the first high pressure, high revolution engines ever made.” As part of this study, two mixed reality applications, Walkable Mixed Reality Map [
14] and Clouds-based Collaborative and Multi-modal Mixed Reality [
17] were designed and implemented. Both applications use the story of Xantho as their cultural context.
The evaluation in this study will focus on the Clouds-Based Collaborative and Multi-Modal Mixed Reality. By using this mixed reality application at the Western Australia Shipwreck Museum, visitors can collaboratively interact with 3D models, videos, audio, and textual information related to Xantho. The experience is delivered to users via Microsoft HoloLens device. A total of two users can collaborate and interact with the mixed reality experience at the same time. Users have a choice of speech, gaze, gesture, and movement to use to interact with the mixed reality environment. They can interact with 3D models, read text, and play audio and video the media content presented. The two users experiencing the mixed reality environment can collaborate and communicate while navigating through the story of Xantho.
The experience begins with the application asking users to provide stage ID to locate a shared location that will be used to load the mixed reality environment onto.
Figure 3 shows users interacting with the application (see the
Supplemental Materials to view video of the mixed reality experience). The stage ID can be set and passed to users either by a curator or one of the participants who plays a role of a guide [
17]. We invite readers to refer to this published article. Once users supply the stage ID, the HoloLens devices will load the mixed reality environment at the shared location and users start interacting with the environment. The experience takes approximately 15–20 min and has four segments. The first segment introduces users to Microsoft HoloLens and the interaction methods they can utilise. The introduction is delivered by a male virtual guide. After this segment, users select to begin the story of Xantho and then the Walkable Mixed Reality Map (second segment) is projected on the floor. At this stage, users start to explore the content collaboratively. This segment focuses on the early life of Xantho. After this stage, users can freely navigate through segment three (focuses on the wreck of Xantho) and segment four (focuses on the discovery of the wreck of Xantho). Interaction with content and the environment is achieved via a multi-modal interaction method that combines speech, gaze, gesture, and movement. This provided users with the flexibility of switching between different modes.
4.2. Measures
The instruments used for the evaluation were questionnaires and semi-structured interviews. The questionnaire used for this study had a total of nine measurement items scored on a 5-point Likert scale, one open question, and six demography questions (see
Table 2 and
Table 3). The measurement methods were adopted from Technology Acceptance Model (TAM) and Bae, Jung [
40]. The semi-structured interview had five predetermined questions.
4.3. Data Collection
The survey and interview were conducted over two evaluation sessions that took place at the Western Australian Shipwreck Museum on the 7 and 14 October 2021. The evaluation was conducted by the primary author as part of his PhD research. Experts, archaeologists, curators, and researchers from the museum participated in the evaluation.
After completing the mixed reality experience, participants were given a tablet computer to respond to the questionnaire. Once responses were gathered, participants were asked five semi-structured questions. The interview was recoded on a recording device (smartphone) and transcribed for further analysis.
A total of 11 experts from different departments of the Western Australian Shipwreck Museum participated in the evaluation.
Table 2 shows demographical details of the participants. According to the data gathered from the two evaluation sessions, the majority of participants were female (6 female, 4 male, and 1 preferred not to identify gender). The majority of participants were aged between 40 and 49. With regards to participants’ previous experience with immersive reality technology in general, the responses show that 7 participants were novice users, and 3 participants had never used the technology (one participant did not respond to this survey item). However, participants’ response to a survey item that asked their previous experience with Microsoft HoloLens showed that the majority of participants were new to the technology (8 never used the technology, and 3 were novice users).
5. Results
In this section we present the results obtained from analysing the data gathered from survey items, open question, and semi-structured interview.
Table 3 and
Figure 4 summarise questionnaire items scored on a 5-point Likert scale. The results are grouped into three categories based on the three characteristic of collaborative and multi-modal mixed reality we identified in
Section 2 and
Section 3.
5.1. Collaboration (Social Interaction)
In
Section 3.1, we hypothesised that collaborative interaction in mixed reality will have a positive effect on engagement, collaboration (social interaction), and contextual relationship between users and the virtual environment. Participants response to the survey items
“It was easy for me to collaborate with the person I shared the mixed reality experience with”, “It was easy for me to share and explain what I was seeing”, “It was easy for me to use speech command to interact with the system”, and “
It was easy for me to use gesture command to interact with the system” were used to validate weather collaborative and multi-modal interaction methods enable collaboration (social interaction) in mixed reality. The results (see
Table 3 and
Figure 4) indicate that the collaborative and multi-modal aspects of the mixed reality experience enable collaboration (social interaction) between users. Furthermore, participants response to a question
“were the two of you able to communicate while exploring the shared mixed reality experience?” validates the importance of collaboration (social interaction).
For instance, one participant responded to the question saying “…I think it’s good when people get along. They communicate in all forms from experience, communication is an important thing for the experience…”
Similarly, the following responses from the participants underline the role collaboration plays in terms of enhancing visiting experience and cultural learning.
Participant 2. “…I would agree with what (name omitted) said in her interview, that it’s good to have that level of communication. So, if this was like a partner experience that two people were able to do…”
Participant 3. “…I think If you knew each other, the interaction is easier. I’m not saying that strangers couldn’t do it. But I’m just saying it’s easier if you knew that…”
Participant 4. “…wasn’t actually aware we were supposed to. Yeah, I was just sort of acting by myself…”
Participant 5. “…It was very easy. And that was also because we work together? If it was two strangers that were working together on it, it might not be quite as, as easily as intuitive…”
Participant 6. “…always…”
Participant 7. “…we didn’t have any collaborative experiences… that’s because the application or the experience was already loaded…”
Participant 8. “…Yeah, look, it was because you, you know that you can see them there. And you can ask, well, how did you get there?”
Participant 9. “…I tried to communicate with (name omitted) and he was in his own little world …”
Participant 10. “…I think for me, because it was kind of challenging anyway, because I tried to make it work. I was focused more on what I was experiencing. I noticed that the first two ladies seem to interact quite well…”
Hence, based on the results obtained from the survey items and interviews, we can validate that collaborative and multi-modal interaction methods in mixed reality have positive effects on social interaction and engagement.
5.2. Engagement
Participants’ response to the survey items
“It was easy for me to collaborate with the person I shared the mixed reality experience with”, “It was easy for me to share and explain what I was seeing”, “I enjoyed this shared mixed reality experience”, “It was easy for me to relate the virtual experience with physical items in the gallery”, “It was easy for me to use speech command to interact with the system”, and “
It was easy for me to use gesture command to interact with the system” were used to validate weather collaborative and multi-modal interaction methods enable engagement in mixed reality. The results (see
Table 3 and
Figure 4) indicate that collaborative and multi-modal interaction methods in mixed reality enhance users’ engagement. In addition, participants response to the questions
“were the two of you able to communicate while exploring the shared mixed reality experience?” and
“were you able to interact with the system using all modes of interaction, such as gaze, speech, and gesture?” indicate that collaborative and multi-modal interaction methods enhance users’ engagement in mixed reality environment.
For instance, one participant stated that “I think having that combination (gesture and speech) is good, especially for people with disabilities”. This statement shows the role that the multi-modal interaction method plays in terms of disseminating a mixed reality experience to people with different abilities and backgrounds. The following responses from participants support our assumption that a multi-modal interaction method enhances users’ engagement with a mixed reality environment.
Participant 2. “…Yes. I think it’s good to have the two options (gesture and speech), not just one…”
Participant 3. “… it was fairly user friendly. For me, at least, the speech commands didn’t work all the time. But people do always have that backup gesture…”
Participant 4. “…is quite easy to use gestures. I can see where the voice can be easier to use, but a lot of people use the gestures…”
Participant 5. “…I found the hand gestures difficult until I’ve got used to them. But I think having that combination is good, especially for people with disabilities. So, they can choose either the gaze or the spoken word…”
Participant 6. “…both, but I found the gesture was better than the speech. I had to say the keywords a couple of times…”
Participant 7. “…gestures were good…”
Participant 8. “…I was able to use speech freely…didn’t experience any difficulty with that at …”
Participant 9. “…gestures, it’s quite easy to use gestures. The voice command, have tried a few times …”
Participant 10. “…gestures…in the beginning I was a little confused…”
Participant 11. “…the gestures, once I learned them …”
Based on the results obtained from the survey items and interviews, we can validate that collaborative and multi-modal interaction methods in mixed reality have positive effects on engagement.
5.3. Contextual Relationship
Contextual relationship refers to establishing a specific relationship between users, cultural context, and the immersive reality systems. In
Section 3, we have hypothesised that collaborative interaction in mixed reality will have a positive effect on contextual relationship. Participants’ response to the survey items
“It was easy for me to collaborate with the person I shared the mixed reality experience with”, and
“It was easy for me to relate the virtual experience with physical items in the gallery” were used to validate whether the collaborative interaction method enables a contextual relationship in mixed reality. The results (see
Table 2 and
Figure 4) indicate that collaborative interaction in mixed reality enables a contextual relationship. The results from
Section 5.2 and
Section 5.3 can support this view because a contextual relationship is the result of collaborative and multi-modal interaction.
5.4. Enhanced Cultural Learning
In this paper, we have argued that collaboration (social interaction), engagement, and a contextual relationship in mixed reality enhance cultural learning in virtual learning. The results presented above show that collaborative and multi-modal interaction methods enable these characteristics. Therefore, we can conclude that collaborative and multi-modal mixed reality has a positive effect on cultural learning in virtual heritage. Furthermore, this assumption is validated by participants’ response to survey items “I would like to see more items from the gallery presented in the system” and “I think the experience can enhance visitors’ interest to explore more collections in the museum” and their response to the question “do you think this technology can be used to enhance visitors’ interest in the museums’ collections”.
The following responses from the participants validate that the collaborative and multi-modal mixed reality enhances visitors’ interest in learning about the museums’ collection.
Participant 1. “…Yeah, I can see how it’s quite useful. You know, everybody likes a more interactive experience the museum…”
Participant 2. “…Yeah, I think so. It would engage the younger generations, including teenagers, I think we missed that demography from the museum, you know, the late teens, early 20s…”
Participant 3. “…I think this captures the interest of the 16- to 24-year-old. I think this is a great option to do that…”
Participant 4. “…because you have so many ways of learning. Some people are happy to read. Other people want to touch and interact with some people or technology as well. So, I think the experience adds another good layer…”
Participant 5. “…I think this would really appeal to young people. This gallery is underused and undervalued, this technology can attract young people to come to the gallery…”
Participant 6. “…yes, very much so…”
Participant 7. “…yes it can…”
Participant 8. “…Absolutely, this is a great new way of seeing gallery interviews and stuff. I really liked that…”
Participant 9. “…yes, the gallery, yes …”
Participant 10. “…I think it should. I mean, why wouldn’t it? Because it’s supposed to be enhancing your experience? Yes. So therefore, it must be beneficial…”
Participant 11. “...yes, absolutely”
6. Discussion
The objective of this study was to validate whether collaborative and multi-modal mixed reality can facilitate enhanced cultural learning in virtual heritage. Overall, the finding supports our proposed hypotheses that collaboration (social interaction), engagement, and contextual relationship in mixed reality influence cultural learning in virtual heritage. However, the study’s findings also identify some limitations that hinder the learning experience. These limitations are categorised into two groups, multi-media content (cultural context) and usability.
6.1. Multi-Media Content
Participants were asked to provide and share any thought or comment about their experience (only five participants responded to this open question). Their response suggest that the experience needs improvement in terms of the multi-media content and 3D models included in the experience. The following suggestions were made by the participants. We believe that addressing this feedback will improve the overall cultural learning in the mixed reality experience.
Participant 1. “…I think subtitles during the video would be great. Also, a visual representation of what the whole ship looks like…”
Participant 2. “…reduce amount of text and video content. Currently it is quite long and may not hold visitors’ attention…”
Participant 3. “…The video of the discovery of Xanth-would be good if a time display was shown so people know how long to expect it to go for and perhaps a volume control, as with other people in the gallery it was hard to hear…
Participant 4. “…Perhaps include a 3D version of entire Xantho when it was complete…”
Participant 5. “… Perhaps an animation of how the engine worked, and some further interpretation of why the sideways mounting of it was so remarkable…”
6.2. Usability
Feedback received from participants suggests that visitors might find interacting with the system a difficult task. This is supported by the results of the evaluation. The results of the survey item
“I think visitors will find the system easy to use and follow” received the lowest score compared to the other items. This is to some extent influenced by a lack of previous experience with immersive reality and Microsoft HoloLens in particular.
Table 2 shows that a total 8 out of 11 participants had never used Microsoft HoloLens prior to the evaluation session. The following remarks were made by the participants.
Participant 1. “…Interesting and worthwhile experience, easier I would think for younger people…”
Participant 2. “…Instructions embedded to explain ability to enlarge the 3D engine…”
Participant 3. “…Number the steps that participants should follow. If map could be mounted horizontally rather than on the floor (which I liked because it tracks the journey in the correct orientation, but was hard on the neck as I had to look down quite sharply) …”
Based on the evaluation results and the remarks from participants, the interaction design needs improvement to address the suggestions. Visitors need to be presented with easy-to-understand instructions prior to engaging with the experience. The mixed reality application had a segment that provides instructions to users. However, the instructions were part of the experience. They need to be presented to users before the experience begins. To this effect, printed material or a video that demonstrates interaction methods of HoloLens can be used to introduce users to the overall experience.
7. Conclusions
In this paper, we have presented results of the evaluation of a clouds-based collaborative and multi-modal mixed reality application that took place at Western Australia Shipwreck Museum. The application was designed and implemented, aiming at enhancing cultural learning in virtual heritage via a combination of collaborative interaction, multi-modal interaction, and mixed reality. SS Xantho, one of the world’s first iron ships and western Australia’s first coastal steamer, was used as a cultural context for the evaluation. Surveys and interviews were conducted to gather data from 11 participants. The collected data were analysed to validate whether collaboration, engagement, and a contextual relationship in mixed reality enhance cultural learning in virtual heritage. The results indicate that these characteristics facilitate enhanced cultural learning in virtual heritage. Furthermore, the results were interpreted to identify limitations, suggestions, and direction for future research in the domain.
Future Directions
Immersive reality display technologies, more specifically the Microsoft HoloLens, are expensive to install in museums as permanent exhibits. Even if the mixed reality application in this article is Microsoft HoloLens native application, it can be customised and deployed to other AR/MR headsets. Alternatively, the application can be customised for cloud native deployment. For instance, Amazon Web Services have released a cloud-based AR/VR platform called Amazon Sumerian. This platform enables museums to create, deploy, and run browser-based 3D, AR and VR applications. Museums can exploit this platform to disseminate their AR and VR experiences to a wider global audience. Hence, this article sets its future research focus on customising the mixed reality applications for multi-device and cloud deployment.
One of the findings of the evaluation was the difficulty of interacting with Microsoft HoloLens for first time users. Participants of the evaluation (experts, curators, and museum professionals) suggested that the general audience of museums (visitors of various background) would find the interaction mechanism (gesture, gaze, and speech) of HoloLens difficult to operate without prior knowledge and practice. They have also suggested that the younger generation would find the interaction mechanism relatively easy to learn. Hence, this article sets its future research direction on designing an interaction mechanism that is easy to learn and that accommodates different demographics of visitors.