1. Introduction
In the context of medical education and training, simulation is defined as “an artificial, yet faithful, representation of clinical situations through the use of analog and digital apparatuses” [
1]. Simulation relies on a number of physical assets including manikins, cadavers, standardized patients, animals, devices, and computer-based simulation (CBS), among other methods of imitating real-world systems [
2]. CBS has been gaining momentum due to the availability of low-, mid-, and high-end immersive technologies used as complementary training and educational tools where learners are able to develop procedural and declarative knowledge under a controlled and safe environment [
3]. The use of CBS in medical education allows for the safe exposure to hazardous and life-threatening situations otherwise impossible in the real world [
4]. The level of immersion and presence possible with CBS, particularly with virtual reality (VR) (e.g., virtual experiences that do not require a head-mounted display (HMD)) and augmented reality (AR) (e.g., experiences, where computer content is overlaid on the real world and visualized through a handheld or HMD), in addition to cross-sensory cues from spatial audio and haptic feedback, can provide a highly immersive and engaging learning environment.
The eye fundus examination is a standard medical examination [
5], which allows for the early identification of conditions associated with high blood pressure and diabetes mellitus, among others [
6]. Within the eye fundus examination, the direct ophthalmoscopy (DO) examination requires the health professional to use a direct ophthalmoscope, to search for anomalies within the eye. Relative to other eye examination methods, including those employing slit lamps for eyelid examination or the tonometer for gauging the intraocular pressure of the eye, direct ophthalmoscopy remains the most cost-effective method and is widely available in urban and rural healthcare facilities [
7]. Teaching and evaluating DO examination competency is particularly challenging; DO training requires one ophthalmoscope per student and a set of eye fundus samples to evaluate. Due to the limitations associated with the use of live patients, images and digital renderings are often used. Instructors provide verbal descriptions during training, walking trainees through the anatomical landmarks and possible eye fundus-related conditions [
8]. The use of images lacks the proper representation of the volumetric shape of the eye fundus, thus limiting the development of spatial awareness and the patient interaction skills needed for live examinations [
9].
Typically, DO simulator-related studies focus on validating the effectiveness of the simulation with respect to other simulators or traditional practice [
10]. However, this is not the only question regarding VR- and AR-based simulation technologies. VR- and AR-based DO simulators require input devices that are representative of the direct ophthalmoscope, and the choice of input device can have a significant impact on the ease of use of the entire simulation. The potential impact of interaction techniques on VR- and AR-based DO simulators has led us to explore usability issues associated with mobile AR used in combination with a Styrofoam head, and a 3D-printed direct ophthalmoscope replica as alternative tangible user interfaces [
11], tabletop displays for multiuser visualization, interaction and augmentation [
12], and early prototyping of VR DO eye examination [
4]. Despite recent advances in custom-made user interfaces employing 3D printing and open electronics, virtual simulators continue to employ off-the-shelf VR controllers and gestural inputs that may impact usability.
In this paper, we build upon our previous preliminary work that compared the usability between current VR controllers and hand-tracking interaction methods employed in VR- and AR-based DO simulators. Our aim is to develop a greater understanding regarding the usability effects associated with the controller and hand-tracking interactions methods, including the widely used HTC Vive controller, Valve Index controller, Microsoft HoloLens 1 hand gesticulation, Oculus Quest controller, and Oculus Quest hand tracking, on usability when performing a virtual eye examination. To address these combinations, we conducted a within-subjects study in two stages. The first stage saw five participants randomly exposed to the HTC Vive, Valve Index, and Microsoft HoloLens hand gesticulation 3D user inputs. The second stage saw 13 participants randomly exposed to the Oculus Quest controller and the Oculus Quest hand-tracking system.
3. Materials and Methods
The virtual examination scenario used here was developed following current eye fundus examination practices conducted at the Clinical Simulation Center in Universidad Militar Nueva Granada, Bogota, Colombia. The scenario also builds on previous research conducting preliminary usability and cognitive load testing [
25].
3.1. The Virtual DO Procedure
The virtual DO examination begins with the trainee holding the ophthalmoscope and aligning the aperture of the device with the patient’s eye. Trainees are expected to align their right and left examination eyes with the patient’s examined eye to avoid nose-to-nose contact with the patient [
26]. This process requires approaching the patient while maintaining the red reflex in focus until the ophthalmoscope is as close to the patient as possible. The red reflex is caused by light being reflected on the pupil, and by approaching it, a small portion of the retina becomes visible through the ophthalmoscope [
26]. The first anatomic landmark to be located in the optic disc, or optic nerve head presents a yellow-orange hue and is located approximately 15 degrees from the patient’s nose. Once the retina is in focus, blood vessels are localized and traced back against the branching pattern to the optic disc and the macula next to it. To further explore the eye fundus, the examiner orients the ophthalmoscope and asks the patient to look in different directions. In addition to the optic nerve, the “optic cup” is the second-most notable anatomical landmark, characterized as a pale depression at the center of the optic disc. In summary, the virtual DO examination procedure requires locating and holding the direct ophthalmoscope, approaching the patient, locating the red reflex, and observing the retina, the optic disc, and the macula for abnormalities.
3.2. DO Virtual Examination Scenario
Building on our previous work [
25], the virtual examination scenario was updated after conducting a within-subjects usability testing session with five participants (
Section 5). The updated scenario includes a virtual static ophthalmoscope that allows for the examination of the eye fundus when tracking issues cause the virtual ophthalmoscope to jitter and thus increase the difficulty of the task. The previous and current scenarios are presented in
Figure 1.
3.2.1. The Virtual Eye Model
The virtual eye was modeled after the human eye and included eyelid animations and subtle eye movements found during the eye fundus examination. These animations were implemented to increase realism and to avoid an unrealistic and uncanny virtual patient. Additionally, the eye model was comprised of a hollow interior depicting the anatomical landmarks with invisible colliders that trigger their detection during the virtual examination.
Figure 2 presents the virtual examination scene with a closeup to the floating standalone virtual eye model and its integration into two virtual avatars, ’Lyette’ and ’Jimothy’. To facilitate the location of anatomical landmarsk, as shown in
Figure 2, the following conventions were used: (i) a transparent white circle indicates where the Optic Disc is located, (ii) a blue transparent circle indicates where the Optic Cup is located, and (iii) a green transparent circle indicates where the Macula is located.
3.2.2. Virtual Avatars
The examination scenario was designed to introduce users to the virtual eye examination employing three virtual avatars posted in examination stations containing information about the tasks to be performed (
Figure 1). The first station (
Figure 2a) presents a floating eye and the actions focus on closeup interactions to help the trainee understand the use of the digital ophthalmoscope and facilitate identifying the anatomical landmarks. The second station (
Figure 2b) presents a humanoid character with enlarged eyes and focuses on locating the red reflex and the fundus visualization. Finally, the third station (
Figure 2c) presents a realistic human avatar where the examination takes place without any aids or directions.
3.3. 3D User Interactions
Our previous work focused on the HTC Vive and the Valve Index controllers for the VR interactions and the Microsoft HoloLens 1 gesticulation system for the AR experience. The Unity game engine was used to create the VR and AR experiences. The VR version of the environment was developed in Unity 2019.3.0f5 with the SteamVR plug-in, while the AR version was developed using Unity 2017.2.5f1 compatible with the Microsoft Academy Mixed Reality Toolkit. To enable compatibility with current consumer-level VR and AR hardware such as the Oculus Quest series of HMDs, we used the Air Light VR (ALVR) software, an open-source remote VR display software. ALVR allows streaming content from a computer running the VR simulation into a standalone VR headset [
27]. ALVR allows for the addition of the Oculus Quest’s hand tracking as an alternative to the controller inputs.
In the VR simulation, the user is required to wear a headset and navigate the virtual environment by physically walking within a pre-established room-scale boundary. When using the HTC Vive, users are required to set their tracking space to room-scale with a minimum size of 2.0 × 1.5 m, while when using the Oculus Quest, the tracking space minimum tracking space is 2 m. Depending on the size of the VR interactive area, the simulation places the examination scene within the proximity of the user. It is important to note that this version does not support stationary VR modes to reduce possible simulator sickness effects caused by walking employing the controllers, hand gestures, or teleportation.
During the virtual eye fundus examination, the virtual direct ophthalmoscope (VDO) is operated by employing the HTC Vive or the Valve Index controllers when using the HTC Vive headset or employing the Oculus Quest controller or hand tracking when using the Oculus Quest 1 or Oculus Quest 2 in the VR simulation. Pinching, supported by the Microsoft HoloLens 1 hand gesticulation, is employed with the AR simulation (see
Figure 3). By operating the VDO, the learner can identify the anatomical landmarks on an eye fundus. To perform the virtual eye fundus examination, first, the VDO has to be located, then reached, and secured. The VDO is placed on a table to the left of the virtual environment with an introduction and instructions. In the VR simulation, when using the HTC Vive controller, the VDO can be secured and gripped by reaching out and pressing the grip buttons on each side of the HTC Vive controller. When the Valve Index controllers are being used, the VDO can be secured by closing the hand around the controller simulating the feeling of grasping the VDO due to the finger proximity sensors and haptic feedback. When using the Oculus Quest controllers, the VDO can be secured by pressing and holding the grip button, while, when using hand tracking, the VDO can be grasped and held by closing the hand around it. When using the AR simulation, the VDO is secured by pointing the Microsoft HoloLens 1 reticle (i.e., a circular marking built into a holographic projection on the screen to enable interactions with the pinch gesture during AR experiences), at it and selecting it by performing a pinch gesture employing the index and thumb fingers. The same pinch gesture is used to hold and move the VDO during the virtual examination.
Figure 3 presents the virtual examination with the 3D input methods in addition to an ophthalmoscope view of the eye fundus.
3.4. Study Design
The study was conducted in two stages. The first stage (interrupted by government-implemented COVID-19 lockdowns to minimize the virus spread) was a within-subjects study focused on analyzing the usability, workload, and difficulty after having added the static ophthalmoscope when employing the HTC Vive controller, the Valve Index controller, and the Microsoft HoloLens 1 hand gesticulation for conducting the virtual eye examination. The interventions were randomized to ensure all participants were exposed to the three user inputs in a different order to minimize carry-over effects. This study was reviewed by the Ontario Tech University Research Ethics Board (REB# 15526) and approved on 7 November 2019. The second stage was an online within-subjects study (during-COVID-19) focused on usability and engagement. This study was reviewed by the Ontario Tech University Research Ethics Board (REB# 15128) and approved on 22 January 2021.
The virtual examination procedure was the same for both stages, requiring the participants to grasp the VDO and approach the three virtual patients in sequential order starting with the floating eye, then ’Lyette’, and finally ’Jimothy’. Each virtual patient was accompanied by a set of floating instructions in text format indicating what the participant was required to do when examining the virtual eye fundus.
3.5. Participants
Five participants from Ontario Tech University in Oshawa, Ontario, Canada, were recruited for Stage 1. It is important to highlight that additional participants were not recruited for Stage 1, given the difficulties associated with the restrictions imposed by the COVID-19 pandemic. In total, 13 participants, 9 from Ontario Tech University, and 4 from undisclosed affiliations were recruited for Stage 2. The experiment of Stage 2 was conducted remotely, and therefore, all participants were required to have access to an Oculus Quest 2 headset, a VR-ready computer, and 5 GHz local area network connectivity for completing the VR tasks by streaming content from the computer to the headset using ALVR. The inclusion criteria and the COVID-19 pandemic have made it difficult to recruit a larger number of participants. All participants completed the study, and all participants reported being familiar with VR and AR during the introduction to the study and not having any condition that would impede them to perform the virtual eye fundus examination. Participant background with eye fundus examination was not considered as an exclusion factor since the information presented in the procedure was sufficient for novice trainees.
3.6. Evaluation Criteria
3.6.1. Usability—Stage 1 and Stage 2
The system usability scale (SUS) questionnaire [
28] is regarded as a quick method for measuring system usability. The questionnaire asks users to rate levels of agreement through a 5-point Likert scale with statements that cover a variety of usability characteristics such as the system’s complexity, ease of use, and need for assistance amongst others. After calculating the SUS score according to [
28], a score above 68/100 indicates that the system is usable.
3.6.2. Task Workload—Stage 1
The NASA Task Load Index (TLX) [
29] provides a method of measuring mental workload for each user as they complete the tasks. The Raw NASA TLX (RTLX) was chosen to derive an overall workload score based on a weighted average of ratings associated with mental demand, physical demand, temporal demand, performance, effort, and frustration [
30].
3.6.3. Task Difficulty—Stage 1
User-perceived task difficulty is captured employing a five-point Likert scale question, where “Strongly Disagree” is “1”, “Disagree” is “2”, “Neutral” is “3”, “Agree” is “4”, and “Strongly Disagree” is “5”. This question focuses on how difficult each method of interaction is with respect to locating the eye fundus landmarks when examining the virtual patients.
4. Details of the Experiment
4.1. Stage 1
After agreeing to be part of the study and coordinating a participation date, the participants met with a facilitator in the GAMER Laboratory at Ontario Tech University in Oshawa, Ontario, Canada. Upon completion of the informed consent form, the participants were introduced to the HTC Vive controller, the Valve Index controller, and Microsoft HoloLens 1 hand gesticulation. Then, the participants were randomly assigned to each input device and required to perform an eye fundus virtual examination. After completing each examination with the designated user input device, the participants completed the SUS questionnaire, the NASA RTLX questionnaire, and the task difficulty question. After completing the study, participants received a verbal thank you from the facilitator.
4.2. Stage 2
After agreeing to be part of the study and coordinating a date for the study, the participants met over Discord and Google Meet with a facilitator. Upon completion of the informed consent form, the participants were introduced to the virtual scenario and were asked if they had any issues running ALVR with their Oculus VR headsets. Three of the participants had technical issues, and the facilitator helped them troubleshoot the problem until the simulation was running. Then, the participants were randomly assigned to either the Oculus Quest controller or the Oculus Quest hand tracking first and after completing the virtual examination, the participants completed the SUS questionnaire. Once finalized, the participants received a verbal thank you from the facilitator.
6. Discussion
The COVID-19 pandemic has had a negative impact on this study. For example, we were only able to recruit five participants for Stage 1 due to mandatory lockdowns and 13 online participants for Stage 2. While initially designed for in-person testing, we had to implement adjustments to make it suitable for online testing. Most significantly, we now use the Oculus Quest 1 and 2 headsets as both headsets support hand tracking and are cable of wireless Desktop VR compatible with Steam VR. However, it is important to highlight that the scope of the results is limited due to the small sample size and that these initial findings allow us to highlight some considerations that require further investigation.
6.1. Usability
The physical VR controllers resulted in higher SUS scores than the hand gesticulation required for the AR simulation during Stage 1 supported by the findings presented in [
25]. The HTC Vive and Valve Index controllers had a SUS score above 68/100 while the Microsoft HoloLens 1 hand gesticulation did not. Participants who struggled with the hand gesticulation required several attempts before they were able to interact with the funduscope using the pinching gesture. In comparison, the HTC Vive controller and the Valve Index were easier to use. Interestingly, we anticipated that the finger tracking available in the Valve Index controller would lead to higher usability than the HTC Vive controller due to allowing the grasping of objects by closing the hand around the controller. However, the participants found using one button for gripping the VDO more convenient than using finger tracking. From the usability testing results, we identified a problem associated with jittering when holding the VR controller in front of the virtual patient since shacking due to tracking issues made it difficult to observe the eye fundus. To address this issue, we implemented a floating static VDO for the participants to use if they experienced any form of jittering.
The addition of the Oculus Quest controller and hand tracking in Stage 2 in combination with a larger sample size (n = 13) allowed us to obtain a statistical power of %51 indicating |hlthat there is a probability of having a difference between the Oculus Quest controller and the Oculus Quest hand tracking. Stage 2 results are consistent with the findings presented in [
25], where the SUS scores were higher for the Oculus Quest controller than the Oculus Quest hand tracking. Interestingly, the Oculus Quest hand tracking resulted in a higher usability score than the Microsoft HoloLens 1 hand gesticulation. From the study observations in Stage 1, the participants moved within the virtual eye fundus examination room more naturally and operated the VDO with ease, as they were able to move their arms and position the VR controllers at the correct height and distance from the virtual patient. Furthermore, when securing and holding the VDO, Stage 1 participants expressed their preference for the Valve Index controller, while this was not reflected in their SUS scores. For Stage 2, due to the online nature of the study, no observations were made. However, at the end of the study, the participants shared difficulties with their experience. For example, when using the Oculus Quest controller and hand tracking, all participants reported dropping the VDO during the simulation, in some cases due to stopping pressing the grip button or when the hands were out of the camera sensor’s field of view when using hand tracking. Dropping the VDO when using the controllers resulted given that the participants expected the VDO to stay snapped to the hand after grabbing it, a relevant feature to consider for future work. While the Oculus Quest controller and hand tracking were well received, the hand tracking was not found usable when compared to the physical controller with a SUS score below the accepted minimum of 68.00/100. Some of the main issues reported by participants include concerns with respect to the stability and accuracy of the tracking as well as the limited working space due to the Oculus Quest’s 100 degrees horizontal and 75 degrees vertical sensor field of view.
6.2. Task Workload
The RTLX indices show that utilizing physical controllers required lower effort than hand gesticulation when conducting the virtual eye examination. We believe that prior experience with VR and AR may have influenced the amount of perceived workload, thus requiring additional investigation in future work. While the participants struggled with hand gesticulation for the virtual examination from a usability perspective, this did not increase the workload. It is worth noting that task workload may be higher with novice users who are not familiar with VR or AR; in which case, practice over time facilitates employing the controllers and hand gestures.
6.3. Task Difficulty
The virtual DO eye examination task difficulty indicates a preference toward using the HTC Vive and Valve Index controllers. The Microsoft HoloLens 1 hand gesticulation was perceived to be more challenging due to the hand-tracking difficulties experienced by the participants. Participants also expressed difficulties associated with the field of view when employing the Microsoft HoloLens 1 as the clipping planes required them to maintain a certain distance from the patient, while in VR they were able to properly approach the patient for the virtual examination.
Finally, it is worth mentioning that as input devices, the VR controllers and the Microsoft HoloLens 1 hand gesticulation differed. While the Microsoft HoloLens 1 hand-tracking system performs poorly with respect to usability, this device was at a stage where head-mounted hand tracking was novel. Although preliminary, lessons from this study indicate that a physical device is better suited for manipulating a device such as the VDO as a result of the challenges experienced with hand tracking. While hand gestures are more natural, current tracking technologies require further development to properly capture the complexity and dexterity of the human hand at a consumer-level for effective user input devices.
While task difficulty ratings were not requested in Stage 2, the participants reported experiencing difficulties when examining some of the eyes and, more specifically, while trying to visualize the eye fundi. Additionally, on some rare occasions, when the VDO was dropped, the participants were unable to recover it, thus requiring them to restart the application in conjunction with ALVR on both the computer and the Oculus Quest headset. Additionally, some participants reported having hand-tracking issues with either the Oculus Quest 1 and the Oculus Quest 2. Due to the online nature of the test, it was difficult to ensure the same conditions across all participants and particularly with respect to lighting conditions which can affect the Oculus Quest tracking performance.
Every participant had a different opinion regarding which avatar was more difficult to examine. One participant expressed the floating eye was the most difficult because it lacked the patient’s face, while others pointed out ’Lyette’s’ blinking and head movements as the reason for this, and finally, those who found ’Jimothy’ the most difficult claimed it was because of the size of his eyes. Overall, all participants expressed that the floating eye did not provide relevant information for the practice and suggested redefining the static VDO when using hand tracking to better facilitate the examination when having tracking issues to snap the hand to it.
7. Conclusions
This paper presented a comparative study between VR and AR interaction methods for a direct ophthalmoscopy training tool employing five user input devices. Our work, which is part of a larger research project focused on developing a greater understanding of immersive technologies in virtual eye examination, builds upon and expands on previous research. The results of the current study have expanded our understanding of user input interaction techniques within a virtual (VR and AR) DO eye examination simulation. We compared several user-interaction techniques including a traditional VR controller such as the HTC Vive, a finger-tracking VR controller such as the Valve Index, the more recent Oculus Quest controller, and hand tracking available in the Microsoft HoloLens 1 and the Oculus Quest headsets, with respect to usability, task workload, and difficulty.
The usability results indicate that physical VR controllers are regarded as a more practical and functional choice for virtual interactions. However, due to the small number of participants, no statistical difference between the Oculus Quest controller and the Oculus Quest hand tracking was observed. A larger sample size is needed to increase the achieved %51 statistical power with the current sample, before more concrete conclusions can be made.. With respect to the results obtained in Stage 1, the Microsoft HoloLens 1 hand tracking presented the participants with difficulties when utilizing hand gesticulation, where inaccurate gesture recognition and registration induced frustration leading to higher SUS scores in favor of the HTC Vive and Valve Index controllers. These results are consistent with the findings presented in [
25]. Interestingly, while the Microsoft HoloLens hand gesticulation and the Oculus Quest hand tracking presented a simpler interaction, gesture tracking issues causing jittering and proper hand detection prevented the users from examining the eye without problems.
The difficulty associated with locating the anatomical landmarks inside the virtual eye was consistent with the usability and workload results. Here, the input interactions provided by the HTC Vive and Valve Index were perceived as less difficult than the Microsoft HoloLens 1 hand gesticulation. However, in some cases, the hand tracking with the Oculus Quest 2 introduced jittering into the examination having participants use the static ophthalmoscope in the scene.
As a result of our findings, we suggest that physical user input devices be employed when possible for DO eye examination since they can provide ease of use when performing 3D user interactions.
Future work will focus on the refinement of the virtual eye examination simulation to realistically simulate patient interactions with diverse eye-fundus-related conditions. In this case, adjustments will be made to the functionality of the virtual ophthalmoscope to allow for a more detailed inspection of the fundus, as well as providing additional means to interact with the eye model for compatibility across diverse platforms to enable remote participation if COVID-19 restrictions remain. Furthermore, development toward supplementary 3D-printed peripherals and mixed interactions employing physical controllers and hand tracking will be explored to enhance user experience and increase the realism of the virtual procedure. Finally, future work will also see a larger study that includes a greater number of participants and also examines retention, to further develop our understanding regarding the effects of 3D input interaction techniques within a virtual DO eye examination.