1. Introduction
With changing climate and growing sea levels, coastal and riverine flooding is a growing concern across the world. With projected increases in the magnitude and frequency of flooding, understanding the risks and developing policies to address them is an integral part of urban planning. Visualizations play a crucial role in understanding and disseminating information from flood simulations and scenario modeling for planners, as well as negotiating adaptation pathways among exposed stakeholders [
1,
2,
3]. Given the institutional nature of flood risk management (FRM), most developed visualizations attempt to fit into the existing planning/risk management infrastructure. This integration makes the flood visualization domain particularly interesting, as the developed tools can be analyzed within the applied context of spatial analysis of risk and its communication to stakeholders.
Over the last decade, 3D visualizations of flood impacts have been increasingly prominent in scholarly literature [
1,
4,
5]. These are mostly produced for risk communication purposes, often with an assumption that perspective 3D views of the landscape are easier to interpret for non-experts [
6]. Although many developed tools are compelling, we still lack empirical studies to turn novelty and claims of improved understanding of data into demonstrable value for users. This trend has certainly been influenced by increased generation and use of 3D data (e.g., LiDAR, structure-from-motion (SfM), building information management (BIM)) where the vertical characterization of space is more complex [
7,
8]. This has, in turn, increased both the need and demand for software that can adequately represent topology in three dimensions and provide interactive and querying capabilities. However, now most of the viewing of and interaction with 3D content is mediated through 2D displays and windows, icon, mouse, pointer (WIMP) interfaces. This is significant, because it eliminates binocular depth cues, the potentially invaluable opportunity to view/manipulate and experience inherently 3D data in three dimensions and restricts interaction to keyboard and mouse inputs.
Concurrently, researchers are investigating ways to leverage emerging 3D interfaces to improve interaction with, and perceptual experiences of, 3D data. Within the FRM domain in particular, mobile augmented reality tools have been developed to visualize flood impacts in situ (e.g., [
9,
10,
11,
12]), ex situ (e.g., [
13]), and immersive virtual environments have been created to visualize potential futures for coastal adaptation [
11,
14,
15]. This growth in research interest has mirrored the development of a new generation of mixed reality (MR) interfaces that have potential to alter, and potentially improve, our interaction and understanding of complex 3D data. When discussing mixed reality tools, we largely follow the definition of an augmented reality (AR) system offered by Azuma [
16], where the system combines virtual and real content, is registered in three-dimensional space, and is interactive in real time. The visualization solution described in this paper is distinct from other types of tangible interfaces, such as AR sandboxes, where physical objects are augmented with digital overlays of data (e.g., [
17,
18]). We use the term MR instead of AR for two reasons: AR systems often focus on augmenting the real landscapes with virtual content, while our application is more environmentally agnostic (thus, further along the virtuality continuum of Milgram and Kishino [
19]. Secondly, the term MR has been used widely by researchers and developers to describe head-mounted display-based systems for AR/MR (e.g., [
20,
21]).
MR geographic visualization has been developing as a distinct domain over the last three decades, but we are now at the pivotal point, where such tools are becoming usable enough to be introduced into routine work [
22,
23,
24,
25]. In this medium, real world views can be augmented with spatially registered three-dimensional content [
16]. With advances in display technology, processing power, cloud computing, computer vision, hand and eye tracking, registration and occlusion management, these tools provide numerous opportunities for development of alternative data interfaces. This moment presents a unique opportunity for researchers and practitioners to evaluate their application in spatial data practice.
While this trend is sparked by the availability of new devices, the interest in emerging interfaces is not about the specific hardware or software. The mixtures of content and narrative are mediated through user interfaces, displays, input/output channels to deliver unique perceptual experiences of spatial data for the user. Each of the components making up an interface between the underlying data and user have the ability to influence understanding of the phenomena, whether in terms of topology of risk (e.g., flood extents and depths), or the associated narrative (e.g., risk perception, willingness to act) [
26]. The preceding introduction and commentary reveal that interfaces are far from being just novel display devices or interaction systems. They are multifaceted perceptual and experiential relationships between humans and phenomena, mediated by the data that represent them, the visualizations that attempt to convey them, and the interfaces that mediate this exploration. Mixed reality interfaces, in particular, are a promising tool to improve user interaction with three-dimensional spatial data due to the combination of visual, sensory-motor, and proprioceptive feedback in the interaction with virtual objects in real spaces [
27,
28]. Proprioception refers to the person’s awareness of their own body in space/environment, which is preserved when using MR tools [
28]. This multi-sensory nature of the interface may improve the comprehension and interaction with complex 3D data. With hand tracking, we can develop interfaces that leverage a user’s knowledge about interaction with real objects to manipulate virtual content, potentially simplifying interaction with complex 3D content (compared to a WIMP interface).
Furthermore, mixed reality interfaces offer distinct and significant features for applied spatial data practice, especially when it comes to collaborative tasks in a shared environment. In particular, the legacy spaces in which most routine work happens will require dedicated open spaces to leverage completely virtual environments (VR), while we might get most of the benefits of virtual environments in MR systems (e.g., immersion, binocular 3D, natural user interfaces), with a more flexible integration into work spaces. MR tools preserve the ability to see and interact with other people and the surrounding environment, and to interact with non-MR tools (e.g., paper maps, sketches), without the need to exit the interface. Numerous researchers have recognized this potential over the years [
22,
28,
29,
30,
31]. Much of the research in the past has focused on overcoming technical hurdles in implementing MR systems. While current MR devices are not yet ubiquitous, much of the development infrastructure needed to create usable visualizations exists. This presents exciting opportunities for researchers to develop and evaluate emerging platforms for their ability to deliver meaningful and useful interaction with rigorous spatial data. Furthermore, much conceptual work is needed to understand the role of various components of the MR interface (data, display, interaction, visualization approaches) in mediating understanding of data and associated phenomena by user.
This paper sits at the intersection of evolving modes of flood risk analysis and communication, and emerging interface technology. Its objective is to report on an applied mixed reality FRM visualization system and then unpack the interplay between interface capabilities, informational experiences grounded in FRM practice, and contemporary workspaces. The sections that follow describe the workflows through which we explored the feasibility of developing MR flood risk visualization tools; the resulting visualization interfaces; critical reflection and review of these systems from the perspectives of their performance, usability, and potential as operational tools; and their potential to integrate with current and future spaces of FRM practice. In the first of these, we report the design and development of a set of prototypes developed to demonstrate the possibilities of single-user and collaborative MR flood visualizations. Using the case study of flood risk management along the shore of the Fraser River in Vancouver, we develop 3D visualizations of the area, associated impacts, and potential mitigation infrastructure. By integrating this visualization into the state-of-the-art mixed reality system HoloLens 2, we aim to understand the usability of such tools and highlight how the distinct aspects of the interface alter the perceptual outcomes of 3D visualization. Informed by this experience, we present a discussion of the potential concerns for integration of MR visualizations into practice. Ultimately, this effort seeks to assess MR tools for potential to improve interaction, understanding and communication of flood risks through visualization by planners, decision-makers, and stakeholders.
3. Results
Based on the workflow presented above, we developed two visualization prototypes for single user and collaborative MR visualization of flooding and associated adaptation scenarios. In the sections that follow, the user experience, reflections on hardware performance and usability are presented. Through unpacking the developed prototypes through multiple lenses, we highlight the state-of-the art capabilities of MR as realized within our prototypes.
3.1. Developed Applications
As mentioned above, we developed a single user version to demonstrate the capabilities of current MR devices, while also developing a collaborative prototype, with minor changes. Two versions were created as some functionality of the single user application could not be realized (given our development resources) in a shared version. Specifically, content scaling was disabled, as well as the hand-bound content menu, which was moved into the environment. This can also be considered advantageous to the user experience, as the state of menu is displayed next to the 3D model for both users. Below, we discuss the user experience and capabilities of developed visualizations.
When a user launches the application, the digital content (3D visualization, text panel, conceptual drawings) is presented in front of the user. The text panel describing the visualization presents a brief description of the visualization, interaction, and the legend for the floodplain depth layer. Within the single user application, gesture guidance is provided to the user upon the start of the application, describing how to move content (pinch and hold), bring up the menu (bring palm up) and toggle the guidance off with a switch on the text panel (pinch/air tap). The 3D visualization itself can be moved and scaled freely in space and persist in a specific real-world location. The text and conceptual drawing panels provide contextual information related to the presented flood visualization, as well as serve relevant information when a user chooses a specific adaptation scenario. By selecting one of the four adaptation scenarios, relevant changes to 3D content appear (e.g., display a shore dike), the textual information is switched to describe pros and cons of a specific adaptation approach, and the conceptual drawings illustrate artistic sketches of the future shore layout. This ability to dynamically explore the spatial and policy implications of a particular adaptation approach mirror the role of maps and other 3D visualization tools designed to understand and communicate risks and relevant mitigation policies [
1,
3,
26]. Another aspect of FRM in the City of Vancouver is the development of setback policies to preserve shore areas for potential adaptation infrastructure. The 3D splines representing setbacks from the shore are available to a user in the menu, and the conflicts between the proposed policy and existing buildings can be seen.
Since the environment is mapped by the devices using the array of sensor, occlusion management is done on the fly in a given environment. This map of the environment is also used to align the virtual slates with text and drawings to the real-world walls. In case the alignment to walls does not make sense in a given environment (or the space is poorly mapped), two-handed manipulation rotating the slate can override it. This flexibility of content scaling, movement, and alignment enables integration of visualization across a range of environments, from a single user desk to a room-scale visualization.
Two movable clipping planes placed orthogonally to the content present the user with a simple tool to define extents and query 3D geometry of visualization along the clipping plane axis (i.e., transect). The resulting “slice” of the landscape is similar to the cross-section of shore displayed in the conceptual drawings panel. We designed this capability to provide a simple solution to query the 3D geometry of shore, while also providing visual correspondence to “slices” of the shore in conceptual drawings.
The shared application provides the capabilities of mixed reality visualization in a co-located, synchronous interactive collaborative setting. In terms of actual user experience, the only difference from a single user application is the need to move the anchor (virtual cube to which the content is attached) to a position with sufficiently complex real-world geometry (i.e., not just mid-air, e.g., on a table corner). By moving the cube, the user can use virtual buttons to start the Azure session, create anchor, and share it to the network. At this point, the anchor cube is locked in space and cannot be moved. The second user then starts an Azure session on their device and gets the shared network anchor. At this point, the position of the anchor cube is identical for both users, and the virtual coordinate systems of co-located users are synchronized, meaning the virtual content appears at the same real-world location. Once both users establish a common coordinate system, the 3D content position, rotation, scale (fixed), and scenario state are all synchronized in real time across users, allowing users to see and share visual information from their own perspective and position in a shared mixed reality environment. This ability to experience and interact with data in a collaborative environment can help to build shared mental models of environmental risks, risk reduction options and spatial policy based on a collaborative experience of 3D visualizations. Furthermore, this MR application setup preserves most of the rich context available to co-located collaborators: an ability to see and interact with the surrounding workspace, to talk and to see a peer’s body language and gestures [
22]. This setup was tested with two users, but it is scalable for more users.
3.2. Hardware Performance
In this section, we reflect on the device performance in processing, robustness of spatial mapping, and hand- and eye-tracking.
Processing across single and shared user versions were practically identical, given much of processing power is spent on loading 3D visualization. Notably, in
Figure 5 illustrates that the application utilizes almost 100% of single core GPU capacity of the device, with framerates beingly fairly stable in the range of 50–60 frames per second. Since we attempted to optimize content to utilize as much of local processing as possible, this demonstrates the limits of current state-of-the art devices. We are slightly above the recommended limit of 100 thousand polygons for the device, with the final model being at ~106 k polygons. It is important to note that local device limitations should not restrict applications to simple 3D content, low resolution or small aerial footprint. With remote rendering on a machine within a local network (Remote Rendering) or with cloud rendering (Azure Remote Rendering), HoloLens-based tools can fit tens of millions of polygons, which is especially relevant for large/complex spatial datasets.
Mixed reality displays on HoloLens 2 have a fairly limited field of view, which is a limitation inherent to all current head-mounted mixed/augmented reality devices, meaning that much of the peripheral view is not augmented, which does affect immersion and limits the “virtual real estate” that can be used without a need for user to move their head. Another notable limitation of this device is brightness limitations of current displays: the device becomes practically unusable in bright (e.g., lit by direct sunlight) environments.
Spatial mapping was satisfactory for our goals of occlusion management, digital content persistence, and alignment of virtual slates to walls. The default update rate of spatial mesh of the environment is 3.5 s in MRTK. We increased the update rate to once per second, which resulted in better performance of the above-mentioned features, without apparent performance penalty. There is still room for improvement, especially in environments with complex geometry/shadows. Nevertheless, the spatial mapping of environment and stability of digital content in real space is robust in a well-lit environment and is especially impressive given the lack of any external sensors or use of fiducial markers.
Hand tracking performance on HoloLens 2 is difficult to capture without a reference to other tracking setups. In our experience, the tracking is not on an “appliance level” of usability. After initial adaptation to the idiosyncrasies of hand tracking (e.g., hand needs to be a certain distance away) and interaction (i.e., gestures and buttons need to be pressed much further than you would expect based on visual feedback), the accuracy of tracking is satisfactory/usable, but still has substantial room for improvement.
Despite the limited use of eye-tracking, we need to acknowledge almost uncanny accuracy of this capability of HoloLens. The tracking is practically flawless, and this is especially exciting for potential approaches to evaluate user interfaces in mixed reality based on rich articulated eye-tracking data, beyond a simple gaze from a center of the camera/head of the user.
The performance of shared application in synchronizing coordinate systems and content state across two devices was satisfactory, with little (<100 ms) lag. The establishment of the anchor to share the coordinate system requires a sufficiently complex scanned real environment. If the anchor is placed on a fairly uniform surface (empty table) or in mid-air, the resulting coordinate synchronization is inaccurate and can be off by 50+ cm. Since both content and coordinate synchronization rely on networked services, local wi-fi overload, poor signal, and low speed might impact the delay across two users.
3.3. Usability
To understand and unpack the usability of developed MR visualization tools, we used Vi et al.’s [
34] framework of 11 MR user interface heuristics framework. This set of design guidelines has been developed with capabilities of head-mounted systems in mind and provide a useful framework to discuss the user interface design decisions made.
- 1.
Organize the spatial environment to maximize efficiency
The ability of MR interfaces to map the physical environment of a user enables integration of virtual content and physical space. By placing virtual objects on real surfaces (the truest form of AR, according to Azuma [
16], we leverage human capacity for spatial reasoning and a sense of their own body in space, through strong proprioceptive cues, to interpret virtual content. This is accomplished by occluding virtual content by real surfaces, as well as aligning information panels to physical walls. This set of capabilities makes the application adaptable to complex office environments. We actively tried the MR applications in several spaces to see how they performed visually, spatially, and cogently in (and with) different spaces. We tested both shared and single user versions in office, formal conference, and informal co-working spaces (
Figure 6). The on-the-fly spatial sensing/mapping of the device supported impressive agility and flexibility in adapting to different environments. Furthermore, the robustness of spatial awareness capabilities allowed movement of content from one space (meeting room) to another (open office area) without a loss of tracking or synchronicity of content placement in a shared version.
For instance, we can place a Fraser shore visualization on a table, and information panels on a wall (e.g.,
Figure 6, bottom-left). By leveraging the real environment of the user, we provide a set of visual and proprioceptive cues that help understand the scale of virtual object and their relative positions [
28]. From the user experience perspective, it might be easier to automatically “snap” content to detected surfaces, which is possible through the use of semantic understanding of environment by device (“scene understanding”) but was not realized here due to technical complexity.
- 2.
Create flexible interactions and environments
We sought to leverage hand tracking capabilities of HoloLens to provide intuitive/natural interaction with virtual objects, mimicking real objects. Beyond the ability to manipulate content directly with hands, users can use a virtual ray to grab distant objects. The ability to move, scale, and rotate content as desired by a user makes the visualization adaptable to a given environment.
- 3.
Prioritize user’s comfort and 5. Design around hardware capabilities and limitations
Content placement was guided by the desire to make interaction and viewing comfortable for the user, without intruding into personal space or requiring excess movements, which is realized through the ability to interact with content at a distance. Furthermore, content placement at approximately 1.25 m in front of the user by default requires the user to move their hands within the view of device cameras for hand tracking. To accommodate the limited field of view of MR displays on HoloLens, content was placed compactly so as to minimize user’s need for neck movement during the use. Processing limitations of the device were addressed by optimizing spatial extents of Fraser shore visualization.
- 4.
Keep it simple: do not overwhelm the user
To keep the user focused on the flood impacts, adaptation and associated policy implications, the UI design is minimal and includes only features directly relevant to the displayed content. There is also a clear correspondence in the results of interaction, where a user’s choice of scenario reflects a simultaneous change in relevant conceptual drawing, text, and 3D content.
- 5.
Use cues to help users throughout their experience and 8. Build upon real world knowledge
Once the user launches the application, the first thing appearing in the field of view is the text panel describing visualization contents and interaction. Within the single user version, users are also presented with gesture guide animations for opening the content menu, moving content and distant clicking (air tap) to disable guidance. The subtle use of eye-tracking to show text prompts and highlights at the user’s gaze position also seeks to guide the user through interaction.
- 6.
Create a compelling XR experience
This set of MR visualization prototypes seeks to leverage the existing information related to flood visualization to provide a complete understanding of flooding phenomena. We used most of the information related to shore adaptation of the area available within the visualization, and leveraged the capacities of MR interface (as discussed throughout) to provide an engaging, simple to use tool to interact with spatial data. While prototypes we developed are certainly compelling to experience and use, we anticipate that spatial data users will expect to be able to use much larger geographic datasets, based on their GIS experience. This can be accomplished with off-device rendering. There are other aspects of MR interface that can especially highlight the potential of interactive MR environments for data exploration, particularly, data with more complex characterization of 3D space, and dynamic content (e.g., animated output of a flood simulation).
- 7.
Provide feedback and consistency
When users interact with content, they get the visual, audio, and proprioceptive feedback based on their interactions. For instance, when a user chooses a particular scenario in the menu, the associated radial button changes color, a clicking sound is played, and the content is changed. We sought to provide users with a feedback on how the device sees their hand/hand gestures, thus we kept the visualization of hand mesh observed by device on, so that a user sees what the device sees (
Figure 7). The interaction across different content is consistent, with single handed interaction moving the content, and two-handed manipulation used for scaling and rotating (and moving) virtual objects.
- 8.
Allow users to feel in control of the experience
The displayed content is inert when a user launches the app (apart from hand guidance, which is animated, but fixed in space). This means that content changes state or moves only due to explicit interaction by the user. While good in theory, in practice, some general hand movements were recognized as gestures by the device, leading to unexpected movement of content. This is not a persistent feature of hand tracking, but rather a noticeable “accidental” limitation when using application for prolonged time.
- 9.
Allow for trial and error
The only critical error that critically affected the experience and required a restart of the application was accidental movement of content behind a physical object/surface/wall. Due to the nature of MR interfaces and management of occlusion, content can sometimes be practically “lost” in physical space, such as behind a wall (i.e., users cannot see or interact with it). This movement of content behind walls is likely fixable through the addition of colliders to walls and virtual objects, but resulted in unexpected behavior (virtual content bouncing off and flying around the room). We wanted to implement the ability to restart the visualization to default position, but restarting a scene with MRTK components in Unity is not straight-forward (see [
48]) and was not implemented due to practical time constraints.
4. Discussion
This section offers critical reflection and review of these systems from the perspectives of their performance and potential as operational tools, and their potential to integrate with current and future FRM and planning practice; and finally, theorization of their significance as data interfaces.
Devices that can deliver usable 3D visualizations with natural user interfaces that are robust enough to support everyday information science work have appeared only recently and, while there is much room for improvement, they provide distinct and compelling experiences of interacting with 3D data [
25]. Growth of dedicated development frameworks and communities significantly reduces the complexity of development of MR experiences. While contemporary systems have their limitations, we are at a critical juncture where the MR systems are becoming usable enough to focus on the applied problems. With decreasing barriers and streamlined integration of geospatial data into MR interfaces, these tools can become a meaningful addition to the planner’s toolkit to investigate topologies of impacts, explore datasets across scales, and understand the interplay between inundation scenario and proposed adaptation policies.
With the capabilities of HoloLens 2, we can develop flexible collaborative flood visualizations that can be used within real offices without a need for dedicated spaces (as needed for VR), or specialized knowledge for interaction. This work demonstrates the practical workflow and seeks to highlight the significant infrastructure available to build powerful MR tools without significant development experience. The developed prototypes only demonstrate a particular case of ex situ, and in case of the shared version, co-located synchronous MR. Many researchers are also investigating in situ visualizations of flood impacts using MR/AR [
11,
15]. This range of applications highlights the significant potential these tools can have for analyzing and responding to flooding risks, as well as provide compelling environments to provide on-site information to broader set of stakeholders (e.g., decision-makers, businesses, residents, etc.). At the same time, we see massive potential in how MR visualization can transform the flood scenario visualizations done ex situ to understand the impacts and adaptation based on the available data. Although this work focuses on collaboration in shared physical environment, possibilities for remote collaboration using emerging interfaces could have a qualitative change to how the risks are understood and managed, given the potential of remote collaborators to develop robust, shared mental models of risks, and possible adaptation based on interactive 3D visualizations.
The visualization development process outlined here was guided by datasets available for flood risk management in the local context. Within developed tools (and underlying datasets), the third dimension is only used to display elevation information at a location (ground elevation, flood depth, building height), without much vertical complexity in data. However, to realize the potential of 3D displays and natural user interfaces, we need an integration of data with more complex 3D characterizations of space. With increasing use of truly 3D data, such as LiDAR, 3D models derived from structure-from-motion, and BIM to characterize urban landscape and structures, the added value of MR visualizations and interaction over a WIMP interface will likely be more significant. This can result in a richer analytical experience, as well as improve practical accuracy of understanding of potential impacts of flooding (e.g., [
7,
8]). Although this work focuses on collaboration in the shared physical environment, possibilities for remote collaboration using emerging interfaces could have a significant influence on how the risks are understood and managed, and potentially enable remote collaborators to develop robust, shared mental models of risks and possible adaptation based on interactive 3D visualizations.
To integrate the tools meaningfully into planning generally, and flood risk management in particular, we need much more empirical work to understand what aspects of mixed reality interfaces provide value for the user. The current moment presents numerous research opportunities to investigate these tools for spatial data practice as they become widely available and used across numerous industries. However, it is not clear how to investigate tools developed for complex tasks and goals, such as exploring and supporting policy discussion. Simple usability metrics and task completion times typically used to compare interfaces do not capture the perceptual outcomes, and the potential of MR tools to engage broader set of users in exploring geospatial data (i.e., without the complexities of a desktop GIS).