Before discussing the materials and methods, the research team undertook the creation of the user-experience within this proof-of-concept study; the authors have included a general description of the considerations underpinning the PARS platform. Specifically, the platform developed in this study employs 360-degree panoramic images that have been augmented with safety data for the trainee to engage in active exploration. Trainees practice identifying hazards over three sessions—Training, Assessment, and Feedback, with the objective of learning, testing, and receive comments about their acquired knowledge.
4.1. Platform Architecture and Data Management
The safety training platform contains three distinct layers: application, service, and hardware (
Figure 1). The layers represent the basic elements required by the platform to function.
The trainee only has access to the application layer, where all the training interactions occur. In this layer, the trainee observes and identifies hazards, and then receives feedback about the hazard-recognition tasks. The application layer includes two functional blocks: the hazard identification panel (HIP) and the 360-degree scenes. The HIP serves as the interactive space within which the user engages the platform. The 360-degree scenes include renderings of 360-degree panoramic images and layers of safety information in the form of augmentations. These augmentations include annotations in the form of data, objects, animations, or sounds.
The service layer consists of the digital tools employed by the application layer to enable and support the platform’s activities. Specifically, the platform utilizes the Unity3D® (Unity Technologies, San Francisco, CA, USA) game engine, and a database that contains trainee information for each session (e.g., time spent reading information, interactions with interface, hazard selections, etc.). Unity3D® is the middleware upon which this study developed the platform. The service layer employs the JavaScript Object Notation (JSON) data structure to build the database of trainee information, as it easy to interpret, requires minimal setup, and can be accessed in any type of device. The locally stored data in the database captures the trainee’s interactions and selections on the training and assessment sessions, and enables instantaneous comments by automatically processing the collected data in the feedback session. Additionally, this data permits the research team to posteriorly analyze the trainee’s interactions and selections while using the platform, gaining insides on hazard identification activities and platform usage patterns. The last layer contains the hardware devices that allow the trainees to physically engage the platform. Currently the platform supports the visualization with a monitor, tablet, or smartphone, and the platform functions as a standalone, locally executed software.
As illustrated in
Figure 2, the data management of the platform proceeds according to an activity-based unified modeling language (UML) diagram that runs within the Unity3D application and get supported by the database.
In order to allow the user to learn and receive information about construction safety hazards, each successive interaction with the game triggers data transfer from the Training Session, Assessment Session, or Feedback Session into the database and vice versa. Each of these sections drives the tasks required from the trainee at different stages of the software utilization. Upon software initialization, the trainee sees a welcome screen as part of the “Game Start” action. Next, the “Demographics” action asks the trainee to input anonymized identification information into the system; this step enables the software and the research team to track the trainee utilizing the platform at a given time. In the Training Session, the “Training Instructions” action presents the trainee with brief written instructions of the tasks to be achieved. Subsequently, two concurrent actions occur in the PARS platform: “User Training” and “Training Data Serialization”. The “User Training” action proceeds with hazard-discovering tasks that the trainee is required to complete on each 360-degree image. As the trainee interacts with the platform, data is recorded by the “Training Data Serialization” action, at which time the recorded training data is encoded into JSON format to facilitate the transfer, storage, and retrieval from and to the database.
In the Assessment Session—represented by the “User Assessment” and “Assessment Serialization” actions—trainees first utilize an interface to identify hazards in each 360-degree image (“User Assessment”); simultaneously, the data inputs are serialized into JSON in the “Assessment Data Serialization” action. The recorded hazard data is then automatically graded during the “Data Verification” action using a set of defined answer keys stored within the platform. Then, the data is transferred for storage to the database in the “Assessment Data” action for later processing. The trainee traverses a series of 360-degree images until a final scene is reached, where the trainee is prompted to move into the Feedback Session. In the Feedback Session, the “User Feedback” action the assessment data is retrieved from the databased and the defined answer keys on the platform, comparing them to populate a feedback interface with the correct and incorrect answers. Once the trainee reaches this screen, the game is complete. In the “Game Completion” action, the user receives a message to restart the “Game Start” action for the next user or to end the game, exiting the application.
By utilizing the architecture described in this section and following the UML data management procedure, the Unity3D application can be used to learn, assess, and receive feedback regarding safety hazards hosted in the 360-degree augmented panorama of reality and the database. Although trainees experience a pre-assembled set of content that cannot be modified directly by them, potential future training creators can access the Unity3D application to add, manipulate, or replace the 360-degree images and layers of information augmented in each scene (e.g., text, objects, animations, etc.), developing their own customizable experience. This provides great flexibility for the proposed proof-of-concept, allowing content designers to potentially explore other educational and training alternatives beyond safety-related topics by simply changing the scope and materials loaded in the platform.
4.2. 360-Degree Panoramas: Capture, Visualization and Augmentation
Assembling 360-degree panoramas requires capturing images of the real environment to populate the virtual environment, as illustrated on
Figure 3. 360-degree image capturing entails the creation of an equirectangular projection. To obtain this 360-degree capture as a 2D projection, a panoramic camera with multiple fish-eye lenses is used (e.g., Ricoh Theta V—Ricoh Company, Ltd, Tokyo, Japan; Insta360 One—Shenzhen Arashi Vision Co., Ltd, Shenzhen, China; Samsung Gear 360 —Samsung Group, Seoul, South Korea; etc.). Alternatively, multiple shots from a traditional camera (DSLR or Mirrorless) can be stitched to create an equivalent equirectangular image. In both approaches, the equirectangular projection requires the use of computer software to stich each individual image into a single picture; the software resolves the distortions introduced during the capturing process and maps the 360-degree spherical coordinates onto planar coordinates. Subsequently, the game engine—such as Unity3D
®—remaps the equirectangular images into the spherical coordinates to render 360-degree image.
In the produced 3D virtual environment, trainees can explore the images to observe focus areas in detail. The augmentation process is performed by the training creator using the Unity3D game engine application software, in which data, objects, animations, or sounds can be superimposed into the 360-degree panoramas; augmenting the information displayed by importing these graphical or auditive assets into the scenes. In this study, the purpose of these augmentations is to communicate safety concepts using supplementary features that enhance users’ understanding of a written description from OSHA’s manuals. The resulting augmented 360-degree panoramic scenes can be transferred to different devices for visualization. This process provides trainees with access to the 360-degree panoramic imaging in a variety of devices such as PCs, laptops, handheld devices, and head-mounted displays (HMD). For this research, PCs were targeted as the primary device of analysis, as these are easily accessible and do not require any special setup. A mouse and keyboard setting were utilized to enable trainees to explore using drag-and-drop gestures in the 360-degree images interface and point-and-click gestures in the HIP interface. The 360-degree panoramas are also accessible using online cloud technologies, which enable real-time feed and big data analysis [
46].
4.4. Hazard Recognition Evaluation: Hazard Identification Index and Grading
The evaluation of the trainees’ hazard-recognition skills is performed using the hazard identification index (
HII) developed by Carter and Smith [
20]. The
HII (1) offers a method to score hazard identification quantitatively in the context of both the identification and the assessment of hazards. The
HII is calculated for each trainee as the ratio:
where
is the number of identified hazards, and
is the total number of hazards present in each 360-degree image (
j). The number of hazards identified by the trainee (
) will be impacted by the level of conceptual comprehension the trainee gained during the Training Session. To successfully reflect the understanding of the trainee, a grading system assigns a value to each potential response. Trainee hazard identification can correspond to three cases:
Correct identification or rejection (CIR): Trainee identifies correctly a hazard as present or as not present in the image.
Incorrect identification (II): Trainee identifies a hazard as present in the image, but the hazard it is not actually contained in the image. Incorrectly identified hazards are analogous to a false positive or Type I error.
Missed identification (MI): Trainee identifies a hazard as not present in the image, but the hazard is in fact contained in the image. Missed hazards are analogous to a false negative or Type II error.
Calculating the number of identified hazards is accomplished by combining the concepts associated within the training to the three previously defined cases, assigning a positive point for each
CIR and penalizing a proportion of the
II’s and
MI’s with negative points. As no literature was found regarding the appropriate percentage of penalization for
II’s and
MI’s, the research team assumed a value of 50 percent for each of these categories, thereby weighting both II and MI errors as equally detrimental to the assessment score. The proposed equation for the calculation of
is defined as:
To compute the overall hazard identification index (
) across the scenes for each trainee, the mean is calculated using each index previously computed (
) divided by the total number of hazards present on the scene (
):
4.5. Graphical User Interface
Trainees must constantly interact with the PARS platform on each of the sessions to learn, evaluate, and obtain feedback about the hazards present in the 360-degree images. These interactions are driven by the platform’s graphical user interface, which enables data input and output.
Figure 5 illustrates the most important user interfaces the trainees encounter while performing the hazard-recognition tasks throughout the sessions. As discussed above, within the application, trainees have access to two different areas within the scene screen: the 360-degree image renderer and the hazard identification panel (HIP). The 360-degree image renderer allows the trainee to actively explore the scene by using drag-and-drop gestures with different pointing devices or finger movements. In this area, graphical representations of the hazardous conditions are displayed using augmentations (data, objects, animations, or sounds). A special type of object augmentation in the PARS platform is the Hotspot. These are safety data-rich locations annotated with graphics using different colors to direct the attention of the trainee to a hazardous situation. The content and position of these augmentations, including the marker, enhances the trainees’ contextual understanding about the safety-related topics (e.g., activity, objects, or persons) in the location.
The HIP facilitates trainee interaction with the descriptive information that accompanies the hazards displayed in the 360-degree image renderer. The HIP employs three different interfaces depending on the type of session (Training, Assessment, or Feedback).
Figure 6 displays the HIP’s types of information, interaction, and layout for each of the different sessions. For the Training Session, the HIP utilizes the learning card (
Figure 6a) to contain descriptive safety information. The learning cards directly link the graphical representation of a hazard to the descriptive information in a hotspot. When a trainee uses a point-and-click gesture on a hotspot or on the learning card, the game camera is automatically directed to the augmentation and shows the contained information. The learning card information has three layout levels: hazard category, hazard name, and hazard summary.
The hazard category indicates the type of hazard the card contains according to a hazard classification scheme (e.g., fall hazard, struck-by hazard, electrical hazard, etc.). The hazard name defines the specific source of the hazard by assigning a distinctive term that outlines the content scope (e.g., a “fall hazard” will include an untied worker, unprotected edges, holes, etc.). Finally, the hazard summary elaborates on the exact context presented in the 360-degree image and provides descriptive information for the trainee to fully understand the hazardous condition.
For the Assessment Session, the HIP uses the evaluation card (
Figure 6b) to contain all the possible answers for the hazard-recognition tasks. The evaluation cards use a checkbox interface to collect the trainee responses for each scene. Each card layout contains the hazard category as its title and the hazard names covered within the category as options to be selected by the users. The user responses collected from these cards are linked to the score cards (
Figure 6c) in the Feedback Session. In the Feedback Session, the HIP displays for each assessment scene: the correct answers, the user responses graded and color coded (green as correct, and red as incorrect), the hazard identification index, and additional notes. An overall hazard identification score is displayed below the score cards to deliver a notion of the user understanding across the different evaluated scenes. In general, the HIP also contains a timer that specifies the time used for the session, a next button to advance to the subsequent scene, an indicator of the type of session currently in use, and a counter that shows the current number of the scene.