Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment

Agostinelli, Thomas; Generosi, Andrea; Ceccacci, Silvia; Khamaisi, Riccardo Karim; Peruzzini, Margherita; Mengoni, Maura

doi:10.3390/app112210645

Open AccessArticle

Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment

by

Thomas Agostinelli

¹

,

Andrea Generosi

¹,

Silvia Ceccacci

^1,*

,

Riccardo Karim Khamaisi

²

,

Margherita Peruzzini

²

and

Maura Mengoni

¹

Department of Industrial Engineering and Mathematical Science (DIISM), Università Politecnica delle Marche, 60131 Ancona, Italy

²

Department of Engineering “Enzo Ferrari” (DIEF), Università degli Studi di Modena e Reggio Emilia, 41125 Modena, Italy

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(22), 10645; https://doi.org/10.3390/app112210645

Submission received: 4 October 2021 / Revised: 5 November 2021 / Accepted: 6 November 2021 / Published: 11 November 2021

(This article belongs to the Special Issue Novel Approaches and Applications in Ergonomic Design II)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

We introduce a motion capture tool that uses at least one RGB-camera, exploiting an open-source deep learning model with low computational requirements, already used to im-plement mobile apps for mobility analysis. Experimental results suggest the suitability of this tool to perform posture analysis aimed at assessing the RULA score, in a more efficient way.

Abstract

This paper introduces a low-cost and low computational marker-less motion capture system based on the acquisition of frame images through standard RGB cameras. It exploits the open-source deep learning model CMU, from the tf-pose-estimation project. Its numerical accuracy and its usefulness for ergonomic assessment are evaluated by a proper experiment, designed and performed to: (1) compare the data provided by it with those collected from a motion capture golden standard system; (2) compare the RULA scores obtained with data provided by it with those obtained with data provided by the Vicon Nexus system and those estimated through video analysis, by a team of three expert ergonomists. Tests have been conducted in standardized laboratory conditions and involved a total of six subjects. Results suggest that the proposed system can predict angles with good consistency and give evidence about the tool’s usefulness for ergonomist.

Keywords:

motion capture; ergonomic risk assessment; industrial ergonomics; postural analysis; RULA

1. Introduction

Nowadays, reducing the risk of musculoskeletal diseases (MSDs) for workers of manufacturing industries is of paramount importance to reduce absenteeism on work, due to illnesses related to bad working conditions, and to improve process efficiency in assembly lines. One of the main goals of Industry 4.0 is to find solutions to put workers in human-suitable working conditions, improving the efficiency and productivity of the factory [1]. However, we are still far from this goal, as shown in the 2019 European Risk Observatory Report [2]: reported working-related MSDs are decreasing but remain too high (58% in 2015 against 60% in 2010), and if we consider the working-class aging, these reports can only get worse. According to the European Commission’s Ageing Report 2015 [3], the employment rate of people over-55 will reach 59.8% in 2023 and 66.7% by 2060, because many born during the “baby boom” are getting older, life expectancy and retirement age are increasing, while the birth rate is decreasing. Solving this problem is of extreme importance. Working demand usually does not change with age, while this cannot be said for working capacity: in aging, physiological changes in perception, information processing and motor control reduce work capacity. The physical work capacity of a 65-year-old worker is about half that of a 25-year-old worker [4]. On the other hand, age-related changes in physiological function can be dampened by various factors including physical activity [5], so work capability is a highly individual variable.

In this context, industries will increasingly need to take the human variability into account and to predict the workers’ behaviors, going behind the concept of “the worker” as a homogeneous group and monitoring the specific work-related risks more accurately, to implement more effective health and safety management systems to increase factory efficiency and safety.

To reduce ergonomic risks and promote the workers’ well-being, considering the characteristics and performance of every single person, we need cost-effective, robust tools able to provide a direct monitoring of the working postures and a continuous ergonomic risk assessment along with the work activities. Moreover, we need to improve the workers’ awareness about ergonomic risks and define the best strategies to prevent them. Several studies in ergonomics suggested that providing workers with ergonomic feedback can positively influence the motion of workers and decrease hazardous risk score values [6,7,8]. However, this goal still seems far from being achieved due to the lack of low-cost, continuous monitoring systems to be easily applied on the shop floor.

Currently, ergonomic risk assessment is mainly based on postural observation methods [9,10], such as: National Institute for Occupational Safety and Health (NIOSH) lifting equation [11], Rapid Upper Limb Assessment (RULA) [12], Rapid Entire Body Assessment (REBA) [13], Strain Index [14], and Occupational Repetitive Action (OCRA) [15]. They require the intervention of an experienced ergonomist who observed workers’ actions, directly or by means of video recordings. Data collection required to compute the risk index is generally obtained through subjective observation or simple estimation of projected joint angles (e.g., elbow, shoulder, knee, trunk, and neck) by analyzing videos or pictures. This ergonomic assessment results to be costly and time-consuming [9], highly affects the intra- and inter-observer results variability [16] and may lead to low accuracy of such evaluations [17].

Several tools are available to automate the postural analysis process by calculating various risk indices, to make ergonomic assessment more efficient. They are currently embedded in the most widely used computer-aided design (CAD) packages (e.g., CATIA-DELMIA by Dassault Systèmes, Pro/ENGINEER by PTC Manikin or Tecnomatix/Jack by Siemens) and allow detailed human modeling based on digital human manikins, according to an analytical ergonomic perspective [18]. However, to perform realistic and reliable simulations, they require accurate information related to the kinematic of the worker’s body (posture) [19].

Motion capture systems can be used to collect such data accurately and quantitatively. However, the most reliable systems commercially available, i.e., motion capture sensor-based (e.g., Xsense [20], Vicon Blue Trident [21]) and marker-based optical systems (e.g., Vicon Nexus [22], OptiTrack [23]) have important drawbacks, so that their use in real work environments is still scarce [9]. Indeed, they are expensive in terms of cost and setup time and have a limited application in a factory environment due to several constraints, ranging from lighting conditions to electro-magnetic interference [24]. Therefore, their use is currently limited to laboratory experimental setups [25], while they are not easy to manage on the factory shop floor. In addition, these systems can also be intrusive as they frequently require the use of wearable devices (i.e., sensors or markers) positioned on the workers’ bodies according to proper specification [25] and following specific calibration procedures. These activities require the involvement of specialized professionals and are time consuming, so it is not easy to carry them out in real working conditions, on a daily basis. Moreover, marker-based optical systems greatly suffer from occlusion problems and need the installation of multiple-cameras, which could rarely be feasible in a working environment where space is always limited, and so they cannot be optimally placed.

In the last few years, to overcome these issues, several systems based on computer vision and machine learning techniques have been proposed.

The introduction on the market of low-cost body-tracking technologies, based on RGB-D cameras, such as the Microsoft Kinect^®, has aroused great interest in many application fields such as: gaming and virtual reality [26], healthcare [27,28], natural user interfaces [29], education [30] and ergonomics [31,32,33]. Being an integrated device, it does not require calibration. Several studies evaluated its accuracy [34,35,36] and evaluated it in working environments and industrial contexts [37,38]. Their results suggest that Kinect may successfully be used for assessing the risk of the operational activities, where very high precision is not required, despite errors depending on the performed posture [39]. Since the acquisition is performed from a single point of view, the system suffers occlusion problems, which can induce large error values, especially in complex motions with auto-occlusion or if the sensor is not placed in front of the subject, as recommended in [36]. Using multiple Kinects can only partially solve these problems, as the quality of the depth images degrades with the number of Kinects concurrently running, due to IR emitter interference problems [40]. Moreover, RGB-D cameras are not as widely and cheaply acceptable as RGB, and their installation and calibration on the workspace is not a trivial task, because of ferromagnetic interference that can cause significant noise in the output [41].

In a working industrial environment, having motion capture working with standard RGB sensors (such as those embedded in smartphones or webcams) can represent a more viable solution. Several systems have been introduced in the last few years to enable real-time human pose estimation from video streaming provided by RGB cameras. Among them, OpenPose, developed by researchers of the Carnegie Mellon University [42,43], represents the first real-time multi-person system to jointly detect the human body, hands, face, and feet (137 keypoints estimation per person: 70-keypoints face, 25-keypoints body/foot and 2x21-keypoints hand) on a single image. It is an open-source software, based on Convolutional Neural Network (CNN) found in the OpenCV library [44] initially written in C++ and Caffe, and freely available for non-commercial use. Such a system does not seem to be significantly affected by occlusion problems, as it ensures body tracking even when several body joints and segments are temporarily occluded, so that only a portion of the body is framed in the video [18]. Several studies validated its accuracy by comparing one person’s skeleton tracking results with those obtained from a Vicon system. All found a highly negligible relative limb positioning error index [45,46]. Many studies exploited OpenPose for several research purposes, including Ergonomics. In particular, several studies carried out both in a laboratory and in real life manufacturing environments, suggest that OpenPose is a helpful tool to support worker posture ergonomics assessment based on RULA, REBA, and OCRA [18,46,47,48,49].

However, deep learning algorithms used to enable people tracking using RGB images usually require hardware with high computational requirements, so it is essential to have a good CPU and GPU performance [50].

Recently, a newer open-source machine learning pose-estimation algorithm inspired from OpenPose, namely Tf-pose-estimation [51], has been released. It has been implemented using Tensorflow and introduced several variants that have some changes to the CNN structure it implements so that it enables real-time processing of multi-person skeletons also on the CPU or on low-power embedded devices. It provides several models, including a body model variant that is characterized by 18 key points and runs on mobile devices.

Given its low computational requirements, it has been used to implement mobile apps for mobility analysis (e.g., Lindera [52]) and to implement edge computing solutions for human behavior estimation [53], or for human posture recognition (e.g., yoga pose [54]). However, as far as we know, the suitability of this tool to support ergonomic risk assessment in the industrial context has not been assessed yet.

In this context, this paper introduces a new low-cost and low-computational marker-less motion capture system, based on frame images from RGB cameras acquisition, and on their processing through the multi-person key points detection Tf-pose-estimation algorithm. To assess the accuracy of this tool, a comparison among the data provided by it with those collected from a Vicon Nexus system, and those measured through a manual video analysis by a panel of three expert ergonomists, is performed. Moreover, to preliminary validate the proposed system for ergonomic assessment, RULA scores obtained with the data provided by it have been compared to (1) those measured by the expert ergonomists and (2) those obtained with data provided by the Vicon Nexus system.

2. Materials and Methods

2.1. The proposed Motion Analysis System

The motion analysis system based on RGB cameras (RGB-motion analysis system, RGB-MAS), here proposed, is conceptually based on that described in [18] and improves its features and functionalities as follows:

-: New system based on the CMU model from the tf-pose-estimation project, computationally lighter than those provided by Openpose and therefore able to provide real-time processing requiring lower CPU and GPU performances.
-: Addition of estimation of torso rotation relative to the pelvis and of head rotation relative to shoulders.
-: Distinction between abduction, adduction, extension, and flexion categories in the calculation of the angles between body segments
-: Person tracking is no longer based on K-Means clustering, as it was computationally heavy and not very accurate.
-: Modification of the system architecture to ensure greater modularity and ability to work even with a single camera shot.

The main objective, using these tools, is to measure the angles between the main body segments that characterize the postures of one or more people framed by the cameras. The measurement is carried out starting from the recognition of the skeletal joints and following with the estimation of their position in a digital space. It is based on a modular software architecture (Figure 1), using algorithms and models of deep learning and computer vision, to analyze human subjects by processing videos acquired through RGB cameras.

The proposed system needs one or more video recordings, retrieved by pointing a camera parallel to the three anatomical planes (i.e., Sagittal Plane, Frontal Plane, and Transverse Plane) to track subjects during everyday work activities. In most cases, it is necessary to use at least two points of view, taken from the pelvis height, to achieve good accuracy: this trade-off guarantees a good compromise between the prediction accuracy and the system portability. For the calculation of the angles, the optimal prediction has been evaluated when the camera directions are perpendicular to the person’s sagittal and coronal planes. However, the system tolerates a deviation in the camera direction perpendicular to these planes in the range between −45° and +45°. In this case, empirical tests evidenced that the system performs angle estimation with a maximum error equal to ±10%. The accuracy of the skeletal landmark recognition tends to worsen the closer the orientation angle of the subject is to ±90° to the reference camera, due to obvious perspective issues in a two-dimensional reference plane.

The system does not necessarily require that video recordings taken from different angles be simultaneously collected. The necessary frames can be recorded in succession, using only one camera. An important requirement is that each video must capture the subject during the entire working cycle.

Any camera with at least the following minimum requirements can be used:

Resolution: 720 p.
Distortion-free lenses: wide angle lenses should be avoided.

The PC(s) collects the images from the camera(s) and processes them through the Motion analysis software, which is characterized by two main modules (i.e., “Data Collection” and “Parameters Calculation” modules), which are described in detail below.

2.1.1. Data Collection

This module enables the analysis of the frames from the camera(s) to detect and track people presented in the frame. It uses models and techniques of deep learning and computer vision.

The deep learning model used to track the skeleton joints (i.e., key points) is based on the open-source project Tf-pose-estimation. This algorithm has been implemented using the C++ language and the Tensorflow framework. Tf-pose-estimation provides several models trained on many samples: CMU, mobilenet_thin, mobilenet_v2_thin, and mobilenet_v2_small. Several tests were done, and the CMU model was chosen as the optimal model for this project, looking for a compromise between accuracy and image processing time. It allows the identification of a total of 18 points of the body (Figure 2). When a video is analyzed, for any person detected in each processed frame, the system associates a skeleton and returns the following three data, for each single landmark:

x: horizontal coordinate.
y: vertical coordinate.
c: confidence index in the range [0; 1].

For the subsequent parameter calculation, the system considers only the landmarks with a confidence index higher than 0.6.

To ensure univocal identification of the subjects, as long as they remain and move in the visual field of the camera(s), the system also associates a proper index to each person. A proper algorithm has been implemented to distinguish the key points belonging to different subjects that could eventually overlap each other when passing in front of the camera. It considers a spatial neighborhood for each detected subject through the key points detection model: if the subject maintains its position within that neighborhood in the subsequent frame, the identifier associated with it at the first recognition will remain the same. Collected data is then saved in a high-performance in-memory datastore (Redis), acting as a communication queue between the Data Collection and the Parameter Calculation modules.

2.1.2. Parameter Calculation

This module, developed in Python, uses the C++ output from the Data Collection module to determine the person orientation with respect to the camera(s) and to calculate the 2D angles between the respective body segments.

The angles between the body segments are evaluated according to the predicted orientation. For each frame, they are computed from the coordinates (x, y) of the respective keypoints, by using proper algorithms. To estimate the angle between the segments i-j and j-k (considering the coordinates of the keypoints i, j and k) the following formulas are applied:

θ = (\arccos (\frac{γ}{δ})) \times (\frac{180}{π})

(1)

where γ is the scalar product between the vector formed by the first cathetus (i-j) and the one formed by the second cathetus (j-k) and δ is the cross product between the norms of the aforementioned vectors:

γ = (x_{j} - x_{i}) \times (x_{k} - x_{j}) + (y_{j} - y_{i}) \times (y_{k} - y_{j})

(2)

δ = (|\sqrt{{(x_{j} - x_{i})}^{2} + {(y_{j} - y_{i})}^{2}}|) \times (|\sqrt{{(x_{k} - x_{j})}^{2} + {(y_{k} - y_{j})}^{2}}|)

(3)

In the case of two cameras, it is necessary to consider that one camera among all will have the best view to predict some angles correctly: to this end, a special algorithm has been developed. Considering that the cameras are positioned at about the pelvis height, it performs a comparison between the distances interposed between the key points (corresponding to specific body segments) and the expected average anthropometric proportions of the human body reported in [55], which are calculated on a large dataset of individuals. In particular, it analyzes the ratio between the width of the shoulders (i.e., Euclidean distance between key points 2 and 5) and the length of the spine. The Euclidean distance between the key point 1, and the midpoint (m), between key points 8 and 11, is considered to estimate the spine length since the CMU model does not consider a pelvis key point. Based on the human body proportions reported in [55], such a ratio is estimated equal to 80%, when the subject is in front of the camera, and to 0% when the person turns at 90°. Consequently, the angle between the sagittal plane of the person and the direction of the camera (α) can be estimated through the following equation, considering the length of the segments related to shoulder (x) and spine (

l

) measured in each frame:

α = \arcsin (\frac{x}{l \cdot 0.8})

(4)

To determine whether the camera is framing the person’s right side or left side, the x coordinate of the key point 0,

X_{0}

, has been considered in relation to the average x coordinate of keypoints 1, 2, and 5,

X_{a}

. The person’s side framed by the cameras is then determined according to the following algorithm:

X_{0} - X_{a} \{\begin{matrix} < 0, left \\ > 0, right \end{matrix}

(5)

In case two video shots are available, recorded from angles settled at 90° to each other (e.g., respectively parallel to the frontal and the sagittal planes), it is then possible to refer the orientation estimation to a single reference system, using as origin the direction of the first camera (camera index 0).

The compute of each angle is then performed considering the frame coming from the camera view that better estimates it. For example, the elbow flexion/extension angle is calculated using the frame coming from the camera that has a lateral view of the person, while in case of abduction, the frame coming from the frontal camera is considered, as reported in Table 1.

To evaluate which camera view is more convenient to measure the elbow flexion/extension angle, the extent of the shoulder abduction angle is considered: if it is less than 45°, the elbow flexion/extension angle is estimated considering the frames from the lateral camera. Otherwise, data provided by the frontal camera are used.

Finally, the system can estimate whether the person is rotating his/her head in respect to the shoulders or not, considering a proportion between the distance from the ear to the eyes (CMU keypoints 16-14 for the right side and 17-15 for the left side), and the length of shoulders (euclidean distance between CMU keypoints 2-5). A reference threshold for this calculated ratio is empirically estimated due to the lack of literature. Currently, this solution has been applied only when a person has an orientation between −30 and +30 degrees with respect to the camera (i.e., in a frontal position). A similar approach has been considered to detect a torso rotation, calculating the proportion between the segment from the left to the right shoulder (CMU key points 2-5), and the pelvis one (CMU key points 8-11).

2.2. Experimental Case Study

2.2.1. Experimental Procedure

Tests had carried out in the “X-in the Loop Simulation Lab” (XiLab) at the University of Modena and Reggio Emilia. Participants in the experiment received informed consent prior to accessing the lab to take part in the test.

A total of 6 subjects i.e., 2 females (age: median 31 years, IQR 5.0 years; height: median 1.68 m, IQR 0.005 m; mass: median 69 kg, IQR 13.5 kg; BMI: median 24.74 kg/m², IQR 4.78 kg/m²) and 4 males (age: median 30 years, IQR 4.0 years; height: median 1.86 m, IQR 0.0035 m; mass: median 78 kg, IQR 9.0 kg; BMI: median 22.31 kg/m², IQR 1.90 kg/m²), were involved and asked to pose for five seconds, while recording by the two systems, in the following five different postures (chosen because they are very frequent in standard working procedures or ergonomic assessment research works), in the following order:

T-pose: the subjects have to stand straight up, with their feet placed symmetrically and feet slightly apart and with their arms fully extended.
Seated: subjects have to sit on a stool 70 cm in height, with the back straight, hands leaned on the knees, and feet on the floor.
Standing Relaxed: the subjects have to stand comfortably facing straight ahead, with their feet placed symmetrically and feet slightly apart.
Reach: the subjects must stand straight, feet well apart and with the arms stretched forward, simulating the act of grasping an object placed above their head.
Pick up: the subjects have to pick up a box (dimensions 30.5 cm × 21 cm × 10.5 cm, weight 5 kg) from the floor, and raise it in front of them, keeping it at pelvic level.

An example of the analyzed postures can be found in Figure 3.

Participants’ postures were tracked using a Vicon Nexus system powered by 9 Vicon Bonita 10 optical cameras. Cameras were distributed in the space in the most symmetrical configuration possible (Figure 4) to cover the entire working volume. Participants had to stay in the middle of the system acquisition space. They were equipped with a total of 35 reflective markers, positioned on the whole body according to PlugInGait Full Body model specification defined in the Vicon PlugInGait documentation (Figure 5). Vicon Nexus session has been performed on a Dell Precision T1650 workstation with an Intel(R) Xeon(R) CPU E3-1290 V2 at 3.70 GHz, 4 Core(s), with 32 GB RAM, and an NVIDIA Quadro 2000 GPU, running Windows 10 Pro.

System calibration was carried out at the beginning of the experiment. Before starting the experiment, PluginGait (PiG) biomechanical models have been defined for the subjects (i.e., one for each) according to their anthropometric parameters.

The video capturing for the RGB-MAS was carried out by two Logitech BRIO 4K Ultra HD USB cameras, with a setting video streaming/recording configuration of 1080 p and 30 fps and a 52 vertical and 82 horizontal degree field of view. They were placed at 1.2 m high from the ground (the pelvis height of the subjects) and angled 90 degrees to each other. Both cameras were mounted on tripods to ensure stability.

The system works regardless of the subject’s position in the camera’s field of view, but the cropped image of the subject must consist of a sufficient number of pixels. This depends on the characteristics of the camera used (e.g., resolution, focal length). For example, considering the cameras chosen in this case, the system would work properly when the user was positioned at no more than 7 m from the camera. Therefore, the first camera was placed in front of the subjects, at a distance of 2 m. The second one was at the right at a distance of 3.5 m, to ensure that the cameras correctly capture the subjects’ entire body. Figure 4 shows the overall layout.

Stream videos were processed through a PC workstation with an Intel(R) Core (TM) i7-7700K CPU at 4.20 GHz and 32 GB RAM, and a GTX 1080 Ti GPU, running Windows 10 Pro.

Postures recording was carried out simultaneously using the two different systems to ensure numerical accuracy and to reduce inconsistencies. The camera frame rate showed to be consistent and constant along with the experiment for both systems.

2.2.2. Data Analysis

Angles extracted by Vicon PiG biomechanical model are compared with those predicted by the proposed RGB-MAS. To provide a better understanding, the considered angles respectively measured by the two systems are reported in Table 2.

Resulted angles measured through these two systems are also compared with those manually extracted by expert ergonomists.

Shapiro–Wilk test is used to check the normality of the distribution of the error in all these analyses. Results evidence that the distributions follow a normal law for this experiment. Root mean square error (RMSE) is computed for the following condition:

{RMSE}_{1} = \sqrt{\frac{\sum_{i = 1}^{N} {({RGB}_{i} - {MAN}_{i})}^{2}}{N}}

(6)

{RMSE}_{2} = \sqrt{\frac{\sum_{i = 1}^{N} {({VIC}_{i} - {MAN}_{i})}^{2}}{N}}

(7)

{RMSE}_{3} = \sqrt{\frac{\sum_{i = 1}^{N} {({RGB}_{i} - {VIC}_{i})}^{2}}{N}}

(8)

where

{RGB}_{i}

is the ith angle measured by the RGB-MAS system,

{VIC}_{i}

the one measured by the Vicon system and, finally,

{MAN}_{i}

the angle measured manually.

Based on the collected angles, a Rapid Upper Limb Assessment (RULA) is performed manually, according to the procedure described in [12]. Then, RMSE is computed to compare the resulting RULA scores estimated according to the angles respectively predicted by the RGB-MAS and Vicon with that performed considering the angles estimated from the video analysis by the experts themself.

3. Results

Table 3 shows the RMSE comparison between the angles extracted from the RGB-MAS and the ones extracted by the Vicon system for a pure system-to-system analysis.

As it can be observed, angle predictions provided by the proposed system result in general lower accuracy, in the case of shoulder abduction and shoulder and elbow flexion/extension. The accuracy is particularly low if we consider the reach posture. This is probably because of perspective distortions.

However, the pickup posture could not be traced with the Vicon system, due to occlusion problems caused by the presence of the box that occludes some of the markers needed by it.

Table 4 allows easy comparison of the RMSE between angles respectively predicted by the RGB-MAS and the Vicon system, and those manually measured. These results suggest that the proposed system can be considered somehow feasible to support ergonomists doing their analysis.

Figure 6 highlights the similarities and discrepancies between the angle predictions provided by RGB-MAS and Vicon system with respect to the manual analysis. It can be observed that the prediction provided by the RGB-MAS suffers from a wider variability compared to the reference system. As for the neck flexion/extension angle, the RGB-MAS system slightly overestimates the results compared to the Vicon system. This occurs more pronouncedly for the shoulder abduction and flexion/extension as well, especially when abduction and flexion both occur simultaneously. In Figure 7, the keypoint’s locations and the skeleton are shown superimposed over the posture pictures. In particular, from the picture of the Pick Up posture, we can see that the small occlusion that caused problems to the Vicon system did not cause any consequences to the RGB-MAS system.

Moreover, high variability is also found for all the angles of the left-hand side of the body. Nevertheless, RGB-MAS accurately predicts the trunk flexion/extension and the right-hand side angles. This left-right inconsistency can be due to the lack of a left camera, and so the left-hand side angles are predicted with less confidence than their right-hand side counterparts.

Table 5 shows the median RULA values obtained using each angle extraction method considered (i.e., manual measurement, RGB-MAS prediction, Vicon tracking). Table 6 shows the RMSE between the RULAs determined through manual angle measurement, and those calculated from the angles predicted by RGB-MAS and the Vicon system, respectively. The maximum RMSE between RGB-MAS and manual analysis is 1.35, while the maximum Vicon vs. manual RMSE is 1.78. Since the closer a value of RMSE is to zero, the better the prediction accuracy is, it can be observed that RGB-MAS compared to a most used manual analysis is generally able to provide a result closer to that of the Vicon systems. However, this should not be interpreted as a better performance of RGB-MAS than Vicon: the result that can be obtained, instead, is that the values provided by Vicon, in some cases, are very different from both those of the RGB-MAS system and those estimated manually. This is because the result of the Vicon system is not affected by estimation errors due to perspective distortions, which instead occur with the other two systems. Ultimately, the RGB-MAS system can provide estimates that are very similar to those obtained from a manual extraction, although the system’s accuracy is poor compared to the Vicon.

As can be seen from Table 5, the risk indices calculated from the angles provided by the RGB-MAS system, those evaluated from manually measured angles, and those provided by Vicon belong to the same ranges. The only exception can be found for the Reach posture, where the Vicon underestimated the scores of a whole point. Despite the overestimations in angle prediction, the RULA score evaluation made considering the extension of the angles within particular risk ranges filters out noise and slight measurement inaccuracies. This leads to RULA scores that are slightly different in numbers but can be considered almost the same in terms of risk ranges.

4. Discussion

This paper aims to introduce a novel tool to help ergonomists in ergonomic risk assessments by automatically extracting angles from video acquisition, in a quicker way than the traditional one. Its overall systematic reliability and numerical accuracy are assessed by comparing the tool’s performance in ergonomics evaluation with the one obtained by standard procedures, representing the gold standard in the context.

Results suggest that it generally provides a good consistency in predicting the angles from the front camera and a slightly less accuracy with the lateral one, with a broader variability than the Vicon. However, in most cases, the average and median values are relatively close to the reference one. This apparent limitation should be analyzed in the light of the setup needed to obtain this data; in fact, by using only two cameras (instead of the nine ones the Vicon needs), we obtained reliable angles proper to compute a RULA score.

Although it has greater accuracy than the proposed tool, the Vicon requires installing a vast number of cameras (at least six), precisely positioned in the space, to completely cover the work area and ensure the absence of occlusions. In addition, such a system requires calibration and forces workers to wear markers in precise positions. However, when performing a manual ergonomic risk assessment in a real working environment, given the constraints typically presented, an ergonomist usually can collect videos from one or two cameras at most: the proposed RGB-MAS copes with this aspect providing predicted angles even from the blind side of the subject (like a human could do, but quicker), or when the subject results partially occluded.

As proof of this, it is worth noticing that the pickup posture, initially considered for its tendency to introduce occlusion, was then discarded from the comparison just for the occlusion leading to the lack of data from the Vicon system, while no problems seemed to arise with the RGB-MAS.

In addition, the RMSE values obtained comparing the RGB-MAS RULA scores with the manual one showed tighter variability than the same values resulting from the comparison between the RULA scores estimated through the Vicon and the manual analysis. This suggests that the RGB-MAS can be helpful to fruitfully support ergonomists to estimate the RULA score on a first exploratory evaluation. The proposed system can extract angles with a numerical accuracy comparable to one of the reference systems, at least in a controlled environment such as a laboratory. The next step will be to test its methodological reliability and instrumental feasibility in a real working environment, where a Vicon-like system cannot be introduced due to its limitations (e.g., installation complexity, calibration requirements, occlusion sensitivity).

Study Limitations

This study provides the results of a first assessment of the proposed system, with the aim to measure its accuracy and to preliminary determine its utility for ergonomic assessment. Many studies should be carried out to fully understand its practical suitability to be used for ergonomic assessment in real working environments. The experiment was conducted only in the laboratory and not in a real working environment. This limits the study results. Therefore, it did not allow the researchers to evaluate the instrument’s sensitivity to any changes in lighting or unexpected illumination conditions (e.g., glares or reflections). Further studies are needed to fully evaluate the implementation constraints of the proposed system in a real working environment.

In addition, the study is limited to evaluating the RULA risk index related to static postures only. Further studies will be needed to evaluate the possibility of using the proposed system for the acquisition of data necessary for other risk indexes (e.g., REBA, OCRA), also considering dynamic postures.

Another limitation is that the experiment conducted did not entirely evaluate the proposed system functionalities in conditions of severe occlusion (e.g., as could happen when the workbench partially covers the subject). Despite results evidenced that the proposed system, unlike Vicon, does not suffer from minor occlusion (i.e., due to the presence of a box during a picking operation), further studies are needed to accurately assess the sensitivity of the proposed system with different levels of occlusion.

Another limitation is the small number of subjects involved in the study. A small group of subjects was involved, with limited anthropometric variation, assuming that the tf-pose-estimation model was already trained on a large dataset. Further studies will need to confirm whether anthropometric variations affect the results (e.g., whether and how the BMI factor may affect the estimated angle accuracy).

5. Conclusions

This work proposes a valuable tool, namely RGB motion analysis system (RGB-MAS), to make a more efficient, and affordable ergonomic risk assessment. Our scope was to aid ergonomists in saving up time doing their job while maintaining highly reliable results. The lengthy part of their job is manually extracting human angles from video analysis based on video captures, by analyzing how ergonomists carry out an RULA assessment. In this context, the paper proposed a system able to speed up angle extraction and RULA calculation.

The validation in the laboratory shows the promising performance of the system, suggesting its possible suitability also in real working conditions (e.g., picking activities in the warehouse or manual tasks in the assembly lines), to enable the implementation of more effective health and safety management systems in the future, so as to improve the awareness of MSDs and to increase the efficiency and safety of the factory.

Overall, experimental results suggested that the RGB-MAS can be useful to support ergonomists to estimate the RULA score, providing results comparable to those estimated by ergonomic experts. The proposed system allows ergonomists and companies to reduce the cost necessary to perform ergonomic analysis, due to decreasing time for risk assessment. This competitive advantage makes it appealing not only to large-sized enterprises, but also to small and medium-sized enterprises, wishing to improve the working conditions of their workers. The main advantages of the proposed tool are: the ease of use, the wide range of scenarios where it can be installed, its full compatibility with every RGB commercially available camera, no-need calibration, low CPU and GPU performance requirements (i.e., it can process video recordings in a matter of seconds by using a common laptop), and low cost.

However, according to the experimental results, the increase in efficiency that the system allows comes at the expense of small errors in angle estimation and ergonomic evaluation: since the proposed system is not based on any calibration procedure and is still affected by perspective distortion problems, it obviously does not reach the accuracy of the Vicon. Nonetheless, if it is true that the Vicon system is to be considered as the absolute truth as far as accuracy is concerned, it is also true that using it in a real working environment is actually impossible, since it greatly suffers problem occlusion (even the presence of an object such as a small box can determine the loss of body tracking), and requires:

A high amount of highly expensive cameras, placed in the space in a way that is impracticable in a real work environment.
A preliminary calibration procedure.
The use of wearable markers may invalidate the quality of the measurement as they are invasive.

Future studies should aim to improve the current functionalities of the proposed system. Currently, the system is incapable of automatically computing RULA scores. A spreadsheet based on the derived angles is filled to obtain them. However, it should not be difficult to implement such functionality. In particular, future studies should be focused to implement a direct stream of the angles extracted by the RGB-MAS system to a structured, ergonomic risk assessment software (e.g., Siemens Jack) to animate a virtual mannikin, again automatically obtaining RULA scores.

Moreover, the proposed system cannot predict hand- and wrist-related angles: further research might cope with this issue and try to fill the gap. For example, possible solutions can be those proposed in [56,57].

For a broader application of the proposed RGB-MAS system, other efforts should be made to improve the angles prediction accuracy.

Moreover, the main current issue is that it is not always possible to correctly predict shoulder abduction and flexion angles with non-calibrated cameras, e.g., when the arms are simultaneously showing flexion in the lateral plane and abduction in the frontal plane. This comes from the fact that, at the moment, there is no spatial correlation between the two cameras: the reference system is not the same for both, so it is not possible to determine 3D angles. Thus, another topic for future work may cover the development of a dedicated algorithm to correlate the spatial position of the cameras one to each other. In addition, such an algorithm should provide a (real-time) correction to effectively manage the inevitable perspective distortion introduced by the lenses, to improve the system accuracy. However, all of this would require the introduction of a calibration procedure that would slow down the implementation of the system in real working places.

Author Contributions

Writing—original draft preparation, T.A.; system design, software development, A.G.; experimental design, data interpretation, writing review, S.C.; testing and data analysis, R.K.K.; supervision and validation, M.P.; project coordination, writing—review and editing, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This project has been funded by Marche Region in implementation of the financial program POR MARCHE FESR 2014-2020, project “Miracle” (Marche Innovation and Research Facilities for Connected and sustainable Living Environments), CUP B28I19000330007.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Università Politecnica delle Marche (Prot.n. 0100472 of 22 September 2021).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

This research has been funded and supported by the EMOJ srl startup within the program “HEGO: a novel enabling framework to link health, safety and ergonomics for the future human-centric factory toward an enhanced social sustainability”, POR MARCHE FESR 2014-2020-ASSE 1-OS 1-AZIONE 1.1. INT 1.1.1.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Badri, A.; Boudreau-Trudel, B.; Souissi, A.S. Occupational health and safety in the industry 4.0 era: A cause for major concern? Saf. Sci. 2018, 109, 403–411. [Google Scholar] [CrossRef]
European Agency for Safety and Health at Work. Work-Related Musculoskeletal Disorders: Prevalence, Costs and Demographics in the EU. EU-OSHA. Available online: https://osha.europa.eu/en/publications/msds-facts-and-figures-overview-prevalence-costs-and-demographics-msds-europe/view (accessed on 5 July 2021).
European Commission. The 2015 Ageing Report: Economic and Budgetary Projections for the 28 EU Member State. Available online: https://ec.europa.eu/economy_finance/publications/european_economy/2015/pdf/ee3_en.pdf (accessed on 5 July 2021).
Ilmarinen, J. Physical requirements associated with the work of aging workers in the European Union. Exp. Aging Res. 2002, 28, 7–23. [Google Scholar] [CrossRef] [PubMed]
Kenny, G.P.; Groeller, H.; McGinn, R.; Flouris, A.D. Age, human performance, and physical employment standards. Appl. Physiol. Nutr. Metab. 2016, 41, S92–S107. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Battini, D.; Persona, A.; Sgarbossa, F. Innovative real-time system to integrate ergonomic evaluations into warehouse design and management. Comput. Ind. Eng. 2014, 77, 1–10. [Google Scholar] [CrossRef]
Mengoni, M.; Ceccacci, S.; Generosi, A.; Leopardi, A. Spatial Augmented Reality: An application for human work in smart manufacturing environment. Procedia Manuf. 2018, 17, 476–483. [Google Scholar] [CrossRef]
Vignais, N.; Miezal, M.; Bleser, G.; Mura, K.; Gorecky, D.; Marin, F. Innovative system for real-time ergonomic feedback in industrial manufacturing. Appl. Ergon. 2013, 44, 566–574. [Google Scholar] [CrossRef]
Lowe, B.D.; Dempsey, P.G.; Jones, E.M. Ergonomics assessment methods used by ergonomics professionals. Appl. Ergon. 2019, 81, 10. [Google Scholar] [CrossRef]
Ceccacci, S.; Matteucci, M.; Peruzzini, M.; Mengoni, M. A multipath methodology to promote ergonomics, safety and efficiency in agile factories. Int. J. Agil. Syst. Manag. 2019, 12, 407–436. [Google Scholar] [CrossRef]
Snook, S.H.; Ciriello, V.M. The design of manual handling tasks: Revised tables of maximum acceptable weights and forces. Ergonomics 1991, 34, 1197–1213. [Google Scholar] [CrossRef]
McAtamney, L.; Corlett, E.N. RULA: A survey method for the investigation of work-related upper limb disorders. Appl. Ergon. 1993, 24, 91–99. [Google Scholar] [CrossRef]
Hignett, S.; McAtamney, L. Rapid entire body assessment (REBA). Appl. Ergon. 2000, 31, 201–205. [Google Scholar] [CrossRef]
Moore, J.S.; Garg, A. The strain index: A proposed method to analyze jobs for risk of distal upper extremity disorders. Am. Ind. Hyg. Assoc. J. 1995, 56, 443. [Google Scholar] [CrossRef]
Occhipinti, E. OCRA: A concise index for the assessment of exposure to repetitive movements of the upper limbs. Ergonomics 1998, 41, 1290–1311. [Google Scholar] [CrossRef] [PubMed]
Burdorf, A.; Derksen, J.; Naaktgeboren, B.; Van Riel, M. Measurement of trunk bending during work by direct observation and continuous measurement. Appl. Ergon. 1992, 23, 263–267. [Google Scholar] [CrossRef]
Fagarasanu, M.; Kumar, S. Measurement instruments and data collection: A consideration of constructs and biases in ergonomics research. Int. J. Ind. Ergon. 2002, 30, 355–369. [Google Scholar] [CrossRef]
Altieri, A.; Ceccacci, S.; Talipu, A.; Mengoni, M. A Low Cost Motion Analysis System Based on RGB Cameras to Support Ergonomic Risk Assessment in Real Workplaces. In Proceedings of the ASME 2020 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, American Society of Mechanical Engineers Digital Collection, St. Louis, MO, USA, 17–19 August 2020. [Google Scholar] [CrossRef]
De Magistris, G.; Micaelli, A.; Evrard, P.; Andriot, C.; Savin, J.; Gaudez, C.; Marsot, J. Dynamic control of DHM for ergonomic assessments. Int. J. Ind. Ergon. 2013, 43, 170–180. [Google Scholar] [CrossRef] [Green Version]
Xsense. Available online: https://www.xsens.com/motion-capture (accessed on 5 July 2021).
Vicon Blue Trident. Available online: https://www.vicon.com/hardware/blue-trident/ (accessed on 5 July 2021).
Vicon Nexus. Available online: https://www.vicon.com/software/nexus/ (accessed on 5 July 2021).
Optitrack. Available online: https://optitrack.com/ (accessed on 5 July 2021).
Manghisi, V.M.; Uva, A.E.; Fiorentino, M.; Gattullo, M.; Boccaccio, A.; Evangelista, A. Automatic Ergonomic Postural Risk Monitoring on the Factory Shopfloor‒The Ergosentinel Tool. Procedia Manuf. 2020, 42, 97–103. [Google Scholar] [CrossRef]
Schall, M.C., Jr.; Sesek, R.F.; Cavuoto, L.A. Barriers to the Adoption of Wearable Sensors in the Workplace: A Survey of Occupational Safety and Health Professionals. Hum. Factors. 2018, 60, 351–362. [Google Scholar] [CrossRef] [PubMed]
Aitpayev, K.; Gaber, J. Collision Avatar (CA): Adding collision objects for human body in augmented reality using Kinect. In Proceedings of the 2012 6th International Conference on Application of Information and Communication Technologies (AICT), Tbilisi, GA, USA, 17–19 October 2012; pp. 1–4. [Google Scholar] [CrossRef]
Bian, Z.P.; Chau, L.P.; Magnenat-Thalmann, N. Fall detection based on skeleton extraction. In Proceedings of the 11th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry, New York, NY, USA, 2–4 December 2012; pp. 91–94. [Google Scholar] [CrossRef]
Chang, C.Y.; Lange, B.; Zhang, M.; Koenig, S.; Requejo, P.; Somboon, N.; Rizzo, A.A. Towards pervasive physical rehabilitation using Microsoft Kinect. In Proceedings of the 2012 6th International Conference on Pervasive Computing Technologies for Healthcare (Pervasive Health) and Workshops, San Diego, CA, USA, 21–24 May 2012; pp. 159–162. [Google Scholar]
Farhadi-Niaki, F.; GhasemAghaei, R.; Arya, A. Empirical study of a vision-based depth-sensitive human-computer interaction system. In Proceedings of the 10th Asia Pacific Conference on Computer Human Interaction, New York, NY, USA, 28–31 August 2012; pp. 101–108. [Google Scholar] [CrossRef]
Villaroman, N.; Rowe, D.; Swan, B. Teaching natural user interaction using openni and the microsoft kinect sensor. In Proceedings of the 2011 Conference on Information Technology Education, New York, NY, USA, 20–22 December 2011; pp. 227–232. [Google Scholar] [CrossRef]
Diego-Mas, J.A.; Alcaide-Marzal, J. Using Kinect sensor in observational methods for assessing postures at work. Appl. Ergon. 2014, 45, 976–985. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Manghisi, V.M.; Uva, A.E.; Fiorentino, M.; Bevilacqua, V.; Trotta, G.F.; Monno, G. Real time RULA assessment using Kinect v2 sensor. Appl. Ergon. 2017, 65, 481–491. [Google Scholar] [CrossRef]
Marinello, F.; Pezzuolo, A.; Simonetti, A.; Grigolato, S.; Boscaro, B.; Mologni, O.; Gasparini, F.; Cavalli, R.; Sartori, L. Tractor cabin ergonomics analyses by means of Kinect motion capture technology. Contemp. Eng. Sci. 2015, 8, 1339–1349. [Google Scholar] [CrossRef]
Clark, R.A.; Pua, Y.H.; Fortin, K.; Ritchie, C.; Webster, K.E.; Denehy, L.; Bryant, A.L. Validity of the Microsoft Kinect for assessment of postural control. Gait Posture 2012, 36, 372–377. [Google Scholar] [CrossRef]
Bonnechere, B.; Jansen, B.; Salvia, P.; Bouzahouene, H.; Omelina, L.; Moiseev, F.; Sholukha, C.J.; Rooze, M.; Van Sint Jan, S. Validity and reliability of the kinect within functional assessment activities: Comparison with standardstereo-photogrammetry. Gait Posture 2014, 39, 593–598. [Google Scholar] [CrossRef] [PubMed]
Plantard, P.; Auvinet, E.; Le Pierres, A.S.; Multon, F. 2015, Pose Estimation with a Kinect for Ergonomic Studies: Evaluation of the Accuracy Using a Virtual Mannequin. Sensors 2015, 15, 1785–1803. [Google Scholar] [CrossRef]
Patrizi, A.; Pennestrì, E.; Valentini, P.P. Comparison between low-cost marker-less and high-end marker-based motion capture systems for the computer-aided assessment of working ergonomics. Ergonomics 2015, 59, 155–162. [Google Scholar] [CrossRef] [PubMed]
Plantard, P.; Hubert PH, S.; Le Pierres, A.; Multon, F. Validation of an ergonomic assessment method using Kinect data in real workplace conditions. Appl. Ergon. 2017, 65, 562–569. [Google Scholar] [CrossRef] [PubMed]
Xu, X.; McGorry, R.W. The validity of the first and second generation Microsoft Kinect™ for identifying joint center locations during static postures. Appl. Ergon. 2015, 49, 47–54. [Google Scholar] [CrossRef] [PubMed]
Schroder, Y.; Scholz, A.; Berger, K.; Ruhl, K.; Guthe, S.; Magnor, M. Multiple kinect studies. Comput. Graph. 2011, 2, 6. [Google Scholar]
Zhang, H.; Yan, X.; Li, H. Ergonomic posture recognition using 3D view-invariant features from single ordinary camera. Autom. Constr. 2018, 94, 1–10. [Google Scholar] [CrossRef]
Cao, Z.; Hidalgo, G.; Simon, T.; Wei, S.E.; Sheikh, Y. OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 172–186. [Google Scholar] [CrossRef] [Green Version]
Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7291–7299. [Google Scholar] [CrossRef] [Green Version]
Bradski, G. The OpenCV Library. Dr. Dobb’s J. Softw. Tools 2000, 25, 120–123. [Google Scholar]
Ota, M.; Tateuchi, H.; Hashiguchi, T.; Kato, T.; Ogino, Y.; Yamagata, M.; Ichihashi, N. Verification of reliability and validity of motion analysis systems during bilateral squat using human pose tracking algorithm. Gait Posture 2020, 80, 62–67. [Google Scholar] [CrossRef] [PubMed]
He Ling, W.A.N.G.; Yun-Ju, L.E.E. Occupational evaluation with Rapid Entire Body Assessment (REBA) via imaging processing in field. In Proceedings of the Human Factors Society Conference, Elsinore, Denmark, 25–28 July 2019. [Google Scholar]
Li, L.; Martin, T.; Xu, X. A novel vision-based real-time method for evaluating postural risk factors associated with muskoskeletal disorders. Appl. Ergon. 2020, 87, 103138. [Google Scholar] [CrossRef] [PubMed]
MassirisFernández, M.; Fernández, J.Á.; Bajo, J.M.; Delrieux, C.A. Ergonomic risk assessment based on computer vision and machine learning. Comput. Ind. Eng. 2020, 149, 10. [Google Scholar] [CrossRef]
Ojelaide, A.; Paige, F. Construction worker posture estimation using OpenPose. Constr. Res. Congr. 2020. [Google Scholar] [CrossRef]
Da Silva Neto, J.G.; Teixeira, J.M.X.N.; Teichrieb, V. Analyzing embedded pose estimation solutions for human behaviour understanding. In Anais Estendidos do XXII Simpósio de Realidade Virtual e Aumentada; SBC: Porto Alegre, Brasil, 2020; pp. 30–34. [Google Scholar]
TF-Pose. Available online: https://github.com/tryagainconcepts/tf-pose-estimation (accessed on 5 July 2021).
Lindera. Available online: https://www.lindera.de/technologie/ (accessed on 5 July 2021).
Obuchi, M.; Hoshino, Y.; Motegi, K.; Shiraishi, Y. Human Behavior Estimation by using Likelihood Field. In Proceedings of the International Conference on Mechanical, Electrical and Medical Intelligent System, Gunma, Japan, 4–6 December 2021. [Google Scholar]
Agrawal, Y.; Shah, Y.; Sharma, A. Implementation of Machine Learning Technique for Identification of Yoga Poses. In Proceedings of the 2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT), Gwalior, India, 10–12 April 2020; pp. 40–43. [Google Scholar] [CrossRef]
Contini, R. Body Segment Parameters, Part II. Artif. Limbs 1972, 16, 1–19. [Google Scholar]
Romero, J.; Kjellström, H.; Ek, C.H.; Kragic, D. Non-parametric hand pose estimation with object context. Image Vis. Comput. 2013, 31, 555–564. [Google Scholar] [CrossRef]
Wu, Z.; Hoang, D.; Lin, S.Y.; Xie, Y.; Chen, L.; Lin, Y.Y.; Fan, W. Mm-hand: 3d-aware multi-modal guided hand generative network for 3d hand pose synthesis. arXiv 2020, arXiv:2010.01158. [Google Scholar]

Figure 1. High-level software architecture and an example of possible hardware configuration.

Figure 2. tf-pose-estimation CMU model: acquired joints of body.

Figure 3. Postures assessed in this research paper.

Figure 4. Experimental camera layout for RGB (those labeled in yellow) and Vicon system.

Figure 5. Markers’ layout according to the PlugInGait full body model.

Figure 6. Graph comparing the RMSE values “RGB-MAS vs. manual” and “Vicon vs. manual”.

Figure 7. Body skeleton predicted by the RGB-MAS.

Table 1. Body key points and camera views considered in the computation of the angles.

Angles	Considered Keypoints		Camera View
Angles	Left	Right	Frontal	Lateral
Neck flexion/extension	17-1-11	16-1-8		X
Shoulder abduction	11-5-6	8-2-3	X
Shoulder flexion/extension	11-5-6	8-2-3		X
Elbow flexion/extension angle	5-6-7	2-3-4	X	X
Trunk flexion/extension	1-11-12	1-8-9		X
Knee bending angle	11-12-13	8-9-10		X

Table 2. Pairs of angles that have been compared between Vicon and RGB systems.

Pairs of Angles Compared between the Two Systems.
Vicon	RGB-MAS
Average between L and R neck flexion/extension	Neck flexion/extension
L/R shoulder abduction/adduction Y component	L/R shoulder abduction
L/R shoulder abduction/adduction X component	L/R shoulder flexion/extension
L/R elbow flexion/extension	L/R elbow flexion/extension
Average between L and R spine flexion/extension	Trunk flexion/extension
L/R knee flexion/extension	L/R knee bending angle

Table 3. RMSE values obtained comparing RGB-MAS angles with the Vicon ones.

	RMSE RGB-MAS vs. Vicon [°]
	T-Pose	Seated	Standing Relaxed	Reach
Neck flexion/extension	6.83	16.47	19.21	9.58
Left shoulder abduction	12.66	45.16	13.07	45.46
Right shoulder abduction	11.64	50.66	7.93	43.23
Left shoulder flexion/extension	27.86	57.19	21.53	71.29
Right shoulder flexion/extension	33.73	52.90	57.88	82.93
Left elbow flexion/extension	7.13	16.90	21.15	27.05
Right elbow flexion/extension	5.46	13.19	22.11	53.30
Trunk flexion/extension	0.35	8.61	0.91	2.95
Left knee flexion/extention	2.39	46.25	7.38	24.76
Right knee flexion/extention	0.21	4.79	0.07	0.12

Table 4. RMSE values obtained when comparing RGB-MAS angles with manually extracted ones.

	RMSE RGB-MAS vs. Manual [°]					RMSE Vicon vs. Manual [°]
	T-Pose	Seated	Standing Relaxed	Reach	Pick Up	T-Pose	Seated	Standing Relaxed	Reach
Neck flexion/extension	9.19	13.10	19.63	6.87	28.05	8.26	15.47	8.64	8.32
Left shoulder abduction	6.90	44.13	7.54	49.68	8.03	7.29	7.52	13.00	38.11
Right shoulder abduction	6.67	47.00	7.82	51.65	7.55	6.35	6.54	7.09	38.08
Left shoulder flexion/extension	32.62	53.84	12.03	82.10	30.43	21.08	16.19	19.55	101.52
Right shoulder flexion/extension	50.91	50.70	70.02	71.31	28.01	23.14	15.66	21.43	110.20
Left elbow flexion/extension	3.60	15.03	14.21	36.16	15.83	6.21	7.08	15.56	16.35
Right elbow flexion/extension	2.20	9.67	18.63	25.45	10.26	5.60	17.97	10.35	39.00
Trunk flexion/extension	3.48	30.06	5.46	20.71	33.46	3.28	33.60	4.59	17.93
Left knee flexion/extention	3.81	53.12	6.82	22.73	30.10	1.98	18.85	2.68	8.39
Right knee flexion/extention	3.95	20.69	1.94	22.63	29.19	3.75	21.24	1.90	22.52

Table 5. RULA median scores for the three angle extraction methods and corresponding level of MSD risk (i.e., green = negligible risk; yellow = low risk; orange = medium risk).

	Median
	Manual		RGB-MAS		Vicon
	Left	Right	Left	Right	Left	Right
T-Pose	3.00	3.00	3.00	3.00	3.00	3.00
Relaxed	2.50	2.50	3.00	3.00	3.00	3.00
Sit	4.00	4.00	4.00	4.00	4.00	4.00
Reach	4.50	4.50	4.00	4.00	3.00	3.00
Pickup	5.50	5.50	6.00	6.00	-	-

Table 6. RMSE (+SD) values obtained when respectively comparing RGB-MAS and the Vicon RULA with the manual one.

	RULA RMSE (+SD)
	RGB-MAS vs. Manual		Vicon vs. Manual
	Left	Right	Left	Right
T-Pose	0.00 (0.58)	1.00 (0.75)	0.41 (0.37)	0.41 (0.37)
Relaxed	0.71 (0.00)	2.45 (0.37)	0.82 (0.37)	0.82 (0.37)
Sit	0.58 (0.75)	1.41 (0.76)	0.71 (0.69)	0.82(0.58)
Reach	1.35 (1.07)	1.35 (0.82)	1.78 (0.76)	1.78 (0.76)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Agostinelli, T.; Generosi, A.; Ceccacci, S.; Khamaisi, R.K.; Peruzzini, M.; Mengoni, M. Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment. Appl. Sci. 2021, 11, 10645. https://doi.org/10.3390/app112210645

AMA Style

Agostinelli T, Generosi A, Ceccacci S, Khamaisi RK, Peruzzini M, Mengoni M. Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment. Applied Sciences. 2021; 11(22):10645. https://doi.org/10.3390/app112210645

Chicago/Turabian Style

Agostinelli, Thomas, Andrea Generosi, Silvia Ceccacci, Riccardo Karim Khamaisi, Margherita Peruzzini, and Maura Mengoni. 2021. "Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment" Applied Sciences 11, no. 22: 10645. https://doi.org/10.3390/app112210645

APA Style

Agostinelli, T., Generosi, A., Ceccacci, S., Khamaisi, R. K., Peruzzini, M., & Mengoni, M. (2021). Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment. Applied Sciences, 11(22), 10645. https://doi.org/10.3390/app112210645

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment

Abstract

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. The proposed Motion Analysis System

2.1.1. Data Collection

2.1.2. Parameter Calculation

2.2. Experimental Case Study

2.2.1. Experimental Procedure

2.2.2. Data Analysis

3. Results

4. Discussion

Study Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI