1. Introduction
Human locomotion is an extraordinarily complex translational motion that can be described on the basis of the kinematics and muscular activity of the extremities in all their various movements [
1]. The analysis of human gait identifies and analyzes walking and posture problems, load anomalies and muscle failure, which would not be measurable with normal clinical exams. It can be defined as a pattern of the locomotion characteristic of a limited range of speeds and serves as an important diagnostic tool in many fields, such as health, sports, daily life activities, prosthetics, and can thus have a significant impact on the person’s Quality of Life (QOL) [
2].
The analysis of human gait cyclic motion is usually obtained from identification and characterization of an individual’s walking pattern and kinematics. The gait cycle observation can be derived from sensor data approaches (visual, weight, inertial, etc.) and modeled upon different data driven methods. The extracted features (phases) recognized in a gait analysis can be used for many tasks, including, among others, addressing health-based issues, such as recognition of unpredictable gait disorders, which, in the worst cases, can lead to injuries [
3,
4]. However, two things are important for reliable gait analysis: An appropriate wearable gait motion data acquisition system (with sensors) and methods running for robust gait monitoring, analysis and recognition. A wearable system with sensors, a processing unit and communication, packed in a small lightweight housing, is usually desired for long-term daily use. The important criteria for a wearable system are also low power consumption, sufficient external memory storage, and a user-friendly human interface for on-line data visualization and monitoring. In the case of Machine Learning (ML) methods, many of them now enable feature extraction. A combination of different learning methods can be used to recognize and extract human walking features adequately. A tradeoff between code complexity and real-time operation needs to be considered as far as implementation is concerned. The human gait is a cyclical process, with the joint operation of bones, muscles and the nervous system trying to maintain static and dynamic balance of an upright body position while in motion. The cycle can be defined as the time between two consecutive contacts of the foot striking the ground while walking, and has two states, namely stance when the foot is firmly on the ground, and swing when the foot is lifted from the ground, as seen in
Figure 1.
Furthermore, gait phases can be divided into durations of certain events (left and right foot up/down, limb stance/swing, and limb turn) [
5], and classified according to the area of interest (sport, health, prosthetics, and daily life) [
6,
7,
8], type of sensors used (floor sensors, visual, inertial or other sensors) [
9,
10], and last but not least, its placement (shoe, ankle, limb, and hip/back) [
11,
12]. The gait analysis in sports can improve walking/running techniques, prevent long-term hip, knee and ankle injuries, improve sport results, and aid the development of custom-made athlete’s textiles or sport shoes [
6]. In medical applications, a gait analysis can result in successful recognition of various gait disorders, like Freezing Of Gait (FOG) in Parkinson’s Disease (PD) or Cerebral Palsy (CP) patients [
3]. In prosthetics applications, the gait analysis can help disabled patients after limb amputation to improve feedback in walking with a robotic prosthesis [
8,
13].
Collecting data of human activities became very accessible with technology development and hardware minimization. Older approaches using visual-based systems (e.g., cameras) [
10], and environmental sensors for data acquisition, are well established in applications with remote CPU processing, but are limited to mostly indoor solutions. Approaches based on force (weight) measurements are conducted in a motion analysis laboratory with force platforms (floor tiles) and optical motion systems [
14]. Such motion capture systems are not easily portable, and usually only operate in controlled environments. Force-based systems, such as foot switches or force-sensitive resistors, are generally considered the gold standard for detecting gait events, yet they are prone to mechanical failure, unreliable due to weight shifting, and provide no details about the swing phase.
Small and lightweight unobtrusive devices have become indispensable in human daily life activities. Many research groups and authors have presented their achievements in the field using numerous hardware systems and methods intended for indoor and outdoor use [
15]. In the past decade, Inertial Measurement Unit (IMUs) sensors with gyroscopes and accelerometers have become very affordable and appropriate, due to their small-size, low power consumption, and fast processing and near instantaneous accelerations measurements during motion [
16,
17,
18]. Many related research studies on gait analysis are oriented towards using wearable IMU-based sensor devices [
7,
11,
19]. Today, IMU sensors can be found in many commercially available gadgets (i.e., fitness bands, smart watches, phones) with user-friendly User Interfaces (UI) [
20]. Usually, embedded device boards are used for dedicated tasks. An important aspect is also sensor placement on the human limb/body, where accelerometer locations for the gait research studies were almost evenly distributed among the shank/ankle [
11,
13,
17,
21], foot [
5,
22,
23], and the waist/lower back/pelvis [
12,
24].
When reviewing the related work for gait analysis, we found that some authors used a foot displacement sensor in combination with IMUs to classify heel strike, heel off, toe off and mid-swing gait phases under different walking speeds, as described in [
25]. Most novel methods are based on ML, which is a very popular collection of data-driven methods that take advantage of a large amount of data and reduce the need to create meaningful features to perform the classification task. Various ML approaches, such as K-Nearest Neighbors (KNN) [
26], Decision Trees (DT) [
27], Naïve Bayesian classifier (NB) [
28], Linear Discriminant Analysis (LDA) [
10] and Long-Short Term Memory (LSTM) Neural Networks (NN) [
29], have been used for gait disorder’s recognition and to distinguish between gait phases. Some authors propose the Hidden Markov Model (HMM) method [
8,
17,
30], or Artificial Neural Networks (ANN), which are largely applied in feature extraction and image recognition and have also been used successfully to identify human motion and activity [
13,
22,
31]. The Support Vector Machines (SVM) method was used for analysis and classification of activities such as running, jumping and walking [
32]. Newer studies also involve using electromyography data combined with inertial signals [
3], approaches based on higher-order-statistics IMUs and transformation of acceleration signals into features by employing higher-order cumulants [
33], and advanced NN models [
13,
22]. Recent studies outperformed other state-of-the-art methods by utilizing ML with an added attention mechanism. The original article on attention [
34] describes the attention mechanism as a function which can learn the relative importance of input features for a given task. The attention mechanism is parameterized by a set of weights, which are learned during training. The hybrid Convolutional Neural Network (CNN) [
13,
35,
36,
37] and Recurrent Neural Network (RNN) [
38,
39] architecture with an added attention mechanism [
40,
41] can be applied to boost the classification performance of the algorithm further.
The motivation of the paper is to develop a robust and lightweight system capable of human gait activity monitoring, analysis and recognition, with the aim to improve a subject’s QOL if applied in the fields of medicine, sports, or prosthetics. The proposed system (
Figure 2) consists of the following components: A gait motion data acquisition system, time-series data collection, data pre-processing and data processing (classification).
The gait motion data acquisition system allows for semi-automatic labeling of recorded gait data, needed later during the learning phase of the ML methods. The purpose of the data collection is to gather data acquired from the gait motion data acquisition system into a database saved on a personal computer. Data pre-processing captures three methods: Feature normalization, feature transformation, and feature reduction. Feature normalization means scaling the parameter values into specific intervals. The feature transformation refers to frequency transformation techniques, where the Continuous Wavelet Transform (CWT) is used typically [
42,
43]. This method increases the data dimensionality. Data can be processed further using feature reduction methods such as Auto Encoder (AE), which saturates the data and decreases their dimensionality [
44].
The gait activity was recognized by various ML methods, where the combination of the CNN and RNN algorithms with added attention (CNNA + RNN) was distinguished by the best classification results according to standard classification measures on the RAW, as well as CWT datasets. The two types of datasets were used in order to show the effects of the pre-processing. The robustness of the proposed system using the CNNA + RNN classification method was justified by an experiment, in which five subjects of different ages and gender were included. Obviously, each individual was tested by their own personalized system. Finally, the usability of the proposed system for recognizing gait events was confirmed by applying it for detecting FOG in medicine. However, direct comparison of the presented results with other related work is difficult because of the unavailability of datasets alongside papers. Therefore, we offer our datasets, provided in
Supplementary Materials of this paper, for potential future comparison tests (see
S1: Gait data set).
The main novelties of the proposed system for recognizing gait events can be summarized as follows:
- -
developing a robust and lightweight gait motion data acquisition system with semi-automatic data labeling,
- -
identifying the attention-based CNNA + RNN supervised ML classification algorithm as the most reliable method for detecting specific human gait events, justified by comparative analysis with other ML methods,
- -
comparing commonly used ML algorithms with the CNNA + RNN method according to five evaluation metrics, and
- -
confirming the general usability of the system by detecting FOG in medicine.
The rest of the paper is organized as follows.
Section 2 describes the hardware materials used, the gait motion data acquisition system and its development, and gives an overview of common ML methods and evaluation metrics, as well as a detailed description of the dataset recording protocol. The results for each tested method, with classification, visualization and evaluation results, are presented in
Section 3. The reliable gait analysis results were obtained for healthy subjects, and furthermore, were tested on a patient with a gait disorder. The proposed approach for gait analysis enabled successful FOG detection in PD patients, which is discussed in
Section 4, along with the discussion and future work comments.
3. Results
The goal of our experimental work was firstly focused on identifying the best performing ML algorithm for gait activity classification of a single 25-year-old male subject. Furthermore, the robustness of four best performing algorithms was evaluated with a larger dataset of five subjects. Finally, a follow-up proof of concept FOG detection was performed on the gait data from one male PD patient who commonly experiences FOG episodes.
To perform a fair comparison, tuning of parameters was performed until the best classification performance was found for each algorithm (independently on the CWT and RAW datasets). The best parameter settings are presented briefly in
Table 3, while the full parameter settings and architectures are supplied in the
supplementary material.
Three experiments were conducted as follows:
recognizing gait events for one subject,
recognizing gait events for five subjects, and
PD patient FOG episode’s detection.
The first two experiments dealt with a multi-class classification problem with five outputs, while the third presented a binary classification problem with two outputs. The same ML algorithm parameter settings were used to obtain results for all three aforementioned experiments.
3.1. Recognizing Gait Events for One Subject
The purpose of the test was to find the ML algorithm that produced the best classification results for Subject 1 (
Table 2), according to standard classification metrics. This algorithm presents the most reliable solution, that is then applied in the experiments which follow. We hypothesized that the identified algorithm could be used to detect any kind of desired gait event from input data accurately, given that the event itself has some kind of specific time domain or frequency domain signature.
The data label is utilized as ground truth in classification evaluation, where the algorithm’s output and data label are compared, and evaluation metrics are calculated. All the ML algorithms were firstly trained with only one 3 min gait recording (train dataset), and later tested on the remaining 27 min of the gait recording (test dataset) in order to evaluate the trained algorithm. The performance of the particular ML algorithm was evaluated using the leave-one-out cross-validation method, where we repeated the process 10 times (each time training the algorithm with a unique 3 min recording). At first, the experiment was conducted using the RAW dataset, which consists of 20 features, and then repeated for the CWT dataset, which consists of 540 features.
In the remainder of this subsection, the results of the experiments are explained and presented in detail (
Table 4,
Table 5,
Table 6 and
Table 7), while their further discussion is summarized in the next section. The ML performance of nine algorithms compared regarding the five evaluation metrics is presented in
Table 4. The results are calculated for the CWT and RAW datasets separately. Furthermore, it is desired to use the RAW dataset for classification, since we do not have to calculate CWT at all. The best results according to each classification measure are presented as bold in the tables.
Table 4 identifies the CNNA + RNN method as the best performing on the RAW dataset, while the CNN + RNN algorithm outperformed CNNA + RNN for the CWT dataset. The best precision score on the CWT dataset was achieved by the AE + biLSTM method.
Table 5 presents the calculated Standard Deviation for the cross-validated evaluation metrics. The results were calculated for the CWT and RAW datasets separately.
Independent of the 3 min recording we used for training during the leave-one-out cross validation, the CNN + RNN algorithm operates consistently with minimal variations in performance. This fact can be observed as the low Standard Deviation in
Table 5.
Table 6 presents the execution time of the tested ML algorithms. It is divided into two parts: The algorithm’s learning time (specified as the duration of time for the algorithm to train), and the algorithm’s classification time (the duration of the pre-trained algorithm to classify new instances).
As shown in
Table 6, the CNNA + RNN is not the fastest computing algorithm but offers the best classification capabilities with the RAW dataset and a still manageable execution time.
Table 7 presents the results from evaluating the CNN + RNN algorithm after training with different combinations of Complementary Filter (CF), Accelerometer (ACC), Gyroscope (GYRO) and Strain Gauge (SG) RAW sensor information.
Inspecting the results from
Table 7, we can observe that some sensor combinations work better than others. The combination of all four sensors still showed superiority, but it is worth noting that using an accelerometer or gyroscope alone still yields good classification results.
Similar to the plot in
Figure 5e, it is now possible to visualize the output of the trained CNNA + RNN ML algorithm, as illustrated in
Figure 10, where
Figure 10b represents the algorithm’s output on raw test dataset, and different gait activities are recognized within the recordings.
Closely inspecting the algorithm’s output reveals some misclassified instances and some detection time asymmetries after comparing it to the ground truth in
Figure 10c. The average detection time delay was measured at 100 ms (i.e., 4 samples). Out of 10,000 samples in this test dataset, the algorithm misclassified a total of 22 samples (0.22%).
3.2. Recognizing Gait Events for Five Subjects
The purpose of the experiment was to show that the developed system for recognizing different gait events possesses two additional characteristics: Robustness, and personalization. In line with this, the Subjects 1–5 (
Table 2) were tested individually by the four best performing algorithms, chosen by the highest F1-score (CNNA + RNN RAW, CNN + RNN RAW, CNNA + RNN CWT and CNN + RNN CWT). The cross-validated results for each subject are presented in
Table 8 and
Table 9.
The CNNA + RNN and CNN + RNN algorithms trained with the RAW dataset produced the results as presented in
Table 8, while
Table 9 corresponds to the results obtained by the same ML algorithms, but for the CWT dataset.
Comparing
Table 8 and
Table 9, the CNNA + RNN approach with the RAW dataset delivered the highest performance among all five subjects.
As can be seen from
Table 9, the CNNA + RNN algorithm, trained with the CWT dataset, performed well, but was inferior compared to the CNNA + RNN’s RAW dataset results (
Table 8). The classification performance of the first subject (Male, 25) was better compared to the other subjects, due to better data label distribution amongst the dataset, as presented in
Table 2.
For the RAW dataset, the attention mechanism seems only to increase the performance (F1-score) slightly (0.1%) for the first subject, while the performance of the remaining four subjects was improved by 0.45% on average. These findings indicate that the contribution of attention mechanism is greater when dealing with more sparse data label distribution datasets.
3.3. PD Patient FOG Episode’s Detection
The purpose of the experiment was to show the usefulness of the proposed system for detecting special gait events like FOG episodes in medicine as well.
In the later stages of the disease, PD patients experience special FOG episodes, where the patient is not able to move the lower limbs despite the clear intention, during which trembling of the lower limbs is often observed. FOG episodes are defined as rare, brief episodes (1–20 s), and typically occur when the patient’s focus is shifted from the gait itself [
57], which manifests in one of two scenarios—from initiating gait (sudden intention to move the lower limbs after standing still) or during walking (through doors and avoiding obstacles). The latter is easier to detect, because it consists of a larger amount of gait motion, which manifests in larger amplitudes gathered from the IMU sensor. On the other hand, the strain gauge sensor information is valuable for detecting an FOG during gait initialization, since there are generally low IMU sensor amplitudes present, but the leg muscles of PD patients still tremble, and subsequently produce large strain gauge sensor amplitudes. Our gait recordings include both of the above-mentioned FOG scenarios, and both were detected correctly.
The PD patient’s 24 min gait recording was split into a train dataset (60%) and test dataset (40%). The results of the latter are visualized in
Figure 11a, alongside the algorithm’s output in
Figure 11b.
Figure 11c compares the algorithm’s output with the data label.
Each PD patient experiences unique FOG events, which make it difficult to develop a universal ML algorithm for FOG detection (for multiple subjects). This is where our approach with real-time data label creation excelled, as it allowed for complete algorithm personalization.
The FOG detection results for the four best performing classification algorithms are presented in
Table 10.
Training the CNNA + RNN and CNN + RNN algorithms with the CWT frequency matrix produced superior results when compared to training with the RAW dataset, as is shown in
Table 10.
Sensor importance for FOG detection was evaluated and presented in
Table 11.
As is evident in
Table 11, the CNN + RNN algorithm trained with the combined CWT data, obtained by all four sensors, showed superiority, although it also performed relatively well for some of the other tested sensor combinations, e.g., training the ML algorithm using ACC, GYRO and ACC, CF combinations produced the highest precision and sensitivity scores, respectively.
4. Discussion
A feasible system capable of gait activity recognition was developed, tested and assessed in this study. The following characteristics of the system can be exposed after our experimental work:
reliability,
personalization,
usability, and
robustness.
With the proposed ML algorithms, we demonstrated that the system is feasible to detect specific gait events reliably. Both hardware (the gait motion data acquisition system) and software (various ML algorithms) solutions are presented. The optimal sensor placement location was determined by evaluation of classification metrics. The CNNA + RNN algorithm’s superiority over other ML algorithms can be identified by observing the classification metrics (
Table 4 and
Table 5). Moreover, proposed algorithm is still relatively fast to compute (
Table 6), and therefore, feasible for running online on the microcontroller. We further demonstrated the robustness of four best performing algorithms by performing gait classification on five subjects. The CNNA + RNN algorithm trained with the RAW dataset again showed superiority according to the evaluation metrics.
Gait motion profiles can differ substantially between different humans, especially if movement disorders are present. Therefore, researchers commonly try to develop general ML algorithms which can be applied to multiple subjects. In contrast, the ML algorithms in our study are trained and tested for each individual subject separately. This way, the algorithms can learn subject specific features and achieve complete personalization with only 3 min of recorded gait. The proposed method of automatic online data labeling turned out to be very useful, as it can be used to label practically any desired gait activity in real time, allowing for algorithm personalization to specific subjects (since we have data labels for the whole gait dataset).
Moreover, the usability of the best performing algorithms was confirmed in a complex practical application (detection of FOG in PD patients), as can be seen in
Section 3.3. We discovered that using the CWT dataset yields a more accurate FOG detection algorithm. The trembling of limbs during FOG seems to be more easily distinguishable (from other activities) using the CWT frequency spectrum than the raw signal itself. The data transformation ability of the CWT analysis really seems to reveal broader data insight in this case. In medicine practice, it is desired to detect FOG episodes among all other recorded activities, as this would enable us to assess the effect of pharmacological and nonpharmacological interventions on the occurrence and characteristics of FOG episodes.
Figure 11 indicates the robustness of our approach to detect even special events that are a result of neurological decline accurately, which points to a possible unobtrusive ‘wearable assistant device’. All the best performing algorithms tested for FOG detection produced some false-positive instances. It is interesting that all the algorithm’s outputs detected roughly the same false-positive instances, hinting at the possibility that there was an onset of FOG actually present and not labeled correctly by the data labeling system’s operator at that specific instance. There is no drawback by providing cues to the patient during false-positive instances, as the patient can only benefit from it. At the end of the day, the algorithm’s outputs suggest, that there is some similarity between false-positive detected instances and FOG instances. The robustness of our system for recognizing gait events is shown in the fact that the system performed exceptionally for all five subjects without the need to change any ML parameters. This holds true for the developed hardware as well.
The proposed system could be upgraded to enable real-time ML algorithm deployment on the microcontroller. Upon adding a cue feedback system, it would be possible to deliver cues ‘on demand’, when specific events are detected. Furthermore, the existing RF communication can be used to activate cue feedback systems wirelessly in real time. This feature has great potential in sports, robotics, virtual reality, medicine (FOG, rehabilitation), etc.