1. Introduction
Human activity recognition (HAR), which recognizes human actions such as walking, and sitting/lying, has sparked a lot of interest in mobile computing. Consequently, a significant amount of work is being done to analyze the information gathered from the sensors of cell phones [
1,
2]. The HAR is classified into two types: video-based HAR and sensor-based HAR. Video-based HAR examines videos or images from the camera that contain human motions [
3,
4,
5], whereas sensor-based HAR examines motion data from smart sensors such as an accelerometer, gyroscope, and sound sensors [
6,
7,
8]. In recent years, sensor technologies have seen significant advantages, particularly low power, low cost, high capacity, compact sensors, wired and wireless communication networks, and data processing methods [
9,
10]. Human activities are classified using various sensors. Accelerometers and gyroscopes are the two most commonly used sensors for HAR. An accelerometer is an electrical sensor that measures the acceleration forces acting on an object to calculate its position in space and track its movement. A gyroscope is a device that determines and maintains the orientation and angular velocity of an object. These two sensors, which are commonly obtained in smartphones, were used to gather data for HAR. According to the methodology, one or more sensors can be positioned in various body parts to collect the data. The positioning of wearable sensors directly impacts the monitoring of physiological movements. Sensors are often positioned on the chest, lower back, and waist. Machine learning algorithms then analyze these sensor data to recognize human activities.
In the past decades, several researchers have surveyed HAR [
11,
12]. For instance, the activities may include walking, running, exercising, etc. The activities may be categorized into three main classes based on their length and complexity: short activities, basic activities, and complicated activities. Activities with very short duration, such as standing up from a sitting position and human gestures, are included in the category of short activities. The basic activities include walking, running, and walking upstairs. The complicated activities are the combinations of fundamental activity progressions and interaction with other things and persons. It is important to note that these three activities have produced a different pattern. However, some research has produced better results for short and basic activities using a single triaxial accelerometer and gyroscope based on their used method and how many sensors they employed. In this paper, we focus on recognizing some short activities based on data collected from a curved piezoelectric sensor. The huge amount of sensor data cannot be easily analyzed without the help of a machine-learning algorithm. Machine learning is the most effective approach for entering a huge amount of data and classifying them, and it can easily identify the patterns present in the data and classify them based on the pattern. Meanwhile, with the fast development of wearable sensor technologies, high-accuracy data classification is required and increasing exponentially. Machine learning algorithms are also used in wearable sensors with an activity recognition element that captures various activities for health monitoring. Several machine learning algorithms were fed on collected data and their results were compared.
Machine learning algorithms such as SVM [
13], RF [
14], Markov model [
15], and k-NN [
16] have been used for a while to tackle the HAR problem. SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called support vectors, and hence the algorithm is termed SVM. Technically, the primary objective of the SVM algorithm is to find a hyperplane that discriminates data points of different classes. Depending on which side of the hyperplane a data point falls, it might belong to a different class. The amount of input characteristics in the dataset determines the hyperplane’s dimension. A random forest algorithm is composed of a set of decision trees, and each tree in the ensemble consists of a data sample taken from the training set with a replacement called a bootstrap sample. The term “bootstrap” refers to “re-sampling the original data to build new data in a proper and repeatable manner.” The idea of RF method is to merge many “weak learners” to create a more robust model i.e., “strong learner”. k-NN is used for both regression and classification, by computing the distance between the test data and all of the training sets, k-NN may identify the correct class for the test data. The k-NN works like a lazy learning algorithm; it only stores the data of the training part. The saved data are to be used to evaluate new search data.
K-mers are often used in bioinformatics to control the quality of generated sequences [
17], classify metagenomics [
18], and estimate genome size [
19]. Haodong yan et al. [
20] developed a method called DeepTE that uses CNNs to classify unknown transposable elements. K-mer counts were implemented by DeepTE to transfer sequences into input vectors and can be identified domains inside TEs to correct misclassification. F.P. Breitwieser et al. [
18] present an approach that integrates fast K-mer-based classification with an effective algorithm for calculating the number of distinct K-mers present in each species in metagenomics samples. They demonstrated that applying distinct K-mer counts improves species identification accuracy and that K-mer counts seem to be very effective for detecting false positives. Machine learning approaches with K-mer have recently been found to perform better on pattern recognition problems [
21,
22,
23,
24]. These methods are created by extracting the frequency of fixed-length k from DNA sequences. This research will focus on combining the K-mer frequencies with certain supervised machine-learning algorithms to enhance efficiency. We believe that integrating K-mer with machine learning algorithms will increase the classification accuracy of the sensor data.
In this paper, we introduce machine learning algorithms enabled with K-mer. For human gesture recognition (HGR), such as elbow movement, wrist turning, wrist bending, coughing, and neck bending, we used a curved piezoelectric sensor to collect the data. Each human gesture exhibits a different pattern in the sensor signals. Further, the collected sensor data from human gestures were analyzed and performed to investigate the pattern’s recognition. Our aims were
- (i).
To determine whether our sensor could be used to classify human gestures.
- (ii).
To develop various machine learning algorithms with K-mer frequencies and modify the input parameters, then compare the performance of each classifier.
- (iii).
Based on the classification performance, choose the best parameter combination for high accuracy and sensor data classification.
The structure of the paper is as follows.
Section 2 provides information about the related studies. The procedure for data collection and processing is described in
Section 3, while
Section 4 shows the results of various machine-learning approaches with varying parameters for gesture detection and classification.
Section 5 represents the performance and accuracy of the model.
Section 6 concludes the article.
2. Related Work
Serkan Ball et al. [
25] developed a method that combines principal component analysis (PCA) and machine learning algorithms (RF, SVM, C4.5, and k-NN) and compared their performances. The combination of PCA and RF provides an effective clustering feature extraction approach as well as better classification accuracy. The PCA approach was used to reduce the dimensionality of features, to improve an RF classifier’s classification performance, and reduce the variance of features in datasets.
Nadeem Ahmed et al. [
26] developed a hybrid feature selection approach that employs a filter and wrapper method. The process uses SFFS (sequential floating forward search) to extract desirable features for enhanced activity recognition. Features are then fed to a multiclass support vector machine (SVM) to create nonlinear classifiers using the kernel method for training and testing purposes.
Enda Wista Sinuraya et al. [
27] used ensemble empirical mode decomposition (EEMD) and Hilbert–Huang transform (HHT) process to enhance the feature extraction approach. This technique applies a nonlinear approach in the frequency domain. The data were classified using SVM, Naive Bayes, and RF. The RF classifier produced the best accuracy compared to others.
B Vidhya et al. [
28] presented a wearable multi-sensor HAR system using discrete wavelet transform (DWT), and empirical mode decomposition (EMD) approaches for feature vector extraction. The four machine learning algorithms, namely SVM, KNN, ensemble classifier (EC), and decision tree (DT), were trained to identify a variety of human activities using the discriminative statistical data from DWT along with the entropy features from EMD.
Ashhim et al. [
1] created prediction models using sample data provided by smartphone sensors and supervised machine learning techniques. The logistic regression technique achieved the best classification rate in their experiment, which was 95.995%.
Ahmad Jalal et al. [
29] proposed a method for recognizing human activities in the video using skeletal joints. The depth camera was used to collect data and train the hidden Markov models for each activity. Their experimental results showed better performance, with mean recognition rates of 93.58%.
Akram Bayat et al. [
30] proposed SVM, MLP, random forest, and a combination of classifiers. They got a best accuracy of 91.15%.
Lu xu et al. [
31] employed the RF model for recognizing human activity in which accelerometer data from a wearable device are gathered and utilized as input data. The RF model for human activity recognition was constructed, and the algorithm was designed and analyzed. They achieved an overall accuracy of 90%.
Song-Mi Lee et al. [
32] provided a 1D CNN-based technique for identifying human activity using triaxial accelerometer data collected from a smartphone. The acceleration values for 3 axes are converted into vector magnitude data and employed as input for 1D learning CNN. The accuracy of their 1D CNN-based technique was 92.71%.
Zong Liu et al. [
33] developed an effective approach called reduced kernel k-nearest neighbors. The k-NN model is modified in this work to improve the classification accuracy. The input data are transformed into high-dimensional features using the kernel approach, which significantly improves the classification performance. Human activity data accuracy is 91.60% for the human activities and postural transitions dataset (HAPT) and 92.67% for smartphones.
Agus Eko Minatno at el. [
34] used a public HAR dataset with static and dynamic activities. A smartphone with 563 features, 561 of which were selected, the accelerometer and gyroscope sensor embedded with a smartphone is used to retrieve data. They used various machine learning algorithms for HAR, such as logistic regression, decision tree, RF, SVM, and k-NN. The accuracy of 98.96% was achieved by SVM with RBF kernel.
Michele Alessandrini et al. [
35] used accelerometer and PPG sensor data from a publicly accessible data collection and constructed an RNN for the detection of human activity. The RNN was subsequently transferred to an embedded system based on an STM32 microcontroller, using a particular toolkit for porting the network model to a specific architecture. The results indicate that the classification of test data is achieved with an accuracy of more than 95%.
Lin-Tao Duan et al. [
36] built a motion tracker using two motion sensors to capture five types of limb activity. The fast Fourier transform was employed to separate the motion features from the frequency domain of the sensor data and choose a subset of the features as a feature vector. Three supervised algorithms such as naive Bayes (NB), k-nearest neighbor (k-NN), and artificial neural networks (ANNs) were used to classify the human lower limb movements with a recognition rate of 97.01%, 96.12%, and 98.21%.
Yu-Ta Yao et al. [
22] developed K-mer-based pattern recognition (KPR) for keyboard inspection. This paper analyzed the image patterns and classified them based on the K-mer frequency. Based on the K-mer frequencies, KPR was applied to encrypt the image patterns of the light-emitting letters on the keyboard.
Yu-Ta Yao et al. [
24] proposed a novel way to enhance the efficiency of finding KPR parameters through a two-stage multi-fidelity design optimization. As a result, the proposed strategy method was more efficient than finding the best design parameters based on the entire range of input images.
4. Results
K-mers are substrings of length k contained within a biological sequence in bioinformatics. Herein, we developed machine learning models embedded with K-mer for signal classification based on pattern recognition. For example, if we choose a K-mer subsequences length of 4 and slide it from left to right, switching one character at a time, the length of the DNA sequence is L; then, we obtain L – K + 1 K-mers. Binarization is the method to reduce unnecessary data, and we adopted this binarization method to convert the signal data into 0 and 1. The binarized data are then extracted using K-mer frequency analysis and examine the accuracy of the results.
Figure 3 shows the schematic diagram of K-mer-based signal encoding process.
We started the data processing with SVM, RF, and k-NN through K-mer frequency analysis. Herein, we cut the data into several parts, such as 20, 40, 60, 80, and 100, to increase the input samples. Different subsequence lengths, such as 2, 4, 6, 8, and 10, were employed to analyze the data. Then, we run the proposed algorithms with different combinations of input parameter values and analyzed their classification accuracy. The algorithms in machine learning must be trained to update each model parameter. Consequently, it is essential to import the test set and training set. In this work, the k-fold cross-validation method was used to evaluate the classification accuracy of the proposed model. In this approach, k determines how many times the dataset will be split. In most cases, the value of k = 10. (In detail, the dataset is divided into k equal samples, in which the first sample is used as a test set and the remaining (K − 1) are training sets (20% was used for test data, and 80% was used for training data, i.e., a test-train split of 20–80).
Figure 4 depicts the schematic diagram of 10-fold cross-validation.
Table 1 shows the accuracy of the 10-fold cross-validation for each fold.
SVM is a supervised machine learning technique that builds a hyperplane with the maximum possible margin to classify the data [
37]. A kernel function might be introduced to project the data into a higher dimension to make the data linearly separable. The penalty coefficient (C) is a parameter used to tune the SVM accuracy. There is no rule for selecting a C value; it depends entirely on testing data. We used different C values and selected the value which gave the best classification accuracy. Initially, we selected a value of C = 0.01, which provides very low accuracy of results, while increasing the C value from 0.01 to 0.1, 1, 10, the accuracy was also increased. Finally, 94.11 ± 0.3% of the accuracy result was achieved at the value of C = 10. The average precision, recall, and F-score values for K-mer-based SVM were 0.941 ± 0.003, 0.941 ± 0.003, and 0.941 ± 0.003.
Figure 5 shows the accuracy results for the K-mer-based SVM model with the various numbers of cuts, different subsequence lengths, and different C values.
The RF method is made up of a group of trees, each of which is built using randomly collected data with the same statistical distribution as the trees in the forest [
38]. Voting is often used to decide class labels in the forest tree for classification. We used the different n_estimators values (20, 40, 60, 80, 100) and max_depth values (10, 20, 30, 40, 50) to determine which gives the best classification accuracy. Finally, 97.18 ± 0.4% of the accuracy result was achieved at n_estimators = 100 and max_depth = 50. The average precision, recall, and F-score value for the K-mer-based RF model were 0.9718 ± 0.004, 0.9718 ± 0.004, and 0.9718 ± 0.004.
Figure 6 shows the accuracy results for the K-mer-based RF model with various numbers of cuts, different subsequence lengths, various n_estimators, and max_depth values.
k-NN classifies the new dataset into a class that closely resembles the current data by comparing it to the existing dataset [
39]. The k-NN uses the concept of similarity between the training set and new data to classify the data. The k-NN algorithm stores all existing data and identifies new data points based on similarities. We used K values of 1, 3, 5, 7, 9, and 10 to determine the best classification accuracy. Finally, 96.90 ± 0.5% of the accuracy result was achieved at the value of K = 1. The average precision, recall, and F-score value for the K-mer-based k-NN model were 0.969 ± 0.005, 0.969 ± 0.005, and 0.969 ± 0.005.
Figure 7 shows the accuracy results for the K-mer-based k-NN model with the various number of cuts, different subsequence lengths, and different k values. We used the public domain dataset [
40], including running and walking, and analyzed its classification performance with our algorithms that provide better accuracy results of 91.62 ± 0.4%, 91.85 ± 0.4%, and 92.45 ± 0.5% for SVM, RF, and k-NN. Moreover, to test its robustness, we used the ECG detector data [
41] from the public domain, combined it with our sensor dataset, and analyzed its classification performance. We achieved promising results with accuracy of 95.41 ± 0.5%, 98.01 ± 0.3%, and 98.12 ± 0.4% for SVM, RF, and k-NN, respectively.
Table 2 provides the best classification performance parameters used for the final models and accuracy analysis of the machine learning algorithms with and without the subsequence length. Furthermore,
Table 3 shows a comparative analysis of HAR using various machine learning algorithms.
6. Conclusions
The primary goal of this work was to develop a trustworthy approach for HGR using wearable sensors. We developed a novel piezoelectric sensor for HGR, such as elbow movement, wrist turning, wrist bending, coughing, and neck bending. The dataset collected from the piezoelectric sensor was used as input for the classifiers. We classified these human gestures using three different machine learning models enabled with K-mer. Fine-tuned machine learning models were executed on the data with different parameters to obtain the best model. The proposed method provided remarkable accuracy results of 94.11 ± 0.3%, 97.18 ± 0.4%, and 96.90 ± 0.5% for SVM, RF, and k-NN. The confusion matrices were performed for the classification of the HGR. Finally, human gestures can be correctly recognized using our proposed model. As a result, we plan to develop a dataset with even more complicated activities in the future and analyze its pattern recognition, and we would like to apply these algorithms to a huge dataset, preferably with different sensor data. Furthermore, future research can focus on developing real-time data classification on different environments and robust human activity recognition systems.