1. Introduction
Technology for estimating the location of workers in indoor environments has been studied for accident prevention and convenience at construction and industrial sites. Furthermore, indoor localization systems are being developed for the realization of the fourth industrial revolution [
1,
2,
3]. For example, if a worker attempts to enter a dangerous area, an indoor localization system can prevent accidents by estimating the location of the worker and warning them. Furthermore, human–robot collaboration, in which human workers and robots pool their skills for flexible manufacturing, has recently been noted as a future manufacturing trend [
4]. However, an indoor localization system is considered necessary for the realization of safe human–robot collaboration.
Because a global navigation satellite system is not available indoors, various indoor localization systems using radio signals, such as Wi-Fi, Zigbee, RFID, Bluetooth, ultra-wide bandwidth (UWB) radar, and frequency-modulated continuous wave (FMCW) radar, have been introduced [
5,
6,
7,
8,
9,
10,
11]. The multilateration method and the fingerprint method are well known as localization techniques that use radio signals such as Wi-Fi, Zigbee, RFID, and Bluetooth. Because the multilateration method is based on range estimation, its localization performance depends on the accuracy of the estimated distance to the target [
12]. In the fingerprint method, meanwhile, the received signal strength of radio signals is collected at all points of interest, and the real-time data at a specific position are correlated with the precollected data to estimate its location [
13]. However, it is known that the range and the localization accuracy of both the multilateration and the fingerprint methods with Wi-Fi, ZigBee, and Bluetooth are inferior to those localization schemes that use radars, which provide high time resolution for estimating distances and locations [
11,
14,
15,
16]. It is also known that the UWB radar has the disadvantage of limited coverage relative to the FMCW radar, which computes the difference between the transmission and reception frequencies generated by the time delay and then calculates the distance to the target [
14,
15,
16]. Indoor localization using FMCW radars is introduced with various bandwidths (e.g., 24 GHz, 77 GHz, etc.), and the performance varies depending on the frequency band. In most cases, with high bandwidth, the hardware is calibrated to increase target detection accuracy.
In conventional two-dimensional (2D) localization schemes that use FMCW radar, multilateration methods that use time of flight (TOF) and a joint TOF and direction of arrival (DOA) scheme have been introduced [
11,
17]. State-of-the-art conventional schemes can provide relatively low errors in location estimation if the distance to the target is accurately estimated. However, the distance estimation tends to be somewhat inaccurate due to random occurrences in indoor environments [
15,
16]. To overcome this limitation, a distance estimation scheme that exploits the deep learning technology of artificial neural networks is introduced to improve the accuracy of distance estimation in [
18]. By applying deep learning technology to the data received by FMCW radar, the data can be classified in terms of different distances to the target even with noise and clutter, and thus an accurate distance can be estimated.
We propose a deep learning-based indoor 2D localization scheme using 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. We also consider two different models, which are the deep neural network (DNN) and the convolutional neural network (CNN), and two different numbers of FMCW radars to analyze the performance of the proposed scheme. The performance of the proposed scheme is evaluated experimentally and compared with the conventional 2D localization scheme that is based on multilateration under the same conditions.
The remainder of this paper is organized as follows: In
Section 2, we briefly describe the 2D localization system using FMCW radar and review the conventional scheme. In
Section 3, the proposed scheme is presented in detail. In
Section 4, the performance of the proposed scheme is evaluated experimentally then compared with the conventional scheme under the same conditions, and the results are discussed. Finally, the conclusions of the study are summarized in
Section 5.
3. Proposed Scheme
In this section, we propose a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. To achieve enhanced localization performance, two different models—the DNN and the CNN model—are proposed using different numbers of FMCW radars.
Firstly, we propose the DNN model, which includes two cases. In the first DNN model case, the collected data are used with two FMCW radars, and the input layer is set to 128 units, because it combines two pieces of collected data for each point. Note that the FFT data from a single radar are collected in an array of size 1 × 64. For the second DNN model case, the collected data are used with only one radar, and the input layer is set to 64 units. As for the second case with a single radar, we tried to estimate the 2D location of the target using only one radar pattern. These two DNN model cases consist of the same layers, except for the number of units in the input layer.
Figure 7 shows a network configuration diagram of the proposed DNN model where (1) represents the case with only one piece of radar data in the input layer, and (2) represents the case with two pieces of radar data. Note that the hyperparameter configurations of both cases are the same, as summarized in
Table 1. Based on the experience accumulated from previous studies, we selected the best performing hyperparameters [
18].
Secondly, we propose a 1D CNN model, which consists of two cases. The CNN model is known to be a trainable model with spatial information of the image retained [
21,
22]. Because CNNs exhibit excellent performance by extracting features from raw data during image classification, a 1D CNN was recently developed to reduce the computational complexity of 1D signals [
23]. Similar to the proposed DNN model with two cases, we propose a 1D CNN model with two cases, and set the number of CNN channels to one and two, respectively, and classify the data as 1D or 2D input for each case. The input shape for the CNN model using one channel is set to (64,1) in 1D form, while the input shape for the CNN model using two channels is set to the 2D form of (64,2). These two cases for the CNN model consist of the same layers except for the input shape, as shown in
Figure 8. The hyperparameters of the CNN model are the same as those of the proposed DNN model, as summarized in
Table 1.
The same dataset is used for both the DNN and the CNN models. As mentioned in
Section 2.1, we collected a dataset of 5000 pieces of FFT data for the 25 points, and split them into training data, validation data, and test data. The training data and the validation data were used for algorithmic learning, while the test data were used to evaluate the performance of the proposed model and were not involved in learning. The 5000 pieces of data collected through experiments were first divided into learning and test data at a ratio of 8:2. Subsequently, the divided learning data were again divided into training and validation data at a ratio of 8:2. In other words, the model used 3200 pieces of learning data, and 800 pieces of validation data, while 1000 pieces of test data were used for the performance evaluation of the proposed model.
Figure 9 shows a diagram of the dataset split.
Furthermore, 5000 pieces of data were randomly divided into 25 classes. This is because if it is not split by class, out of the 25 classes, an empty class can occur. The data were split into 25 classes and then randomized using a random function in TensorFlow.
4. Performance Evaluation
After learning the DNN and the CNN models, using the experimental data, validation and testing were carried out. For the DNN model, there were two cases that used data from only one radar (DNN_radar_1) or used data from both radars (DNN_radar_2). In the CNN models, the two cases had either one channel (CNN_channel_1) or two channels (CNN_channel_2). A graphical representation of the validation accuracy of each model is shown in
Figure 10. As shown in the figure, the DNN_radar_1 model achieved a validation accuracy of approximately 56%, whereas the DNN_radar_2 model achieved a validation accuracy of approximately 80%. Similarly, the validation accuracy of the CNN_channenl_1 model was approximately 65%, whereas the validation accuracy of CNN_channel_2 was approximately 90%.
Table 2 shows a comparison of the validation accuracy and the average localization error using test data for both the conventional scheme and the proposed schemes. As shown in the table, the average localization error of the DNN_radar_1 model is approximately 1.30 m, while that of the DNN_radar_2 model is approximately 0.89 m. It is evident that the performance of the proposed DNN scheme was not improved compared with that of the conventional scheme. However, it is worth noting that the 2D location of the target can be estimated using a single radar in the proposed scheme, while two radars are required in the conventional scheme.
As for the proposed CNN schemes, the average localization error of the CNN_channel_1 model was approximately 0.77 m, while that of the CNN_channel_2 model was approximately 0.23 m. According to the results, the proposed CNN scheme with two FMCW radars can provide enhanced localization performance compared with the conventional scheme as well as the other proposed schemes. Therefore, even when using the same data set, it is shown that we can enhance the average localization error from 0.53 m to 0.23 m by using the proposed CNN scheme with two FMCW radars. Although enhanced 2D localization performance with a single radar can be obtained with the proposed CNN scheme, it is worth noting that compared with the DNN scheme, the proposed schemes with two radars can enhance the performance of the localization more than the schemes with one radar.
Meanwhile, the validation accuracy graphs in
Figure 10 show that the CNN model has less variance as the epochs increase, while the DNN model has greater variance. This is because the CNN model maintains the geometry of the input/output data in each layer, unlike the DNN model, and thus they can effectively recognize features with neighboring values while retaining spatial information in the data [
24]. Therefore, we conclude that the proposed CNN models provide more efficient learning than the DNN models, and this results in higher accuracy due to effective feature extraction. By comparing the performance of the four models, we also conclude that the CNN model with two channels was the most accurate and had the lowest average error.
Figure 11 shows the estimated location of the target by the CNN model with two channels for the test data. As shown in the figure, the estimated location was very close to the ground truth point for all 25 points.
Figure 12 shows the average localization error for the proposed CNN model with two channels. In the figure, class 1 indicates the point (1,1), and class 2 indicates the point (1,2), while class 25 represents the point (5,5). Contrary to the results of the conventional scheme in
Figure 6, the average localization error does not increase as the distance between the target and the radars increases, and the total average estimation error was approximately 0.23 m.
Figure 13 shows a comparison of the average localization error for the conventional scheme and the proposed CNN scheme using two channels. In the figure, the localization error is compared point by point. As shown in the figure, the average localization error of the proposed scheme is generally less than that of the conventional scheme, and the difference is remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance for remote points can be achieved using the proposed scheme compared with the conventional scheme.
5. Conclusions
In this paper, we proposed a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar to achieve better localization accuracy than the conventional 2D localization scheme based on multilateration. In the proposed scheme, DNN and the CNN models with either one or two FMCW radars were employed to overcome the limitations of the conventional 2D localization scheme.
Experiments were conducted in the corridor of the general office building ta Kwangwoon University, and the received data were collected to estimate the location of a human target, which was positioned at one of 25 different points within a monitoring area of 5 m × 6 m. According to the results, the 2D location of the target could be estimated with a single radar using the proposed scheme, while two FMCW radars were required for the conventional scheme. Furthermore, for the proposed CNN scheme, using two FMCW radars produced an average localization error of 0.23 m; while for the conventional scheme, using two FMCW radars produced an average localization error of 0.53 m. Even for the same data set, therefore, it was shown that the average localization error could be improved from 0.53 m to 0.23 m by applying the proposed CNN scheme and using two FMCW radars. Furthermore, the localization error was compared point by point, and it was shown that the average localization error of the proposed scheme was generally lower than that of the conventional scheme, and the difference was more remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance was achieved for remote points using the proposed scheme relative to the conventional scheme.
In future research, we will develop a regression model with substantial training data for more accurate localization performance, and we will also conduct research for estimating the location of any targets not included in the training data.