1. Introduction
In recent years, the demand for location-based services (LBSs), both indoors and outdoors, has been gaining attention and has massive demand in industry and academia [
1]. Successful application of the Satellite Navigation Positioning System (SNPS), such as Global Positioning System (GPS) and the Galileo Navigation System, provides great convenience for traveling people. However, in indoor or complex outdoor environments, GPS cannot provide accurate LBS [
2]. Multiple sensors equipped with a smartphone have brought new advances for indoor LBS. By measuring with the received signal measurements, localization with Wi-Fi or a magnetic signal becomes possible [
3].
Traditional localization methods rely on signal Time of Arrival (TOA), Time Difference of Arrival (TDOA), and Angle of Arrival (AOA) to determine the position of the User Equipment (UE). However, special equipment is needed to determine the signal round-trip time or angle. Therefore, it is inconvenient and impractical in many applications. In contrast, most of the fingerprint-based positioning methods do not require any dedicated equipment or infrastructure, and it can be implemented just by one ubiquitous smartphone. In addition, low-power sensors equipped with a smartphone draw much lower energy, even when continuously active [
4].
As illustrated in
Figure 1, the proposed fingerprint localization system normally consists of two phases: the offline phase and the online phase. During the offline phase, UE collects a series of Wi-Fi Received Signal Strength Indications (RSSIs) from all access points (APs) or magnetic signal magnitude at known locations, known as Reference Points (RPs), to build a fingerprint database. Therefore, each RP has its own fingerprint, containing the known locations and the received RSSI or magnetic signal magnitude. Then, the proposed deep learning model is used to train with the pre-constructed fingerprint database. At the online phase, the well-trained deep learning model is used to match the currently received signals against the fingerprint database, and the location of UE is determined by the best-fitted RP [
3,
5].
The initial fingerprint-based localization approach relies on K-Nearest Neighbor (KNN) to find the RPs that match best with the fingerprint database. Later, the Bayesian algorithm, Weighted-K-Nearest Neighbors (WKNN), and Support Vector Machine (SVM) are proposed to improve the robustness of the positioning system [
6,
7,
8]. In [
9] a magnetic-based indoor subarea localization approach was proposed using an unsupervised learning algorithm. A multi-hop approach was leveraged to solve inaccuracies in the localization problem [
10].
However, the main problem in achieving accurate fingerprint localization lies in the signal fluctuation, such as the adverse impact of multipath fading and signal attenuation by furniture, walls, and people. In addition, accurate positioning requires collecting more RPs; therefore, the workload of constructing a fingerprint database tends to be tremendous. Consequently, the main challenge in fingerprint-based localization is how to develop a model that can extract reliable features and accurately map massive numbers of RPs with widely fluctuating signals [
11]. The aforementioned localization approaches have shallow learning architectures, leading to limited representational ability, especially when dealing with those massive and noisy data issues. Positioning with MFS is also problematic. The discernibility of MFS decreases dramatically when considering a large area, which makes it impossible to use MFS directly for positioning.
In recent years, deep learning has made great progress both in academics and industry. Deep learning with multiple layers has beaten other techniques in speech recognition, image classification, and so on [
11,
12]. Therefore, in this work, deep residual network (Resnet) and transfer learning are introduced to develop a highly accurate localization system. Using MFS alone for localization is insufficient, because of its low discernibility in a large area. Therefore, considering the outstanding performance of the density peak clustering (DPC) algorithm in feature selection, we propose a novel density peak clustering algorithm based on the comparison distance (CDPC) algorithm to select several center points of magnetic field strength (MFS), then combined it with a Wi-Fi signal to improve the robustness of the proposed localization system. Owing to the state-of-the-art performance of deep learning in image classification, the Wi-Fi RSSI and the center points of MFS are converted into images to build the fingerprint image database.
In order to deal with signal fluctuation, a model with a strong learning ability should be designed. In this work, a two-level hierarchical architecture training approach, containing a pre-training step and fine-tuning step, is adopted to obtain the final deep learning model. After finishing the construction of the fingerprint image dataset, the proposed Resnet is first used to train with the dataset and return a pre-trained model called the coarse localizer. Then, by leveraging prior knowledge of the pre-trained model, multiple perception layer (MLP)-based transfer learning is used to further train with the dataset and return a fine-tuned model called the fine localizer.
During the training phase, multiple data enhancement approaches are leveraged to improve the localization accuracy. The fingerprint dataset images are standardized into 224*224, so the model can more easily learn image features. In addition, some of the images are enlarged by 1.25 times or randomly rotated by 15°. In batch normalization, a momentum item is added to reduce the vibration time and accelerate convergence of the model. In addition, the learning rate (LR) is dynamically adjusted to further optimize the model. For the matching phase, a probabilistic method is leveraged to indicate the accuracy of the localization system.
The main contributions of this paper can be summarized as follows: (1) the unsupervised learning CDPC algorithm is first used to pick up center points of MFS, which can represent the distribution of MFS at each RP. Positioning accuracy can be improved by combining Wi-Fi signals and the selected MFS. (2) Different from ordinary datasets, these selected MFS and Wi-Fi RSSI are transformed into images to form the fingerprint image dataset for localization. In order to develop a model with strong learning ability, Resnet and an MLP-based transfer learning two-level hierarchical training architecture are proposed for localization. (3) Considering the numerous classification points, we dynamically adjust the LR and adopted several data enhancement approaches to enhance the generalization ability of the deep neural network (DNN) model. (4) To verify the effectiveness of the proposed positioning system, the experiment was conducted in both real indoor and outdoor environments. The experiment shows that the proposed positioning system can achieve high-precision localization in both indoor and outdoor environments.
The rest of this paper is organized as follows:
Section 2 describes the related works. The proposed positioning system is presented in
Section 3. The experimental part is described in
Section 4. Finally,
Section 5 describes the conclusions and future works.
2. Related Work
The great demand for LBS has stimulated the development of localization techniques. The wide deployment of Wi-Fi signals and magnetic signals can be useful in almost all indoor environments for localization. Therefore, it has aroused great interest among researchers [
13].
Traditional measurement-based localization systems, such as TOA and TDOA, can determine the UE location. However, these approaches require line-of-sight (LOS) signal propagation, because the localization approaches depend on trilateration. The localization accuracy will deteriorate greatly in indoor environments, because the signal will often be blocked by objects and refracted [
14]. However, fingerprint-based localization can overcome these drawbacks, and it has been proven to have a satisfactory localization performance [
12]. Therefore, the fingerprint-based localization technique has attracted widespread attention. Basically, there are three kinds of fingerprints: visual fingerprint, motion fingerprint, and signal fingerprint [
3]. Improved image and video processing abilities enable smartphones to handle massive visual searches from a large number of visual fingerprint databases [
15]. The application of Google Goggles and Vuforia Object Scanner have also been successful. With the support of motion sensors, such as accelerometers and electronic compasses, smartphones can identify the real-time dynamics of UE. The basic idea of motion fingerprint localization is to combine an accelerometer and compass measurements and match these with the pre-constructed motion fingerprint database to determine the UE location [
16]. Signal fingerprint-based localization captures signals and matches them with the geotagged fingerprint database to determine the UE location [
17].
The most commonly used signals are Wi-Fi signals and geomagnetic signals. Each Wi-Fi signal has its unique media access control (MAC), and its limited signal coverage ability (around 100 meters) enables Wi-Fi signals to be widely used in localization [
5]. However, as is shown in
Figure 2, Wi-Fi signals can fluctuate over a wide range because of surrounding signal noises, multiple fadings and so on, which may confuse nearby locations in Wi-Fi-based positioning systems. Therefore, collecting more Wi-Fi signals with different MACs can produce a higher positioning accuracy. Wi-Fi-based indoor localization systems have a localization performance of 5–10 meters. In addition, for signals with low strength, the Wi-Fi signal scanning process may take several seconds to obtain all the Wi-Fi signals.
The magnetic field is rather stable over a long period, and it has outstanding spatial discernibility in a small area [
18]. It can collect around 100 data points per second by the sensors equipped in a smartphone. Researchers have found that MFS in indoor environments varies from 20 to 80 μT. MFS at a given location will have similar variations to nearby locations. Therefore, discernibility decreases dramatically when considering a large area. Therefore, it is impossible to directly use MFS for positioning. This paper discusses whether the CDPC algorithm can be used to pick out the MFS center point to enhance the positioning accuracy.
In [
19], KNN was leveraged to find the best match from the constructed fingerprint database. However, the experiments showed that the performance was not very satisfactory, because the system was sensitive to signal noise. In order to enhance the stability of the localization system, Bayesian-based filtering localization approaches were proposed in [
20]. However, the traceability of the localization system was influenced by the filter. An SVM-based localization system that converts the localization problem to a classification problem was proposed in [
21]. With the development of neural networks (NNs), researchers have leveraged shallow NN models for localization. However, these models have shallow structures and lead to a limited learning ability; therefore, it cannot handle a large set of massive vibrating signals, and the localization performance is not very good [
11]. The increase in computer computing power and the successful application of deep learning give researchers a new way to improve localization performance. One study [
22] investigated the application of convolution neural networks for localization. Another [
11] used a stacked denoising autoencoder and four-layer DNN to learn reliable features. In order to further increase the localization accuracy, [
23] leveraged channel state information (CSI) and deep learning for localization. SVM and DNN were used for indoor and outdoor localization [
24]. By using convolution neural network, a hybrid wireless fingerprint localization method was proposed for indoor localization [
25]. However, additional expensive hardware is needed to acquire CSI information, and the workload of data preprocessing is tremendous. Therefore, this approach is inconvenient and impractical [
26].
Compared to other works, this work has three differences. First, the collected signal measurements were converted into fingerprint grayscale image for localization. Second, the unsupervised learning CDPC algorithm is first used to find out the center points of MFS, and these selected MFSs are leveraged to improve the localization performance. Third, in this work, a two-level hierarchical deep learning structure is leveraged to extract key features from massive, widely fluctuating Wi-Fi and magnetic signals. Additionally, MLP-based transfer learning is introduced to fine-tune the trained Resnet coarse localizer for obtaining the fine localizer. In addition, our localization system requires no orientation information; therefore, there are no orientation requirements for the phone when localizing. Different from the aforementioned localization methods, in this paper, our proposed method does not rely on additional expensive hardware, and the localization task can be realized only by a smartphone. Therefore, our proposed localization system is universal and cost-effective.
3. Proposed Solution
In this paper, we considered a typical localization environment with a smartphone receiving RSSI and MFS measurements from surrounding Wi-Fi APs and magnetic fields. As is shown in
Figure 3, the purpose of localization is to find the location of the smartphone from the collected signal measurements. The localization system consists of six functional modules: data collection, data selection, data pre-processing, fingerprint image construction, DNN training and DNN localization. Multiple sensors equipped in the smartphones make it possible to read Wi-Fi and MFS signals. The purpose of the data selection is to use the CDPC algorithm to find the center point of MFS, and by combining the selected MFS with Wi-Fi RSSI, the localization accuracy can be improved. The signal measurements were converted into images to form fingerprint image dataset. Additionally, the localization information contains the fingerprint image and its location. The purpose of data pre-processing is to find signals with high strength and make it adaptable to form fingerprint images. After the construction of fingerprint image database, the proposed DNN was used to train with it. Then, the DNN parameter database stores the proposed localization model for the online localization. In the online phase, by using the trained DNN model, the constructed fingerprint image is used to match against the fingerprint image dataset to estimate location. Additionally, the DNN used in this paper includes Resnet and MLP-based transfer learning. In the following sections, we will detail the implementation steps and corresponding algorithms of the proposed localization system.
3.1. The Proposed Data Selection Algorithm
For the magnetic field measurements, the unsupervised learning CDPC algorithm is used to select several center points to better reflect the distribution of MFS in each RP. Combining the selected MFS and Wi-Fi RSSI can improve the accuracy of the localization system.
Clustering by fast search and finding density peaks are representative of a density clustering algorithm. The basic idea of the DPC algorithm is based on two assumptions: (1) the cluster center is surrounded by some points with a lower density; and (2) these centers have a relatively larger distance from the points of higher density [
27].
The two assumptions give the criteria of the cluster centers and give the test criteria for potential cluster centers. Two important parameters, the density , and relative distance , can be calculated.
A clustering dataset is
, where
,
is a vector with
attributes.
can be expressed as
, and the Euclidean distance
for the
and
can be represented as follows:
After calculating the Euclidean distance, the DCP algorithm can be conducted by the following procedure.
Define the local density
of data point
where
is the cut-off distance and is usually used as a manually entered parameter, based on experience.
Suppose there are N data points, and the distance between each point is . These distances are sorted in ascending order. is the position of in this order, where is the manual input percentage parameter and is the celling function.
The idea of is to discover the number of points in the data space that are less than from data point .
Traditional relative distance
: for each node
, a node with a higher density than
can be found. Calculate the distance between nodes
and
, and define the smallest
as
. If node
has the largest density, then
is the maximum distance from that point to other points.
In this paper, we propose a comparable distance to improve on DPC’s second hypothesis. The DPC algorithm does not quantitatively compare
. Therefore, choosing a new variable to replace
reflects the relative size in the algorithm. Based on the above conditions, an amount
which similar to
is defined as follows:
where
represents the distance from point
to the low-density area, which is a very suitable amount to compare with
.
It is known by the hypothesis that the point with larger density and larger relative distance is the cluster center point. Hence, calculations are after each point of local density
and comparative distance
.
Figure 4 indicates the decision graph for our experiments.
is calculated to find several maxima values. These maxima values are utilized as the center points and reflect the overall magnetic measurement distribution.
3.2. Data Pre-Processing
The purpose of data pre-processing is to find signals with high strength and make them adaptable to an RGB image. In order to eliminate the adverse effect of weak Wi-Fi signals on localization, we selected the eight strongest Wi-Fi signals at each RP. In our proposed localization system, the fingerprint database was constructed based on the image. Therefore, the purpose of data pre-processing was to adapt the signal measurements to an image. Generally, an ordinary RGB image contains three channel matrixes, and the values in the matrix are between 0 and 255. Wi-Fi RSSI measurements are between −30 and −120 dBm. Thus, the Wi-Fi measurements are based on .
3.3. Fingerprint Image Construction
Different from other works that use raw signal data to construct fingerprint database [
13,
16], this paper proposes a novel method to construct fingerprint image dataset. Considering the impact of different data lengths and AP sets on localization accuracy, the fingerprint image construction module, in each grid, normalizes all the fingerprint images into the same size and AP set. This module is used both in training and matching phases. The difference is that, in the training phase, the fingerprint images are labeled, and it needs to predict the label in the matching phase.
Different from the traditional way of processing sequence data, we converted the collected data into fingerprint images for feature extraction. The collected sensor data contained a series of MFS, RSSI and multiple APs. Generally, an ordinary image is a three-channel matrix that has red, green, and blue channels, respectively. Therefore, for constructing the fingerprint image, we need to rearrange the collected data.
In the proposed localization system, the constructed fingerprint image should be standardized into the same size. The fingerprint image
is composed of a magnetic part
and a Wi-Fi RSSI part
. The fingerprint image can be constructed as follows:
where
is the number of center points selected by the CDPC algorithm, and it is equal to the number of RSSI measurements collected in each RP.
is the number of APs detected in the localization areas. Therefore, the MFS
is stored as a
vector. The Wi-Fi RSSI fingerprint image is stored as a
matrix. In this paper,
is used to form the red, green, and blue channel matrixes; therefore, the fingerprint image can be constructed. Then, the same method is used to form the fingerprint image dataset.
3.4. The Proposed DNN Introduction
In this paper, the proposed DNN contains a Resnet-based coarse localizer and a transfer learning-based fine localizer. DNN used in our localization system can automatically learn signal features and can distinguish the difference between fingerprint features in different classification points. However, the collected dataset is rather small, which lessens the localization accuracy. Therefore, inspired by the idea of transfer learning, a two-level hierarchical training strategy is adopted. First, Resnet is used to train with the fingerprint image database, and we reserved the localization model. Then, MLP is added after the Resnet, and we used the new model for transfer learning.
3.4.1. Deep Residual Network Introduction
DNN algorithm is proposed to predict the user equipment (UE) locations. Because we converted the locations into labels, the predicted results were the IDs of these labels. In addition, the proposed localization consists of a Resnet-based coarse localizer and a transfer learning-based fine localizer.
With the development of deep learning, researchers have found as the number of layers of the neural network increases, the learning ability of the network will increase. However, owing to the overfitting problem, the generalization ability will be decreased as the network goes deeper. This problem has troubled researchers for a long time. With further research, [
28] proposed the deep residual model, and it successfully improved the learning ability of the network. As is shown in
Figure 5, the residual model is constructed by adding a skip connection. The learning for the target map
is transformed into
, and learning
is easier than
. By cumulating multiple residual modules, the degradation problem of DNN can be effectively alleviated and performance improved.
Figure 6 shows the proposed Resnet model, and it consists of one basic block 2, four basic blocks 2, three basic blocks 3, an average pooling layer, and one MLP layer. Each basic block is a residual module, and when overfitting occurs, the DNN skips some residual blocks and continues training. In this paper, SELU was used as the activation function. Additionally, cross-entropy loss is used as the loss function of the
classifier. The detailed calculation process of different layers can be seen in [
29].
3.4.2. Transfer Learning Introduction
Transfer learning has lots of merits. As shown in
Figure 7, transfer learning has a higher start, higher slope, and higher asymptote. Therefore, for obtaining the best localization model in this paper, a Resnet-based coarse localizer model and transfer learning-based fine localizer model were used to maximize the localization accuracy. These two localizer models need to be trained separately. Specifically, Resnet is first used to train with the fingerprint image dataset. After completing the training process, we reserved the trained Resnet model and added MLP after Resnet for transfer learning. The MLP-based transfer learning model leveraged prior information from the trained Resnet to maximize localization accuracy.
As is shown in
Figure 8, in this paper, MLP-based transfer learning is leveraged to fine-tune the Resnet and further increase the localization accuracy. First, the Resnet is leveraged to train with the fingerprint image database. After finishing the training process, we obtained a pre-trained model called the coarse localizer. Then, we reserved the trained Resnet model and added MLP after it. Finally, this newly constructed model was used to further train with the fingerprint image database. This transfer learning-based model was used as the final localization model called the fine localizer.