1. Introduction
In the era of 5G IoT [
1], real-time positioning is becoming increasingly required by context-aware applications and location-based services. Typical scenarios include locating doctors and patients inside a hospital, advertising commercial products to mall visitors, monitoring gas and oil plants status, pinpointing dead crops in vertical farms, identifying victims’ location in Public Protection and Disaster Recovery (PPDR), etc. Moreover, several advanced applications can further provide cellular phone fraud detection, location-sensitive billing, as well as navigation from and to almost everywhere, through the utilization of heterogeneous wireless technologies, fusion of sensor and IoT data [
2,
3,
4,
5]. A recent report published by IEEE has estimated 50 billion [
6] mobile devices will be connected to the cloud by the end of 2020. These devices will need constant access to data anywhere. Cisco has predicted that 26 billion [
7] of these devices will be IoT or Wireless Sensor Network (WSN) devices. In this respect, technologies like Cloud Radio Access Network (C-RAN), Millimeter Wave (mm-Wave) communication, ultra dense communication [
8], device-to-device (D2D) communication and Vehicle-to-everything (V2X) [
9,
10] and protocols like IEEE 802.11be (Extremely high Throughput WLAN) [
11], IEEE 802.11az (Next Generation Positioning) [
12] are not only introduced to increase the bandwidth of communication but also to offer the possibility of co-operative and precise localization. Additionally, with 5G paving the path for a seamless collaboration among heterogeneous wireless systems (cellular, WiFi, WSN, IoT, etc.), a great opportunity has risen in the area of indoor localization in urban areas under the framework of smart cities. Such high dense networks could be utilized to solve multi-agent positioning and offer agility and scalability for accurate positioning as a service. In this direction, we propose a DEep Learning-based with Co-operative Architecture (DELTA) algorithm to enhanced 3D indoor localization. The contributions of this paper can be summarized as follows:
A realistic 3D indoor localization scenario for 5G IoT networks has been designed using an emulated 5G C-RAN and Zolertia IoT nodes.
We present a novel approach to Received Signal Strength (RSS)-based fingerprint using 3D multi-layered radiomap to enhance the learning of network signal behaviour.
A deep learning cooperative algorithm is implemented on the constructed multi-layered radiomap for an improved 3D localization indoor localization. The proposed method targets improving vertical and horizontal localization for use case scenarios such as indoor navigation or people tracking in multi-floor smart or large complex buildings. Based on the results of the emulated realistic radio-planning, we have shown how the DELTA outperformed KNN and SVM.
The remaining of this paper is organized as follows:
Section 2 covers related research to this paper.
Section 3 describes the problem related to indoor positioning in a 3D environment.
Section 4 gives a detailed description of the underlying architecture of the DELTA model.
Section 5 consists of a discussion and analysis of the performance results produced by our proposed approach compared with other traditional models. Lastly,
Section 6 summarises a conclusion and spots possible future work.
2. Related Work
Indoor positioning techniques can be divided into two main categories: fingerprint and multilateration. In the latter, given a known propagation speed, the distance between a receiver and a group of transmitters is measured using techniques such as Direction of Arrival (DoA), Time of Arrival (TOA)/Time of Flight (TOF), Angle of Arrival (AoA), Time Difference of Arrival (TDOA) and Return Time of Flight (RTOF). These techniques are commonly used in Global Navigation Satellite Systems (GNSS) [
13], such as Global Positioning System (GPS) and Galileo, but surprisingly they are also found in IoT indoor navigation solutions [
14]. However, multilateration relies mainly on the travelling time or the direction of the signal rays. This makes indoor localization a complex task especially with many issues rising such as synchronization errors and multi-path fading [
15,
16,
17].
In the fingerprint-based technique, a set of RSS measurements are taken and linked to specific Reference Points (RP) (also known as fingerprints or signatures). Localization using this approach works in two phases: offline and online. During the offline phase, a site survey is conducted with the purpose of linking the measured signal strength values to predefined RPs. The outcome of this measurements campaign is then stored in a radiomap database. During the online phase, a user equipment receives real-time signals and tries to match them with existing records stored in the radiomap database using a matching algorithm. In the context of IoT localization, the RSS signal is collected from wireless technologies such as Zigbee, LoRA, Wifi, Raspberry Pi, BLE, RFID. Since it does not require any specialised equipment or time synchronization to obtain the RSS signal, this technique is usually preferred to multilateration. For instance, authors in [
18] have studied how robust localization for robots and IoT can be achieved using RSS fingerprint. Additionally, another interesting approach has been introduced in [
19] where the authors have focused on the use of IoT and Wifi-enabled devices to improve fingerprinting in an indoor environment. Recently, a new concept has been developed by Ali et al. [
20] using raster maps instead of traditional offline scene analysis. Furthermore, a hybrid solution implemented on LoRa devices, which combines RSS fingerprinting with AoA methods is discussed in [
14]. The proposed idea is very promising but it has inherited synchronization issues from multilateration. From these examples, it is undoubtedly clear that the RSS-based fingerprint method is widely used in the research community. This is due to improved localization and reduced computational complexity, as concluded by Amr et al. [
19]. A detailed comparison of technologies and algorithms implementing the fingerprint technique for IoT indoor positioning has been carried out by [
15,
21,
22,
23].
In the fingerprint-based approach, deep learning techniques have been widely used to extract common patterns from a sparse radiomap database and to improve localization. In recent years, it has gained a huge popularity among the indoor localization researchers, in particular, due to its robustness and high accuracy [
24]. Supervised and unsupervised deep learning algorithms have been recently implemented in 2D localization [
25] and multi-floor localization [
26]. Recently, Wafa et al. [
27] studied the use of Convolutional Neural Networks (CNN) on IoT-Sensor System to determine the node location. In this simulation, the authors converted the 2D localization problem into a 3D image tensor identification problem. The 3D tensor has been constructed using a 2D matrix of RSS signals and 1D kurtosis. This concept has achieved 2 m average error accuracy but a similar system was also implemented in [
28] and usually requires a large number of access points deployed in a small space to achieve this result. In [
29], authors have implemented a Deep Belief Network (DBN) on an active RFID tag system for accurate location estimation. Their solutions consisted of a set of stacked Restricted Boltzmann Machine (RBM) layers called autoencoders trained using Contrastive Divergence with one-step iteration (CD-1). This algorithm has improved the 2D positioning. To achieve this, the authors have deployed a large number of RFID tags in a 12 m × 12 m indoor environment, which does not take into account the power consumption of the devices. Finally, Wang et al. [
30] have suggested a hybrid deep learning solution combining a regression Deep Neural Network (DNN) with a Convolutional AutoEncode (CAE) using Visible Light Communication (VLC). To overcome the issue of fluctuated signal reading in the RSS-based fingerprint method, the authors have proposed an algorithm taking into account a set of consecutive signal readings and converting them to an RSS Temporal Image (RTI), instead of implementing the traditional RSS measurement processing technique. However, despite having been used in several works [
31,
32], VLC suffers from issues such as interference with other ambient lights, signal shadowing and usually requires the receiver to be in Line-Of-Sight (LOS), which can affect the accuracy of the location estimation. A detailed comparison of deep learning and other machine learning algorithms used in localization for IoT environment is covered in [
33,
34].
Until now, most of the existing IoT-based indoor localization solutions have mainly focused on either 2D localization or floor detection. However, in some special use cases scenarios such as indoor navigation for Unmanned Aerial Vehicle (UAV) or Automated Guided Vehicle (AGV) in a smart factory or big supermarket, precise 3D positioning is indispensable for daily operations. To address this issue, we suggest the DELTA to maximize the localization accuracy and minimize the distance error in a 3D indoor environment.
3. System Model and 3D Localization Problem
In this section, we introduce our proposed system model using Deep Neural Networks (DNN) and multi-layered radiomap to perform 3D Indoor Localization. To the best of our knowledge, this is a novel approach to implement deep learning on multi-layered radiomap for localization purposes. The main benefit of the proposed method is improved localization accuracy, and computational complexity minimization during online fingerprinting through the adoption of deep learning techniques, while at the same time utilizing the widely spreading WSN and/or IoT infrastructure making it an economical solution. To realize these steps, we considered N to be the number of transmitters in the environment and x, y and z, the corresponding coordinates of each fingerprint entry on the constructed radiomap. The 3D multi-layered fingerprint database has been constructed by linking the RSS values received from the transmitters to a 3D location on the radiomap [
35]. This can be mathematically expressed as:
where
M is the ratio-map database,
is a vector of RSS signal values and L is a vector of three values:
and
represents the total number of the sample location of
,
and
associated with each signal vector sample
collected during the offline-phase.
In this respect, the estimation problem is defined by solving the 3D localization problem using a matrix of historical location points and their corresponding signal values. However, the challenge is to model the non-arbitrary relationships between N transmitters members of S signal matrix to predict accurately the 3D location L using a deep learning algorithm. To achieve this, the 3D localization has been segmented to two sets of problems:
Problem 1. Given a matrix of S signal sent from N transmitters, predict the x and y coordinates of a 2D mobile station location. This can be written as:where represents the and 2D location, which we would like to estimate, and represents the function that utilizes RSS values received by the transmitters to predict the location of the mobile station. Problem 2. Given a matrix of S signal sent from N transmitters to the mobile station and , , known from problem 1, estimate the coordinate. This can be mathematically expressed as:where is the location, is the output of problem 1 solution and represents a matrix of signal values S as previously stated in problem 1. 5. Performance Evaluation Results
In this section, we explore, evaluate and critically analyse the simulation results against famous industry methods such as SVM and KNN. However, before going through the results analysis, it is worth mentioning that KNN and SVM modelling tasks have been carried out using Scikit-learn [
50], a widely used Python library toolset for machine learning and statistics. More specifically, SVM models have developed using an SVM class from the Scikit-learn library and KNN models have been built using a classifier class called KNeighborsClassifier [
51]. The DELTA models have been constructed using Keras API [
52], a deep learning library also available in Python. During the evaluation phase, the three algorithms were implemented using python software on the same machine with Intel
[email protected] CPU and 16 GB of RAM. In terms of time complexity, KNN has finished after 230 ms while SVM has taken 450 ms. The proposed DNN has used 160 ms to execute, making it more efficient than KNN and SVM.
5.1. Results Analysis
5.1.1. s. KNN and SVM
Using 180 random samples [
39], we have bench-marked and assessed DNN model
against KNN and Support Vector Regression (SVR) models. The samples have been obtained for each z layer making a total of 540 RPs. The SVR has been trained using a linear kernel, a degree of one and an epsilon value of one using 80% training and 20% validation data sets. Similarly, a KNN model has been trained with a K value set to three. The results in
Figure 14 show the error distribution in meters for all three models. SVR has scored a rather worse error distribution where the peak of its distribution ranges between 4 and 6 m error. KNN has done slightly better compared to SVR. However, a large proportion of the distribution error falls between 3 and 5 m, which makes it the second worse performing after SVR. DNN
has performed better. The peak of its distribution error samples falls between zero and two meters with a mean error of 1.6 m. A detailed result is provided in
Table 4.
5.1.2. s. KNN and SVM
Using the aforementioned samples, the z layer (z coordinate) has been estimated. The results are depicted in
Figure 15 illustrating a visual comparison of each classifier in a bar-chart using misclassification count as a measure. Each model has been given an equal number of three classes 0.25, 1.25 and 1.75 m. At first glance,
Figure 15 shows that Support Vector Classifier (SVC) has performed very badly in terms of classification of observations. The model has failed to accurately classify during the online phase. More than 66%—circa 120 samples—have been wrongly classified. With a total of 40 misclassified samples, K-Nearest Neighbor (KNN) has performed better than SVC but still does not differentiate between certain classes properly. Our proposed
model of DNN, has made excellent classification compared to both later models. As an effect, 100% of the 0.25 m layer has been accurately classified while more than 95% of the other two classes, 1.25 and 1.75 m, have also been properly predicted. The total number of misclassified samples is 20 bringing the classification accuracy rate to 89%. This shows how the proposed 3-D multi-layered model has outperformed the traditional models.
Table 5 gives a detailed count of each model and its misclassification count. The worse performing model is highlighted in red and the best performing model is highlighted in blue.