1. Introduction
Over the past ten years, location-based services have rapidly developed. In these services, positioning is one of the most indispensable approaches. For outdoor environments, the global navigation satellite system (GNSS) can provide very reasonable positioning results, with an average error of 3–5 m [
1]. However, this system is limited outdoors since it cannot guarantee the same accuracy indoors, where the satellite signals are reflected or blocked by the walls of buildings. Thus, there is an extreme demand to develop indoor positioning systems (IPSs).
Currently, there are various technologies used to track the position of a mobile user indoors. These technologies can be classified into two groups: building independent and building dependent. The first group includes the IPSs that do not rely on the infrastructure of a building to determine the user’s position. Some examples for this group are dead reckoning [
2] and image-based [
3] technologies. On the other hand, the IPSs that relate to the building where they are operated belong to the second group. For this group, we can divide it into two subgroups: the dedicated infrastructure required and the building’s infrastructure utilized. The former subgroup is defined as the IPSs that need the unpopular infrastructure to be set up in a building. Radio frequency identification [
4], acoustic signal [
5], or visible light communication [
6] are some examples belonging to this group. The latter subgroup utilizes the common infrastructure in the building to create the IPSs such as Bluetooth [
7,
8] and WiFi [
9,
10]. Among these technologies, WiFi-based positioning [
11,
12,
13] has drawn tremendous attention due to the widespread of WiFi infrastructure in indoor environments such as office buildings, hospitals, malls, etc. Moreover, the high popularization of smart devices such as smartphones or tablets and the compatible implementation of these devices with WiFi systems also leads to the large deployment of this technology for IPS. To enhance the performance of the WiFi-based positioning system, there are hybrid systems that combine the result of the WiFi with other technologies, such as pedestrian dead reckoning [
2,
9], camera [
14], or magnetic field [
15].
There are two main approaches for WiFi-based positioning: geometric and fingerprinting [
16]. For the former approach, some common conventional positioning methods are the time of arrival [
17], time difference of arrival [
18], angle of arrival [
19], etc. The positioning accuracy of these methods depends on the line-of-sight condition, which is hard to keep indoors, where there are many physical obstacles, such as walls or doors. For the latter one, the WiFi fingerprinting method is another popular method due to its low cost and ease of implementation on smart devices. However, the reflections, the multipath interference, or the changes in the environmental conditions could greatly degrade the performance of this method. Therefore, achieving a reliable and highly accurate WiFi fingerprinting-based positioning method is a challenging problem.
The traditional WiFi fingerprinting method has two phases: the offline and the online phases. In the offline phase, the radio map is built by collecting the RSS values from the available APs at different reference points (RPs) in a target area. The unique set of RSS values at one RP is defined as a fingerprint of that position. In the online phase, by comparing the measured RSS values at one unknown position with the RSS values from the radio map using different matching methods, the user’s position can be estimated.
The conventional rough set theory was first proposed by Pawlak [
20] as an extension of classical set theory to deal with vague information. This theory conceives that information and knowledge in information systems are expressed via indiscernibility relations (equivalence relations) between objects in the object set. The rough set theory is used for big data mining and knowledge reduction [
21,
22] and can be applied in various fields such as information systems [
23] or real estate market analysis [
24,
25]. To give a more flexible way to the rough set theory to handle the indiscernibility relation, Stefanowski and Tsoukias [
26] introduced the valued tolerance relation and proposed the valued tolerance rough set and decision rules method (VTRS–DR). This method is often applied as a support tool in decision-making systems. Rough set theory based on valued tolerance relation calculates the valued tolerance relation for all objects in the decision table, resulting in a valued tolerance relation matrix where the values belong to [0, 1]. This is the extension of the conventional rough set theory proposed by Pawlak.
When applied to WiFi fingerprinting-based indoor positioning, the VTRS–DR works not only in the offline phase but also in the online phase. In the offline phase, the conventional RSS fingerprinting database is converted into a decision table where the RSS values are classified into different decision classes which correspond to the predefined RPs. Each tolerance granule corresponding to each object (i.e., one row in the decision table) is calculated based on the valued tolerance relation matrix. After calculating the values of the membership degrees of objects belonging to lower (upper) approximations of each decision class, the credibility degrees of decision rules (i.e., objects in the decision table) are also calculated. Besides that, the support object set values of decision rules are defined by the tolerance granule of objects. A new fingerprinting database with decision rules is constructed from the decision table that includes the credibility degrees and the support object set values for all decision rules. In the online phase, the valued tolerance relation is calculated between the RSS values in the new database and the measured RSS values at an unknown position. The best decision class (i.e., one RP) is chosen among a set of RPs via the proposed method by comparing various components such as the valued tolerance relations, the credibility degrees, the support object set values, and the Euclidean distances; thus, the user’s position can be estimated. In our work, it means estimating the unknown position as the RP closest to that position.
To deal with the instability in RSS values from APs, in this paper, a novel VTRS–DR-based WiFi fingerprinting for IPS is proposed. The VTRS–DR method provides an efficient computational structure and helps to solve the indoor positioning problem with high accuracy. The main contributions are figured out as follows:
The VTRS–DR method is applied in the offline phase of the WiFi fingerprinting method by creating a new fingerprinting database with decision rules that includes the credibility degrees and the support object set values for all rules.
The VTRS–DR method is also applied in the online phase of the WiFi fingerprinting method by estimating the user’s position based on the fingerprinting decision rules database and the measured RSS value to determine the user’s position.
For performance evaluation, the proposed method is compared with the nearest neighbor-based and the random statistical methods to prove its superior positioning accuracy and robustness.
The rest of this paper is organized as follows.
Section 2 discusses the related works.
Section 3 describes the basis of the VTRS–DR method.
Section 4 introduces the proposed positioning method. Experimental results are displayed in
Section 5, and the conclusion is given in
Section 6.
2. Related Works
Generally, there are two categories of WiFi fingerprint-based matching algorithms: deterministic and probabilistic. The deterministic algorithm is very common due to its ease of implementation and its possibility to work well in real time. The nearest neighbor (NN), KNN, and weighted KNN (WKNN) [
27,
28] are some of the popular deterministic algorithms. To handle the fluctuation of WiFi RSS values, as well as to normalize them, Ninh et al. [
10] introduced the random statistical method in the offline phase. The method helped to create a standardized fingerprinting database. In the online phase, to figure out the unknown position of a user, the authors applied the Mahalanobis distance as the matching algorithm to improve the positioning accuracy. The experiments were conducted in different setup conditions in an office room. As a result, the maximum positioning error was only 0.75 m. In Reference [
29], instead of using the common Euclidean distance in NN-based algorithms, Duong-Bao et al. implemented five distance measures and compared the positioning results in different settings such as changing the number of setup APs or changing the grid spacing between two RPs. The experimental results showed that the Chi-Squared distance is the best measure with a mean error of 1.17 m in two test cases. Even though these deterministic algorithms could reduce computational complexity, they used only a single reference fingerprint for each RP (i.e., a set of mean RSS values from available AP) for finding the best match between the measured RSS values at an unknown position and the fingerprints in the database, and this could lead to big positioning errors due to the instability of the RSS values.
On the other hand, the probabilistic algorithm needs to know the probability distribution of the RSS values from available APs at every RPs. Even though this algorithm can provide good accuracy, it is more complex than the deterministic algorithm, and the positioning performance relies on the computation of likelihood functions. Some popular probabilistic algorithms are the Kalman filter [
9], particle filter [
30], and hidden Markov models [
31]. Using two Kalman filters, Zhuang et al. [
32] fused the positional information from the micro-electromechanical system (MEMS) sensors and WiFi fingerprinting to enhance the tracking accuracy. The MEMS sensors helped to reduce the searching space of WiFi fingerprinting by using an extended Kalman filter, and another Kalman filter was applied to smooth the positioning result of WiFi fingerprinting. From two test scenarios, the results showed that the proposed method improved the accuracy by 47% and 28% with mean errors of 2.2 m and 2.6 m, respectively. In Reference [
33], Deng et al. also used the extended Kalman filter in the integrated system, which consisted of pedestrian dead reckoning, WiFi fingerprinting, and special positions in a building (i.e., doors, elevators, escalators, etc.) to reduce the positioning error. From the experiments, it was shown that the mean positioning error was only 1.22 m in a test area of 487.2 m
2. The two above methods obtained good positioning results; however, they could not completely handle the variation of the RSS values, even though they used different sources (i.e., smartphone sensors and special positions in a building), which would make the systems become more complicated, to calibrate for estimated user’s position.
Currently, many latest works use the WiFi fingerprinting technique as the main tool to determine the user’s position with the help of machine learning techniques [
34,
35,
36,
37,
38]. In Reference [
37], based on the conventional WiFi fingerprinting, Song et al. proposed the CNNLoc which combined the stacked auto-encoder and convolutional neural network (CNN) to track the users’ positions in multi-buildings with multi-floors. The framework required a lot of data preprocessing in the training phase, such as sub-datasets division, rectangle areas creation, cell grids division, etc. As the result, the proposed method obtained a 96.03% accuracy of floor classification with a positioning error of 11.78 m. Moreover, using the deep learning approach, Qin et al. [
38] combined the convolutional denoising autoencoder (CDAE) and CNN for the IPS. In the offline phase, the K-means algorithm was used to extract the set of features. During the online phase, the RSS values were put into the CDAE to extract the main features, and then the CNN was used to find out the user’s position. The experimental results showed that the mean positioning errors were 1.05 m and 12.4 m for two different datasets. These methods were vulnerable to the changes of RSS values, and they were required to be adjusted to work well with different buildings.
The RSS values collected even from the same AP can fluctuate much due to the changes of environmental conditions, such as the number of working people, the number of electrical devices, the user’s body, the period in a day, etc. The aforementioned works have not controlled the instability of the RSS values perfectly. Therefore, we propose a novel WiFi fingerprinting-based method, which utilizes the valued tolerance rough set and decision rules method to classify the user’s position among predefined reference points. To the best of our knowledge, applying the VTRS–DR method to solve the indoor positioning challenge has never been recorded before.
3. VTRS–DR Method
Say we have a set of objects,
(
is the total number of the fingerprints of
RPs), that can be characterized by a set of conditional attributes,
(a set of
RSS values collected from
APs at one
RP,
). If we denote a subset of objects in
by a decision attribute
(with
), which is one
RP in a set of
RPs, then we define the decision table
. The decision attribute
partitions set
into
decision classes (
) as
,
, with each class being one
RP in the fingerprinting database. Each decision class is a tolerance class. At the beginning of the VTRS–DR method, the relations between the objects and the attributes can be represented through the decision table,
, which is given in
Table 1. In this table,
conditional attributes (i.e.,
RSS values from
APs)
are given and each attribute
,
, has its
RSS values changing in the interval [–100, 0] (dBm). Here, each object (i.e., one row in
Table 1) represents one fingerprint (i.e., a unique set of RSS values) of the given
RP, which is a decision class. It is worth noting that, in the decision table,
, we do not need to sort the
RPs by their orders, and the number of appearing times of each decision class (i.e., each
RP) can be different which depends on the number of RSS scanning times at each
RP.
To explain the VTRS–DR method clearly, we are able to give some basic definitions.
Definition 1. Valued tolerance relation among the objects of setbuilt on a set of conditional attributesis denoted by (i.e., ). satisfies two properties: (a) reflexive, ; and (b) symmetric, .
Definition 1 is used to determine whether the objects of set satisfy the valued tolerance relation built on a set of conditional attributes, . If a pair of objects has the valued tolerance relation, then these objects may belong to the same tolerance class.
Definition 2. The valued tolerance relation of each attribute , in any two objects,, is denoted by and calculated as follows:
where
,
is the discrimination threshold value of the attribute
,
. The discrimination threshold,
, is defined as the threshold of the similarity measure. The value
is used to measure the similarity between two values of the attribute
. From Equation (1),
,
, and
. The value
determines whether the two objects are indiscernible (equivalent) with the respect to each attribute in rough set theory.
Definition 2 is used to calculate the valued tolerance relation between a pair of objects in built on a conditional attribute, . This definition supports the calculation of the valued tolerance relation between two objects in Definition 3.
Definition 3. The valued tolerance relation between two objects, , on a set of conditional attributes, , is denoted by and calculated as follows:
From Equation (2), , and satisfies two properties, (a) and (b), of the valued tolerance relation.
Definition 3 used is to determine the valued tolerance relation between a pair of objects in built on a set of conditional attributes, . If , and then two objects, , are not related. If , then two objects have a valued tolerance relation. If is closer to 1, then two objects are more related. Then the valued tolerance relation matrix is built, , where this matrix is symmetric with the main diagonal is equal to 1. This definition supports Definition 4 to calculate the tolerance granule for each object in .
Definition 4. A tolerance granule of object for relation is denoted by and calculated as follows:
Definition 4 is used to determine whether the objects in have the valued tolerance relation with an object, , built on a set of conditional attributes, . Here, is a tolerance granule of u, where it carries a set of objects that has the valued tolerance relation with (i.e., a set of supporting objects of ). This definition supports Definition 5 to determine the lower and upper approximations of the decision classes (i.e., the tolerance classes) built on a set of conditional attributes, .
Definition 5. With a set of decision classes, , a subset of conditional attributes, , and an object, , the lower approximations and upper approximations of for are denoted by and and defined as follows:
Definition 5 used is to determine a lower approximation, , of decision class , which has tolerance granules, , involved in and an upper approximation, , of that has tolerance granules, , intersected with . In this way, any objects that belong to are certainly involved in ; meanwhile, any objects that belong to can be involved in or not. Therefore, the determination of the lower and upper approximations plays an important role in the performance of the VTRS–DR method since it strongly affects the decision results, as well as the calculation space. The closer the and are to the , the more accurate the results and the lower the computational cost.
Definition 6. With a setof decision classes, , a subset of conditional attributes, , and an object, , the membership degrees of object belonging to and are denoted by and and calculated as follows:
where
is the membership degree of object
in the set
(i.e.,
or
).
Definition 6 is used to calculate the membership degree of an object, , to find out whether it belongs to the lower or upper approximation. The and values are in the interval between 0 and 1; thus, the object, , will belong to the approximation that is closer to 1. This is an important criterion to determine the decision class, , that the object, , belongs to.
Definition 7. With a decision table and a subset of attributes, , a decision rule that describes an object, , , is denoted by , and defined by the following:
where
is a value of
and
is a decision class based on decision attribute
of the object
.
Definition 7 is used to describe a decision rule, , corresponding to an object, , . In Equation (8), the left side presents a set of conditional attributes, , and the right side presents a decision attribute, , corresponding to a decision class that involves the object, .
Definition 8. With a subset of attributes, , the credibility degree of a decision rule , , is denoted by and calculated as follows:
Definition 8 is to calculate the credibility degree of a decision rule, , . The value is in the interval between 0 and 1. If this value is closer to 1, then it is reliable that the object is involved in the decision class . This is an important criterion for classifying a new object into a decision class.
Definition 9. With a subset of attributes, , a support object set of a decision rule , , is denoted by and calculated as follows:
In Equation (10), is the tolerance granule of rule for relation . It means, on binary relation , and with a set of conditional attributes, , we build a valued tolerance relation, , such that , where is the support degree of rule for object . Object is similar to some extent to the conditional part of the rule on the set of conditional attributes, . Moreover, is a valued tolerance relation defined exactly as the relation . Thus, is calculated as in Equation (3), i.e., and , , where is an object of the rule .
Definition 9 is to determine the , a support object set of a decision rule, , corresponding to an object, , . The of is equivalent to a tolerance granule, , of the object . If many decision classes have the same credibility degrees, then it is difficult to decide which class the object belongs to; thus, the support object set of each class will help to solve the problem of classification. This value plays an important role in classifying a new object into a decision class.