1. Introduction
Indoor spaces are one of the main places for people to engage in various activities. With the acceleration of the urbanization process and rapid development of modern architectural technology, the indoor space pattern of large public buildings is more complicated, and the functional zoning is more diverse. Especially in large public places, such as shopping malls, airports, and exhibition centers, the situation of getting lost indoors is particularly prominent. Seamless coverage of indoor and outdoor navigation and location services has become a basic requirement, and indoor positioning and navigation has become a hot spot in the current navigation field [
1,
2,
3]. Indoor real-time navigation involves many links, such as indoor positioning, indoor space information expression, path planning, etc., among which indoor space information expression provides information support for navigation planning and navigation map display. Indoor navigation technology focuses more on the research and experimentation of indoor positioning technology, while research on the indoor spatial information model (indoor information organization) necessary for a complete indoor navigation system is relatively weak. In fact, the indoor spatial information model plays an important role in the realization of various functions of the indoor navigation system. For example, the optimal path planning cannot be separated from the indoor topological network, the three-dimensional expression of the path cannot be separated from the indoor geometric information, and the interaction process in the path query process cannot be separated from the indoor semantic information. The effective organization and storage of these indoor information is the foundation for ensuring the implementation of system navigation functions.
The data sources currently used to extract indoor space information mainly include indoor laser point cloud, building information model (BIM), architectural design plan, CityGML, etc. From a technical perspective, compared with extracting indoor space information by using unstructured and semantic-free point cloud data, it is relatively easy to extract indoor navigation information from BIM or building plan containing rich building element information [
4,
5]. However, although the BIM model develops rapidly at present, many architectural designs are still based on building plans, which lack complete BIM data [
6,
7,
8,
9]. Therefore, constructing an indoor space topology model for indoor navigation based on building plans is still a very necessary and effective way. To meet the various application requirements of indoor navigation, such as real-time localization, path planning, and navigation visualization, the indoor space information model needs precise geometric expression (geometric layer), rich semantic information (semantic layer), and complete spatial topology description (topology level), as shown in
Figure 1.
In the process of indoor navigation, indoor space information often abstracts doors and windows, walls lines, and rooms into three geometric elements of points, lines, and faces. Building plans provide architectural design information such as walls, doors, windows, annotations, and dimension annotations, but lack description of the topological adjacency and reachability between polygonal entities such as rooms and corridors [
10]. Therefore, the construction of the topological relationship between walls and rooms is the key link to extract indoor navigation information from building plans, which essentially involves the reconstruction and expression of the topological relationship between lines and polygons.
2. Related Work
Existing research has mainly use three ways to extract indoor space polygons and topology relations from building plans: rasterization, calculating azimuth angles, and using graph theory. The method of rasterization processing requires transforming building plans into images first, and then performing subsequent processing in the form of images. For example, the literature [
11] uses a raster-based seed point filling algorithm to extract polygons and establish topology relations among them. However, due to the complex spatial structure and irregular shape of complex indoor environments, it is difficult for raster models to accurately represent indoor spatial boundaries and obstacle boundaries. At the same time, polygon boundaries extracted based on the raster form also require the process of converting from raster to vector, which increases the complexity of algorithm processing.
Extracting polygons by calculating azimuth angles is a method based on geometric calculation. The steps of implementing automatic construction of the indoor space polygon and topology with azimuth angle calculation include arc segment adjacent relation determination, polygon search, and topology relation determination. Most of the polygon generation algorithms are left-turn algorithms based on azimuth angle comparison [
12,
13,
14,
15], and the calculation and comparison of azimuth angles are very time-consuming in the process of polygon search. Although many strategies are used for optimization [
16,
17,
18,
19], the improvement effect is not obvious. In addition, these algorithms are not fully applicable to the polygon extraction in complex building plans containing multiple islands.
Some scholars employed graph structures to extract indoor spatial polygons, avoiding the calculation and comparison of inner angles of polygons and enhancing algorithm efficiency. In addition, graph structure also has natural structural advantages in expressing adjacent and connectivity relationships. With the help of mathematical concepts, such as nodes, edges, weights, etc., spatial entities are abstracted and topological relationships are constructed. Its basic idea is to abstract indoor space into a graph composed of nodes and edges, which is convenient for the construction of topological relationships of polygons. The literature [
20] dynamically forms same-level nodes and sub-level nodes in the process of polygon search, and successively searches same-level nodes, sub-level nodes, and superior nodes to generate the topological relationship of the polygon automatically. The literature [
21] analyzes the topology structure of the building plan based on the principle of “indoor space is an independent space enclosed by various building components” based on the way of loop. The literature [
22] proposes a preprocessing method of wall lines based on odd and even methods and adopts the search-loop method to obtain indoor space. However, in modern large buildings (
Figure 2a), there are often complex situations, such as multiple isolated islands, columns supporting the hall (
Figure 2c), and isolated walls used to divide functional areas (
Figure 2b), which lead to error loops and degenerated loops when recognizing loops in building plans. The literature [
23] calculates a convex hull from the union set of boundary components in the interior space of enclosed buildings, and uses Boolean difference operation between the convex hull and the union set of boundary components in the interior space of enclosed buildings to obtain corresponding polygons. However, for the indoor navigation, some special cases are not paid attention to, such as the hanging edge treatment and the extraction of complex isolated island boundaries, and lacks of a complete topological description of indoor spatial relations.
To solve the problems of the existing methods in the process of extracting indoor space information by building plans, a disjoint-set based automatic construction algorithm of complex indoor spatial topology model is proposed in this paper. After the extraction of indoor polygons, the algorithm can also effectively construct the topological connection relationships between different indoor spatial entities, such as room–wall relation, door–room relation, and room–room relation. Meanwhile, it can well deal with the special cases, such as hanging edge, indoor island polygon, and multi-layer nesting between them. In addition, introducing disjoint set greatly improves the processing efficiency of the algorithm and simplifies programming complexity. The proposed indoor topological model stores all the indoor spatial entities, such as doors, windows, walls, and rooms, and their relationships, providing a basis for building indoor navigation information models.
4. Algorithm
To satisfy the demand of indoor navigation service, a new algorithm is proposed to extract indoor space into a room polygon, boundary polygon (outer contour of floor), and isolated islands in the room polygon. The algorithm mainly includes hanging edge recognition, indoor space polygon extraction, and island and boundary polygon determination. Firstly, the isolated walls of complex buildings are identified and transformed into non-navigable elements for storage. Based on the union-find operation, the topological reconstruction of indoor space is efficiently implemented, and then the isolated polygons and boundary polygons in the complex indoor space are extracted to ensure the integrity of the topological information of indoor space entities. The flowchart of the proposed algorithm is shown in
Figure 4.
4.1. Hanging Edge Detection
In the process of digitizing the polygon graph, if the line is out of the intersection point or does not reach the intersection point, after intersection processing, hanging edges [
25] will be formed, which are expressed in complex buildings as isolated walls for dividing different functional areas. To obtain accurate indoor spatial topology information, it is necessary to accurately identify the hanging edges existing in the building plans.
The Tarjan algorithm is an algorithm proposed by Robert Endre Tarjan based on depth-first search to find the strong connected components in a directed graph [
26].
DNF[
] is defined as the visit order (timestamp) of node u in the depth-first search, and
LOW[
] is the earliest DFS timestamp to which u or its subtree can backtrack through reverse edges. Initially,
. In this paper, a backtracking array for undirected graphs is introduced, and Tarjan’s DFS (depth-first search) backtracking method is used to extract all hanging edges in the room polygon in linear time.
Firstly, the simplified building plans (
Figure 5a) are traversed by DFS to get the depth first search tree (
Figure 5b), and the DFN and LOW values of all nodes are obtained. After DFS processing to the end, backtracking forward occurs, while dynamically maintaining
(
Figure 5c). If the DFN value and LOW value of a point are equal, the edge from
to its father
in the depth first search tree is a hanging edge (the root node has no parent node), i.e., if there exists
(n is the number of child nodes of
), then
is a hanging edge. At the end of DFS backtracking, all hanging edges are output as non-navigable elements, which facilitates reasonable indoor navigation planning in complex environments and better meets the navigation requirements in buildings.
4.2. Disjoint Set Operation
Disjoint set is a set of sets used to represent mutually exclusive dynamic sets [
27]; its mathematical model is
, where
is a set of certain elements, and
. Each set
has a representative element
, which is a member of this set, the root of each element, and the name of the set. Its basic principle is: first, each element is considered as a set. If there are direct or indirect connections between elements in different sets, these two sets are merged until there are no connections between elements in any two sets.
The core idea of the disjoint set is to view the set as a tree, with each node in the tree representing a unit element in the set, and the root of the tree corresponding to the representative element of the set, which is also its own parent node. The array type disjoint set stores its tree nodes in an array form. We only need a one-dimensional array to represent the tree data structure of disjoint sets, which can greatly reduce the time consumption of accessing nodes. The initial state of the set is shown in
Table 4.
Three main operations of the disjointed set are as follows:
- (1)
: Create a new set containing a single element x, update x’s parent to x in the array, that is, the root (name of the set) of x is x;
- (2)
: Merge two dynamic sets containing x and y (represented as
and
) into a new set, i.e., the union of these two sets (see
Figure 6). Call two
operations:
and
. If they return the same root, that is, x and y belong to the same set, no merging is required. If not, return
and
, choose the representative of
and
as the new representative, and in the array update the parent of x to y to complete the set merging;
- (3)
: Find the set containing the element x and return the name of the set containing the element x.
To get the name that the set x belongs to, we recursively search until we reach the root of the tree. When the depth of the tree increases, it will affect the efficiency of the find operation. Here, we introduce a very simple and efficient strategy, path compression, that is, during each search process, the parent node of each node on the search path is directly pointed to the root node. For example, in the
operation, we move from x to the root node of the tree. The effect of path compression is that the parent node of each node on the path from x to the root is changed to the root node, as shown in
Figure 7. The
operation using path compression is as follows:
4.3. Polygon Extraction of Indoor Spaces
From the analysis of
Figure 8, it can be seen that the polygons in the building plan have the following three properties:
- (1)
The polygon is a closed shape composed of several edges;
- (2)
The plane outside the map of the building plan is regarded as a polygon, each edge is shared by two and only two polygons, the left and right sides of the edge of the polygon belong to two different polygons, that is, the wall of the object can be regarded as two side walls, one wall corresponding to one room;
- (3)
If multiple outgoing edges of a point are put in a ring, the adjacent two edges on the ring must be contained in the same polygon.
Based on the above properties, the algorithm proposed in this paper for extracting indoor space polygons is as follows: If an edge is split into two parts, left and right, and each polygon occupies its “inner side” half, then each “half edge” can only contain and contain in one polygon; traverse each node one by one, and merge the two adjacent half edges on the ring formed by the outgoing edge of the node into the same polygon. We use some sets to represent these polygons, and by continuously merging, the half edges belonging to the same polygon are put into the same set. Finally, by traversing the sets, we can get the wall lines contained in each indoor space polygon (room and boundary polygon), that is, the indoor space contours and the external boundary contours. As shown in
Figure 8, taking R
2 as an example, a ring is established at the C
1 node. E
1, E
2 belong to R
1, E
3, E
1′ belong to R
2, and E
2′, E
3′ belong to R
3. The disjoint set initializes each half edge as a set, as shown in
Table 5. Processing the ring corresponds to the node in R
2, and the extracted results are shown in
Table 6. The first row is all elements in the set, and the second row is the set that each element belongs to.
4.3.1. Ring Establishment
The operation of enclosing each node’s outgoing edges in the building floor plan into a loop can be implemented based on computational geometric thinking.
Combining the chain forward star (essentially an adjacency list simulated by array), the edges construct a five-tuple, , where a is the starting point, b is the ending point, type is the edge type (original edge 0 (forward edge), derived edge 1 (reverse edge)), id is the edge number, and next is the number of the next outgoing edge from a:
- (1)
For each node, use the chain forward star (essentially using an array to simulate the adjacency list of the link list) to quickly find all outgoing edges of the current node, and record the coordinates of the current point (
x0,
y0) and the coordinates of the other node (
xi, yi) on the i-th edge where the current point is located. Set the node being processed as the temporary origin, then (xi, yi) is transformed into the relative coordinates of (
x0, y0), i.e., (
x = xi − x0,
y = yi − y0). Based on the distribution method of quadrant, all wall lines corresponding edges are divided into eight sets according to their coordinates. The parameters
Ki and
Ti are given to describe the orientation of the space where the Ith wall line is located, and its value setting is as follows:
- (2)
For the same set, all the edge vectors are sorted by slope, and if the slope is the same, order according to the length of the line connecting to the origin as the second keyword in a positive order;
- (3)
Sort all the outgoing edges of the current point according to the set label from small to large, and if the set labels are the same, sort them according to their order in the set.
Construct a ring to the outgoing edges of each node in the building plan according to the above rules, use a circular queue to simulate a ring in the implementation, get all adjacent edges, as shown in
Figure 9.
4.3.2. Indoor Space Reconstruction
First, process each node in the ring one by one, traverse each node in the ring, then find the edge (a) connected with the current node and the edge (b) connected with the next node in the ring. Merge the left half of a relative to the current node and the right half of b relative to the current node. To ensure that the same half side is found when switching the reference system (current node), the edges need to be directed (that is, randomly determine a positive direction for each edge, and then the left and right of the edge are determined according to its left and right relative to the starting point). Finally, all adjacent half sides are merged to form some sets, and each set corresponds to an indoor space polygon.
The steps of topological reconstruction of indoor space based on disjoint set are as follows:
- (1)
Split the m edges in the building plan into 2×m half-edges and number them sequentially, , where . _edge is the storage location of the edge, the id=_edge>>1, when reading all the edges, binary right shift one bit, restore to decimal, and set as the id of the half edge;
- (2)
Create a union-find with a capacity of 2×m, initialize the union-find array, and put the 2×m half-edges into 2×m sets sequentially. In the tree representation, each tree node represents an element in the set, that is, a half-edge in the building planar map, and the element value is the corresponding number of the half-edge. In the one-dimensional array representation, the array subscript is the corresponding number of the half-edge, and the array element value is the name of the corresponding set. Initialize each set’s root to the element itself;
- (3)
Call , set the root of one set to the root of the adjacent edge that needs to be merged, and call to query the set to which these two edges belong. If the two semi-edges originally do not belong to the same set, then merge the two sets, and dynamically update the corresponding relationship between elements and their parents, until all adjacent half-edges on the ring in the building plan are merged.
For simple polygons (such as R
5, R
6 in
Figure 2), the inner and outer edges of the above process will be merged twice, thus generating two duplicate polygons. Therefore, if the number and number of edges of two sets are the same, one of them is deleted to ensure the accuracy of extracting the number of indoor space polygons.
- (4)
Establish a mapping from the set number to the double-ended queue to store the edges contained in each polygon. After merging, the elements contained in each set are the boundary sets of each indoor space polygon (room and boundary polygon). Look up the set of each edge, put all the edges into the corresponding double-ended queue of the set, and traverse them one by one to get all the indoor space contours and external boundary contours.
4.4. Determination of Island and Boundary Polygons
Island polygons can be understood in the graph theory as the outer boundary contour of a “connected graph”, and a boundary of a connected graph corresponds to an island polygon. Island polygons can also represent the internal boundaries of those areas that contain “islands”. Island polygons are separated from each other and have no common edges. The building plan may contain zero, one, or more island polygons. All island polygons are divided into two categories, one in different outer polygons, the other in polygons containing infinity points. In this paper, we consider the second type of island polygon as the outer outline of the floor, that is, the boundary polygon, which reflects the spatial pattern of the whole floor. After extracting the indoor space, 12 polygons R
1~R
12 can be obtained, as shown in
Figure 10. Intuitively, R
1 is a boundary polygon, R
7 and R
8 are nested multi-island polygons, R
9 is an independent island polygon, and R
10, R
11, and R
12 are continuous multi-island polygons.
Converting a subgraph that is connected between any two points in a building plan into blocks is called the connected blocks of the building plan. A face’s outer boundary can only correspond to one polygon, so the maximum polygon in the connected block can be intuitively described as the “outer contour”, so we can identify the islands polygons and boundary polygons by determining the connected blocks and finding the maximum polygons in each connected block. Connected block determination can be implemented through the deep first search algorithm (DFS), specifically as follows:
(1) Select an unvisited point to do a depth-first search, find all the nodes that can be directly or indirectly accessed, mark each visited node with the same label, marking which connected block it belongs to, and “coloring” all vertices of the block as visited; (2) continue to find the next unvisited point until there are no unvisited points, and the algorithm ends. During this execution process, only unvisited nodes will do DFS, so each connected block has only one DFS entrance. By calculating the DFS entrance, the number of connected blocks can be judged; the same serial number label is used to identify the vertices belonging to each connected block, so that the vertices contained in each connected block can be identified.
In complex indoor environments, such as large shopping centers and office buildings, there exist complex island polygons, nested multiple islands, and continuous multiple islands. The literature [
23] uses Boolean differences to subtract the space occupied by isolated columns and walls in the indoor space, and the area occupied by isolated components in the indoor space exists in the form of isolated island polygons without consideration of how to extract complex islands boundaries (such as R
7 in
Figure 10). The literature [
10] can only extract single islands, and the efficiency of island attribution determination needs to be improved. At the same time, the literature [
21] also needs further processing of building plans with columns and beams. In view of the above situation, an algorithm is proposed to identify all isolated island polygons and floor external contours in the building plan by using the number and area of connected blocks, which is of high universality and its time complexity is
. Detailed algorithm steps are as follows:
- Step 1:
Determine whether there is an island by the number of connected blocks. Identify all the connected blocks in the building plan. If the number of connected blocks is one, then the building plan does not contain any islands. Otherwise, in addition to the outermost connected block, the outer boundary of each connected block is an isolated polygon;
- Step 2:
Compare the area size of the same connected block to find the outer contour of the connected block. For each connected block, traverse all the polygons and label the polygon with the largest directed area as the outer contour of the connected block, and record the area of the polygon as the area of the connected block;
- Step 3:
Traverse each connected block to find the connected block with the largest area and mark its outer contour as the outer contour of the floor, i.e., the boundary polygon of the floor (such as R
1 in
Figure 10), representing the entire floor area;
- Step 4:
Determine whether the connected block belongs to the boundary polygon, find the connected block without the floor outline label, and mark its outer boundary as an isolated polygon (such as R
7, R
8, R
9, and R
10—12 in
Figure 10).
4.5. Time Efficiency Analysis
Assuming the number of nodes is “n” and the number of edges is “e”, the entire algorithm is divided into three parts: hanging edge recognition, indoor space polygon extraction, and island and boundary polygon determination. In hanging edge recognition, each vertex is visited once and only enters and exits the stack once, and each edge is also only visited once, so the time complexity of this process is
. In indoor space, most of wall lines are perpendicular to each other and there are relatively few wall lines around the same corner point. Therefore, the number of outgoing edges contained in each point is
, so the complexity of computing ring time is
. The time complexity of initializing the disjoint set is
, and the time complexity of
times of merging
queries after path compression is
, where α is the inverse of Ackerman function [
28], which can be seen as less than 4, so the running time of the disjoint set is linear to the total number of operations. Strictly speaking, it is superlinear. In the process of extracting island polygons, the connectivity block determination process only needs to traverse each node and edge once, so the time complexity is
. For each connected block, process one by one to find the largest polygon, visit the polygon in the connected block a constant number of times, and the number of polygons and the number of edges e are on the same level, so the time complexity of this process is
. Therefore, the total time cost of the algorithm can be seen as approximately linear.
In LBS applications, the data required by users is downloaded from the data server in real time. Sometimes, the architectural plan data of large buildings after downloading needs to be established in a real time topological relationship. In the traditional topological relationship determination algorithm, it is necessary to determine the inclusion relationship between the inner point of each polygon and other polygons to combine into complex polygons, whose time complexity is
. The algorithm proposed in [
10] for automatically constructing the indoor space topology model, but in the extraction of room polygons, all edges must be traversed again each time to track new rooms, thus increasing the computational complexity. The algorithm in [
17] will affect the generation efficiency when the targeted polygon region is a narrow area, and the search loop algorithm in [
22] has a square relationship with the number of elements. The proposed algorithm improves the efficiency of indoor spatial information extraction, reduces the computational complexity, and can provide a more effective data organization basis for large indoor navigation systems.
6. Conclusions
The topology model of indoor space is the basis of indoor navigation and positioning services. The effective organization and expression of indoor spatial features directly affects the level of navigation and positioning services. Building plans cannot be applied directly to indoor navigation maps because of the lack of expression of room elements and their topological relationships in them. Focusing on indoor navigation requirements, using building plans as data sources, a fast algorithm for building indoor spatial topological model based on the graph and disjoint set is proposed in this paper. The proposed algorithm can automatically extract the indoor polygonal space of a building using building plan data, and fully describe and express the topological relationships between these spatial objects. In addition, the algorithm can well identify and properly handle isolated walls, continuous multi-island polygons, and nested polygons used to divide different functional areas in complex buildings. The indoor spatial topology model provides necessary data support for indoor location services, such as indoor navigation, path planning, emergency evacuation, and building operation management.
Currently, this algorithm lacks consideration for extracting the semantic information of indoor spatial polygons, which can be extracted from the annotation of building plans. If there is no corresponding annotation, it may be considered to use deep learning methods for learning and recognition. Of course, to obtain more detailed indoor spatial element information, in the future, we will use deep learning methods to extract various indoor elements of the building plan structure, thereby constructing a more complete indoor spatial topology model.