1. Introduction
Due to recent improvements, mobile laser scanners (MLS) became an effective means of data collection in urban and indoor scenes. Indoor mobile laser scanners (IMLS) are capable of quick data collection at a lower cost than terrestrial laser scanners (TLS). Three types of common IMLS devices can be distinguished: Handheld devices (e.g., Zeb-Revo), push-cart systems (e.g., NavVis Trolley) and backpack sytems (e.g., Leica Pegasus). Thanks to the MLS mobility, these devices can achieve a more complete coverage of cluttered scenes in a shorter time.
In addition to generating point clouds, IMLS systems generate a trajectory of the sensor positions, which is a valuable source for the scene understanding. The trajectory can be linked to the point clouds through the time stamp. In robotics, some researchers have exploited the robot’s trajectory to classify indoor places from both the trajectory and point clouds [
1,
2]. However, the trajectory can be more useful in understanding indoor scenes. In our research, the trajectory is used for the detection of openings, separating building floors and the detection of stairs. For example, the trajectory as a set of scanner positions is used for occlusion reasoning to discriminate between openings and occlusions. Furthermore, wall planes that are intersected by the trajectory can be used to detect doors. Points that belong to stairs can be extracted by using the trajectory of the stairs. Obviously, detecting stairs by trajectory analysis is only applicable for laser scanners that are operable on stairs, i.e., for backpack and handheld systems.
In addition to using the trajectories, our research introduces a method for detecting the permanent structure, such as walls, floors, ceilings, and stairs from point clouds. Most current indoor reconstruction methods are limited by assuming vertical walls and a Manhattan World [
3,
4,
5] to reduce the complexity of 3D space. Few works deal with arbitrary wall layouts [
6,
7,
8], but they are restricted to vertical walls and horizontal ceilings. Our method detects slanted walls and sloped ceilings exploiting the adjacency of permanent structures, based on the assumption that there is less clutter near the ceiling in indoor environments. Additionally, the arbitrary arrangements of walls (non-Manhattan-World) will be handled in this work. Our pipeline for semantic labeling of permanent structure uses detection of planar primitives labelled as wall, floor and ceiling, and their topological relations.
Room segmentation is another research problem in large-scale indoor modeling. In the literature, different approaches, such as Voronoi graphs, cell decomposition, binary space partitioning and morphology operators [
9] are suggested for 2D and 3D room segmentation. Some of these methods have limitations, such as Manhattan-World constraints and vertical walls. Most of the room segmentation methods rely on the viewpoint [
8,
10] and require scanning with a TLS in each room [
7]. However, as opposed to one scanning location per room, mobile laser scanning systems produce a continuous trajectory and assigning points per room based on the scan location is not possible. Similar to our method for trajectory analysis, refs. [
11,
12] exploit the trajectory for space subdivision. Although their focus is only on space subdivision and simple structure, their results support our motivation of using the trajectory for interpretation of point clouds.
In our pipeline, a novel method is suggested for partitioning interior spaces based on voxels and exploiting unoccupied space. Besides knowing the room layout, information about the doors, walkable space and stairs supports navigation planning. Therefore, voxels are used to identify the walkable space and the trajectory to identify the stairs and doors.
Reflective surfaces, such as glass, complicate the analysis of indoor point clouds. Such surfaces cause the appearance of “ghost walls” in the data that do not exist in the real building. Ghost walls may incorrectly be detected as part of the room layout and sometimes result in an incorrect room segmentation. The problem of transparent and specular surfaces is addressed in robotics applications [
13,
14]. We tackle this problem by comparing the time stamps of points with the time stamp of the nearest trajectory parts before starting the wall detection process. Using our method, some of the noise caused by the reflective surfaces can be corrected.
The contribution of this work is introducing methods for using the sensor trajectory as a valuable source for semantic labeling of IMLS points clouds. The result is not a watertight model, although it extracts a coarse 3D model from heavily cluttered data with the presence of noise. Some of the methods presented in this work (e.g., door detection) are limited to mobile laser scanner data because of use of the trajectory. Most of our methods are applicable to TLS point clouds as well. For example methods for the wall, floor, and ceiling detection can be implemented on both RGBD data and TLS point clouds. The proposed methods are tested on three types of mobile laser scanner data: Backpack systems, trolley systems (push-cart), and handheld devices. The rest of the paper explains the related work, and data collection, followed by the methodology for permanent structure detection, space partitioning and door detection in
Section 4,
Section 5 and
Section 6, respectively. The results, evaluation and conclusion are described in
Section 7 and
Section 8.
2. Related Work
In this work, several known problems are addressed in the domain of indoor modeling, such as detection of permanent structures, room segmentation, opening detection and dealing with noise and reflective surfaces. For each of the cases, the state of the art is reviewed in the following subsections.
Data acquisition: The first step in any indoor modeling pipeline from real data is collecting data and preprocessing to clean up the data. The main sources of the data for indoor modeling in large scale are point clouds from LiDAR Systems or RGBD Systems. LiDAR systems could be TLS devices, such as RIEGEL VZ [
15], FARO FOCUS [
16], or MLS devices, such as the Google Cartographer backpack [
17], Leica Pegasus backpack [
18], NavVis M3 Trolley [
19], VIAMETRIS iMS3D [
20] and Zeb-Revo and Zeb-1 [
21]. RGBD cameras, such as Matterport [
22] and Google Tango [
23], are another source of data for indoor modeling. However, RGBD cameras have less accuracy in comparison with TLS or MLS. Lehtola et al. [
24] present a thorough review of various indoor mobile laser scanners based on Simultaneous Localization And Mapping (SLAM). According to their study, TLS systems have the highest accuracy, but less flexibility, than MLS for indoor data acquisition. Backpack and handheld systems have the most mobility, but at the cost of a lower accuracy than trolley and TLS devices. The trolley devices are constrained to near-flat surfaces; they cannot be used on staircases and steep slopes. RGBD cameras are accurate enough for indoor 3D modeling purposes and scene understanding, but not surveying goals. In our research, we only use the point clouds from laser scanner systems, such as the data from NavVis M3 Trolley, handheld Zeb-1, Zeb-Revo and a prototype backpack system (ITC Backpack) based on the proof of concept of 6DOF SLAM [
25].
Reflective Surfaces: The first step after data acquisition is dealing with noise and artefacts. Often these artefacts come from transparent and specular surfaces. Koch et al. [
14] investigate this problem to identify specular and transparent surfaces during scanning with a SLAM robot. Their goal is to identify and purge the corrupted points from the data on the fly or by post-processing. The intensity of the reflected laser pulse and the material of the surface (e.g., aluminum surfaces, glass, and mirror) often have unique distribution for discrimination of the transparent and reflective surfaces. However, the detection of transparent surfaces is more challenging because of the characteristic of the material. In another study by Foster et al. [
13] the authors employ both the geometry and the angle of incidence between the laser and the surface during scanning. They suggest that in a particular angle of incidence, specular and glass surfaces are visible to LiDAR and glass can be detected.
Approaches to indoor reconstruction either from LiDAR point clouds or RGBD images can be categorized to three following categories:
Indoor Volumetric Reconstruction: These approaches involve volumetric primitive detection (e.g., cuboid) and are often computationally more expensive than grammar-based and Binary Space Partitioning (BSP) methods. However, volumetric methods have a better representation of non-Manhattan-World structures, slanted and rounded walls and sloped ceilings. Xiao et al. [
26] employ inverse constructive solid geometry (Inverse CSG) to build the 3D model. A 3D CSG is generated by iteratively stacking 2D CSG models. Each 2D CSG model is produced with many line segments that form various rectangle primitives. Their approach cannot model rounded walls because their hypothesis is based on extracting rectangles. Mura et al. [
10] apply the piecewise-planar detection and encode the adjacency of planar segments into a graph that represents the scene.
Indoor grammar-based Reconstruction: One popular modeling approach, especially in regular environments, is adopting a (shape) grammar [
27,
28,
29], Lindenmayer Systems (L-systems) [
30] or (inverse) procedural modeling [
31,
32,
33,
34] approaches for interiors. Becker et al. [
5] use a combination of split grammar and L-system to reconstruct a 3D model for as-built BIM (Building Information Model). Their approach has a different view of the indoor space, since it divides the building into two main partitions as corridors and rooms. In another innovative approach, Ikehata et al. [
3] introduce an indoor structure grammar consisting of eight rules. Their approach is limited to Manhattan-World structures and 2.5D space. In [
29,
35,
36] authors apply simple examples of shape grammar to reconstruct indoor models that are clutter free.
Binary Space Partitioning (BSP) or cell decomposition: In the domain of indoor reconstruction, many researchers use BSP to tackle the problem of room segmentation. In indoor space partitioning, BSP is a piecewise-planar approach that subdivides the space in 2D cells and as an output generates a 2.5D model [
4,
7,
37,
38]. In using BSP, 2D approaches have the assumption of both vertical walls and horizontal ceilings, which is a shortcoming of the 2D-BSP. If BSP is implemented in 3D, it results in a 3D reconstructed model [
10,
39,
40], where the limitations of vertical walls and horizontal ceilings can be lifted. Additionally, BSP methods are able to assign the 2D or 3D cells of space partitions to the rooms based on the viewpoint and ray-casting. However, it requires scan positions per room with enough overlap to make the room labeling process possible. The main problems of BSP approaches are the restriction of viewpoints, the emergence of ghost primitives and the computation cost for labeling the cells as inside and outside.
Opening Detection: Among the work for the indoor reconstruction of points clouds, some of them [
3,
7,
41,
42,
43,
44,
45,
46] consider the problem of opening detection (doors and windows) and in their final model reconstruct the openings. Doors are essential elements for route planning and space subdivision. In our definition openings are not just limited to doors, but any opening in the wall that could be passed by individuals and connect two spaces. However, in cluttered environments and because of the presence of the furniture and obstacles, many walls could have data gaps that can be falsely considered as openings. Adan and Huber [
41] propose an occlusion test to detect windows in the walls. Ikehata et al. [
3] use a grammar rule to add a door in the wall between two separate rooms such that the walls are connected through a doorway. Therefore, in their pipeline, the addition of the doors is after reconstruction of the room. In a recent work Diaz-Vilarino et al. [
44] use the trajectory for door detection followed by an energy minimization to separate rooms with the known location of the doors. However, their example is a simple and clutter-free dataset. Another approach for door detection especially in the robotic domain is using images besides point clouds for detection of semi-open doors and closed doors. Quintana et al. [
45] and Diaz-Vilarino [
46] present such techniques for detecting closed doors from images and point clouds.
Similar to our approach, authors of [
11,
47] use the trajectory for semantic enrichment of indoor spaces. The authors exploit the fact that doors are the connecting elements of two spaces. By detecting the doors using the trajectory, it is possible to partition the trajectory and the space. This approach is only suitable for interiors with low level of transparent surfaces. Similarly, Zheng et al. [
12] analyze the scanlines to find local geometric regularities and to detect openings. By using extracted information, such as doors from scan lines, it is possible to segment the trajectory to associated spaces and subdivide the space. Both approaches may have poor results in environment with a large number of transparent surfaces or when the operator of the laser scanner has inconsistent behavior.
There is a large body of literature regarding scene understanding in small-scale indoor spaces, such as the detection of objects in a kitchen [
48,
49] for robot operation or in a bedroom [
50,
51]. In large-scale there are works by Armeni et al. [
52] for scene parsing, Mattausch et al. [
53] using a similarity matrix in cluttered environment and Qi et al. [
54] using deep learning for object classification. Some other works in the domain of indoor 3D reconstruction from point clouds use semi-automatic approaches to generate BIM models [
55,
56,
57] or stochastic methods to make a hypothesis on generating floor plans [
58].
Our work is innovative in terms of dealing with glass reflection problems using mobile laser scanners and exploiting the potential of trajectories as a supplementary data produced by MLS systems. This work can be further improved to reconstruct a complete 3D indoor model from complex structures. Furthermore, the generated navigable space can be used for route planning in 2D (e.g., pedestrians, wheelchair and robots) and 3D space (drones).
4. Permanent Structure Detection
For the detection of walls, floors and ceilings, the surface patches that are generated in the previous step are further processed. An adjacency graph is constructed from the patches and is further analyzed to induce the correct class of each patch (
Section 4.2). For the detection of openings, an occlusion reasoning method is applied to discriminate between real openings and gaps that are caused by occlusion (
Section 4.3). The occlusion test is also used to remove points that are outside the building layout and could be disturbing the reconstruction process. To start with detecting the permanent structure, the building levels are separated and then each level is processed separately (
Section 4.1).
4.1. Separation of Building Levels and Stairs
The typical solution in the literature [
10,
37,
63] for separating building levels in indoor point clouds is using a height histogram of points. A level in a building is a horizontal section that extends over the floor space. Using the histogram is straightforward and gives an initial separation of the building levels. However, it is not applicable to buildings where a building level is extended vertically in the space to other levels (see
Figure 4a) or a building with sub-levels. To overcome this problem in complex architectures, first the trajectory is separated to several levels and staircases. If the trajectory belongs to a handheld or a backpack system, the separation should be done where the operator enters the stairs. Therefore, the flat trajectory can be split from a sloped trajectory on the staircase. If the trajectory belongs to a push-cart scanner, then the trajectory of the levels are already separated, because the device does not move up or down the stairs.
To separate the levels, the process starts with the segmentation of the trajectory to the horizontal and sloped segments. A surface growing segmentation is used and points on the same horizontal or sloped plane are segmented together.
Figure 4b shows that the trajectory points in the upper level (blue segment) belong to the same level and points on the staircases are segmented together. However, this segmentation needs a modification to make sure staircases are separated correctly. For example, if in the same level of the trajectory, there are several segments with a height difference of fewer than two meters (see
Figure 4c, the orange and purple segments in the first floor) they will be merged. This is done because trajectories belonging to different levels typically have a height difference more than the ceiling height (at least two meters). After separating the trajectory to meaningful building levels, for each segment in the trajectory, the associated points from the point clouds will be selected using the timestamp.
Near the staircases, the laser scanner measures points from other levels; to modify the level of these points to their correct level, the two dominant horizontal planes are detected as floor and ceiling of the current level and the label of the points is changed to the corresponding levels.
Figure 4d shows the first and third level of the building. After separation of levels, each level will be processed individually for detection of walls, floors, and ceilings.
The point clouds of the stairs are extracted using the trajectory segments of stairs and the associated timestamp.
Figure 4e shows four different stairs datasets colored based on four segments of the trajectory. Because a large portion of other levels may be seen from stairs, it is sometimes inevitable to have an overlap between point clouds of the stairs and the floors. For example, in
Figure 4e part of the floors are also scanned from the stairs.
4.2. Wall Detection
The wall detection process includes detecting the permanent structures, such as walls, floors and ceilings. This process starts by making an adjacency graph (G) from surface patches (S). An adjacency graph is presented by G = (V, E) where nodes (V) are surface patches and edges (E) are connecting two adjacent nodes. Each node is associated with the point clouds of a surface patch S. When a label (l) is assigned to a surface patch, all the associated points obtain that label. The label shows the class of the surface, such as wall, floor, ceiling, door, and window.
Two nodes (V) are adjacent if their corresponding surface patches are within a specific distance from each other. This distance is set to dadj = 0.1 meter in all of our experiments. Note that the coplanar or parallel segments are already merged. Therefore, two adjacent surface patches could meet under any arbitrary angle, which means our method is not limited to Manhattan-World. To deal with slanted walls and non-horizontal ceilings an angle threshold (α) should be specified to separate the candidate walls and ceilings before proceeding with the analysis of the graph. Each node in the graph is labeled as almost-vertical or almost-horizontal based on a threshold α. By default, this threshold is set to α = 45 degrees to make a primary separation between candidate ceilings and walls. Considering this threshold, the node V in the graph G will be categorized to Vh and Vv for almost-horizontal and almost-vertical. By comparing a pair of surface patches out of nodes V(v1, v2), three principal labels will be assigned to each edge e ϵ E of adjacent nodes v1, v2:
E obtains the label wall-wall iff v1 and v2 are both almost-vertical and adjacent.
E obtains the label wall-ceiling iff v1 and v2 are almost-vertical and almost-horizontal respectively and the center of v2 is higher than the center of v1.
E obtains the label wall-floor iff v1 and v2 are almost-vertical and almost-horizontal respectively and the center of v2 is lower than the center of v1.
After labeling the edges, each node in the graph will be analyzed based on the connected edges and the respective labels. Three main rules are applied to each node v ϵ V to decide for the label:
- Rule 1.
V obtains the label wall iff the count of wall-ceiling edges is equal or more than one and V is almost-vertical. This means every wall should be at least once connected to the ceiling.
- Rule 2.
V obtains the label ceiling iff the count of wall-ceiling edges is more than two and the count of wall-wall is equal to zero. This means an almost-horizontal surface with wall-ceiling edges should be connected more than two times to the walls to get the ceiling label.
- Rule 3.
V obtains the label floor iff the count of wall-floor edges is more than two and the count of wall-wall is equal to zero. This means an almost-horizontal surface with wall-floor edges should be connected more than two times to the walls to get the floor label.
Note that in Rule1, the connection of the wall candidates to the floor is not checked because of possibly heavy occlusions near the floor.
During the processing of the rules, further considerations as soft rules need to be applied. For example, during applying second and third rule on the ceilings and floors, each almost-horizontal surface cannot be a floor or a ceiling candidate. This happens especially in the case of horizontal surfaces of shelves and tables. Therefore, the average z-value of a horizontal patch is compared with an estimation of the floor and ceiling height to decide if it is near the floor or ceiling. In this way, horizontal surfaces of objects, such as tables and boxes, could be discarded. However, some of the horizontal surfaces that are near the floor and ceiling disturb the correct semantic labeling. For example, the top of shelves and cabinets that are near the ceiling could be labeled as the ceiling (see
Figure 5b). As a drawback, the attached vertical surfaces that are connected to them may be also mislabeled as walls. To avoid this problem, the overlap of projection of almost-horizontal surfaces in the xy-plane is checked before starting with the rules. If the 2D projection of two horizontal surfaces has overlap (considering a small buffer), the upper surface is preserved as a ceiling candidate and then the process with the rules will follow. Since, the topological relations of the surfaces are exploited in our method, it is not limited to regular manmade structures or Manhattan-World.
In the permanent structure detection method, a ceiling or floor will be distinguished from a wall by the angle threshold which is by default α = 45 degrees. By applying rules 1, 2, and 3, a
slanted surface could be labeled to a wall or ceiling (floor) depending on its normal angle. In our method, a slanted surface is distinguished by this angle threshold defined by the user.
Figure 6 shows two different cases when
α is set to 40 and 50 degrees. However, there is a special case where the slanted surface is distinguished as a wall and is supported by another vertical wall that is connected to the floor (see
Figure 6b). Such a case happens when a slanted wall and a vertical wall are not segmented in the same surface patch since they have different normal angles during the generalization. Therefore, an extra check is required to see if the almost-vertical surface that is not connected to the ceiling is a wall or not. This check could be done by means of support and adjacency relation between a slanted surface and a vertical surface. Let
v1 and
v2 represent the two almost-vertical surfaces and one of them is not connected to the ceiling, then the lower one (with a lower center) is called
supporter (
v1) and the upper one is called the
supported (v2). Furthermore, the condition
max-z(v1) <
min-z(v2) including a buffer should be satisfied. Notice that checking the support relation is necessary, otherwise objects attached to the wall could be labeled as a slanted wall. Respecting this explanation, the corresponding edge (
E) of two adjacent wall candidates (
v1,
v2) could obtain the following label:
E obtains the label
wall-slantedwall iff v1 and
v2 are both almost-vertical
and the intersection line is almost-horizontal
and one surface is supporting the other one.
The following rule is applied to define the label of a node V: Rule 4. V obtains the label slantedwall iff the count of wall-slantedwall edges is more than zero and the count of wall-wall edges is more than zero and V is almost-vertical.
Since a real dataset with slanted walls from a MLS system was not available, our algorithm is tested on a part of the penthouse dataset from Mura et al. [
10]. We assumed the slanted surfaces once as the non-horizontal ceiling (α = 40) and once as slanted walls (α = 50).
Figure 6 demonstrates the results on a part of the penthouse building. This experiment shows the robustness of the algorithm in case of non-horizontal ceiling or slanted walls. In the next section, a method is presented for detecting the openings by using the trajectory and applying occlusion-test.
4.3. Opening Detection Using the MLS Trajectory
After detecting the walls, floor and ceilings, the point clouds are enriched with more semantics, such as openings (doors and windows). Reasonably, it is expected that doors and windows are located on the walls. Furthermore, openings are represented as holes or gaps in the data because where there is an open door or a window the laser rays go through the wall surface. The same gaps happen in the data, if part of the scene is not captured by the laser scanner, e.g., because of occlusion. Therefore, one problem of opening detection is to discriminate between data gaps and real openings in the data. We exploit the fact that a laser beam, crossing a wall surface with the opening, hits the objects behind the surface. Hence, from each location on the trajectory a ray is reconstructed to the measured laser point. Note that here the time attribute of the points plays an important role. Because from every point on the trajectory only the measured points at that specific time are evaluated for the ray casting. This process is named occlusion-test and is implemented as the following (see
Figure 7): First, each surface patch Si with the wall label would be enveloped by a 3D voxel grid (grid size of 10 cm). Second, a ray is constructed from
t1 on the trajectory to the corresponding point
p1 in the point cloud. If the ray intersects a surface
s1 ϵ
Si, the intersection point of the ray and the surface corresponds to one of the voxels of the
s1. The incident voxel obtains one of the four labels: Occupied, occluded, open or unknown. The incident voxel is occupied if the measured point
p1 belongs to the
s1, occluded if
p1 is in front of the
s1, opened if
p1 is behind the
s1 and is unknown otherwise. If the ray does not intersect the surface the labels remain unchanged.
After the occlusion-test process, the results need to be further inspected to identify false openings. False openings happen where a clutter is connected to the ceiling and is extended to the neighboring walls. Therefore, during the occlusion test it is considered as a surface with opening (
Figure 8b). Such false openings are identified and removed if more than a percentage (e.g., 80%) of voxels in the wall surface are labeled as openings (
Figure 8c). With this simple check most of the false openings and erroneous walls are removed.
Furthermore, it is possible to separate the openings into openings that intersect the floor (doors), and those that are above the floor (windows). However, the clear frame of the opening could not be inferred because of the noise and occlusion.
The occlusion-test provides additional information about the points behind the wall surface. During the occlusion-test, points that are behind each surface are flagged for further inspection. Each point
p1 that is behind the surface
s1 and is measured from
t1 on the trajectory, can be a reflected point or a point that is sensed through a transparent surface. In
Section 3.2, it was explained how to identify points that are caused by the reflection. Otherwise, the point is labeled as a
point-behind-surface artefact and will be removed from the collection. Here, the assumption is that the objects behind an opening are scanned properly from the belonging space. A point behind a surface is less reliable because it is possibly measured through a glass surface. For example, in one of the datasets (Fire Brigade building, level 2) some of the rooms are partially mirrored to the outside of the building, because of a lot of glass surfaces in the façade. Consequently, in detecting the permanent structures they are mislabeled as walls, floors and ceilings. By removing points behind a surface, artefacts that are outside the building layout and could not be identified as reflection will be removed.
6. Door Detection Using the Trajectory
In this stage, doors that are intersected by the trajectory during the data acquisition can be detected. Note that in
Section 4.3, some doors were already detected as openings by occlusion tests, while here it is possible to detect closed doors as well. For detecting the doors using the trajectory, the voxels and the trajectory are the input data of the process. Voxels are used for this step, because the algorithm tries to find the center of each door that is crossed by the trajectory (see
Figure 10).
The door center is represented by an empty voxel in case of an open door and an occupied voxel in case of a closed door. Each voxel in the voxel space is checked whether it can be a center of a door candidate (a door center). A voxel is a door center candidate if: (i) Nearby the voxel there is a trajectory; (ii) Above the voxel occupied voxels exist that represent top part of a door frame; (iii) The neighborhood of the voxel should be empty for an open door. These three criteria enforce three main search radius parameters: (1) A search range to look for a nearby trajectory (rtraj < * voxel-size); (2) a search radius to look for voxels on top of the door frame relative to the floor (1.80 m < rtop < 2.10 m); and (3) a neighborhood search radius (rvoid < n * voxel-size) to make sure around the candidate voxel is empty, where the search radius is a factor of voxel size. The rvoid threshold should always be smaller than the door width to exclude the door frame in the search for empty neighborhood. Empirically, if the percentage of empty voxels around a door center within the search radius (rvoid) exceeds 70% of the total neighbor voxels, then the third criteria for an open door is fulfilled. Furthermore, to speed up the calculation process, only voxels are explored to be a door center that are located in the height between 0.8 to 1.10 m relative to the floor, as the door center is expected to be in this height.
Closed Doors: Closed doors appear in the point cloud as part of the wall (
Figure 10, the middle door). When the trajectory crosses the door and the door is closed before or after the scanning, it appears in the data as if the trajectory went through the wall. To detect closed doors crossed by the trajectory, the same three criteria as open doors are applied, but with the difference that for the third rule the neighborhood of the voxel candidate as the door center should be occupied instead of empty. Notice that simply intersection of wall planes with the trajectory is not sufficient to detect closed doors. Because in cluttered rooms the trajectory goes between the congested furniture or false detected walls that can be identified as false doors. Therefore, checking the three criteria is also required for closed doors. The door detection algorithm, using the trajectory, can only be used for spotting the location of the door (also double doors), further inspection is required for identifying the door frame or the door leaf.
7. Results and Evaluation
Our approach is tested on four datasets collected with four different mobile laser scanners. The details of the datasets and the scanners are given in
Table 1. The results of all datasets and the ground truth are shown in
Figure 11. The results show that 80% of the doors and more than 85% of the rooms are correctly detected. Our methods are tested on buildings that violate the 2.5D and Manhattan World assumptions. The space partitioning results (
Figure 11, 3rd column) show our constraint-free approach in arbitrary room layouts with different ceiling heights. Both datasets from Fire Brigade building level 1 and the TU Delft Architecture building have large halls with a high ceiling and different ceiling heights in other rooms.
Figure 11 (3rd column, first and last row) illustrates the extracted spaces for these two datasets. The permanent structures in
Figure 11 (2nd column) indicates that our pipeline is capable of detecting most of the walls and openings in the heavy cluttered environments with many reflective surfaces.
Each dataset is subsampled to ease the visualization and to decrease the processing time. For subsampling, every k’th point of a kNN is used, where a reduction factor between 4 and 6 is applied to decrease the size of the original dataset while keeping the structure of the point clouds. The subsampling keeps the average point distance less than 0.05 m.
The other important influential factors are noise, the level of clutter and the number of glass surfaces in the data. The level of noise depends on the sensors precision and the SLAM algorithm. For details of each MLS device accuracy and noise, the readers are referred to the specification of each product and the review by Lehtola et al. [
24]. In terms of high clutter and high number of glass surfaces, Fire Brigade dataset poses a lot of challenge because of the very large glass walls. Such glass walls, as well as heavy clutter are present on the first floor (
Figure 12), where the fire trucks are located.
Implementation: All algorithms are written in C++ and tested on a Lenovo ThinkPad workstation with an Intel core i7 (2.5 GHz), 16 GB RAM. The main computational cost is devoted to occlusion test process, because of the ray casting algorithm where the peak of RAM usage is 16 GB and for large datasets it takes up to an hour. Another expensive process is space partitioning, because of the 3D morphological process on a large number of voxels. For an area with almost 15 million voxels, it takes 10 minutes with a voxel size of 10 cm, and 3 minutes with a voxel size of 20 cm. Other algorithms including permanent structure detection, door detection, reflection removal, level partitioning and surface growing take seconds up to 5 minutes depending on the size of the dataset. The computation time for space partitioning is more dependent on the volume of the building and height of the ceiling than the size of the point clouds. For example, for the TU Delft dataset the number of voxels exceeds 100 million, since the orange hall has high ceilings (almost 13 meters). Therefore, the space partitioning method is implemented with 20 cm voxel size for this dataset to speed up the process.
Parameter Selection: Our pipeline starts with the surface growing segmentation followed by a surface patch generalization algorithm. For the surface growing segmentation the most important parameter is the smoothness threshold. The optimal value depends on the amount of sensor noise and the noise caused by the SLAM algorithm. For the MLS devices in this article, the sensor noise is less than 5 cm. However, there is more noise in the data created by SLAM algorithm and artefacts of the glass reflections. Therefore, we experienced a value between 10 and 15 cm for datasets from Zeb1 as a good threshold for planar segmentation and between 5 to 10 cm for other datasets (ZebRevo, ITC Backpack and NavVis Trolley). For the surface patch generalization, nearby surfaces are considered parallel if their normal vector angle tolerance is less than θ < 10°, and their proximity (d) alongside the plane is less than 60 cm. The time lag Δt is the important parameter for detecting and pruning of ghost walls. Empirically, we choose 150 seconds time lag for a point to be considered as a reflected point, and if more than 70% of the points in a segment have this time difference with their neighbor trajectory that segments is defined as a ghost wall.
For reconstructing the adjacency graph, the distance for adjacency of two surface patches is less than
dadj < 10 cm and the minimum length of an intersection line is 20 cm. We experienced that a minimum length of 20 cm in most datasets is reasonable. There is just one special case that the threshold is increased to 25 cm, when the frames of doors are extended to the ceiling (e.g., glass rooms in
Figure 13). To avoid door leaves to be misclassified as wall, a minimum length of 25 cm for intersection line is considered.
The default threshold of 45° is considered for separating the surfaces to almost-vertical and almost-horizontal. In case of different sloped ceilings, the angle threshold could be changed to recognize the ceilings from slanted walls. The minimum number of supporting points for each surface to be included in the adjacency graph is 500 points. A voxel size of 10 cm is preferred for algorithms operating on voxels, such as the occlusion test, space partitioning and door detection. For a point spacing of 2 to 5 cm, the voxel size of 10 cm offers a good balance between the computational time and the number of preserved details of permanent structures. The window size of the morphological operator for the space partitioning should be larger than a doorway to ensure the separation of spaces at the locations of open doors. Therefore, a window size between 1.0 to 1.3 m is suggested. Other soft parameters, such as kNN search, proximity search and connected components do not have significant influence on the whole pipeline.
Robustness: The robustness of our algorithms is evaluated by testing on various datasets collected by four different mobile laser scanning devices. Additionally, our pipeline is tested on a multistory building (Fire Brigade dataset), a building with slanted walls (
Figure 6) and different ceiling heights (Fire Brigade building, level 1 and the TU Delft Architecture building), and a building with high amount of clutter and glass surfaces (Cadastre building and Fire Brigade building, level 2). Among those, buildings with large glass walls pose the largest challenge to our pipeline (for wall detection and space partitioning), because the connection of glass surfaces near the ceiling is not guaranteed in the segmented data and in some cases these glass surfaces are missing entirely in the data (the TU Delft Architecture building the hall with orange stairs,
Figure 14). However, the wall detection is capable of detecting glass walls even with loose connections to the ceiling.
For reconstructing the adjacency graph, all datasets are processed with dadj 10 cm. We experienced, in most cases that increasing this threshold to 20 cm or higher results in losing some of the walls; and decreasing the threshold to less than 10 cm results in misclassification of some clutter surfaces to wall surface. The dadj parameter depends on the noise in the data. For datasets with a higher level of noise the threshold could be increased to 20 cm.
Limitations: The permanent structure detection using the adjacency graph is susceptible to errors when there is a clutter at the ceiling close to the walls (
Figure 15). This kind of clutter could be misclassified as wall if the size of the clutter is large. Hence, during the occlusion test it may be misclassified as a glass surface. Consequently, the space would be partitioned incorrectly. The reason behind this limitation is that the rules in the adjacency graph deliberately do not check if a wall candidate is connected to the floor, because in most cases walls are occluded near the floor. Hence, a structure in the ceiling connected to the neighboring walls could cause this error.
During the door detection algorithm, the algorithm fails in case of low ceilings spaces, such as basements or tall doors reaching until the ceiling. This is because the algorithm searches for the points on top of the door center, and when the ceiling is low it could be considered as the top of the door that results in detecting false positive doors (see
Figure 16). Detection of doors may be difficult if they are semi-open, because the condition that checks if a door center is in a void neighborhood for an open door cannot be true if a door-leaf is occupying part of the doorway. Double doors, could be spotted with our algorithm, but the exact door frame could not be extracted.
Using the trajectory to separate the levels can be error-prone in buildings with a lot of glass surfaces, because objects could be seen from other levels, especially in the stair cases. However, using the trajectory for the Fire Brigade building with a huge space in the first level that spans to the other level is the reasonable option.
In the cadastre building dataset (
Figure 11, 3rd row), the surface growing segmentation results in flawed segments because of slanted glass surfaces and artefacts outside the façade (
Figure 17). Consequently, the supporting walls that are connected to the slanted glasses could not be extracted. The opening detection for cadastre dataset is not performed, since the time stamp of the point clouds were not available. All the point clouds including the furniture are used for the space partitioning of the cadastre dataset. Otherwise the interior space will be connected to the outside through the missing walls.