1. Introduction
Autonomous robot exploration technology requires robots to collect data within a given region and construct corresponding environmental maps. As a critical technology that reveals robotic autonomy, relevant research in robotics has garnered significant attention, driving widespread applications in geological exploration, 3D reconstruction, post-disaster rescue, and other fields.
Numerous autonomous exploration methods have been proposed in recent years and are divided into sampling-based and frontier-based categories. Sampling-based methods originated from the NBV (Next Best View) algorithm in the field of 3D reconstruction. RH-NBV (recurrent hybrid neural-based visual) first introduced the NBV algorithm into the autonomous exploration field [
1], which consisted of the robot randomly sampling viewpoints in explored free space, constructing a rapidly exploring random tree (RRT), and evaluating the utility of each branch on the RRT. Finally, the robot focuses on the branch with the highest information reward and selects the first node of this branch as the local target. After that, numerous researchers have extended and improved the RH-NBV to meet the requirements of various application scenarios [
2,
3,
4,
5,
6]. However, sampling-based autonomous exploration methods have lower exploration efficiency and lead to the robot being trapped. Ref. [
7] first introduced the frontier-based exploration method, which groups free voxels adjacent to unknown voxels as frontier clusters and then drives the robot towards these frontier clusters to move to explore unknown areas. Since then, many frontier-based exploration methods have been proposed to meet various application requirements [
8,
9,
10]. Ref. [
11] proposed to select a viewpoint with minimal speed changes as the next goal within the Field Of View (FOV) of sensors, aiming to maintain the high movement speed of unmanned aerial vehicles (UAVs) and achieve efficient exploration. Fast UAV expLoration (FUEL) proposes the incremental frontier information structure (FIS) to address the problem of high computation of frontier extraction and low decision frequency of the path planner [
12]. Based on FIS, UAVs can quickly and incrementally extract environmental information that the planner needs and promptly plan the exploration path.
However, most autonomous exploration methods tend to greedily guide the robot to exploration scenes with immediate rewards and neglect some targets with long-term rewards, resulting in lower efficiency in global exploration [
12]. Although some methods plan paths from the global exploration standpoint, the robot inevitably overlooks some scenarios during exploration because of the limited perception range of sensors and the unpredictability of unknown environments [
12,
13]. To thoroughly explore a given region, the robot must revisit areas containing those missed scenarios, resulting in a waste of resources. Furthermore, when exploring large-scale scenes, the more information the robot collects with exploration, the more the path planner computes, which poses a significant challenge to onboard computers.
In order to solve the above problems, the work [
14] proposed that supplying robots with prior information about a given region can aid them in making decisions that align with long-term benefits. Ref. [
15] proposed a probabilistic information gain map as the prior knowledge to guide exploration. Ref. [
16] introduced a general information theory framework to control multiple robots to search and rescue, wherein the prior knowledge of people is modeled to capture target positions and dynamics. Ref. [
17] employs hand-drawn sketches as prior information, enabling the robot to explore even when the metric description of the environment is incomplete.
As the concision of a topological map, many researchers use them as prior information to guide robots in autonomous exploration. Ref. [
18] proposed a novel autonomous exploration method based on a prior topometric graph, which verifies that prior information could aid the robot in swiftly completing the exploration of unknown environments. Ref. [
19] proposed a path planning method based on topology information for 3D reconstruction, in which the multi-view stereo path planning is decomposed into a collection of overlapped viewing optimization problems that can be processed in parallel. In [
14], the prior topometric map is employed to improve exploration efficiency and guide the robot to trigger a loop close, improving the localization accuracy of the Simultaneous Localization And Mapping (SLAM) system. Finally, the environmental information collected will be used to refine prior information.
Furthermore, some researchers focus on the generation of topological maps. Ref. [
20] proposed a framework called “topomap” to provide robots with customized maps to simplify robot navigation tasks, transforming the sparse feature-based map from visual SLAM into a three-dimensional topological map. Ref. [
21] proposed an efficient and flexible algorithm that generates a trajectory-independent 3D sparse topological skeleton graph captured from the spatial structure of free space.
Inspired by the abovementioned research, we select the topological map as the prior information to guide in robot exploration and employ the frontier-based exploration method suitable for exploring large-scale scenes to plan exploration paths. As a form of map representation, the topological map briefly provides relative position and connectivity between critical places in complex scenes, which could guide the robot in planning paths that follow long-term benefits [
20]. In practical cases, many methods can easily acquire the skeleton structure of the environment as the topological map [
20,
21,
22,
23].
To fully take advantage of the guiding function of prior topological maps, we propose an autonomous exploration method based on topological maps. The proposed method employs a hierarchical path planning framework, integrates frontier information with prior topological maps, and plans the exploration path with long-term benefits. It first plans a global exploration path by solving the constructed Priority Constrained Asymmetric Traveling Salesman Problem (PCATSP). The global exploration path would follow optimal or customized global exploration strategies to guide the robot to cover frontiers but prioritizes exploring scenes outside the prior information, thus preventing the robot from revisiting previously visited areas. Then, the exploration path is refined from the global exploration path by quickly evaluating the rewards and costs of the candidate viewpoint for each frontier.
Because of the one-pass exploration process, our method will maintain the frontier at a small number, preventing excessive computational burden on the solver during exploration. The above properties make our method more suitable for autonomous exploration in large-scale scenes, and the contributions of this paper are as follows:
(1) An autonomous exploration method based on prior topological maps. The robot follows an optimal or customized strategy to explore a given region autonomously but prioritizes exploring scenes outside prior information, preventing the robot from revisiting the explored areas.
(2) A path planning method integrates information between frontiers and prior topological maps, which makes the topological map deeply involved in the path planning of robot exploration.
(3) A local path planning method, which quickly evaluates the rewards and costs of each candidate viewpoint to optimize the global exploration path, enhances the stability of exploration efficiency.
2. Design Objectives
Give the robot a region to autonomously explore, and provide it with a topological map of the region to be explored. The topological map should reflect the fundamental layout of the region but may not represent all of its spaces. The objectives we address are as follows:
Objective 1: The robot completes a comprehensive exploration of the given region. When there are no frontier clusters extractable within the given region, it indicates the completion of an information gathering task.
Objective 2: The robot utilizes real-time collected scene information and prior topological maps to plan exploration paths. When the robot encounters scenes that are not included in the priori information, the exploration path will guide the robot to prioritize exploring scenes beyond a priori information.
Objective 3: The exploration path enables the robot to complete an exploration of the visited area in a one-pass manner, preventing the robot from repeatedly visiting the areas that have been explored.
3. Methods
We define the topological map as follows.
consist of the global targets set
and the undirected edges set
.
s denotes a global target corresponding to a corner or intersection in the environment.
is an undirected edge, connecting
and
, representing a straight-line scene, such as a road or corridor. With the support of a prior topological map, we can obtain a global exploration strategy
for the given region by customizing or solving the Chinese Postman Problem (CPP) [
24], which is a priority queue.
denotes a directed line segment from
to
, corresponding to the exploration guidance. Define
f as the frontier cluster, and
as a set of remaining frontier clusters in a scene. We adopt the incrementally frontier information structure (FIS) proposed by FUEL to update frontier clusters efficiently [
12]. Viewpoint sequential queue
of frontier cluster
is extracted by random sampling, where
is a viewpoint with the largest reward of
and will replace
in constructing PCATSP.
Figure 1 shows an overview of the proposed method, which operates upon a voxel grid map. We employ a hierarchical architecture to plan the exploration path, which consists of global path planning (
Section 3.1) and local path planning (
Section 3.4). The global path planning module takes a prior topological map, global exploration strategy, and frontier clusters as input to plan the global exploration path based on the Priority Constrained Asymmetric Traveling Salesman Problem (PCATSP). Nodes with priority in PCATSP will be extracted (
Section 3.2), and the movement cost of some frontier clusters will also be updated (
Section 3.3). Then, the global exploration path is given to the local path planning module, which refines the input path based on rewards and costs of each viewpoint candidate to improve the stability of exploration efficiency. Finally, the exploration path will output to the trajectory generation module. The exploration task will be completed when no frontier clusters can be extracted from environment.
3.1. PCATSP-Based Global Path Planning
PCATSP is a variation of the classic Traveling Salesman Problem (TSP), which aims to find a minimum-cost Hamiltonian circuit, with the constraint that some nodes must be visited before others. If splitting the start and end points of PCATSP into two nodes, PCATSP is equivalent to finding a path between the start and end points that satisfy priority constraints, which is also considered a Sequential Ordering Problem (SOP) [
25]. To address Objective 3, our basic idea of global path planning is to solve PCATSP with frontiers and priority-constrained global targets. It is equivalent to inserting frontiers into a sequential queue of global targets, utilizing the global targets to influence the covered sequence of some frontier clusters.
However, the construction of PCATSP faces the following challenges: First, constrained by the perception range of sensor and obstacle obstruction, the robot cannot accurately calculate the movement distance between global targets in unknown space and viewpoints inside the free space. Second, for TSP, the farther the metric distance between frontier cluster and global target, the less influence the global target can exert. So, we need to enable global targets to influence specific frontier clusters.
To address the above challenges, we define the following: if a global target
is inside the free space, and the other global target
connected to
is in unknown space, then the shortest path
between any node
in free space and global target
in unknown space is given by
where
a is an intersection point of frontier and undirected edge
that connects
and
, as shown in
Figure 2a.
is a portion of
in unknown space, and
is a search path from
to
a.
Based on the above definition and by incorporating the solving property of TSP, if all elements in a certain row or column of cost matrix
D of TSP are subtracted by the same value, a new cost matrix
for the TSP will be obtained. However,
D and
correspond to the same TSP solution [
26]. Thus, we can subtract
from all
, and leave the TSP solution unchanged. It is equivalent to setting an agent point for the global target on the frontier. For constructing PCATSP, the intersection point
could be extracted on a frontier cluster as an agent for global target
and the priority can be set to
based on global exploration strategy
O.
Hence, based on the above theory, we define agent
as the intersection point of a frontier cluster
and an undirected edge
, which is associated with a global target
in unknown space and possesses access priority. Then, the path between any point inside free space and an agent on frontier could be found by a path-searching algorithm, and then path length could also be accurately calculated. The method of agent extraction and priority assignment will be elaborated in
Section 3.2.
Finally, we can naturally combine the prior topological map with real-time updated scene information in path planning based on PCATSP. We can then solve the PCATSP with the extracted agent set and frontier cluster set F, where agents with priority will influence the visited order of nearby frontier clusters.
In this section, we suppose that the movement cost
between any two nodes
and
is calculated as follows:
where
is the shortest search path between nodes
and
;
denotes the length of search path; and
is the maximum velocity of the robot.
PCATSP solver provides a path
composed of input nodes. By removing
A from
, we obtain a global exploration path
that satisfies global exploration strategy
O, as shown in
Figure 2b [
27]. However,
cannot guarantee that the robot with the priority will explore areas outside prior information, i.e., Objective 2. To address this objective, frontier clusters that guide the robot towards global targets will be recognized, and movement cost between these frontier clusters and the robot will be increased. Details are discussed in
Section 3.3.
Figure 3 is a schematic diagram of the robot exploration process. The blue arrows represent the global exploration strategy, and green arrows are basic exert programs for global exploration paths. Based on our method, the robot explores the given region according to the global exploration strategy but prioritizes exploring scenarios outside prior information. Finally, following the global exploration strategy, the robot actively loops close and completes the exploration.
3.2. Agent Extraction and Priority Assignment
When the position of a global target is explored by the robot, other global targets connected to will be searched on a topological map. Then, the directed line segment is constructed based on , indicating from to . During implementation, we set up some satellite points for each global target to prevent the robot from missing them. The satellite points are evenly distributed at a set distance around the global target point.
The Oriented Bounding Box (OBB) of each new frontier cluster is extracted by principal component analysis (PCA), which is used to extract agents. As shown in
Figure 4a, if
is crossed by
, OBB of
must be crossed, and the following condition is met:
where
are vertices of OBB. For the new frontier clusters that meet the condition, proceed with the following secondary evaluation to identify which global target the forthcoming agent will belong to:
where
R is the position of the robot, and
is the average position of
, as shown in
Figure 4b.
If the above conditions are met with the newly extracted frontier cluster , the cell closest to in will be defined as an agent for global target . An agent of the global target is independent of the frontier cluster but is associated with it; if a frontier cluster is deleted or updated, the corresponding agent will be deleted.
The movement cost between an agent
and other nodes of PCATSP are calculated as follows:
where
C is a large value, ensuring that the robot prioritizes exploring scenes outside prior information.
The priority
of agent
is assigned as follows:
which is determined by the sequential position of its
in
O.
is a directed line segment in
O, where
u represents its ordinal position in
O. The higher the sequential position of
, the lower the priority of the agent extracted from
. If
is not found in
O, the priority of the agent equals
, and these agents with the lowest priority will not be actively explored, as shown in
Figure 3. Additionally, during exploration, not only will one agent be generated by a directed line segment, but all agents will point towards the same global target. In such cases, we define the agent that is further away from the global target to have a higher priority, ensuring the robot does not miss the scene during exploration.
3.3. Update of Movement Cost of Frontier Clusters
As mentioned earlier, to ensure that the robot prioritizes exploring scenes outside prior information, the system needs to recognize frontier clusters that would guide the robot towards a global target and set a higher cost between them and the robot. To achieve this, we classify frontier clusters into three categories:
The first class includes frontier clusters with agents and other frontier clusters that are adjacent to these frontier clusters. They will guide the robot towards global targets.
The second class includes frontier clusters adjacent to the first class frontier clusters. These frontier clusters may guide the robot towards global targets. We utilize density-based spatial clustering of applications with noise (DBSCAN) algorithm to recognize these frontier clusters and rely on the following methods to identify whether they could guide the robot to global targets [
28]:
as shown in
Figure 5a,
is the angle between vector
and
;
is the angle between vector
and
; and
r is the Euclidean distance between two points.
However, as the robot explores, the relative position between frontier clusters and robot changes, and frontier clusters cannot always satisfy Equation (
7), as shown in
Figure 5b,c. Thus, to keep the consistency of determination for these frontier clusters during exploration, we employ the Dynamic Time Warping (DTW) algorithm to evaluate similarities between path
and all paths of
, as shown in
Figure 5d [
29]. If the similarity ranking of
satisfies the following condition,
is still believed to guide the robot towards global target
:
The third class consists of the remaining frontier clusters. They will guide the robot to explore scenes outside prior information.
For the frontier clusters
that guide the robot towards global targets, the movement cost from the robot to them is set as follows:
The movement cost from robot to other frontier clusters
is computed as follows:
which considers the path length, yaw change, and motion consistency, where
and
are coordinates, and the yaw angle of viewpoint
,
is the maximum angular change rate;
is the yaw angle of the robot; and
is the current velocity.
As
and
do not impact the solution results of open-loop path planning, we set them to 0. The movement cost between all frontier clusters are calculated as follows:
3.4. Local Path Planning
The global path planning module aims to assist the robot in making decisions at a global standpoint for efficient exploration. The local path planning module aims to find the best viewpoints to make the robot to follow. Many previous works refine a path by evaluating the cost and reward for efficient exploration, but they consume significant computational resources in information evaluation [
1,
30]. Thus, we define the potential reward of a candidate viewpoint as a volume of unknown space within its a Field of View (FoV) and propose a simple and fast reward evaluation method based on incremental frontier information structure (FIS). Finally, the local path is refined by synthesizing the reward and movement cost of each viewpoint candidate.
Cells of a frontier cluster are stored in FIS [
12]. We use them to evaluate the volume of unknown space in FOV. As shown in
Figure 6, each truncated pyramid is constructed based on cells that the candidate viewpoint could cover, and its volume is calculated as follows:
where
is width of cell,
is distance between cell and candidate viewpoint.
is effective distance to calculate reward and computed by:
is maximum range of FOV, and
is used to control the depth of truncated pyramid to balance movement cost and expected rewards.
The expected reward
of a candidate viewpoint
is evaluated by accumulating
V:
where
is taken from current frontier cluster
, and
is taken from next adjacent frontier cluster
to be visited.
m and
n, respectively, represent the number of cells that
could cover.
is a weight coefficient.
We formulate local path planning as a graph search problem and refine an optimal path from the global exploration path by balancing expected reward and movement cost, where viewpoints of each frontier cluster serve as candidate points. Suppose that the optimal exploration path
provided by the Dijkstra algorithm will minimize the cost/reward ratio:
where
is the size of the frontier clusters to be optimized, and
is the volume of FOV.
Finally, the exploration path is output to the trajectory generation module.