1. Introduction
In autonomous driving and robotics, commonly used sensors are LiDAR, camera, and GPS/IMU. The LiDAR market has broad prospects. It is predicted that by 2025, China’s LiDAR market will be close to USD 2.175 billion (equal to CNY 15 billion), and the global market will be close to USD 4.35 billion (equal to CYN 30 billion). By 2030, China’s LiDAR market will be close to USD 5.075 billion (equal to CYN 35 billion), and the global market will be close to USD 9.425 billion (equal to CYN 65 billion). The annual growth rate of the global market will reach 48.3% (
http://www.evinchina.com/newsshow-2094.html, accessed on 2 January 2023). The annual growth rate of vehicle camera modules from 2018 to 2023 is about 15.8%. It is estimated that the annual sales in 2023 will reach USD 5.8 billion (about CYN 39.44 billion), of which robots and cars will account for 75% (
https://www.yoojia.com/article/10096726019901689110.html, accessed on 2 January 2023). GPS/IMU is a comprehensive system combining inertial measurement unit (IMU) and global positioning system (GPS), which can provide centimeter-level absolute localization accuracy. A single IMU can only offer a relatively accurate motion estimation in a short time, which depends on the hardware performance of the IMU. The localization error increases rapidly with time. LiDAR (light detection and ranging) is the most critical ranging sensor, which can estimate the relative motion of vehicles in real-time through point cloud registration. LiDAR’s ranging calculation is complex and often needs to be accelerated in a parallel settlement [
1], and due to the high cost, FGPA is used as a part of the data processing module to reduce costs [
2]. The SLAM based on LiDAR will degenerate where the geometric feature information is sparse. The camera projects the three-dimensional scene onto the two-dimensional image, through which we can obtain visual odometry. However, the visual odometry has scale ambiguity, and the image is also affected by the exposure rate and the light. In order to integrate the advantages of the sensors to achieve better robustness, many tasks need more than one kind of sensor. Multi-sensor fusion has become a trend [
3,
4,
5,
6,
7,
8]. The study in [
3] proposes a pipeline inspection and retrofitting based on LiDAR and camera fusion for an AR system. In [
4], for the loss of visual feature tracking of mobile robots, a feature matching method is designed based on a camera and IMU to improve robustness. Efficiently fusing different attributes of the environment captured by the two sensors facilitates a reliable and consistent perception of the environment. In [
5], a method for estimating the distance between an autonomous vehicle and other vehicles, objects, and signboards on its path using the accurate fusion of LiDAR and camera is proposed. It can be seen that many tasks tend to use multi-sensor fusion to complete the task. However, since the coordinate systems of the data collected by each sensor are different, we need to unify the coordinate systems of different sensors. This requires different types of data collected by sensors to calibrate the intrinsic parameters of a single sensor and the extrinsic parameters of multiple sensors. For example, Refs. [
9,
10,
11,
12] are calibration works about multi-LiDAR extrinsic parameters. Refs. [
13,
14,
15] are calibration works about LiDAR-camera extrinsic parameters. Refs. [
16,
17,
18] are calibration works about LiDAR-IMU extrinsic parameters. There are few calibration works about LiDAR-GPS/IMU extrinsic parameters. The calibration of LiDAR-GPS/IMU is to complete the calibration of extrinsic parameters of LiDAR and IMU rotation and LiDAR and GPS translation.
Since the intrinsic parameters of IMU and LiDAR are usually provided by the manufacturer at the factory, we mainly solve the calibration of extrinsic parameters between LiDAR and GPS/IMU. At present, there are many challenges in the calibration of extrinsic parameters of LiDAR and GPS/IMU. For example, the movement of the vehicle in automatic driving is not like the movement of the robot arm. The movement of the vehicle is mainly the plane movement of three degrees of freedom, which has fewer constraints on the other three degrees of freedom, so the error of the other three degrees of freedom is relatively large. Due to the large difference in the measurement principle between radar and GPS/IMU, to detect the same object in different sensors for calibrating extrinsic parameters between them, many calibration methods require specific vehicle operation and manual calibration marking scenes, resulting in high cost and low degree of automation.
The motion-based method is the primary method for calibration between LiDAR and GPS/IMU. Geiger et al. [
19] proposed to use a hand–eye calibration method to constrain GPS/IMU odometry and obtain LiDAR odometry through point-plane registration. However, it does not have an initial extrinsic parameter calculation, so its accuracy depends on the high-precision registration of the LiDAR odometry. Hand–eye calibration utilizes the hand-eye model proposed in the field of robotics. It is used to solve the rigid transformation between the manipulator and the camera rigidly mounted above, and this problem can be extended to solve the problem of relatively rigid transformation between any two sensors [
20,
21]. Therefore, we can also use it to solve the rigid transformation between LiDAR and GPS/IMU. Hand–eye calibration mainly solves the equation AX = XB, where A and B are the odometry of two different sensors, respectively, and X is the extrinsic parameters between the two sensors. In [
22], the ground is used to solve the Z-axis translation, roll, and pitch transformation, and then the three-dimensional problem is converted into a two-dimensional problem. The remaining three degrees of freedom are estimated by hand–eye calibration. It can be regarded as taking the three degrees of freedom optimized on the ground as the initial value and then performing hand–eye calibration. The initial value calculation is too simple and rough, there are manual operation errors, and the obtained accuracy is relatively low. Baidu’s open-source Apollo project divides the calibration process into two steps. First, calculate the initial calibration parameters through the trajectory and then refine the calibration parameters through the feature-based method. Our calibration method shares the same idea, which is to complete the calibration parameters from the coarse to the fine, but the details of the method implementation are different. We adopt more strategies, such as loop closure detection and tight coupling strategies, so the accuracy and robustness are relatively better. In [
23], in order to solve the problem of missing constraints caused by the plane motion of vehicle motion, large-range and small-range trajectories are used to solve the extrinsic parameters of rotation and translation, respectively. However, large-range trajectory recording requires a large labor cost and is demanding; therefore, the method is not universal. The advantages and disadvantages of related work are shown in
Table 1.
This paper proposes a new two-step self-calibration method, which includes extrinsic parameters initialization and refinement. The initialization part decouples the extrinsic parameters from the rotation and translation part by first calculating the reliable initial rotation through the rotation constraints, then calculating the initial translation after obtaining a reliable initial rotation, and eliminating the accumulated drift of LiDAR odometry by loop closure to complete the map construction. In the refinement part, the LiDAR odometry is obtained through scan-to-map registration and is tightly coupled with the IMU. The constraints of the absolute pose in the map refined the extrinsic parameters. The main contributions of our work can be summarized as follows:
A new LiDAR and GPS/IMU calibration system can be calibrated daily in an open natural environment and has good universality.
Like the common localization problem, we divide the calibration problem into two steps and adopt the idea from coarse to fine to make the extrinsic parameters have better robustness and accuracy and effectively eliminate the influence of accumulated drift on the extrinsic parameters.
The proposed calibration system is verified in both a simulation environment and a real environment.
The rest of the paper is organized as follows:
Section 2 is an overview of the system, which describes the core calibration algorithm of the LiDAR-GPS/IMU calibration system, which is divided into two parts to introduce our proposed algorithm and provide a detailed introduction to the algorithm.
Section 3 is the experimental results of the simulation and real environment.
Section 4 is the summary of this paper and future work.
Table 2 is all the abbreviations used in this article.
2. Methods
This section introduces the architecture of the proposed LiDAR-GPS/IMU calibration system, as shown in
Figure 1. The system is mainly divided into two parts: parameter initialization and parameter refinement. In the parameter initialization part, the feature-based LiDAR odometry and the interpolated GPS/IMU relative pose were used to construct the hand–eye calibration problem to solve the initial extrinsic parameters and construct the map. In the parameter refinement part, LiDAR and IMU are tightly coupled by the initial extrinsic parameters, and the extrinsic parameters were refined through the constraints of the absolute pose in the map.
The core of the extrinsic parameter initialization is a hand–eye calibration algorithm. When the vehicle is in motion,
Figure 2 shows the relationship between relative and absolute pose during hand–eye calibration of LiDAR and GPS/IMU, where the
is the first frame of the GPS/IMU, and the
and
represent the
frame of the LiDAR and the GPS/IMU frame obtained by interpolation at this time, respectively.
is the absolute pose of
relative to
.
is the relative pose from
to
.
is the relative pose from
to
. It is easy to see from
Figure 2 that we have two ways to obtain the relative pose from
to
, so we can obtain the formula as follows:
Equation (
1) constitutes the equation AX = XB for hand–eye calibration. The principle of hand–eye calibration will be introduced in detail below.
2.1. Hand–Eye Calibration
Equation (
1) is usually decoupled into two parts according to [
24]: rotation and translation. As the vehicle moves, the following equation holds for any
k:
2.1.1. Extrinsic Rotation
Equation (
2) is expressed by a quaternion as follows:
where ⊗ is the quaternion multiplication operator, and
and
are the matrix representation of the left and right quaternion multiplication, respectively [
25]. After stacking measurements at different times, we obtain an over-determined equation as follows:
where
K is the number of rotation pairs of the over-determined equation, and
is the robust weight to better handle outliers. The angle of the current rotation pair difference in the angular axis is calculated by Equation (
4) and taken as the parameters of Huber loss, whose derivative is the weight:
where
is Huber loss,
is the real part of the quaternion
, and
means taking the inverse of the quaternion. We use SVD to solve the over-determined Equation (
5), whose closed solution is the right unit singular vector corresponding to the minimum singular value. Meanwhile, according to [
26], to ensure sufficient rotation constraints, we need to ensure that the second smallest singular value is large enough, which needs to be larger than the artificial threshold. With the rapid increase of
, the priority queue is used to remove the constraint with the smallest rotation. Until the second smallest singular value exceeds the threshold, we can obtain a reliable initial extrinsic rotation.
2.1.2. Extrinsic Translation
When the extrinsic rotation
is obtained, translation can be solved by the least square approach according to Equation (
3).
However, the vehicle motion is usually a three degrees of freedom plane motion, so the translation of the z axis,
, is not reliable. At the same time, since the acceleration of IMU is coupled with gravity, it is related to rotation, so it is also not reliable to calculate the initial translation value through the initial rotation. We can make the plane assumption as [
10] and rewrite Equation (
7) as follows:
Equation (
8) is the plane motion constraint generated by the
relative pose, where
is the yaw rotation,
and
are the translation of the x and y axes, respectively,
is the 2x2 block matrix in the upper left corner of
, and
and
are the first two elements of the column vectors
and
, respectively.
Equation (
8) can be rewritten in the form of the matrix equation AX = b:
where
,
denotes the
element of the column vector
. As in Equation (
5), by stacking measurements at different times according to Equation (
9), we can obtain the final matrix equation AX = b, which can be solved by the least-squares approach:
The obtained yaw rotation, according to Equation (
10), is fused with the original calculated extrinsic rotation
to obtain the new initial rotation. We can manually measure the value of the extrinsic parameters about the z-axis. Suppose the calculated translation of the extrinsic parameters about the Z-axis differs too much from the measured value. In that case, we can set the translation of the Z-axis to the measured value and refine it through the absolute pose constraint in the refinement part; see
Section 2.2.2.
2.2. Methods
This section details the procedure for calculating the initial values of extrinsic parameters by GPS/IMU measurements and LiDAR measurements. First, the GPS/IMU measurements were interpolated against the LiDAR timestamp; then, the LiDAR odometry was estimated as in [
27]. The extrinsic parameters were initialized by hand–eye calibration; see
Section 2.1. We integrate loop closure into the system by using a factor graph and completing the map construction. Then, we tightly coupled LiDAR and IMU through the initially obtained extrinsic parameters, obtained the LiDAR odometry through scan-to-map feature-based registration, and refined the extrinsic parameters by constructing the constraints of the extrinsic parameters through the obtained absolute pose.
2.2.1. Extrinsic Parameters Initialization
As for LiDAR odometry, there are two methods to calculate the relative transformation of two consecutive frames: 1. based on direct matching, such as ICP [
28] and GICP [
29], 2. feature-based methods, such as LOAM [
27], do not need to calculate all the point clouds but only need to calculate representative points, which not only improves accuracy but also reduces computing efficiency. We use the feature-based method in [
30] to obtain the LiDAR odometry.
However, since there are no extrinsic parameters between the LiDAR and IMU, we cannot use the IMU pre-integration as the guess pose of the LiDAR odometry as in [
30]. Because offline calibration does not require real-time performance like SLAM, we can add more constraints to make the estimated relative motion more accurate. We first need to compute a predicted pose to prevent the optimization algorithm from converging to a local optimum. With the increase in time, the registration between consecutive frames tends to have a larger drift on the Z-axis than other axes. Therefore, we first use the constant velocity model as the predicted pose and then extract the ground points and calculate the centroid of the ground point to optimize roll, pitch angle, and z-translation, respectively. Since LiDAR is basically installed horizontally, we first extract ground points using geometric features in [
31], and then filter the outliers by RANSAC so that the ground points obtained can reach a high accuracy and the guess pose is relatively accurate.
When we obtain the latest scans of the raw LiDAR point cloud, firstly, we de-skew the point cloud to the moment of the first LiDAR point by guess pose and project the skewed point cloud into the range image according to the resolution. We extract the two feature points, edge feature and planar feature points, through the curvature. We distribute the feature points uniformly and remove the unstable feature points as [
27]. We denote
as all feature points of
LiDAR point cloud, and
and
are denoted as edge feature points and planar feature points, respectively. We can obtain the distances of point-to-line and point-to-plane through the following:
where
.
is the feature points de-skewed to the first point of the current point cloud.
are the two edge feature points of
corresponding to
.
are the three planer feature points of
corresponding to
.
is the feature points de-skewed to the last point of the previous point cloud. The moment of the last point of the previous point cloud is equal to the moment of the first point of the current point cloud, so the edge feature points at the same moment are on the same line, and the planner points are on the same surface. Thus, we can obtain the constraint of Equations (
11)–(
13) and use point-to-line and point-to-surface distances as cost functions. The relative pose can be calculated by minimizing the cost function by using the Levenberg–Marquardt algorithm. The LiDAR odometry algorithm is shown in Algorithm 1.
Algorithm 1 LiDAR Odometry.
|
- Input:
from the last recursion - Output:
- 1:
Through the constant velocity model, guess pose is calculated from - 2:
for each edge point in do - 3:
Extracting ground points through geometric features and RANSAC - 4:
end for - 5:
Calculate the three degrees of freedom z-translation, roll, and pitch through ground point optimization to obtain a new guess pose . - 6:
De-skew the point cloud to the moment of the first LiDAR point by guess pose. Detect edge points and planar points in - 7:
for a number of iterations do - 8:
for each edge point in do - 9:
Find an edge line as the correspondence, then compute point to line distance based on ( 11) - 10:
end for - 11:
for each edge point in do - 12:
Find a planar patch as the correspondence, then compute point to plane distance based on ( 13) - 13:
end for - 14:
Using distance as a cost function for nonlinear optimization, update - 15:
if the nonlinear optimization converges then - 16:
Break - 17:
end if - 18:
end for - 19:
Reproject to the moment of the last LiDAR point according to to get - 20:
return
|
After completing the registration between two consecutive frames, we optimize the estimated pose again through scan-to-map point cloud registration. We obtain the keyframe through the relative pose transformation amount greater than the threshold value that is considered to be set and save it. The local point cloud used for the current point cloud registration is obtained through the following two methods: 1. It is formed by superimposing several keyframes adjacent in time. 2. It is formed by superimposing several keyframes adjacent in position. The search for adjacent positions uses a KD-tree and adds the position of each keyframe as a point to the KD-tree, and then the keyframes with adjacent positions are obtained through the nearest neighbor search of the KD-tree. In order to reduce the drift of cumulative errors, we combined the obtained relative pose with z-translation, roll, and pitch angle obtained by ground point optimization to make the relative pose more accurate and robust.
After obtaining the relative pose of LiDAR, the initial calibration parameters can be obtained by hand–eye calibration through the relative pose constraints of the LiDAR odometry and the relative pose constraints of the interpolated GPS/IMU data; see
Section 2.1. We denote
and
as the relative pose constraint pair of LiDAR data and GPS/IMU data, respectively. We can construct the following cost function with extrinsic parameters:
where
represents the imaginary part of the quaternion, and there are a total of N relative pose constraint pairs. If we make a planar assumption, the residuals of the translation part can be constructed by Equation (
10). After multiple optimizations, the initial value of the calibration parameters can be obtained. The global consistency map is constructed by a loop closure.
2.2.2. Extrinsic Parameters Refinement
In this section, the process of extrinsic parameter refinement is introduced in detail. First, the absolute pose is obtained through the registration of the global map obtained from
Section 2.2.1. The LiDAR odometry and IMU and tightly coupled by the initial extrinsic parameters, and the cost function of the calibration parameters was constructed by the absolute pose. After nonlinear optimization, the refined extrinsic parameters can be obtained.
From
Section 2.2.1, we can obtain the initial extrinsic parameters, the global map, and the world frame. The global map is built under the world frame. Since we have the initial extrinsic parameters, we can tightly couple the IMU during the construction of the LiDAR odometry. Firstly, the pre-integration is used to de-skew the LiDAR point cloud and serve as the guess pose of the LiDAR odometry. The bias of the IMU is corrected by the optimized LiDAR odometry based on the factor graph.
First, we obtain the local map by filtering the global map and obtain the LiDAR odometry by registration with the local map,
. Through
Figure 3, we can construct constraints with extrinsic parameters through the following formula:
where
is the GPS/IMU pose corresponding to the current LiDAR timestamp.
As per
Section 2, we can also decouple Equation (
15) into the rotation part and the translation part to obtain the following two constraints [
24]:
The construction of the over-determined equation about rotation and translation is the same as
Section 2.2.1, but the assumption of plane motion is not required here, the extrinsic translation parameters can be optimized by the absolute pose, directly.
We can refine the extrinsic parameters by constructing the cost function of the absolute pose and extrinsic parameters according to Equation (
16):
When the cost function is optimized by L-M optimization, the final precise extrinsic parameters can be obtained when the iteration converges.