1. Introduction
Autonomous driving vehicles with various levels of automation from semi-autonomous driving technologies such as adaptive cruise control (ACC) and lane-keeping assist systems (LKAS) to fully-autonomous driving vehicles are now commercially available on the market. While conventional digital maps for car navigation are made for human drivers and have road-level resolution, map providers are now focusing on generating digital maps with a relatively high resolution. The presence of lane-level digital maps reduces the burden of capacity and eventually the cost of each autonomous driving vehicle.
Using conventional digital maps with road-level resolution, an individual autonomous driving vehicle carries excessive burden to fully understand its surroundings to make a decision. For example, if the vehicle has to turn right at the next junction, it needs to figure out the total number of lanes on the road and the lane that the vehicle is currently in in order to move to the rightmost lane safely before reaching the junction. However, if a lane-level map were provided to the vehicle, the path planning process would become considerably simpler and safer so that the individual vehicle would be less obligated to be equipped with very expensive sensors and processors.
Unlike the road-level digital map generation process, a large part of which is automated, the process of generating a lane-level digital map usually requires manual work at many stages. Lane detection on an urban road is more difficult because of multiple lanes, diverse road signs on the ground and complex lane variations at the junctions.
In the case of road-level map generation, it is important to reflect the direction and the curvature of the road accurately, as well as the existence of ramps and overpasses. Lane-level map generation requires more details; therefore, the individual lanes must be correctly acquired via accurate road line detection followed by appropriate parameterization. If the road is simple and has only a couple of lanes with no road signs on the ground other than the road lines, the generation of a road-level map and a lane-level map will not be very different. However, a complex boulevard in an urban area has multiple lanes and various road signs mixed with road lines. In this case, fast and accurate road line detection becomes an important issue in lane-level map generation.
Many of the conventional road line detection methods using LiDAR data about a decade ago focused on detecting only two lines on each side of the vehicle, in order to reduce the unintended lane departures of an autonomous driving vehicle [
1,
2]. Kammel and Pitzer [
3] proposed a LiDAR-based lane marker detection method, but the purpose of the algorithm was the robust estimation and correction of an offset between the provided map and the real lane on which the autonomous vehicle was traveling. As the road was rather simple, lane markers were assumed to be either the painted road lines or the curbs with some height. Jordan et al. [
4] compared LiDAR-based and camera-based lane detection methods and summarized that the LiDAR-based methods detect an increase in the reflectivity of the lane markings when compared to the road surface reflectivity.
More recently, Hata and Wolf [
5] detected road markings and curbs by using a 32-channel LiDAR, but the method was tested on a simple two-lane ring track, while road markings other than lines were not distinguished because the purpose of the algorithm was the localization of the vehicle. Yan et al. [
6] proposed a mobile mapping system using LiDAR. All road markings including the road lines were precisely extracted to generate a high-definition point-cloud map; however, the markings were not categorized, and the lines were not parameterized for use as a lane-level digital map.
In this paper, we present a practical real-time working prototype for road lane detection in an urban area using LiDAR data. The proposed method is a fully-automated process to detect road lanes on complex urban roads with multiple lanes and diverse road signs on the ground such as arrows with different directions and stop signs. The following are the main contributions of this paper:
Multiple lanes are detected simultaneously including the lane on which the vehicle is currently.
Road lines are distinguished from other road markings on complex urban roads.
Road lines are represented as uniformly-distributed 3D points to be easily used for further applications, such as road curvature calculation and lane-level map generation.
Lane-level digital map generation by accumulating the detected road lanes is presented as an application of the proposed method.
The rest of this paper is organized as follows: Related works are introduced in
Section 2. The proposed method is detailed in
Section 3. The experimental results are shown in
Section 4, and
Section 5 discusses the limitations and the future work of the proposed method and concludes the paper.
2. Related Work
Several methods for road map generation have been developed thus far. One of the traditional methods is to use satellite images or aerial images [
7,
8,
9]. One satellite/aerial image can cover a large area, and the road map can be generated automatically by using adequate image processing. However, up-to-date satellite images are not always available to the general public, and aerial images are expensive to update. Even though the camera resolution is rapidly increasing, it is still too limited to facilitate the generation of a lane-level map from satellite/aerial images. Moreover, as the images are two-dimensional, it is not possible to build a 3D map out of satellite/aerial images.
To build a road map without access to the satellite data or an airplane, in many studies, researchers have installed a GPS sensor on a probe vehicle and recorded its positioning information as the vehicle moves on the road [
10,
11,
12,
13,
14]. This is an intuitive method of building a road map, and the accuracy of the resulting road map will depend on the accuracy of the GPS sensor. In contrast, as the GPS sensor records only the position of the vehicle, the vehicle has to pass multiple lanes one by one to generate a lane-level map. The generation of a lane-level map for a route takes a considerable amount of time and effort if the route contains many lanes. Moreover, it is difficult for a human driver to drive the vehicle on the exact center of the lane as the GPS sensor records its positions.
To resolve these shortcomings, map providers nowadays install various perception sensors on the map-making vehicles. The most popular sensor is a camera [
15,
16]. The use of cameras can provide the information of multiple lanes from a single frontal or an all-around view of the vehicle. The vehicle is no longer forced to be driven on the center of the lane if the GPS sensor and the cameras are calibrated. A 2D or a 3D LiDAR sensor provides the 3D information of its surroundings in real time. Although 3D information can be obtained using multiple cameras, the use of a 3D LiDAR sensor is increasing with a significant decrease in the cost of the sensor [
16,
17,
18,
19,
20].
After processing the acquired data, various methods can be used to represent the generated lane-level road maps. Several researchers have studied how to represent road maps efficiently while maintaining their high usability in practical applications such as autonomous driving. Several studies have been conducted on the representation of a lane-level road map using polygons [
21,
22,
23,
24], clothoids [
14,
25], splines [
26,
27,
28,
29] and piecewise polynomials [
30].
Instead of a particular representation, we simply represent the detected lines by using densely- and uniformly-distributed 3D points along the detected road lines. As lane-level maps used for autonomous driving do not have a global standard to represent the road lines, there are pros and cons to choosing any of the representations mentioned above, and it is inevitable to need to transform a certain representation to another to calculate road characteristics such as the curvature or the tangent of the road. Therefore, because of the computational advantage with little burden for storage, we represent the detected road lines using densely- and uniformly-distributed 3D points along the line, which can easily be processed to any other representations for further application.
Finally, it is worth mentioning a few more studies on road detection using LiDAR in recent years. Clode et al. [
31] presented a method for automatic road detection from airborne laser scanner (ALS) data. The test data were collected from Fairfield in Sydney, Australia, with an approximate point density of one point per
in the area of 2 km × 2 km. The method classified road or non-road using height and intensity information of ALS data. Later, Clode et al. [
32] proposed a more mature method for road/non-road classification using the same dataset, including road vectorization with appropriate parameters such as centerline, orientation and width.
Zhang [
33] presented road and road edge (curb) detection method using a 2D LiDAR sensor, which was demonstrated in a prototype vehicle in the DARPA Urban Challenge 2007. A forward down-looking 2D LiDAR sensor was equipped on the front-top of the vehicle, and the elevation information of the range data was used to detect the lowest smooth surface as the drivable region and curbs as road edges. Han et al. [
34] presented a road boundary and obstacle detection method for the 2010 Autonomous Vehicle Competition in Korea using a downward-looking 2D LiDAR. The method detected drivable region and obstacles on the road boundary such as curbs, bushes, traffic cones and vehicles. Both methods [
33,
34] were successfully demonstrated through competition participation; however, the road condition in the test fields was limited to two lanes, and the road lanes were not detected. Yuan et al. [
35] presented a road detection and corner extraction method using a 3D LiDAR for robot navigation on the pavement. The test fields were narrow sidewalks surrounded by bushes or buildings, which were distinguished by corner extraction from range data.
Fernandes et al. [
36] proposed a road detection method using 3D LiDAR data. The authors projected 3D LiDAR points on a 2D reference image plane and upsampled the points to generate dense height maps. The method detected road surface, which is the drivable region, from the dense height maps surrounded by road boundaries with elevation such as curbs or parked vehicles. Caltagirone et al. [
37] proposed a 3D LiDAR-based road detection method using fully-convolutional neural networks (FCN). The FCN was designed for a pixel-wise semantic segmentation task in the point cloud top-view images. The proposed system carried out road segmentation in real time. While both methods [
36,
37] were successfully tested using the KITTI road benchmark [
38], only the drivable region was detected, and road lanes were not distinguished.
3. Proposed System
The proposed method detects the drivable region and road lines from LiDAR data in stages. The LiDAR data are assumed to be scanned from a spinning multi-channel LiDAR sensor installed on top of a vehicle while driving on the urban road.
3.1. Overview
A spinning multi-channel LiDAR sensor spins 5–20 times in a second to densely scan its surroundings with the same number of scanlines as the number of channels. Usually, it scans up to 100 m, which is sufficiently far for a moving vehicle to detect any object in advance and take appropriate action in time. However, a sensor with less than 16 channels may have limited vertical resolution to recognize an object located more than 30 m away.
As we aimed to detect road lines on the ground rather than vertical obstacles by using LiDAR data, we were less obligated to process LiDAR points far away from the sensor. Therefore, it was reasonable to exclude the LiDAR points scanned far away from the sensor with limited resolution. The proposed strategy processed the LiDAR points inside a certain perimeter with an appropriate vertical resolution to be accurately categorized. The detected road lines were later accumulated with the help of a GPS/INS sensor for digital map generation.
We processed the three stages of LiDAR point categorization for road lane detection as illustrated in
Figure 1. First, we differentiated the LiDAR points of the drivable region on the ground from the points scanned inside a certain perimeter from the sensor. We assumed that the vehicle was currently on the road and examined the vertical slope between the LiDAR points of the adjacent channels in the radial direction. If the slope was smaller than a certain threshold, the point was categorized as the drivable region until an obstacle was faced. Then, among the points categorized as the drivable region, we distinguished the points of road marks by their intensity. A LiDAR point carried an intensity value, as well as its 3D location, which depended on the reflexibility of the scanned surface. The paint used for the road marks usually had higher reflexibility than the asphalt or cement on the road. Therefore, the points of the road marks could easily be categorized from the points of the drivable region by using the intensity.
The final stage of distinguishing road lines from the points categorized as road marks was a difficult problem. Previous works circumvented the issue by detecting the road lines of the simple roads having at most two lanes with few road signs on the ground. In this case, road line detection became as simple as categorizing the LiDAR points with a high intensity from the points of the drivable region. However, diverse road marks such as arrows, stop signs and the names of different destinations of individual lanes frequently appeared on the urban roads. It was difficult to distinguish them from the road lines because road marks and road lines usually had the same reflexibility to the LiDAR scan.
3.2. LiDAR Point Categorization
A multi-channel LiDAR sensor carries multiple laser scanners, as many as the number of its channels. They are usually installed vertically inside the sensor. Each laser scanner emits and receives light to measure the distance for each channel. In order to scan the entire surroundings of the sensor, a mirror inside the sensor spins rapidly to scan 360.
We categorized the LiDAR points inside a certain perimeter from the sensor as the drivable region by calculating the slope between the points in the radial direction from the sensor. When a spinning multi-channel LiDAR horizontally scanned 360
with multiple laser emitters vertically installed on the sensor, the points of multiple channels scanned at the same instance formed a ray from the sensor. For example, Velodyne HDL-32E horizontally scanned 360
in 0.1 s with 32 lasers. Each channel scanned approximately 2100 points per spin; therefore, approximately six points were densely positioned in each degree angle. Furthermore, 32 points scanned at the same time formed a ray from the sensor. If we considered the point nearest to the sensor as Channel Number 0 and the the point furthest from the sensor as Channel Number 31, we could calculate the slope between two points of the adjacent channels from Channels 0–31 as follows:
where
a and
b indicate the adjacent channels and
x,
y and
z indicate the 3D coordinates of each LiDAR point in the right, forward and up directions from the sensor, respectively. When the slope
s of the ground became greater than a certain threshold (i.e., we empirically set the value to 0.15), we assumed that there was a vertical obstacle on the ground and categorized the points on the same ray beyond this channel as not drivable.
This method effectively detected sudden vertical slope changes on the road such as the curb between the road and the sidewalk or vertical obstacles such as the median strip, beacons or other vehicles. The method successfully categorized speed bumps or small potholes as the drivable regions. An example of drivable region detection is shown in
Figure 2. Gray points indicate the given 3D LiDAR points, whereas blue points indicate the points categorized as the drivable region.
Once the points of the drivable region were discriminated, we distinguished the points of road marks by using the LiDAR point intensity. The signs and lines on the road were usually painted with conspicuous colors such as white or yellow. The intensity of the points scanned at the road marks was easily distinguishable from the intensity of the points scanned at the asphalt or cement on the ground. The LiDAR point intensity of the road marks was usually considerably higher. In
Figure 2, the yellow points indicate the detected road marks on the ground.
3.3. Road Line Detection
To detect the road lines out of various road marks having similar LiDAR intensities, we searched for a set of parallel lines separated by the interval of the lane width. The lane width was initially set as 3.4 m, but the value was updated during the detection process.
Given a set of LiDAR points categorized as road marks, we aimed to estimate
L sets of line parameters. We modeled each road line in 3D as:
where
is a vector parallel to the line,
is a point on the line and
t is a real number.
For initialization, we initialized
sets of lines with the interval of half of the initial lane width to separate the road marks from the road lines effectively, where
K is the number of candidates to detect
L lines. Assuming that the location of the sensor was the origin and the sensor was facing front, all the line parameters were initialized as follows:
This initialization indicates that all the lines are parallel to the y-axis and lie on the ground two meters below the sensor. Note that the and z axes point in the right, forward and up directions, respectively. The values for the lines were initialized to be separated with intervals of w, where w is the initial lane width set as .
After initialization, we updated the value of each line with the assumption that the LiDAR points for the line were stably found when the vehicle moved forward, whereas the points for the other road marks were discontinued. We designed an iterative expectation-maximization (EM) algorithm to determine reliable sets of line parameters. For each iteration, we assigned each LiDAR point to its nearest line by calculating the point-line distance. After the assignment, we updated for each line to be the average of the x-coordinate values of the assigned points. This process was very simple and fast. We repeated the process 100 times for each spin of the LiDAR.
Finally, we selected L reliable sets of line parameters out of the updated sets after the EM-algorithm. For each line k, we calculated the number of assigned points for each line and the standard deviation of the y-coordinate values of the assigned points. showed the level of distribution of the points in the direction of the y-axis. of the cluster of a line tended to be larger than that of the cluster of other road marks. From the cluster having the largest , we selected the line parameters as a reliable set if was larger than a certain threshold. As the line was expected to be continued from the last spin, we also checked if the current line was continued from one of the detected lines from the last spin. That is, we calculated the minimum difference between with the values of the detected lines from the last spin and considered it as the same line when the minimum difference was below a certain threshold. After selecting L sets of reliable line parameters, we ended the line selection process.
As a result, the detected L sets of line parameters were updated at every spin of the LiDAR. The lines were stably detected, while the vehicle changed lanes or went through curved roads. Algorithm 1 summarizes the proposed road line detection process.
3.4. Lane-Level Digital Map Generation
Nowadays, most of the map-making cars are equipped with a GPS/INS sensor, which provides the current GPS location of the vehicle, as well as the rotation matrix from the sensor coordinate to the world coordinate. The world coordinate generally refers to the north-east-down (NED) coordinate. Provided that the GPS/INS sensor was calibrated with the LiDAR sensor, we transferred the estimated parameter of the detected road lines to the world coordinates for each spin of the LiDAR.
We used the current position in terms of the latitude and longitude along with the altitude provided by the GPS/INS sensor, as well as the rotation matrix that transformed the sensor coordinate frame to the local NED coordinate in real time. No additional method was used to post-process the positioning data. The latitude and longitude values were converted to the UTM coordinate to obtain the position values in meters. As the estimated parameter was the x-coordinate value in the sensor coordinate frame, it was transformed to a position in the world coordinates by multiplying the rotation matrix and then adding the current sensor position in UTM.
Algorithm 1: Road line detection. |
|
We filtered out the duplicate dots, which were estimated while the vehicle stopped or moved very slowly. Then, we simply connected the dots to build the lane-level digital map. For qualitative evaluation, we overlapped the final lane-level digital map with the satellite images.
5. Discussion and Conclusions
5.1. Limitations and Future Work
There are several limitations to our system, which need to be addressed before a full solution for the given issue can be developed. The proposed method aimed for lane-level map generation by using the detected road lines rather than immediate lane keeping during autonomous driving as an application. Therefore, we did not consider heavy weather conditions, which might degrade the 3D LiDAR sensor input. When the purpose was mapping, we could simply choose a good weather day to scan the road, whereas this can be a critical limitation when the method is applied for road lane detection for a lane keeping system. When used for lane keeping in bad weather, the proposed method might be simplified to detect only two road lines on each side of the current vehicle, and the input data could be compensated by other perception sensors, such as cameras.
While it is one of the strengths that the proposed method was tested on complex urban roads, this can also indicate that the method was difficult to apply to unpaved roads. Roads with drastic slopes and severe curvatures in mountainous areas could be challenging and might require some different empirical threshold values for the proposed method.
Finally, while vehicles crossing road lines as they changed lanes during driving were never a critical problem for the proposed algorithm, it was difficult to detect the road lines beneath the parked cars on the side, completely obscuring the road lines. As the undetected road line was usually the rightmost line in this case, this issue could be alleviated by knowing the total number of lines to be detected beforehand. If we have prior knowledge that there should be another line to be detected, we can adjust the algorithm to detect the rightmost line using less input data observed between the parked cars.
Potential future work would depend on a further application of the proposed road lane detection method. For example, if we aim to extend the proposed method for use in a lane-keeping assist system for autonomous driving vehicles, we should focus on improving the detection accuracy of the two lines on each side of the vehicle under various weather and lighting conditions. To extend the proposed method toward fully-automated lane-level map generation, we should test the method under various road conditions, such as unpaved roads, mountainous roads with severe slope and curvature, crowded urban roads and roads with noisy GPS signals.
5.2. Conclusions
In this paper, we presented a simple and practical real-time working prototype for road lane detection in an urban area by using 3D LiDAR points. The overall system consisted of two subsystems, including point categorization and road line detection. Given the 3D LiDAR point cloud, we categorized the points of the drivable region and distinguished the points of the road signs on the ground. Then, we presented an expectation-maximization process to detect and update the 3D line parameters in real time. The detected road lines were represented as densely- and uniformly-distributed 3D points on the lines with the help of a GPS/INS sensor and integrated to generate a lane-level digital map. The proposed system was tested to generate the lane-level maps of two complex urban routes in the cities of Seongnam and Incheon, South Korea. The accuracy of the resulting lane-level map was evaluated by comparing with the corresponding satellite images.