This section mainly introduces how to calculate the turning radius of the vehicle and how to match the pixel distance in the aerial view with the real distance in the world coordinate system. At the same time, we will also introduce how traditional sliding window algorithms detect lane lines.
2.1. Calculation of Vehicle Turning Radius
The vehicle motion model is shown in
Figure 1a. In the inertial coordinate system
OXY,
is the yaw angle (heading angle) of the vehicle body,
is the deflection angle of the front wheel,
is the center speed of the rear axle of the vehicle,
is the center speed of the vehicle’s front axle, and
l is the wheelbase.
Figure 1b shows a schematic diagram of the vehicle’s steering process, where
R represents the rear wheel turning radius,
P represents the instantaneous center of rotation of the vehicle,
M represents the rear axle axis of the vehicle, and
N represents the front axle axis. Assuming that the sideslip angle of the vehicle’s center of mass remains constant during the turning process, the instantaneous turning radius of the vehicle is the same as the curvature radius of the road. The calculation formula for the vehicle’s yaw rate is shown in Formula (1):
In Formula (1),
is the speed at the rear axle axis of the vehicle. The turning radius of the vehicle is shown in Formula (2):
Substituting the vehicle’s yaw rate into Formula (2) yields Formula (3). The front wheel angle
in Formula (3) can be obtained through the steering wheel angle and steering ratio.
The calculation of the vehicle turning radius given in Formula (3) is only applicable to vehicles at lower speeds (v ≤ 10 km/h). When the vehicle speed is high, the turning radius of the vehicle will be affected by the tire sideslip angle. At this time, the turning radius of the vehicle is shown in Formula (4).
In the above formula,
K is the stability factor,
is the turning radius of the vehicle at low speeds, and the calculation method for
is shown in Formula (3). The calculation of stability factor
K is shown in Formula (5).
In Formula (5), a is the distance from the front axle of the vehicle to the center of mass, b is the distance from the rear axle of the vehicle to the center of mass, is the total lateral stiffness of the front wheels of the vehicle, is the total lateral stiffness of the rear wheels of the vehicle, m is the body mass, and l is the wheelbase. By using Formulae (3)–(5), the turning radius of a vehicle at high speeds can be determined.
2.2. Sliding Window Recognition of Lane Lines
In order to better obtain the curvature of the lane line, we need to transform the image taken by the camera into an aerial view through perspective transformation. The advantage of this is that it is easy to detect and improve the detection accuracy.
The principle of perspective transformation [
18,
19] is shown in Formula (6).
In Formula (6), (u, v) are the original image pixel coordinates, (x, y) are the transformed image pixel coordinates, , . Because we are dealing with two-dimensional images, we can set . Therefore, to complete the perspective change, the pixel coordinate values of four corresponding points are required.
In order to better detect lane lines in the lane where the vehicle is located, it is also necessary to select ROI regions in the image. Unlike general ROI region selection, in this article, the selection of lane ROI regions not only needs to extract a single lane, but it also needs to make the boundaries of the ROI region as level as possible with the lane line. The advantage of doing so is that the lane in the bird’s eye view can be closer to the lane situation in world coordinates, keeping the error as small as possible during distance matching. In order to obtain a more accurate top view of the lane, we conducted a simulation in Prescan.
Figure 2a shows the normal perspective road map taken by the camera before the perspective change. The width of the image is 640, the height is 360, and the red box represents the selected ROI area. The pixel coordinates of each point in ABCD are shown in
Table 1. After the perspective change, points
A,
B,
C, and
D are transformed into
a(0,0),
b(0,360),
c(240,360), and
d(240,0).
Figure 2b shows the road surface after perspective changes. The width and height of the image after perspective transformation are 240, 360.
As can be seen from
Figure 2b, when the car is traveling in a straight line, the two lane lines in the aerial view are basically parallel, close to the top view in the world coordinate system. This indicates the rationality of selecting various points in the ROI area and provides a guarantee for the subsequent calculation of the curvature of the lane line. After obtaining the aerial view, the image should be binarized. This is because the original image obtained by the camera is a color image, which has a large amount of data and contains a lot of interference information. Converting the image into a binary image can not only effectively remove this interference information but also improve the operation speed. Moreover, through threshold segmentation, the lane line can be better distinguished from the road surface, and the lane line can be recognized better and faster. The binary diagram of the aerial view of the pavement is shown in
Figure 3.
After preprocessing the image, the sliding window method can be used to detect lane lines [
20]. The flow chart of traditional sliding window detection of lane lines is shown in
Figure 4.
It is very important to determine the coordinate value of the search center of the first window in the sliding window algorithm. This article uses pixel statistics to determine the search center position of the first window. Binocular cameras are used to assist in locating the position of the first sliding window. We define the first window at the bottom of the image, with a width and height of h. We showed that the width and height of the converted aerial view are 240 and 360, respectively. Therefore, it can be determined that the y-coordinate value of the first window center point is 360-h/2, and what needs to be determined is the x-coordinate value of the window center. By calculating the number of white pixels in each column and finding the two columns with the highest number of white pixels, the x-coordinate value of the search center of the first sliding window of the left and right lane lines can be determined. This is the pixel statistics method. When the vehicle is in a curve, the lane line is also curved. If white pixel statistics are carried out on the whole aerial view image, the accuracy of the statistical results will be affected. Therefore, we intercept 1/5 of the aerial view image from the bottom to count the number of white pixels. As shown in
Figure 5, it is a pixel statistical map at a certain time.
For most cases, the pixel statistics method can be used to determine the position of the first sliding window of the left and right lane lines. However, for some dashed lane lines, as shown in
Figure 6, the right lane line disappears after taking the bottom 1/5 of the image. The pixel statistics method may fail due to the inability to detect white pixels, so a binocular camera is needed to assist in determining the position of the first sliding window. How to use binocular cameras to assist in locating the position of the first sliding window will be mentioned later.
From
Figure 5, it can be seen that there are two obvious peaks in the statistical value of the number of white pixel points. The x-coordinate values of these two peaks can be used as the search center x-coordinate values for the first sliding window of the left and right lane lines. After determining the search center, one can draw a virtual rectangle with a width of
w and a height of
h. The values of
w and
h can be determined based on the image size. In this article, the values of
w and
h are both 40-pixel values. Calculate the sum of the x-coordinate values of all white pixels within the virtual rectangle range and the number of white pixels, and take the average value. This average value is the x-coordinate value of the center of the actual sliding window drawn. The center of the previous sliding window serves as the search center for the next window. Calculate the total number of white pixels and synthesis of the x-coordinate values of these white pixels in the new window, and calculate their mean. Use the mean as the horizontal coordinate value of the new sliding window center and draw a new sliding window. Repeat the above operation until the number of sliding windows exceeds the threshold. The center of the sliding window is fitted with a curve, and then, the aerial view is inverse perspective transformed. The final lane line fitting effect can be obtained. The traditional sliding window effect is shown in
Figure 7a, and the fitting of lane lines is shown in
Figure 7b.
From
Figure 7, it can be seen that the lane lines detected and fitted using sliding windows have a good effect. Compared to the method of detecting lane lines using Hough transform, the method of detecting lane lines using sliding windows has better detection performance and better robustness in curves with small curvature.
Figure 8 shows the lane detection results of the sliding window for small curvature curves. The curvature radius of the curve is 120 m. From
Figure 8, we can also see that when the vehicle first enters a curve (at which point the curvature of the curve is small), the traditional sliding window algorithm can detect the lane line, thereby providing a steering wheel angle control, which can be used as input for subsequent algorithms.
Although traditional sliding window algorithms have good detection performance for small curvature curves, in curves with larger curvature or curves with dashed lane lines, some sliding windows may miss or fail to detect white pixels, as shown in
Figure 9. The radius of the bend in
Figure 9 is 40 m. At this point, the lane lines detected by the traditional sliding window algorithm will have a significant difference from the actual situation. This is disadvantageous in an auto drive system or auxiliary driving system.
2.3. Distance Matching
The turning radius of a vehicle can be obtained by Formulae (3)–(5), which can also be approximated as the radius of the curve where the vehicle is located. By using the radius, we can calculate the lateral distance difference and longitudinal distance difference between two points on the lane. However, this difference is the true value in the world coordinate system, which is not applicable in the pixel coordinates. Therefore, we still need to calculate the pixel distance difference in the aerial view (including the difference between the horizontal distance and the vertical distance). It is worth noting that both the transverse and longitudinal directions we mentioned are based on the vehicle body as a reference. The distance parallel to the longitudinal axis of the vehicle is called the longitudinal distance, while the distance perpendicular to the longitudinal axis of the vehicle is called the transverse distance. As mentioned earlier, when making changes to the aerial view of the image, we make the lane lines when driving in a straight line as parallel as possible in the aerial view, which will make the aerial view obtained through perspective changes closer to the real-world aerial view. This article proposes a ratio method to calculate the ratio of pixel distance in the image coordinate system to the distance in the world coordinate system. The ratio method can be seen as a calibration process. First, we need to determine the installation position of the camera. Second, we need to determine the transformation matrix of the aerial view, which is related to the four points selected for perspective transformation. Therefore, once these parameters change, we need to regain the scale relationship. Generally speaking, the position of the camera will not move after installation, and the selection of perspective transformation points will not change, so this method is feasible. The reference origin and reference coordinate system of the camera are shown in the blue coordinate system in
Figure 10. In the simulation software, the coordinates of x, y, and z are 1.85, 0, 1.4, in meters. The red coordinate system is the camera coordinate system, and its origin is the camera. We take pictures of lanes with different widths and calculate the pixel distance of these lanes in the aerial view, so as to obtain the search scale relationship. There are two main reasons why we use ratio calculation instead of coordinate transformation: the first is that the ratio method can simplify the calculation steps, shorten the calculation time, and obtain the calculation results faster, ensuring real time and effectiveness; the second reason is that using the coordinate transformation method first requires determining the world coordinate system and its origin, and the world coordinate system changes accordingly during the vehicle’s movement, which increases our computational difficulty. Therefore, considering the above two points, we adopt the ratio method. The distance ratio results obtained in the simulation software are shown in
Table 2.
The average value of the ratio calculated from
Table 2 is 44.7, which means that the distance of one meter in the horizontal position of the world coordinate system in the aerial view image is 44.7 pixels. The ratio of longitudinal distance is shown in
Table 3.
The average value of the ratio calculated from
Table 3 is 30.8, which means that the distance of one meter from the longitudinal position in the world coordinate system in the aerial view image is 30.8 pixels.