1. Introduction
The rapid advancement of autonomous vehicle technology has significantly reshaped the landscape of modern transportation. At the heart of this technological evolution is the critical need for precise position tracking, which ensures that autonomous systems can navigate safely and efficiently. This need is particularly pronounced in the realm of autonomous motorsport, where the demands for precision and speed are at their highest.
Several techniques can be used to determine the vehicle’s location, which can be grouped into two main categories. The first are self-contained methods that utilize a known start location and pose change measurements. Due to this property, these dead reckoning techniques are relative localization ones. In contrast, the other group contains the absolute or position-fixed methods, which rely on external fixed objects, and exploit the measurements from these to calculate the location.
The most widely known from the former group is IMU-based (Inertial Measurement Unit) navigation, where the signals of the accelerometer and gyroscope are integrated and transformed to determine the pose. In the case of wheel robots or cars, the so-called wheel odometry as a dead reckoning can be used [
1]. This estimation uses the wheel rotation and steering encoders as inputs to calculate the velocities and yaw rate of the vehicle and apply one-step integration. More generally, this technique can be interpreted as a vehicle model-based estimation and if more signals from the vehicle sensors are integrated, e.g., torque and suspension displacement, the full 3D dynamic quantities can be calculated and integrated for localization purposes [
2]. Some special sensors measure these quantities, e.g., optical velocity sensors or fiber-optic gyroscopes, and can be used directly to eliminate the disadvantages induced by vehicle model integration. However, these sensors have extremely high costs relative to the IMU and encoders, and are thus utilized for validation purposes only.
The most famous of the latter, the absolute localization group, is the GNSS (Global Navigation Satellite System), where the position of the module equipped to the chassis is computed from the pseudo ranges of the satellite radio frequency signals. Furthermore, the modern GNSS modules integrate the carrier phases of the signals and use an inner estimation algorithm to compute both the position and velocity of the vehicle. Nevertheless, the GNSS technique requires a clear sky and many satellites without multipath to provide a position with high accuracy [
3]. Similarly to the position, orientation can be measured absolutely as well with a compass. In the field of autonomous robots, magnetometers sense the Earth’s field, but several distortions can make false measurements. Furthermore, the most relevant yaw angle of a vehicle can be properly measured with a dual GNSS constellation, and the orientation in the rest of the directions can be estimated with a third module as well.
Based on a similar methodology, it is also possible to develop absolute vehicle localization using a visual principle, whether using a camera, LiDAR (Light Detection and Ranging), or radar [
4]. The fixed points for the triangulation could be real objects, such as cones or lanes, abstract image features, e.g., with ORB (Oriented FAST and rotated BRIEF) feature detectors, or simply the full grid map. However, the technique requires a complex pipeline, prior map building, and robust online feature matching. Furthermore, the distortion induced by the dynamic objects should be handled online as well, which often leads to the development of a full SLAM (Simultaneous Localization and Mapping) algorithm.
Regardless of the group, sensor, or method, it is almost impossible to tackle the localization of a racecar in every circumstance with one method alone. Thus, the fusion of more sensors or techniques is needed, not even in terms of parallelism, but complementarity, to ensure that all of the requirements can be fulfilled.
The rest of the paper is organized as follows.
Section 2 summarizes improvements with sensor fusion methods, while
Section 3 contains experiences from a real racecar competition and an analysis of GNSS-based localization. Finally, the paper is concluded in the last
Section 4.
2. Sensor Fusion Techniques for Improved Localization
Every sensor mentioned in
Section 1 has a high-accuracy version, e.g., a mechanical accelerometer. Since this paper focuses on vehicles, where the size of equipment should be limited, this section examines consumer/industrial-grade sensor types. Thus, e.g., the attributes of a MEMS-based IMU are analyzed rather than its mechanical aspects, the GNSS module is made to be automotive-grade, and the usage of high-cost optical sensors is eliminated as well.
2.1. INS Solutions
There is no exact definition of the INS (Inertial Navigation System) technique, but it is widely used when the signals of a 3-axis accelerometer and gyroscope are fused to calculate the vehicle’s velocity and orientation changes. In most cases, angles from a magnetometer are also utilized as inputs, which improves the orientation calculation mainly when the vehicle is tilting and accelerating at the same time. In some cases, a barometric pressure sensor is also included as an altimeter. The main disadvantage of this method is induced by the appearing bias in the accelerometer’s signals. Thus, regardless of the INS system estimating the pose and all of the required dynamic quantities (assuming the known initial ones), it can be used only for short-term navigation due to the double integration of the measured acceleration and its errors as well.
2.2. Vehicle Model-Based Estimation
The vehicle models utilize the vehicle CAN (Controller Area Network) signals, such as wheel rotations, steering angle, motor torque, and brake pressure. The technique integrates these quantities into Newton’s second law mechanical equations and includes other models, e.g., the Pacejka tire and aerodynamic drag models. With the two-wheel odometry and bicycle model, the velocity components and double estimation for the yaw rate are made available. However, the wheel rotations are corrupted by slipping in the case of a racecar, which can be compensated by fusion with an INS method. Furthermore, the accuracy of every model-based estimation is limited by the knowledge of its parameters, which has a huge relevance in a racecar [
5].
2.3. SLAM-Based Methods
The clear advantage of the SLAM-based methods compared to the previously mentioned ones is the availability of absolute pose estimation. It comes with the cost of a map-building procedure, but it can be carried out in a separate low-speed run without other cars on the track [
6]. However, the accuracy of localization mainly depends on the online scan matching which is a difficult task in this racing environment due to high speeds of up to 300 km/h and the continuous masking of the environment by the opponent. This disadvantage is handled by a motion model where external velocity and yaw rate signals are used to predict the possible location of the actual scan from the previous location. For this, INS- or vehicle model-based estimation can be included; thus, the SLAM-based method is already a sensor fusion technique by default.
2.4. GNSS-Aided Estimation
The GNSS is the only sensor that gives absolute pose measurements directly. However, the accuracy of the location is limited by signal multipath, the blocking of the sky by overpasses, and other facts mentioned by the analysis in the previous section. Nevertheless, the GNSS is a core element in a sensor fusion method because this separated absolute measurement can compensate for the disadvantage of the other methods. For example, in a GNSS/INS system, the drift of the inertial estimation is corrected by a bias estimation of the accelerations [
7], the parameters of the vehicle model can be identified from the measured GNSS path [
8], and it can serve as an a priori measurement for the location of the actual scan matching.
2.5. Sensor Fusion with SLAM Integration
None of the methods can perform the state estimation task alone, as presented. In the robotics and automotive environments, a GNSS-aided INS method with model integration is often handled separately from SLAM-based methods since it estimates all of the required quantities (location-GNSS, velocities-model, accelerations-INS), and the frequency can easily be 100 Hz. However, in vehicle racing, the exact location relative to the track’s border is also crucial. Thus, its output should be fused with vision-based methods, but this should occur in parallel due to the lower frequency and various timestamps of the vision sensors. Additionally, a possible delay caused by image processing in the vision-based methods may introduce latency, complicating synchronization with higher-frequency INS/GNSS outputs.
2.6. Complementary Fusion
In some cases, the fusion of sensors or methods is not only an option to mitigate the estimation noise but is also a must to achieve the required accuracy for every quantity. The basic example of this is the INS fusion where the pose change can be calculated only by utilizing both the acceleration and gyro. However, there are cases where a quantity can be estimated but only with a significant error.
For example, the accuracy of lateral and longitudinal locations in the LiDAR and camera-based SLAM methods is highly different [
6]. The estimation with the LiDAR is difficult in the longitudinal direction due to the spacious surroundings in the straights, but the estimation is simple laterally since there is a constant barrier around the track. In contrast, the camera-based tracking is limited laterally since the environment is feature-less in the side view, but thanks to the higher range and resolution, it is advantaged for longitudinal motion estimation. Thus, the special racetrack environment introduced a complementary fusion for vision-based localization where the driven distance comes from the camera and the lateral location relative to the track sides from the LiDAR.
3. Localization Experience from A2RL
3.1. Inertial Navigation System Experience
The Humda LAB-SzE team competed in the A2RL (Abu Dhabi Autonomous Racing League), and the following is the experience and insight gained during the event. The GNSS-aided Inertial Navigation System (GNSS/INS) used spatial navigation with its internal estimation algorithm [
9]. The used sensor (Vectornav VN-310, Dallas, TX, USA) includes a compass, IMU, and a dual-antenna GNSS module. From these sensor signals, we obtained the orientation, speed, and X-Y position, which were used to locate the vehicle around the Yas Marina Circuit. During the pre-competition testing phase, most teams initially adopted this solution due to its simple implementation, which allowed them to spend more time on other core subsystems.
At low speeds, the system worked satisfactorily, allowing the car to navigate the track without any problems. However, as testing progressed, a number of major issues with the system began to surface, presenting significant challenges that needed to be addressed.
Certain engine speed ranges affected the measured acceleration of the Z-axis because the vibrations from the engine matched or were harmonics of the GNSS/INS natural frequency. These vibrations compromised the accuracy of the acceleration data, making it noisy;
The secondary GNSS antenna showed a notable difference compared to the accuracy of the primary antenna which was below 0.1 m in the RTK (real-time kinematics) fixed mode. Despite several attempts to solve this problem—by swapping antennas, changing their position, and shielding the cables—the secondary antenna continued to produce significant errors;
The fusion algorithm within the GNSS/INS sensor was completely opaque and could not be accessed or tuned, limiting the ability to optimize system performance.
These challenges have revealed severe limitations in the current system’s ability to provide reliable and accurate positioning in competitive conditions. This affected the motor vibration on the Z-axis, the large differences between GNSS antennas, and the lack of scalability of sensor fusion algorithms.
In order to solve the identified problems, we tried several solutions. The rear GNSS antenna was placed close to the VR camera and its associated antenna and cable. In order to reduce the interference, the GNSS cable was shielded. When this approach did not solve the problem, the front and rear GNSS antennas were swapped, and the primary antenna was placed in the less noisy front environment. Despite these efforts, the standard deviation calculated by the GNSS/INS remained substantially large, eventually leading to a safety shutdown of the car.
The system’s positioning accuracy declined over time due to unforeseen noise factors, causing significant instability in the GNSS/INS-based positioning. Initial precision deteriorated from sensor drift, environmental interferences, and mechanical vibrations. This degradation forced the abandonment of exclusive GNSS/INS reliance, leading to the integration of alternative methods and supplementary data sources to restore accuracy and stability.
3.2. Kalman Filter Algorithm
Based on insights gained from earlier testing phases, the Humda Lab-SzE team decided to bypass the GNSS/INS fusion process, instead focusing solely on processing the raw sensor signals. These raw sensor signals were processed with an Extended Kalman Filter (EKF) [
10]. In addition to the integration of signals from the GNSS/INS, the data of a ground speed sensor (Kistler Correvit SF-Motion/2059A, Winterthur, Switzerland), which performs longitudinal (X) and lateral (Y) speed measurements based on optical sensor, have been incorporated to improve the state estimator capabilities.
The 6-degree-of-freedom (DOF) IMU in the ground speed sensor was examined and compared with the one in the GNSS/INS. During the analysis of the data of the two accelerometers, the yaw rate of a measured lap was integrated. While the integrated value of the GNSS/INS data was closer to zero, the acceleration values from the Kistler sensor arrived in sync with the velocity values, which facilitated the operation of the algorithm, speeding up the processing speed. Only the primary GNSS antenna was actively used; the value of the secondary GNSS was ignored.
Using these raw sensor signals mentioned above, our EKF efficiently calculated the exact X and Y position of the car on the track as well as its orientation. This method leveraged the strengths of each sensor component, consequentially improving localization accuracy under competitive conditions compared to the traditional GNSS/INS fusion approach.
The position data calculated at 200 Hz were calculated based on a rigid body model and these were supplemented with a 20 Hz GNSS position. Because the Yas Marina Circuit is mostly horizontal, the calculations were therefore reduced to two dimensions.
In conclusion, the innovative approach of bypassing the GNSS/INS fusion process and focusing on raw sensor signal integration has yielded significant improvements in position estimation accuracy and processing efficiency. The Humda Lab–SzE team’s methodology harnesses the optimal strengths of each sensor component, providing a robust solution for autonomous vehicle localization in competitive scenarios. Further research will explore the integration of additional sensor data to refine the Kalman Filter algorithm and enhance localization accuracy even further. The potential application of machine learning techniques to predict and mitigate sensor anomalies will also be investigated.
4. Summary and Conclusions
Based on the experiences from previous competitions, it can be concluded that despite the rapid development of sensors, they have not yet reached the level where we can rely solely on their raw data to achieve localization with the accuracy required for controlling autonomous racecars. Research and development in the field of autonomous technology is continuously ongoing, but data processing and real-time positioning still pose important challenges for experts. It is particularly important to note that racecars travel at much higher speeds compared to street cars, which demands even greater precision and reliability from the system. Despite this, the selection of sensors is crucial and requires great care, as the improper use of sensors can lead to significant errors.
Despite the technological advancements in sensors and the continuous emergence of new solutions on the market, it is often necessary to supplement and refine sensor data. In addition, it can be stated that even with more expensive sensors, it is worthwhile to enhance the localization accuracy of factory-installed sensor fusions with a vehicle dynamic model built for the specific application. This model helps to remove and filter outliers from the raw data mixed with noise received from the sensors. The use of dynamic models allows the system’s reliability and accuracy to increase, which is essential for the safe and efficient operation of autonomous racecars.
In summary, although sensor technology is continuously evolving, achieving precise and reliable localization still requires supplementary solutions and models. Due to the specific demands of racecars, the proper selection of sensors and the processing of raw data play a crucial role in successful control and navigation.