1. Introduction
In the fields of intelligent transportation and automotive driving, particularly in complex urban environments, there is a pressing demand for high-precision and highly reliable navigation and localization systems. Currently, various vehicle localization methods are available, such as the Global Navigation Satellite System (GNSS), Inertial Navigation System (INS), Light Detection and Ranging (LiDAR)-based algorithms, and vision-based algorithms. However, relying solely on single localization algorithms cannot ensure continuous and reliable positioning.
For instance, GNSS fails to provide accurate and reliable location information in GNSS-denied environments like urban canyons and tunnels [
1,
2]. INS can achieve autonomous positioning through gyroscopes and accelerometers over short periods but suffers from position drift over extended durations [
3]. Vision-based localization methods are unable to achieve positioning under challenging lighting conditions. LiDAR-based algorithms work by emitting multiple laser beams to measure distances and create environmental point cloud maps for positioning [
4,
5]. While LiDAR is unaffected by lighting conditions, it encounters difficulties in degraded environments, such as tunnels, where it also fails to achieve reliable localization [
6].
Through the above analysis, the following conclusion can be drawn: single localization algorithms cannot provide continuous, reliable, and accurate positioning. Ground-Penetrating Radar (GPR), which senses subsurface environmental information, offers a natural advantage over traditional algorithms in adverse weather and low-visibility scenarios. When applied to localization, GPR serves as a valuable complement to existing localization algorithms.
GPR is a non-invasive technology that utilizes electromagnetic waves to detect underground targets. By analyzing variations in reflection time and intensity, the structure of underground targets and the properties of the medium can be determined [
7]. The detection process is illustrated in
Figure 1. As an efficient subsurface exploration technique, GPR can rapidly and accurately detect underground targets without the need for surface contact. This technology has found widespread application in areas such as road maintenance, geological surveys, and battlefield demining [
8,
9,
10]. Recently, researchers have initiated studies on using GPR to assist with vehicle localization [
11].
Research on GPR-based localization algorithms predominantly utilizes a matching mechanism, requiring the construction of a reference database in which GPR images are compared to determine vehicle location. Selecting stable and unique features is essential for the performance of such systems. Cornick et al. from MIT pioneered the study of vehicle localization using GPR image matching [
11]. They used the normalized cross-correlation (NCC) as the similarity evaluation strategy for GPR images. By fusing a Localizing Ground-Penetrating Radar (LGPR) system with GPS/INS systems, they achieved decimeter-level or even centimeter-level localization accuracy. Subsequently, MIT’s Ort et al. expanded upon the LGPR system to achieve the localization of autonomous vehicles under various conditions, such as clear weather, rain, snow, and nighttime, demonstrating the GPR system’s strong environmental adaptability [
12]. However, full image matching methods require substantial storage and computational resources, limiting their practical applicability. Consequently, researchers have shifted focus to extracting features from GPR images and using these features to perform similarity matching instead of utilizing the entire image. Baikovitz et al. employed a ResNet-18 autoencoder to capture high-dimensional deep learning features from GPR images [
13]. Zhang et al. from the National University of Defense Technology (NUDT) used the real and imaginary parts of the Log-Gabor filter to convolve with GPR images and extract phase symmetry features, which demonstrated localization accuracy within 5 pixels across asphalt, cement, and brick roads. However, the operation speed is slow [
14]. Ni et al. from the Aerospace Information Research Institute, Chinese Academy of Sciences, applied Faster-RCNN to extract hyperbolic features from GPR images, creating a feature map for matching-based localization. However, due to the sparse distribution and indistinct nature of underground hyperbolic features, there was insufficient localization continuity [
15]. Zhang et al. from NUDT further developed TSVR-Net based on SVR-Net, using both B-Scan and D-Scan as network inputs to enhance localization stability. The results showed localization accuracy within 0.1 m. However, compared to the original data, the map storage and computational requirements for network training increased [
16].
In summary, most existing GPR image matching algorithms are direct adaptations of visual image matching algorithms. Due to the low resolution of GPR images and the absence of well-defined geometric features, conventional image feature extraction methods are inadequate for effectively describing GPR image characteristics, leading to unsatisfactory matching results. GPR images commonly exhibit strip-like structures, which represent abstract representations of actual underground formations. These features are not only largely invariant over time but also distinct across different locations, making them ideal for matching. Shape context, a descriptor used to capture the outlines of shapes, is not constrained by the specific form of the target and can accurately reflect the distribution of sampled points along a contour. This method has been widely applied in object recognition and image matching [
17,
18,
19].
This paper draws on the concepts of shape context and graphical statistics to propose a novel method for extracting Central Dense Structure Context (CDSC) features. By applying threshold segmentation and extremum point extraction to GPR images, stripe structures and pseudo-corner points are obtained. These pseudo-corner points serve as centers for contextual descriptions of the surrounding stripe structures, forming GPR feature descriptors. Based on these descriptors, the feature sets of the target image and the reference image are calculated. Furthermore, a GPR image matching method based on these features is designed for vehicle positioning. Experimental results demonstrate that the proposed algorithm performs well in both positioning accuracy and computational efficiency.
The remainder of this paper is organized as follows:
Section 2 analyzes the characteristics of underground structural features.
Section 3 presents the algorithm for CDSC feature extraction and the feature matching positioning method.
Section 4 provides experimental analysis and accuracy assessment.
Section 5 offers conclusions and a discussion.
2. Analysis of the Characteristics of Underground Structures
The construction materials of roads primarily consist of concrete, asphalt, gravel, and steel reinforcement, providing a rich variety of information about the subsurface. Additionally, cables and pipelines are often buried beneath roads, which can serve as distinctive underground features. Moreover, road defects such as voids and insufficient compaction are pervasive and can also be regarded as unique characteristics of the roadway. A single radar pulse from ground-penetrating radar (GPR) generates an A-scan, as shown in
Figure 2a. A continuous sequence of A-scans along the driving direction forms a two-dimensional underground profile, referred to as a B-scan, as shown in
Figure 2b. Combining multiple B-scans perpendicular to the driving direction results in a three-dimensional C-scan, as shown in
Figure 2c. Compared to B-scans, C-scans not only provide underground information along the driving direction but also capture underground information perpendicular to the driving direction. Therefore, C-scans have an advantage when dealing with shifts in the vehicle’s trajectory. The richness of information increases with the scanning dimensions, but this also introduces significant challenges for data processing. To balance the trade-off between the amount of information and processing time, when vehicles travel within the same lane, B-scans are sufficient to meet the localization requirements. B-scan data were selected as the source for matching and positioning in this paper.
The underground structural features of the road are represented in GPR B-scan images as alternating bright and dark stripe textures, as shown in the red box in
Figure 3, These stripes are horizontally distributed across the image. Additionally, higher-intensity spots appear on each stripe, as indicated by the yellow box in
Figure 3. The effective extraction of these features for location is the primary focus of this paper.
These spots are typically circular or elliptical in shape, with pixel intensities that are similar within the spot but distinct from the surrounding neighborhood. The spots generally correspond one-to-one with underground structures and exhibit a degree of stability over time and space. To verify the stability of GPR image features under varying environmental conditions, six datasets were collected along the same road on three different days: 25 June 2023, 14 July 2023, and 17 July 2023. The dates 25 June 2023 and 17 July 2023 were sunny days, while 14 July 2023 was rainy. As shown in
Figure 4, two datasets were collected each time along the same trajectory, labeled Track 1 and Track 2. This is evident in the yellow boxes in
Figure 4, where similar underground features are observed across all datasets. A comparison of datasets a, b, e, and f reveals that even after a 20-day interval, the underground structural images remain largely unchanged and stable, thus meeting the requirements for positioning. Similar underground features can be observed in the two GPR images from datasets c and d, although the similarity has decreased. This is because rainfall altered the dielectric properties of the soil, which weakened the signal’s penetration ability. However, the underground features are still measurable.
In summary, using GPR images for positioning offers the following advantages:
- (1)
The underground features are rich and highly correlated with spatial locations, providing coverage across most conventional roads, which can serve as natural geographic information for vehicle positioning.
- (2)
These underground features are long lasting and stable, with minimal influence from external factors such as weather, resulting in strong adaptability to various positioning environments.
3. Matching Method Based on Center Dense Structure Context Features
This section first introduces the basic process of the method proposed in this paper, which first needs to build a feature reference database in advance, and then match the collected GPR images with the pre-built feature reference database and output the current geographical location, as shown in
Figure 5.
GPR detects underground structures by transmitting and receiving electromagnetic waves. However, during vehicle movement, maintaining a constant speed is challenging, which inevitably leads to distance scaling issues in GPR images. To address this, spatial resampling and noise reduction filtering must be applied as part of the preprocessing operations. In the proposed algorithm, when constructing a reference database, the preprocessed images are first transformed from image coordinates to spatial position coordinates before extracting CDSC features. The extracted image features are then stored in a specific format to form the feature reference database.
The feature reference database needs to contain location information
, direction
, and descriptor
:
During the positioning stage, the vehicle-mounted GPR captures the GPR images to be matched. These images undergo the same processing steps, including spatial resampling, noise reduction filtering, and CDSC feature extraction, to obtain the feature sequence for matching.
The similarity measurement method and mismatch elimination technique are used to determine the optimal matching position of the feature sequence within the feature reference database, ultimately yielding the positioning results.
3.1. SURF-Based GPR Image Corner Detection
GPR images are affected by factors such as jitter and noise. However, since spots represent strong underground reflective targets, they can be reliably detected by GPR, making them ideal image features. Feature point extraction has been extensively studied in visual navigation, with classic methods including ORB [
20], SIFT [
21], and SURF [
22]. ORB combines the efficiency of FAST and the reliability of the Harris algorithm to detect feature points utilizing the intensity differences between the center and surrounding pixels. However, its performance in detecting spot features in GPR images is suboptimal. SIFT detects local extrema in a DoG pyramid and uses the Hessian matrix of local intensity values as a filter condition for extracting feature points. While SIFT performs well in detecting spots, the construction of the DoG pyramid and the calculation of the Hessian matrix are computationally intensive, resulting in longer detection times. SURF, an improvement on SIFT, is a fast and robust feature point matching algorithm that excels in detecting GPR spot features compared to the other two methods. The algorithm leverages 2D Haar wavelet responses, integral images, and scale-space techniques, providing strong robustness to image rotation, translation, scaling, and noise, while also offering improved speed over SIFT.
The performance of the aforementioned feature point detection algorithms on GPR images is illustrated in
Figure 6, with results from SIFT, ORB, and SURF shown from left to right. Among the three algorithms, ORB performs the worst, detecting only a few spots and generating a significant number of false feature points. SIFT and SURF detect a larger number of spot features. However, SURF identifies more feature points with a more uniform distribution, resulting in a more complete description of the image. Therefore, SURF was selected as the feature point detection algorithm in this study.
3.2. Central Dense Structure Context Feature Extraction
The inconsistencies between roadbed layers and material differences result in alternating bright and dark stripes in GPR images. These structural variations within the road naturally provide descriptive information for pseudo-corner points. Therefore, this paper proposes the CDSC algorithm, based on the shape context algorithm, specifically targeting the stripe structures in GPR images.
3.2.1. Shape Context Algorithm
Shape context can describe arbitrary shapes and is widely used in contour matching, as shown in
Figure 7. The contour line is uniformly sampled to obtain a set of contour points. For any given point in the contour point set, a polar coordinate system is established centered on that point. The number of contour points falling into small regions around that point is then counted. This process is repeated until each point in the contour point set has been accounted for, resulting in the SC feature.
The main steps are as follows: first, the image contour is uniformly sampled. Suppose there are
sample points. A point,
, is selected, and the relative spatial position, including direction
and distance
, between the remaining
sample points
and
is calculated.
A polar coordinate system is constructed around each sampling point, and the system is divided into
grids based on distance
and directional
intervals.
In this formula,
represents the local neighborhood range of the sampling point. The number of sampling points falling into each grid is counted to obtain the shape context feature histogram. The statistical rules for constructing the histogram are as follows:
In the equation, represents the -th component of the histogram for the -th contour point.
The shape context algorithm requires uniform sampling of the contours in the original image. However, the stripe structures in GPR images are neither independent nor closed shapes. Therefore, building on the concept of shape context, this paper proposes the CDSC feature extraction algorithm to extract features from these stripe structures.
3.2.2. Center Dense Structure Context Feature Extraction
The CDSC is an improved version of the shape context algorithm, specifically designed for feature extraction from threshold-segmented GPR images centered around pseudo-corner points. The CDSC extraction process is illustrated in
Figure 8. Compared to traditional shape context, this approach addresses the challenge of describing non-independent and non-closed images. Additionally, the dense stripe structures provide more neighborhood information for the descriptor.
The GPR image is segmented using the OTSU thresholding algorithm [
23], which separates the stripe structures from the background, emphasizing large-scale structural features while ignoring local gradient details. The specific steps are as follows:
Assuming the GPR image
has
grayscale levels, with
pixels at the
-th grayscale level, the total number of pixels
in the image
is given by the following:
The probability
of a pixel having an
-th grayscale level and the mean grayscale level
of the entire image are given by the following:
By using a grayscale value
as the threshold to segment the image into
and
, where
consists of pixels with grayscale levels in the range
and
consists of pixels with grayscale levels in the range
, the probabilities
and
of
and
occurring, respectively, are given by the following:
The mean grayscale values
and
for
and
, respectively, are given by the following:
The mean grayscale value
of the image
is given by the following:
The between-class variance for
and
is given by the following:
The optimal threshold is the value of that maximizes .
When the grayscale value of the original image
exceeds a predetermined threshold
, the pixel value is set to 1; otherwise, it is set to 0. This binarization process effectively reduces the complexity of the image and enhances the efficiency of subsequent operations.
The GPR image is divided into two classes: the portion greater than the threshold, representing the stripe structures (non-zero points), is denoted as , while the portion less than the threshold is considered the image background. The binarized GPR image emphasizes the existing stripe structures while ignoring minor local variations. This is because large-scale structural information is more stable compared to local gradient information.
Using spots as the center, the stripe structures are employed to describe the spots. Traditional shape context uses an “” feature construction method, where N contour points are used to describe a single point with of these points. In contrast, this paper adopts a “1 + N” feature construction method, where a single center point is described by N stripe structure points. The specific steps are as follows:
Let
be the
-th non-zero point in the image
,
, with coordinates
. In
Section 3.1, the set of pseudo-corner points extracted from the GPR image is denoted as
,
, with coordinates
, serving as the center points for the feature descriptors.
Replace the contour points in Equation (5) with the SURF pseudo-corner points
and the stripe structure points
, respectively.
Next, count the number of
points that fall into each grid within the neighborhood range of
, and obtain the feature vector
.
3.3. Localization Algorithm Based on CDSC Features
As shown in
Figure 9, nine pseudo-corner points are selected from each image to extract CDSC features. The extracted features from both images are observed to be very similar.
Therefore, CDSC descriptors are used as the feature representation for the images, and the similarity of CDSC features is used to describe the similarity between the two images. During the matching and positioning process, the features are extracted from the image to be matched. This region’s features are then compared with a larger range of image features in the feature reference database. This approach helps minimize both false matches and missed matches. The specific process is illustrated in
Figure 10.
- (1)
CDSC Feature Matching
Nearest Neighbor Distance Ratio (NNDR) is a metric used to evaluate whether two feature descriptors are similar. When a match is correct, the distance between the two feature descriptors will be closer compared to the distance between mismatched descriptors. Therefore, the ratio of the distance to the nearest neighbor to the distance to the second nearest neighbor is calculated. If this ratio is below a certain threshold, the match is considered correct. The specific implementation of this method is as follows:
Let
be the feature vector from the reference database and
be the feature vector extracted from the image to be matched. The distance between the two descriptors is measured using the Euclidean distance:
The ratio is the following:
If is less than the specified threshold, it indicates a successful match between the two feature vectors.
- (2)
Elimination of False Matches
Each matching process yields a set of data points, but not all points in this set are correctly matched. Therefore, it is necessary to remove incorrect matches from the data set to improve matching accuracy. The Random Sample Consensus (RANSAC) algorithm can identify correct data from a set containing outliers by iteratively refining the dataset to exclude anomalous data [
24]. Let the point set obtained from a single matching be denoted as follows:
In this context,
represents the points in the feature database with coordinates
and
, while
denotes the points in the image to be matched. In the RANSAC algorithm, correct data points are classified as inliers (
), and mismatched points are classified as outliers (
). The first step is to find the optimal homography matrix
:
Here,
is the scale parameter. Typically,
is set to 1 to normalize the matrix
. Thus, the matrix has eight unknowns, and solving it requires four pairs of matching points. The algorithm selects four pairs of sample points from the data set, ensuring that the points are not collinear, computes the homography matrix, and evaluates the cost function
:
When the cost function is minimized, the final output is the set of matched points:
5. Conclusions
This paper presented a novel method for extracting features from GPR images, namely the central dense structure context (CDSC) features, which takes full advantage of the pseudo-corner and strip-like elements of GPR images. At the same time, a complete and reliable matching and positioning method was designed based on CDSC, which reduces the space size of reference image data and improves the matching calculation efficiency. The proposed vehicle matching method primarily includes steps such as extracting pseudo-corner points using SURF, CDSC descriptor extraction, feature matching, and outlier removal.
The proposed algorithm was evaluated on datasets collected from urban roads and railway tracks, and the performance of the localization method was evaluated. The results show that, compared to traditional methods, the proposed method demonstrates advantages in terms of localization accuracy, the number of matching point pairs, and precision. It significantly improves overall performance. This paper provides new ideas for research in the new technology of ground-penetrating radar navigation and positioning.
Achieving vehicle localization using GPR is a challenging task. Although the proposed algorithm improves localization accuracy and operational efficiency compared to traditional methods, it is still some ways from achieving real-time localization output. Future research will focus on optimizing threshold segmentation and graphical statistical strategies to further enhance the algorithm’s real-time performance. Additionally, the suppression of image interference and the assessment of matching availability are also areas of interest for future research.