Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data

Du, Jiusheng; Liu, Xingwang; Meng, Chengyang

doi:10.3390/su151914299

Open AccessArticle

Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data

by

Jiusheng Du

^*

,

Xingwang Liu

and

Chengyang Meng

School of Surveying and Land Information Engineering, Henan Polytechnic University, Jiaozuo 454003, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(19), 14299; https://doi.org/10.3390/su151914299

Submission received: 21 August 2023 / Revised: 24 September 2023 / Accepted: 26 September 2023 / Published: 27 September 2023

(This article belongs to the Special Issue Big Data Analytics in Sustainable Transport Planning and Management)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Global navigation satellite system (GNSS) vehicle trajectory data play an important role in obtaining timely urban road information. However, most models cannot effectively extract road information from low-frequency trajectory data. In this study, we aimed to accurately extract urban road network intersections and central locations from low-frequency GNSS trajectory data, and we developed a method for accurate road intersection identification based on filtered trajectory sequences and multiple clustering algorithms. Our approach was founded on the following principles. (1) We put in place a rigorous filtering rule to account for the offset characteristics of low-frequency trajectory data. (2) To overcome the low density and weak connection features of vehicle turning points, we adopted the CDC clustering algorithm. (3) By combining the projection features of orientation values in 2D coordinates, a mean solving method based on the DBSCAN algorithm was devised to obtain intersection center coordinates with greater accuracy. Our method could effectively identify urban road intersections and determine the center position and more effectively apply low-frequency trajectory data. Compared with remote sensing images, the intersection identification accuracy was 96.4%, the recall rate was 89.6%, and the F-value was 92.88% for our method; the intersection center position’s root mean square error (RMSE) was 10.39 m, which was 14.9% higher than that of the mean value method.

Keywords:

trajectory data; road intersections; vehicle turning points; heterogeneous density clustering; DBSCAN clustering; intersection center location

1. Introduction

Frequent road blockages and traffic accidents have posed new challenges and requirements for modern intelligent transportation systems [1]. Because electronic maps are the basis for building intelligent transportation systems, determining how to use technologies to obtain and update road network information automatically and in real time has become an important research topic that has garnered widespread concern from scholars across the world [2,3]. Many problems in urban construction demand the prompt renewal of the urban road network. Currently, most road intersections based on remote sensing images are generated for a single road type, and the lack of generality of the method leads to large differences in accuracy. Moreover, the update cycle of remote sensing images is long, making it impossible to obtain real-time road network information based on images [4,5]. The same problems of low update frequency and high cost exist with the method of measuring vehicles [6]. Thus, the traditional road network update method has not yet met the need to capture changes in the road network in a timely manner [7]. The rapid development and popularization of mobile internet and embedded global navigation satellite system (GNSS) smart devices have promoted location-based service (LBS) and made it possible to obtain large-scale, real-time trajectory data. A vehicle’s GNSS trajectory is a record of the vehicle’s path, containing rich road information (e.g., lanes, turns, speed limits, road widths, and road intersections) that directly reflects the road network’s geometric characteristics and provides a new database for road intersection extraction [8,9]. Therefore, an increasing number of scholars are utilizing vehicle GNSS trajectory data in tandem with machine learning to extract road geometry data and examine vehicle behavior, among other applications [10,11,12,13,14,15,16]

The traditional road intersection detection algorithm takes the vehicle trajectory’s unique turning information and speed information at the road intersection as the benchmark, extracts the turning points after ensuring the intersection’s accuracy, and then extracts the road intersection on the basis of the turning points’ clustering. Qixing developed a scale- and orientation-invariant traj-SIFT feature to localize and recognize junctions using a supervised learning framework [17]. However, when the accuracy of the trajectory data was low, good intersection results could not be obtained. Zhang et al.’s method enabled the rapid identification of intersections [18]. Nonetheless, their experimental findings revealed that certain intersection coordinates deviated from the center. Because empirical models are too precise and complex, such algorithms are only applicable to high-frequency trajectory data; efficiency and accuracy cannot be guaranteed for extracting turning points from low-frequency trajectory data [17,18,19,20,21,22]. Moreover, the higher accuracy of trajectory data ensures the extraction accuracy of turning points to a certain extent and reduces the clustering algorithm’s requirements, enabling road intersection extraction using only traditional clustering algorithms, such as density-based spatial clustering of applications with noise (DBSCAN). However, in the face of the more common low-frequency GNSS trajectory data, methods to accurately and effectively extract road information remain the key to achieving low-cost, real-time road updates [23,24,25].

Facing the above challenges, we aimed to overcome the disadvantages of low-frequency trajectory data, such as low localization accuracy and uneven data distribution; thus, we established a road intersection extraction method applicable to low-frequency trajectory data. The method considers the calculation of the intersection center location while accurately identifying intersections. The algorithm consists of three steps. First, it sets the trajectory sequence matching rules to filter the turning point sequence, and it fits the vehicle turning point as the suspected location of the intersection. Then, it uses the clustering algorithm based on the heterogeneous density to identify the intersection based on the vehicle turning point. Finally, it mines the turning direction information around the turning point sequence of the same intersection and uses DBSCAN clustering for the turning direction to determine the intersection center location. This paper proposes a road intersection extraction method from low-frequency trajectory data. The main contributions of the proposed method can be summarized in the following aspects.

(1): We propose a method for the detection of intersection trajectory sequences that is based on continuous directional changes of vehicle trajectories. Experiments showed that our method can effectively extract trajectory sequences at intersections from low-frequency trajectory data.
(2): We used the CDC clustering algorithm to cluster intersections, which effectively identified intersections with low density and adjacent intersections with weak connectivity. Experiments showed that our method achieved higher accuracy compared with some existing intersection extraction methods.
(3): We propose a direction-based approach for calculating intersection center coordinates, using the DBSCAN clustering algorithm to classify vehicle turning points at the same intersection in different directions according to their turn directions. Then, on the basis of the clusters after division, the intersection center coordinates were solved.

The remainder of this paper is organized as follows. Section 2 reviews related studies. Section 3 covers the details of the implementation of the model. Section 4 presents the experimental datasets, intersection detection results, and detailed analyses. Section 5 summarizes the work and describes key areas for future work.

2. Related Work

2.1. Extraction of Roads Based on Remote Sensing Data

In the early stages of research on intelligent transportation systems, because of optical remote sensing images’ advantages, such as their macroscopic view, realism, massiveness, and multiple sources, most roads were extracted on the basis of remote sensing images [26,27,28,29]. The primary research methods were mainly divided into four categories: (1) the matching set template method, which establishes a template based on certain rules for metric analysis to achieve road updates; (2) the knowledge-driven method, which generates a knowledge model related to roads and establishes a hypothesis-testing model between knowledge and image processing results to extract roads; (3) the object-oriented method, which extracts road information by performing image segmentation and image classification, enabling the road information to be extracted by image segmentation and image classification; and (4) using the deep learning method, the road information is extracted by decoding remote sensing images through deep learning. However, the limitations of bad weather conditions and ground target occlusion conditions limit the effectiveness of road extraction methods based on remote sensing images.

2.2. Extraction of Roads Based on LiDAR Data

The method of acquiring data sources through light detection and ranging (LiDAR) provides new possibilities for road information extraction. The 3D coordinates of a large number of target points around the road are acquired, and the formed point cloud is geometrically processed to extract the specific sign information and facility information of the road, such as streetlights, traffic signs, and street trees [30,31,32,33]. This type of method focuses on the completeness and accuracy of the extracted road information and is suitable for different environmental conditions; however, the high cost of data acquisition makes it unsuitable for overall urban road information extraction.

2.3. Extraction of Roads Based on Trajectory Data

The high accessibility of crowd-sourced trajectory data has made such data the new basis for road extraction. According to the type of spatial data transformed by trajectory data, existing road intersection identification methods based on crowd-sourced trajectory data can be divided into two categories. In the first category, vehicle travel information is extracted in a vector space, and a matching mechanism is established between it and traditional road information to extract information on road intersections. For example, Luliang et al. set up a trajectory point matching rule to extract steering point pairs and identify road intersections by clustering steering point pairs and local point-based connectivity [34]. Wan et al. constructed a decision tree classification model for trajectory segments and clustered the trajectory segments according to the Hausdorff distance, thus identifying intersections [35]. Liu et al. detected road intersection center coordinates using a classification model and a clustering algorithm and then calculated the intersection radius by combining the Delaunay triangle network and a circular structure [36]. Furthermore, Deng et al. analyzed intersection hot spots with the local General index and then used hierarchical clustering of trajectories to identify the turning patterns of road intersections [20]. Chen et al. used road intersection geometric features and hot-spot area features to extract a set of turning points and construct position compensation rules for the turning points to determine the intersection location [6]. Similarly, Wan et al. used the long short-term memory (LSTM) network to detect turning trajectory segments from GNSS trajectories and then clustered the segments to obtain the intersection [37]. In the second category, the trajectory data are rasterized, and a mathematical morphology-based approach is used to achieve road intersection extraction. For example, Davies et al. rasterized trajectory points and extracted road information using density estimation methods [38]. Hu et al. rasterized the trajectory data with different resolutions, filtered and refined by mathematical morphology, and fused the information on intersections extracted with different resolutions according to the rules to obtain the best road intersection results [39]. Moreover, Zhang and Xiang et al. explored road intersection information extraction by fusing multiple methods [20,22,40,41,42,43,44,45]: (1) taking the density peak clustering (CFDP) of trajectory data in the vector space and obtaining intersections on the basis of a mathematical morphology in a raster space and (2) establishing a fusion mechanism to finally obtain road intersection information. Multilevel fusion is also performed on the basis of two data sources, vehicle trajectory and remote sensing images, in the extraction of seed intersections, collaborative training, and integrated recognition to extract the vehicle trajectory intersection features and remote sensing images to identify road intersections.

3. Materials and Methods

The proposed method’s framework is shown in Figure 1. First, according to the differences in vehicle driving characteristics at the intersection, trajectory matching rules were set, vehicle turning point sequences were extracted to fit vehicle turning points, and most of the fitted vehicle turning points were clustered around the intersection. Second, the intersection was detected using the clustering algorithm using the local direction centrality (CDC) for the vehicle turning points. Finally, the vehicle turning points at the same intersection were solved by the hierarchical averaging of the DBSCAN clustering algorithm according to the turning direction characteristics, and the center coordinates of the intersection were calculated.

3.1. Trajectory Data Preprocessing

3.1.1. Redundant Data Handling

Vehicles stopping at the side of the road or waiting at red lights at intersections and traffic jams produce invalid GNSS trajectories, resulting in inefficient data processing. According to the frequency of taxi positioning equipment, the GNSS track points that stall for more than two consecutive points were considered redundant and were removed. At the same time, the vehicle’s normal driving speed was limited; thus, the trajectory points that deviated from the normal driving state were removed by setting an upper and a lower threshold for the vehicle speed. The positioning device does not directly obtain the driving speed of the taxi; instead, it calculates the driving speed of the taxi on the basis of the latitude and longitude coordinates and time information of the track point and the two points before and after it. According to the normal city driving speed limit, the minimum speed threshold (Vmin) was set to 8 m/s, and the maximum speed threshold (Vmax) was set to 17 m/s. The trajectory points whose speed exceeded the threshold value were rejected or disconnected according to the situation.

3.1.2. Noise Point Filtration

Inaccurate position information or an unstable equipment power supply can stem from the generally poor signal of the taxis’ positioning equipment, as well as high buildings, dense vegetation on both sides of the road, and tunnels interfering with the positioning equipment signal (producing a multi-path effect). All these reasons cause the coordinates recorded by GNSS to shift so that the trajectory of its positioning significantly deviates from the trajectory sequence, resulting in a large number of noise points in the collected GNSS trajectory. Considering that the data volume of trajectory points on a single day is in the tens of millions, this paper adopted the k-nearest neighbors (KNN) algorithm based on the kd tree, considering the processing efficiency while judging the distance of the n nearest points of the trajectory points. If the farthest distance (R_n) exceeded the distance threshold (R_min), it was judged as a noisy point and eliminated (as shown in Figure 2a). The red circle represents the farthest distance.

3.1.3. Calculation of the Vehicle Steering Angle

When a vehicle completes a turn at a road intersection, a significant change in the travel direction occurs, and this is quantified by measuring the difference in the steering angle before and after the trajectory. To extract the sequence of steering points using the trajectory data, we must judge the magnitude of the change in the vehicle’s steering angle in the original trajectory. As the heading data of the trajectory points did not exist in the original data, the steering angle of the vehicle was calculated after the vehicle’s heading was obtained from the calculation. According to the experiment’s accuracy requirements, the trajectory coordinates were converted from a GCJ-02 encrypted coordinate system to a WGS1984_UTMZONE48N (Mercator projection 48N index band of WGS 1984) coordinate system. The calculation of the heading angles θ₁ and θ₂ was then carried out, and the vehicle turning angle β was calculated from the change in the heading angle.

The heading angle θ₁ of a certain vehicle during a turn was solved from the original data of the two points P₁ and P₂, and the heading angle θ₂ of the next trajectory point P₂ was solved from the original data of the two points P₂ and P₃. β for trajectory point P₂ is the variation of θ₁ and θ₂ (as shown in Figure 2b), and the vehicle steering angle for P2 was solved from the raw data of P₁, P₂, and P₃.

3.1.4. Turning Direction of the Steering Point Sequence

The turning direction of the vehicle turning point near the intersection comprises the location information of the vehicle turning point. The turning direction of the trajectory sequence was calculated to provide the antecedent data for the subsequent solution of the center position of the road intersection. For the filtered trajectory sequence, the turning direction is calculated as follows:

θ_{c e n t r i} = \{\begin{matrix} \frac{\sum_{i}^{j} θ_{n}}{j - i} + \frac{\sum_{k}^{l} θ_{n}}{l - k} - 180, \frac{\sum_{i}^{j} θ_{n}}{j - i} \geq 180 \\ \frac{\sum_{i}^{j} θ_{n}}{j - i} + \frac{\sum_{k}^{l} θ_{n}}{l - k} + 180, \frac{\sum_{i}^{j} θ_{n}}{j - i} < 180 \end{matrix}\}

(1)

where i is the first trajectory point at the front of the vehicle steering sequence, j is the last trajectory point at the front of the vehicle steering sequence, k is the first trajectory point at the rear of the vehicle steering sequence, l is the last trajectory point at the rear of the vehicle steering sequence, and θ_n is the heading angle of the nth trajectory point.

3.2. Intersection Extraction Based on Low-Frequency, Low-Precision Trajectory Data

3.2.1. Steering Point Sequence Extraction and Vehicle Steering Point Solving-Based Trajectory Matching

The low-frequency, low-precision taxi trajectory data used in this paper had a positioning accuracy of 10–20 m and a sampling frequency of 10–60 s. According to current research, the duration of a vehicle passing through an intersection is only 8–20 s [34], and only 1–3 trajectory points exist in the vehicle passing through the intersection in this dataset; thus, effective steering features cannot be extracted from the trajectory points passing through the intersection alone. To extract steering features from low-frequency GNSS trajectory data, we propose the concept of the steering point sequence. A steering point sequence is a continuous sequence of GNSS trajectories that includes steering points passing through an intersection and matches the magnitude of the vehicle’s heading angle change at the intersection. A vehicle completing a turn at an intersection leaves at least one and at most three trajectory points. A data-extraction-based approach was used to establish matching rules for the characteristics of the trajectory point sequence at the intersection, with a selection range of 5–7 trajectory points.

After multiple experiments and comparing the extracted vehicle turning points, the turning threshold was ultimately determined. The first 2–3 trajectory points were the front end of the vehicle steering sequence, and the cumulative steering threshold was set to 5°; the middle 1–3 trajectory points were the middle end of the vehicle steering sequence, and the cumulative steering threshold was set to “30°,150°”; the back 2–3 trajectory points were the back end of the vehicle steering sequence, and the cumulative steering threshold was also set to 5°. The sequence of trajectory points shown in Figure 3 was obtained. We used the front end of the trajectory sequence to fit the vehicle’s straight-line driving trajectory before entering the intersection, and we used the back end of the trajectory sequence to fit the vehicle’s straight-line driving trajectory after entering the intersection. The intersection of the front and rear straight-line trajectories was then regarded as the vehicle turning point at the intersection of the vehicle’s trajectory sequence, which was closer to the center of the intersection than the original trajectory point. This method not only effectively extracts the location and steering information of the vehicle movement but also improves the vehicle steering points’ accuracy and reliability near the intersection. Owing to the constrained steering angle and filtering against the straight lines of the trajectory sequence, the vehicle trajectory points near the ramp were excluded, and most vehicle steering points were concentrated at the road intersection. It is worth mentioning that this filtering method, because of its strong constraints on the trajectory sequence, can effectively avoid entering the vehicle into places like gas stations and parking lots that also require turning.

3.2.2. Vehicle Steering Point Clustering Based on the CDC Clustering Algorithm

After preprocessing the intersection trajectory data and extracting the vehicle turning points, we observed that the extracted vehicle turning points were distributed near the road intersection. The vehicle turning points extracted from the low-frequency, low-precision trajectory data were not completely clustered at the intersection, and most of the clusters exhibited less density the further away they were from the intersection center, but there were still some vehicle turning points around the intersection with lower density. Density-based clustering algorithms can usually better accommodate different densities, but they cannot identify lower-density intersections. If the DBSCAN algorithm is used, the clustering parameters are more sensitive and difficult to tune, while the ordering points to identify the clustering structure (OPTICS) algorithm makes it easy to adjust the parameters, but the clustering results are still unsatisfactory for intersections with different coverage areas. Moreover, certain intersections have fewer vehicle turning points, and the density peak clustering algorithm is susceptible to localized high-density areas, resulting in clusters not being formed in low-density areas to detect intersections. Therefore, in this paper, the CDC algorithm was used to cluster vehicle turning points [46].

The CDC algorithm is based on the KNN distribution, which defines the boundary points and interior points of a cluster by first determining the top k nearest points. The interior points of a cluster are surrounded by their neighbors in all directions, while the boundary points are only surrounded by their neighbors in a certain direction, and the boundary points depict the shape of the cluster and form a “cage” around the interior points. The following concepts are introduced to describe the CDC algorithm.

The k neighboring points of each point form k angles in the direction α₁, α₂, …, α_k (Figure 4a). The uniformity of the directional distribution is judged by describing the difference in the directional distribution, and the angular variance formed in the 2D space is defined as the local direction centrality metric (DCM). The DCM is calculated as follows:

D C M = \frac{1}{k} \sum_{i = 1}^{k} {(α_{i} - \frac{2 π}{k})}^{2}

(2)

The value of the DCM reaches a minimum value of 0 when the k neighbors of a point are uniformly distributed in all directions, while the value of the DCM keeps increasing as the k neighbors are clustered along a certain direction, with a maximum value of

\frac{4 (k - 1) π^{2}}{k^{2}}

. The DCM is normalized as follows:

{D C M}_{m} = \frac{k}{4 (k - 1) π^{2}} \sum_{i = 1}^{k} {(α_{i} - \frac{2 π}{k})}^{2}

(3)

The boundary reachable distance (r_i) is the minimum distance between an interior point and all boundary points and is given by

r_{i} = \min_{j = 1}^{n - m} d (p_{i}, q_{j})

(4)

d(p_i, q_j) is the Euclidean distance between two interior points. If the Euclidean distance between two points is not greater than the sum of the reachable distances at the boundaries of the two points (Figure 4b), then the two interior points can be connected as the same clustering cluster. Furthermore, it is possible to avoid points from different clusters connecting to each other.

d (p_{i}, q_{j}) \leq r_{i} + r_{j}

(5)

Considering that the DCM of internal points in the same cluster was lower than the DCM of the boundary points, the threshold T_DCM after several experiments was used to classify points higher than T_DCM as internal points and lower than T_DCM as external points, and then, the clusters were divided by the comparison of the reachable distance of the boundary (Figure 4c). We presented the formulas derived from the original CDC article [46], which describe the entire algorithm procedure, including the calculation of DCM and the reachable distance, as well as the association rule.

3.2.3. DBSCAN-Based Intersection Center Solution

A sequence of vehicle trajectories passing through an intersection is depicted in Figure 5. The distribution of vehicle turning points in the same intersection in the actual trajectory is not uniform; thus, using only the average of the vehicle turning point coordinates as the center of the intersection produces deviations from the actual location. Vehicle turning points fitted to vehicle trajectory sequences in similar intra-turn directions typically have smaller Euclidean distances between them and are aggregated.

By clustering the turn directions (Figure 6a) of the vehicle turning points, we can classify the vehicle turning points according to their directions. Considering the continuity between 0° and 360° in terms of direction, clustering alone cannot determine this continuous rule. Therefore, the range of angles within a turn, (0°, 360°), is mapped onto a unit circle in a two-dimensional Cartesian coordinate system, represented by x² + y² = 1, to ensure the continuity of direction during clustering (Figure 6b). Direction values such as 355° and 5° near the coordinates (0, 1) in the two-dimensional Cartesian coordinate system correspond to (−0.087, 0.996) and (0.087, 0.996), respectively. As a result, they exhibit similar features in terms of Euclidean distance measurement. Among the many clustering methods, DBSCAN has the property of clustering nonlinear relational clusters on the basis of core distances; therefore, DBSCAN clustering was used to classify vehicle turning points in the same intersection according to direction (Figure 6c). The clustered plane coordinates are shown in Figure 6d. The mean value of the vehicle turning points for each cluster was calculated, and then, the mean value of the cluster coordinates was calculated to determine the center of the intersection (Figure 6e). Different colours represent different types of clusters. The results of the experiment in Section 4.3.2 validated the effectiveness of the proposed method validated the effectiveness of the proposed method.

3.2.4. Time Complexity Analysis

To assess the computational efficiency of our method, the time complexity was analyzed. The runtime of our method can be decomposed as:

{T = T}_{1} {+ T}_{2} {+ T}_{3}

(6)

where T₁, T₂, and T₃, denote the runtime of vehicle steering point extraction, vehicle steering point clustering, and intersection center solution. As described in the procedure, the step of vehicle steering point extraction includes the steering point sequence extraction and vehicle steering point solution:

O (T_{1}) = O ((1 + 1 + 1) q + c) = O (q)

(7)

where q refers to the total number of points, and c refers to the number of vehicle steering points. The vehicle steering point clustering includes the division of internal and boundary points, internal connection, and boundary assignment, respectively. Therefore, the time complexity of T₂ can be represented as:

O (T_{2}) = O (k n^{2} + m n + m n) = O (k n^{2})

(8)

where n refers to the total number of vehicle steering points, and m refers to the number of identified internal points. The intersection center solution includes the DBSCAN algorithm and the calculation of intersection center coordinates:

O (T_{3}) = O (s (n \log n + l m + l + 1)) = O (s n \log n)

(9)

where s refers to the number of clusters. The specified parameters s and k are commonly much <n; hence, the total time complexity of our method is approximately O(n²).

4. Experiments, Results, and Discussion

4.1. Experimental Dataset

In this study, the GNSS trajectory tracking dataset of about 14,000 taxis in Chengdu, Sichuan Province, in August 2014 was selected as the experimental dataset (see Figure 7a for details). The dataset included the trajectory data for 26 days in August, and the raw data recorded the taxi ID, passenger carrying status, time information, and latitude and longitude coordinate data under the GCJ-02 coordinate system, as shown in Table 1. The number of taxis in a single day was about 13,000, the positioning accuracy was 10–20 m, and the sampling frequency was 10–60 s. The frequency and accuracy were lower than those of the Wuhan dataset, Chicago dataset, and Cologne dataset used in the literature [20,22,35], and they were more in line with the low-quality GNSS space–time trajectories generated from daily residential trips. Considering the data volume requirements of the experiment, we selected 1 week of trajectory data for testing, aiming to verify the accuracy of the developed road intersection extraction method. Using the visual interpretation results of the contemporaneous remote sensing images (Figure 7b) as the benchmark, we compared the extraction results with the road intersection extracted by OpenStreetMap and other algorithms during the same period.

4.2. Experimental Parameter Settings and Experimental Contents

In the data pre-processing stage, first, the speed limit of the city was used to reject the trajectory that deviated from the speed interval of 8–17 m/s. Second, for the deviation of coordinate information caused by faults and disturbances, the KNN algorithm was used to calculate the distance and judge whether the farthest distance of the n = 5 points of the trajectory exceeded R_min = 6 m for noise point judgment and rejection. The density of the trajectory points on major traffic arteries was high and evenly distributed, while the density of the trajectory points on secondary road areas was low, and some noisy points existed in non-road areas, which were difficult to avoid. The application of the KNN-based noise removal algorithm could effectively eliminate most of the noisy points and improve the accuracy of the trajectory data.

In the intersection extraction stage, the steering point sequence was first extracted according to the set steering point sequence matching rules, after which the straight-line driving trajectories in and out of the intersection were fitted separately, with the intersection point being used as the vehicle steering point. The extracted vehicle steering points are shown in Figure 8. In Figure 8, type 1 is a common turning point cluster that can be identified using common clustering algorithms; type 2 is a cluster with low density that cannot be identified using density-based clustering algorithms; and type 3 comprises two clusters with weak connectivity that cannot be separated efficiently using the DBSCAN algorithm. The CDC clustering method was used for vehicle turning points to separate clusters to identify intersections, as shown in Figure 9. In the intersection center calculation stage, DBSCAN was used to classify clusters of vehicle turning points at the same intersection according to direction, and the intersection center coordinates were solved for by averaging the clusters. One of the intersection clusters based on direction is shown in Figure 10. In this paper, the test parameters were evaluated by adjusting the test parameters several times, resulting in the selection of the test parameters shown in Table 2.

4.3. Results and Analysis

4.3.1. Intersection Extraction

The proposed method could extract road intersections quickly and accurately, and it performed well in fitting the tianditu map road network data. In the test area, the actual number of intersections was 183, and the proposed method extracted 170 intersections, accurately identified 164 intersections, and could not identify 19 intersections. Thus, more than 96.4% of the detection results represent accurately identified real intersections, which is more accurate than the traditional method of extracting vehicle turning feature points. In addition, the accuracy of the method proposed in this paper was confirmed using the accuracy rate (P), recall rate (R), and F-value comparison evaluation. In this paper, we referred to several methods from the literature, comparing them not only with the intersections of the OSM road network but also with the extraction results of several previously published studies. The results of the test comparison are placed in Table 3.

According to the evaluation results, it can be concluded that the proposed intersection information extraction method has significant advantages in handling low-frequency, low-accuracy taxi trajectory data. Compared with other common methods, this method performed better in intersection information extraction and provided more reliable precursor data for the subsequent construction of navigable road networks.

In Figure 8, three types of clusters of vehicle turning points at intersections can be identified. Cluster type 1 comprises common clusters widely recognized by most clustering algorithms. Type 2 pertains to clusters with low density. The identification of such clusters may be accomplished through the utilization of the CDC clustering algorithm. This algorithm derives the DCM value (Equation (3)) by examining the homogeneity of adjacent points concerning direction. Type 3 consists of two clusters located in close proximity to each other with weak interconnectivity. The DCM value calculation marks the points that are located at the intersection between these clusters as boundary points, which permits the identification of clusters that exhibit weak connectivity through the computation of the boundary reachable distance (Equation (4)).

4.3.2. Intersection Center Coordinate Calculation

The correctly extracted intersections were compared with the visually interpreted center locations using the DBSCAN clustering-based intersection center calculation method and the mean value method, and the deviation from the actual intersection center location was quantified by calculating the root mean square error (RMSE) of the coordinate values. The RMSE of this method was 10.39 m, while that of the mean value method was 12.21 m. The intersection center accuracy was improved by 14.9% using this method. As shown in Figure 11, the analysis of the ranking of the deviation values showed that the method improved the accuracy in both large and small deviation ranges and had good stability.

5. Conclusions

In this study, a method was proposed to perform turn detection and identify road intersections using low-frequency, low-accuracy GNSS trajectory data. The driving characteristics of the vehicle were analyzed, and matching rules were established to screen the sequence of turning points that could fit the turning model. A compensation algorithm was then proposed for the position of the turning points at the intersection to make the vehicle turning points closer to the intersection. Subsequently, the CDC algorithm was employed to cluster the vehicle turning points and identify the intersection. Finally, the DBSCAN algorithm was used to cluster the turning direction and determine the center of the road intersection. According to the 2014 taxi trajectory dataset of Chengdu City, our experiments achieved an intersection identification accuracy of 96.4% and a recall rate of 89.6% in the intersection extraction step. Additionally, our proposed method improved the accuracy of the intersection center calculation by 14.9% compared with the mean value method. It also performed well at intersections with large deviations.

Future research will be conducted to extract information on complex 3D intersections and detailed structures using low-frequency, low-accuracy GNSS trajectory data and generate navigable road networks based on accurately extracted intersection information.

Author Contributions

J.D. was involved in the reproduction of existing algorithms and the analysis of experiments as well as the drafting of the paper and the writing of the thesis. X.L. was involved in the revision and drafting of the paper. C.M. was involved in the reproduction of the existing algorithm. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the National Natural Science Foundation of China (grant number 41801318) and the key scientific and technological project of Henan province (grant numbers 232102321004 and 212102310436).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to perform this study are available and will be provided upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lu, H.; Li, R. The Development Status and Trends of Urban Intelligent Transportation System. J. Eng. Stud. 2014, 6, 6–19. [Google Scholar] [CrossRef]
Li, D.; Li, Q.; Yang, B.; Yu, J. 3S technology and intelligent transportation. Geomat. Inf. Sci. Wuhan Univ. 2008, 4, 331–336. [Google Scholar]
Lu, F.; Zhang, H. Big Data and Generalized GIS. Geomat. Inf. Sci. Wuhan Univ. 2014, 39, 645–654. [Google Scholar] [CrossRef]
Dai, L.; Wang, Y.; Du, Y.; Zhu, T.; Xie, S.; Li, C.; Fang, X. A Review of Methods for Road Extraction from Optical Remote Sensing Images. Natl. Remote Sens. Bull. 2020, 24, 804–823. [Google Scholar] [CrossRef]
Jiang, Y. Research on Road Extraction of Remote Sensing Image Based on Convolutional Neural Network. EURASIP J. Image Video Process 2019, 2019, 31. [Google Scholar] [CrossRef]
Chen, B.; Ding, C.; Ren, W.; Xu, G. Extended Classification Course Improves Road Intersection Detection from Low-Frequency GPS Trajectory Data. ISPRS Int. Geo-Inf. 2020, 9, 181. [Google Scholar] [CrossRef]
Ou, Y. A Study on Road Network Extraction Based on Walking GPS Trajectories. Master. Thesis, Hunan University of Science & Technology, Xiangtan, China, 2014. [Google Scholar]
Lu, F.; Liu, K.; Chen, J. Study of Human Mobility in the Era of Big Data. J. Geo-Inf. Sci. 2014, 16, 665–672. [Google Scholar]
Lei, Y. HBase Based Vehicle Track Data Management and Analysis. Master’s Thesis, Southwest JiaoTong University, Chengdu, China, 2017. [Google Scholar]
Mou, N.; Zhang, H.; Chen, J.; Zhang, L.; Dai, H. A Review of Research on Urban Applications of Trajectory Data Mining. J. Geo-Inf. Sci. 2015, 17, 1136–1142. [Google Scholar]
Tang, L.; Zheng, W.; Wang, Z.; Xu, H.; Dong, K. A GPS trajectory spatio-temporal distribution detection method for urban cab pick-up and drop-off. J. Geo-Inf. Sci. 2015, 17, 1179–1186. [Google Scholar]
Zabihi, Z.; Eftekhari Moghadam, A.M.; Rezvani, M.H. Reinforcement Learning Methods for Computation Offloading: A Systematic Review. ACM Comput. Surv. 2023, 56, 1–41. [Google Scholar] [CrossRef]
Siemuri, A.; Selvan, K.; Kuusniemi, H.; Valisuo, P.; Elmusrati, M.S. A Systematic Review of Machine Learning Techniques for GNSS Use Cases. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 5043–5077. [Google Scholar] [CrossRef]
Dokuz, A.S. Weighted Spatio-Temporal Taxi Trajectory Big Data Mining for Regional Traffic Estimation. Phys. A Stat. Mech. Appl. 2022, 589, 126645. [Google Scholar] [CrossRef]
Keawthong, P.; Muangsin, V.; Gowanit, C. Location Selection of Charging Stations for Electric Taxis: A Bangkok Case. Sustainability 2022, 14, 11033. [Google Scholar] [CrossRef]
Krataithong, P.; Anutariya, C.; Buranarach, M. A Taxi Trajectory and Social Media Data Management Platform for Tourist Behavior Analysis. Sustainability 2022, 14, 4677. [Google Scholar] [CrossRef]
Chen, C.; Lu, C.; Huang, Q.; Yang, Q.; Gunopulos, D.; Guibas, L. City-Scale Map Creation and Updating Using GPS Collections. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1465–1474. [Google Scholar]
Zhang, C.; Xiang, L.; Li, S.; Wang, D. An Intersection-First Approach for Road Network Generation from Crowd-Sourced Vehicle Trajectories. ISPRS Int. Geo-Inf. 2019, 8, 473. [Google Scholar] [CrossRef]
Zheng, K.; Zhu, D. A Novel Clustering Algorithm of Extracting Road Network from Low-Frequency Floating Car Data. Clust. Comput. 2019, 22, 12659–12668. [Google Scholar] [CrossRef]
Huang, J.; Deng, M.; Zhang, Y.; Liu, H. Complex Road Intersection Modelling Based on Low-Frequency GPS Track Data. ISPRS Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2017, XLII-2/W7, 23–28. [Google Scholar] [CrossRef]
Deng, M.; Huang, J.; Zhang, Y.; Liu, H.; Tang, L.; Tang, J.; Yang, X. Generating urban road intersection models from low-frequency GPS trajectory data. Int. J. Geogr. Inf. Sci. 2018, 32, 2337–2361. [Google Scholar] [CrossRef]
Ma, C.; Sun, Q.; Chen, H.; Wen, B. Recognition of Road Junctions Based on Road Classification Method. Geomat. Inf. Wuhan Univ. 2016, 41, 1232–1237. [Google Scholar]
Yang, W.; Ai, T.; Lu, W. A Method for Extracting Road Boundary Information from Crowdsourcing Vehicle GPS Trajectories. Sensors 2018, 18, 1261. [Google Scholar] [CrossRef]
He, Y.; Hofer, B.; Sheng, Y.; Yin, Y.; Lin, H. Processes and Events in the Center: A Taxi Trajectory-Based Approach to Detecting Traffic Congestion and Analyzing Its Causes. Int. J. Digit. Earth 2023, 16, 509–531. [Google Scholar] [CrossRef]
Liu, X.; Zhu, Y.; Wang, Y.; Forman, G.; Ni, L.; Fang, Y.; Li, M. Road Recognition Using Coarse-Grained Vehicular Traces; HPL-2012-26; HP Laboratories: Palo Alto, CA, USA, 2012. [Google Scholar]
Li, Z. Road Extraction from Remote Sensing Images Using Parallel Softplus Networks. J. Indian Soc. Remote Sens. 2020, 48, 1645–1650. [Google Scholar] [CrossRef]
Dai, L.; Zhang, G.; Zhang, R. RADANet: Road Augmented Deformable Attention Network for Road Extraction from Complex High-Resolution Remote-Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5602213. [Google Scholar] [CrossRef]
Abdollahi, A.; Pradhan, B.; Shukla, N.; Chakraborty, S.; Alamri, A. Deep Learning Approaches Applied to Remote Sensing Datasets for Road Extraction: A State-Of-The-Art Review. Remote Sens. 2020, 12, 1444. [Google Scholar] [CrossRef]
Sun, K.; Zhang, J.; Zhang, Y. Roads and Intersections Extraction from High-Resolution Remote Sensing Imagery Based on Tensor Voting under Big Data Environment. Wirel. Commun. Mob. Comput. 2019, 2019, 6513418. [Google Scholar] [CrossRef]
Gargoum, S.A.; El Basyouny, K. A Literature Synthesis of LiDAR Applications in Transportation: Feature Extraction and Geometric Assessments of Highways. GIScience Remote Sens. 2019, 56, 864–893. [Google Scholar] [CrossRef]
Prochazka, D.; Prochazkova, J.; Landa, J. Automatic Lane Marking Extraction from Point Cloud into Polygon Map Layer. Eur. J. Remote Sens. 2019, 52, 26–39. [Google Scholar] [CrossRef]
Yadav, M.; Singh, A.K.; Lohani, B. Extraction of Road Surface from Mobile LiDAR Data of Complex Road Environment. Int. J. Remote Sens. 2017, 38, 4655–4682. [Google Scholar] [CrossRef]
Zhang, D.; Li, L.; Li, Y. An Urban Road Extraction Method Based on Vehicle-Based Laser Scanning. Bull. Surv. Mapp. 2016, 7, 30–34+47. [Google Scholar] [CrossRef]
Tang, L.; Niu, L.; Yang, X.; Zhang, X.; Li, Q.; Xiao, S. Urban Intersection Recognition and Construction Base on Big Trace Data. Acta Geodactica Cartogr. Sin. 2017, 46, 770–779. [Google Scholar]
Wan, Z.; Li, L.; Yang, M.; Zhou, X. Decision tree modelfor extracting road intersection feature from vehicle trajectory data. Acta Geodactica Cartogr. Sin. 2019, 48, 1391–1403. [Google Scholar]
Liu, Y.; Qing, R.; Zhao, Y.; Liao, Z. Road Intersection Recognition via Combining Classification Model and Clustering Algorithm Based on GPS Data. ISPRS Int. J. Geo-Inf. 2022, 11, 487. [Google Scholar] [CrossRef]
Wan, Z.; Li, L.; Yu, H.; Yang, M. A Long Short-Term Memory-Based Approach for Detecting Turns and Generating Road Intersections from Vehicle Trajectories. Sensors 2022, 22, 6997. [Google Scholar] [CrossRef] [PubMed]
Davies, J.J.; Beresford, A.R.; Hopper, A. Scalable, Distributed, Real-Time Map Generation. IEEE Pervasive Comput. 2006, 5, 47–54. [Google Scholar] [CrossRef]
Hu, H.; Xiang, L.; Wang, D. Road Extraction of Taxi Trajectory Data. Bull. Surv. Mapp. 2018, 7, 53–57. [Google Scholar] [CrossRef]
Li, Y.; Xiang, L.; Zhang, C.; Wu, H.; Gong, J. Multi-level fusion of vehicle trajectories and remote sensing images for road intersection identification. Acta Geodactica Cartogr. Sin. 2021, 50, 1546–1557. [Google Scholar]
Zhang, C.; Xiang, L.; Li, Y.; Wang, W. Construction of navigable road network based on cab trajectory. Acta Geodactica Cartogr. Sin. 2021, 50, 1650–1662. [Google Scholar]
Li, S.; Xiang, L.; Zhang, C.; Gong, J. A Study on Intersection Extraction of Urban Road Network Based on Low Frequency Taxi Trajectories. J. Geo-Inf. Sci. 2019, 21, 1845–1854. [Google Scholar]
Li, Y.; Xiang, L.; Zhang, C.; Jiao, F.; Wu, C. A Guided Deep Learning Approach for Joint Road Extraction and Intersection Detection from RS Images and Taxi Trajectories. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2021, 14, 8008–8018. [Google Scholar] [CrossRef]
Zhang, C.; Li, Y.; Xiang, L.; Jiao, F.; Wu, C.; Li, S. Generating Road Networks for Old Downtown Areas Based on Crowd-Sourced Vehicle Trajectories. Sensors 2021, 21, 235. [Google Scholar] [CrossRef]
Yin, Z.; Huang, K.; Ying, S.; Huang, W.; Kang, Z. Modeling of Time Geographical Kernel Density Function under Network Constraints. ISPRS Int. J. Geo-Inf. 2022, 11, 184. [Google Scholar] [CrossRef]
Peng, D.; Gui, Z.; Wang, D.; Ma, Y.; Huang, Z.; Zhou, Y.; Wu, H. Clustering by Measuring Local Direction Centrality for Data with Heterogeneous Density and Weak Connectivity. Nat. Commun. 2022, 13, 5455. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The proposed method’s framework.

Figure 2. (a)Noise point filter and (b) Vehicle steering point diagram.

Figure 3. Track point sequence type: (a) one track point, (b) two track points, (c) three track points. The red circle represents the range of the intersection, the black dots represent the fitted steering points, the blue dots are the points outside the range of the intersection, and the red dots are the trajectory points within the range of the intersection.

Figure 4. CDC clustering principle explained: (a) orientation distribution of center points; (b) clustering cluster partitioning rules; (c) cluster schematic. Red dots represent interior points, black dots represent boundary points.

Figure 5. Trajectory schematic: (a) vehicle trajectory at the “T” intersection; (b) vehicle trajectory at the “X” intersection.

Figure 6. DBSCAN-based intersection solution: (a) turn direction; (b) angle mapping in 2D coordinate system; (c) DBSCAN clustering effect; (d) planar diagram of clustering results; (e) intersection center.

Figure 7. Raw data presentation: (a) 1-day raw track data presentation; (b) simultaneous remote sensing image.

Figure 8. Vehicle steering point extraction.

Figure 9. Intersection extraction.

Figure 10. Example of intersection direction clustering: (a) DBSCAN clustering presentation; (b) clustering result, graphical representation.

Figure 11. Intersection coordinate deviation.

Table 1. Raw data fields.

Data Field	Data Format	Sample of Data	Description of the Data
Vehicle Num	Int	8161	Vehicle number of the taxi
Lat (GCJ-02)	Float	30.652344	Latitude (GCJ-02 Mars coordinate system)
Lon (GCJ-02)	Float	104.063938	Longitude (GCJ-02 Martian coordinate system)
Open Status	Int	1	Passenger status (0 for empty, 1 for passenger)
Stime	Timestamp	24 August 2014 11:50:04	Timestamp of GNSS records

Table 2. Test parameter setting.

Algorithm	Parameters	Parameter Settings	Description of Parameters
Denoising algorithm	n	5	Nearest n point
Denoising algorithm	R_min	6	Maximum distance threshold
CDC algorithm	K	18	Number of neighborhood points
CDC algorithm	T_DCM	0.2	Threshold for delimiting boundary and interior points
DBSCAN algorithm	Eps	0.3	Core neighborhood radius
DBSCAN algorithm	MinPts	3	Minimum amount of data for the neighborhood radius

Table 3. Comparison of road intersection extraction test results.

Method	Precision	Recall	F-Measure
Proposed	96.40%	89.60%	92.88%
OpenStreetMap	96.45%	79.66%	87.25%
Deng	93.60%	65.67%	77.19%
Zhang	93.12%	88.16%	90.80%
Ahmed	78.00%	34.06%	49.43%
Davies	87.59%	57.03%	67.74%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Du, J.; Liu, X.; Meng, C. Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data. Sustainability 2023, 15, 14299. https://doi.org/10.3390/su151914299

AMA Style

Du J, Liu X, Meng C. Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data. Sustainability. 2023; 15(19):14299. https://doi.org/10.3390/su151914299

Chicago/Turabian Style

Du, Jiusheng, Xingwang Liu, and Chengyang Meng. 2023. "Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data" Sustainability 15, no. 19: 14299. https://doi.org/10.3390/su151914299

APA Style

Du, J., Liu, X., & Meng, C. (2023). Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data. Sustainability, 15(19), 14299. https://doi.org/10.3390/su151914299

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Road Intersection Extraction Based on Low-Frequency Vehicle Trajectory Data

Abstract

1. Introduction

2. Related Work

2.1. Extraction of Roads Based on Remote Sensing Data

2.2. Extraction of Roads Based on LiDAR Data

2.3. Extraction of Roads Based on Trajectory Data

3. Materials and Methods

3.1. Trajectory Data Preprocessing

3.1.1. Redundant Data Handling

3.1.2. Noise Point Filtration

3.1.3. Calculation of the Vehicle Steering Angle

3.1.4. Turning Direction of the Steering Point Sequence

3.2. Intersection Extraction Based on Low-Frequency, Low-Precision Trajectory Data

3.2.1. Steering Point Sequence Extraction and Vehicle Steering Point Solving-Based Trajectory Matching

3.2.2. Vehicle Steering Point Clustering Based on the CDC Clustering Algorithm

3.2.3. DBSCAN-Based Intersection Center Solution

3.2.4. Time Complexity Analysis

4. Experiments, Results, and Discussion

4.1. Experimental Dataset

4.2. Experimental Parameter Settings and Experimental Contents

4.3. Results and Analysis

4.3.1. Intersection Extraction

4.3.2. Intersection Center Coordinate Calculation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI