A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing

Marcomini, Karem Daiane; Brito, Vitória de Carvalho; Balestra, Gregori da Cruz; Tosetto, Vitor; Duarte, Luiz Carlos; Donadon, Antonio Roberto

doi:10.3390/jsan14010010

Open AccessArticle

A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing

by

Karem Daiane Marcomini

^1,*,†

,

Vitória de Carvalho Brito

^1,†

,

Gregori da Cruz Balestra

¹,

Vitor Tosetto

¹,

Luiz Carlos Duarte

² and

Antonio Roberto Donadon

²

¹

Pix Force Tecnologia S.A, Porto Alegre 90240-200, RS, Brazil

²

CEMIG (Minas Gerais Electric Power Company), Belo Horizonte 30190-131, MG, Brasil

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

J. Sens. Actuator Netw. 2025, 14(1), 10; https://doi.org/10.3390/jsan14010010

Submission received: 13 December 2024 / Revised: 10 January 2025 / Accepted: 16 January 2025 / Published: 26 January 2025

(This article belongs to the Special Issue Advances in Intelligent Transportation Systems (ITS))

Download

Browse Figures

Versions Notes

Abstract

:

Overtaking relies heavily on the driver’s attention and cognitive state, and illegal overtaking can lead to accidents, severe injuries, or fatalities. To enhance highway safety, we propose a method for accurately detecting illegal overtaking on continuous road lanes. We used dashboard-mounted smartphone cameras and geolocation data to filter the analysis areas. We used the state-of-the-art deep learning model You Only Look Once version 8 (YOLOv8) to detect yellow road lanes. When these lanes suggest potential illegal overtaking, we apply the YOLO for Panoptic driving Perception version 2 (YOLOPv2) model, followed by post-processing. We confirm overtaking events by checking for overlaps between detections from both models. We store confirmed instances and evaluate the information temporally rather than just from individual frames. We then analyze the entire video to identify violations and extract the moments of occurrence. We tested the algorithm on real-world traffic data under various weather and lighting conditions. Our method demonstrates reliability and consistency in identifying illegal overtaking. We achieved 16 TP and only 1 FP over 56 videos totaling 41 h, 9 min, and 24 s, with precision, recall, and F1-score values of 1.000, 0.941, and 0.970, respectively. Consequently, our innovative and practical solution, utilizing simple cameras and advanced computer vision models, can significantly enhance highway safety and support vehicle auditing systems.

Keywords:

computer vision; deep learning; lane detection; vehicle auditing; illegal overtaking detection

1. Introduction

Traffic accidents are among the leading causes of death. The World Health Organization (WHO) estimated that approximately 1.19 million road traffic deaths occurred in 2021 [1]. In 2023, Brazil recorded 67,766 accidents on federal highways. Of these, 5639 (8.32%) involved maneuvers in the changing lane or motorcyclists riding between lanes. Furthermore, 1620 (2.39%) were due to improper overtaking, and 6325 (9.33%) occurred as vehicles entered the road without observing the presence of other vehicles. These four types of incidents represented 13,584 cases (20% of the total), resulting in 1023 fatalities, 16,355 injuries, and 29,862 vehicles involved [2].

Previous studies have revealed numerous factors that influence the severity of traffic accidents, including human behavior, vehicle conditions, traffic characteristics, road infrastructure, and environmental conditions [3,4,5]. Among human-related causes, a lack of attention, mobile phone use, excessive speed, improper overtaking, and alcohol consumption are the most frequently recorded factors contributing to deaths, underscoring their significant role in traffic-related deaths [6].

Illegal overtaking, along with other illegal driving behaviors, has a strong correlation with the frequency of accidents, fatalities, and injuries and is among the main factors associated with an increased risk of accidents [7]. Overtaking is among the most significant driving behaviors on two-lane highways and affects the highway capacity, safety, and level of service [8]. Overtaking is a complicated maneuver that requires a driver to make several decisions based on the prevailing passing conditions. For instance, a driver chooses an acceptable gap size in the opposing traffic, the following distance behind the impeding vehicle, and the distance to leave in front of the impeding vehicle when returning to the original lane after passing [9]. This study focuses exclusively on right-hand traffic. For left-hand traffic (as common in countries like England), minor adjustments would be necessary to account for the overtaking lane on the right. Figure 1 illustrates the overtaking scenario analyzed in our study.

The overtaking action depends on numerous factors, including the ego vehicle’s current state and the surrounding vehicles’ positions and speeds. Furthermore, psychological factors such as impulsivity, mindfulness, driving attitudes, and depression can influence overtaking decisions [10]. Establishing educational policies and promoting self-awareness can enhance traffic safety and encourage drivers to adopt safer driving practices [11]. By increasing awareness of risky behaviors and recognizing unsafe decisions, we can encourage drivers to adopt safer practices and ultimately reduce the number of road accidents.

Indeed, one of the dangers of the overtaking maneuver relies on the driver’s ability to account for both the distance between the two vehicles and the speed of the one ahead [12]. Although overtaking is not the leading cause of road accidents, collisions during overtaking maneuvers are among the most severe [13]. For more than a decade, road traffic accidents have been the leading cause of death among young people [1].

Given those risks, autonomous driving is a widely studied topic for researchers [14], and may improve the traffic environments, making roads and highways safer. In traffic, overtaking is a very complex maneuver [15]. Detecting overtaking is not trivial due to the variety of driving scenarios and conditions and the quality of cameras and sensors, which presents a challenge in developing accurate detection systems.

In the last few years, advancements in deep learning and computer vision have revolutionized the analysis of traffic behavior [16,17,18,19,20]. These technologies are essential for the automatic interpretation of images and videos, enabling real-time object recognition, pattern detection, and scene segmentation. Despite these significant advancements, overtaking detection still poses challenges in low-cost applications [21].

This paper explores the application of computer vision techniques to develop a system that can detect illegal overtaking maneuvers on the road, aiming to deal with the mentioned challenges. The main goal of the proposed system is to improve road safety, assist in vehicle auditing systems, and enhance driving practices.

Thus, the proposed method uses dashboard-mounted smartphone cameras and geolocation data to define analysis areas. You Only Look Once version 8 (YOLOv8) detects yellow road lines, while YOLO for Panoptic driving Perception version 2 (YOLOPv2), followed by post-processing, confirms potential illegal overtakes through detection overlaps. The system evaluates and stores these events throughout the video to identify and extract the moments of violation.

The main contributions of this paper are summarized as follows:

The use of low-cost cameras to make the technology more accessible and feasible for widespread implementation.
The application of recent deep learning and image processing techniques to accurately detect illegal overtaking maneuvers, improving traffic safety.
The development of a tool designed for integration into vehicle auditing systems to monitor and evaluate driving practices, which could improve road safety and help reduce accidents caused by illegal maneuvers.

The paper is structured as follows to provide a comprehensive understanding of this approach: Section 2 discusses the previous state-of-the-art models designed. Section 3 presents the dataset’s characteristics, methodology, and validation metrics. Section 4 presents the results obtained in our study. Section 5 discusses the results based on the data presented in the literature. Finally, Section 6 presents the conclusion of this study.

2. Related Work

Deep learning, a subset of artificial intelligence, has revolutionized numerous fields by enabling machines to learn and make decisions from vast amounts of data. Among the many architectures within deep learning, Convolutional Neural Networks (CNNs) have emerged as a cornerstone for image classification and computer vision tasks. These networks demonstrate remarkable performance by integrating feature extraction and classification within a single framework, eliminating the need for manual feature engineering. This seamless integration, powered by deep learning, transforms machine vision and drives significant breakthroughs in tasks such as object detection [20]. In traffic safety, deep learning performs tasks including obstacle detection, lane recognition, traffic sign and light recognition, and enabling self-driving cars.

Lane recognition, in particular, plays a crucial role in deviation warnings, collision prevention, and effective path planning. Deep learning approaches explore different ways to improve the accuracy and efficiency of these systems, contributing to enhanced traffic safety [20]. Accurate lane detection is essential for identifying illegal overtaking maneuvers, as it allows for monitoring the vehicle’s position relative to lane boundaries. Advanced models for lane detection, such as the SCNN-based hybrid model, have achieved accuracies exceeding 0.970 on benchmark datasets [22]. Additionally, semantic segmentation models like SegNet have performed well, reaching an F1-score of 0.934 and recall of 0.926 [23]. By combining multiple detection frameworks, performance can be further enhanced, with models achieving F1-scores above 0.950 [24].

Lee and Park [25] proposed a method for rearview camera-based blind-spot detection and a lane change assistance system for autonomous vehicles using CNNs. They use YOLOv9 for vehicle detection, combined with Sobel edge detection and a Kalman filter to identify and track lanes, achieving a reported lane detection rate of 0.915. Lu and Chiu [26] employed a domain adaptation model and image segmentation to improve lane detection performance in challenging scenarios. They emphasize that factors like low light, shadows, rain, or snow hinder accurate detection, achieving an F1-score of 0.743 under normal conditions and 0.691 at night.

In vehicle overtaking detection, researchers [16,18,27,28] have developed several models to enhance detection accuracy. One such study [16] combines image pre-processing, segmentation, optical flow, and CNNs to identify overtaking maneuvers. The model removes repetitive patterns to eliminate false positives and employs behavior analysis to differentiate overtaking from regular lane changes or other movements. By analyzing the motion and speed of vehicles through optical flow, the approach improves overtaking predictions, thus reducing errors in detection algorithms.

Another approach [27] integrates image processing techniques with a CNN to enhance overtaking safety. The system extracts features from images and applies manual rules to assess critical parameters such as the distance, velocity, and acceleration of surrounding vehicles. The CNN, trained on sequential image datasets, predicts the overtaking safety by evaluating risks based on the relative positions and speeds of surrounding vehicles, which improves decision-making during overtaking attempts.

In [29], a system focusing on illegal lane crossing detection achieved a precision of 0.920 and a recall of 0.890, ensuring effective detection while minimizing false positives. Similarly, Xia et al. [28] presents a method for overtaking detection with an overall error rate below 25%, demonstrating strong performance across various traffic scenarios, including those with poor visibility or heavy traffic. RGB-D data from Kinect sensors capture color and depth information, improving the detection accuracy in complex environments.

Panichpapiboon and Leakkaw [18] introduced a probabilistic method for detecting lane changes, specifically using smartphone sensors. This method achieved approximately 0.900 precision and 0.928 recall, demonstrating its effectiveness in correctly identifying lane change events, minimizing missed detections, and ensuring reliable performance in real-world scenarios.

The development of autonomous vehicles has become an increasingly prominent area of research and development. These systems demand extensive technological and computational resources, with real-time information processing crucial to ensuring the safety of passengers and others on the road. In response, several studies have explored using cameras and sensors for monitoring dangerous behaviors on highways [15,16,27,29,30]. These studies employ monocular cameras [16] and other sensors, such as sonar, radar, and RGB-D data captured by Kinect devices [28]. Lin et al. [18] investigates using front-view cameras with gyro sensors, while smartphone sensors detect lane changes.

Building on these advances, researchers have focused on creating robust panoptic driving perception systems [31], which integrate cameras and LiDAR to provide a comprehensive understanding of the environment. This approach supports route planning and enhances driving safety. For instance, Faizi and Al-sulaifanie [32] use a CNN to detect lane features from image blocks and apply K-means clustering to map the detected points to lane markings, improving lane detection and driving assistance.

Lin et al. [15] improve control strategies with Time to Lane Crossing estimation, which aids decision-making during overtaking maneuvers. YOLO, a deep learning model for object detection, excels in this application due to its efficiency and accuracy [33]. Additionally, Finite State Machines (FSMs) help manage the overtaking decision process, addressing various states such as free driving, following, overtaking, and aborting [34].

The literature presents recent and relevant studies on traffic safety, addressing various aspects such as lane and vehicle detection, lane change detection, and maneuver identification. However, to the best of our knowledge, no studies specifically address the detection of illegal overtaking. Existing works primarily focus on lane detection, without categorizing maneuvers, or in the case of autonomous vehicles, assessing whether overtaking is safe based on the distances between detected objects.

Despite recent advancements in deep learning and computer vision, challenges continue, particularly in more complex scenarios. Adverse weather conditions, poor lighting, and the demand for efficient and accurate solutions amplify these challenges. At night, the loss of color information further hinders sensor performance and complicates overtaking detection [28]. Moreover, the effectiveness of models relies heavily on the quality and quantity of training data, making dataset collection a significant challenge [27]. A key research gap exists: developing accessible, low-cost, and robust methodologies for accurately detecting illegal overtaking across various conditions.

In response to this challenge, this study proposes an innovative approach that integrates dashboard-mounted smartphone cameras, advanced deep learning models (YOLOv8 and YOLOPv2), and post-processing techniques. This solution aims to improve detection performance and simplify integration into vehicle auditing systems, enhancing road safety and reducing traffic violations.

3. Materials and Methods

This section presents the proposed methodology for identifying overtaking on continuous lanes. Figure 2 illustrates the flow chart with the steps of this study.

The process starts by loading the data, which includes a file with the in-vehicle camera recording of the route and, when available, a file containing geolocation information. If geolocation data are available, we identify the segments where the driver was on a two-way road or highway and process only those relevant segments. If there are no geolocation data, we process the entire video. The processing involves detecting yellow lanes using YOLOv8. We also offer the option to filter YOLOv8 detections using YOLOPv2, followed by post-processing, to identify the overtaking lane and determine if the detections intersect. This filtering is activated only when YOLOv8 detects continuous lanes. We store the results of each frame in a Pandas DataFrame in Python, enabling further operations on the data to identify the start and end moments of overtaking on a continuous lane. Subsequently, we generate video clips corresponding to these segments.

Section 3 provides a structured overview of our approach. It begins with an outline of the dataset (Section 3.1) and the associated geolocation information (Section 3.2), followed by a discussion of the lane detection techniques employed (Section 3.3), including yellow line detection (Section 3.3.1) and overtaking lane detection (Section 3.3.2). The section continues with an analysis of illegal overtaking incidents on continuous lane markings (Section 3.4) and concludes with the evaluation metrics used to assess the model’s performance (Section 3.5).

3.1. Dataset

The dataset used in this study comes from 1440 videos captured by dashboard-mounted smartphone cameras in vehicles. These videos, provided by the Energy Company of Minas Gerais (CEMIG) in Brazil, include urban, highway, and rural scenes, along with variations in weather conditions and times of day.

We selected a total of 4035 images, annotated into five categories: single dash lane (SDL), single solid lane (SSL), double solid lane (DdSL), double lane with solid/dash (DdLSD), and double lane with dash/solid (DdLDS). The dataset contains 4235 labels, as some images have multiple labels assigned. The distribution of these labels is as follows: 1079 for DdSL, 1032 for SDL, 401 for DdLDS, 382 for DdLSD, and 122 for SSL. Additionally, 1219 images consist only of background, which, although part of the same scenarios as the annotated images, lack the yellow lane markings relevant to the study. Including these background images was necessary to prevent the model from falsely identifying non-relevant markings, such as lateral lanes or other road markings unrelated to overtaking, improving the model’s ability to accurately distinguish between lane markings and other elements in the scene. Figure 3 presents sample annotations from the dataset.

We divided our dataset into three subsets—training, validation, and testing—using a ratio of 80:10:10. This resulted in 3388 images for training, 323 for validation, and 324 for testing.

3.2. Geolocation Information

Considering that overtaking on continuous lane markings predominantly occurs on highways and roads, we decided to focus our analysis on these environments. Unlike urban areas, where the road infrastructure is more complex and diverse, with traffic lights, intersections, pedestrians, and cyclists, highways and roads offer more homogeneous conditions, allowing for more accurate detection of illegal overtaking. On these roads, continuous lane markings signify a prohibition on overtaking due to safety and visibility restrictions, making the investigation of such violations more pertinent. Furthermore, focusing on highways and roads reduces the influence of external variables.

To ensure that the system detects illegal overtaking only on highways and roads, we developed an application for the Android platform that leverages the system’s geolocation service to obtain periodic updates regarding the device’s geographic location. Each location data point comprises the latitude, longitude, timestamp, accuracy, heading, and velocity. The application records this data in a CSV file and saves a video of the entire route with the corresponding geolocation data.

After obtaining the driver’s geolocation data, we utilized Valhalla [35], an open-source routing engine that works with OpenStreetMap (OSM) data [36] to match Global Positioning System (GPS) coordinates. Valhalla performs map matching by aligning GPS measurements (represented as triplets of latitude, longitude, and time) with the corresponding road segments.

Every second, we have approximate information about the driver’s position. For each position, we check whether the road belongs to the classes of interest (primary, secondary, tertiary, highway, and motorway) and if it is a two-way road. Our analysis excludes roads with unknown classifications, residential and rural areas, and one-way roads.

We then extract the intervals of interest, focusing on the moments when the driver enters a highway, where we analyze illegal overtaking in a continuous lane. To avoid fragmented data and ensure consistent intervals, we discard isolated periods that last less than five seconds within the area of interest. If the gap between one period of interest and the next is less than five seconds, we merge the two into a single period.

3.3. Lane Detection

Object detection plays a critical role in computer vision, as accurate detection is essential for the effectiveness of applications such as autonomous vehicles, surveillance systems, company audits, and image analysis.

Another field of deep learning involves methods based on segmentation, which have achieved notable detection results. The image segmentation can include semantic segmentation as well as instance segmentation, each serving different purposes. Semantic and instance segmentation both operate at the pixel level to understand images. However, their focus differs: semantic segmentation identifies amorphous regions of uncountable objects with similar characteristics (stuff classes), such as drivable areas, lane lines, or background [37], while instance segmentation not only classifies pixels but also distinguishes between different object instances, such as individual cars or pedestrians in an image. Segmentation refines the classification problem by assigning a predefined category to each pixel in the image. This approach provides more precise pixel-level boundaries than object detection, which focuses on identifying and localizing objects within bounding boxes without considering the finer details at the pixel level.

Recently, the literature has proposed several methods in this field, among which the You Only Look Once (YOLO) network, introduced by [38], stands out. The YOLO algorithm is renowned for its exceptional speed compared to other methods while maintaining high accuracy. We selected it for its precision, fast response time—enabling real-time output—and capability to perform object detection and instance segmentation tasks.

3.3.1. Yellow Line Detection

A state-of-the-art model called YOLOv8 [39] was recently released, offering advanced features for object detection and instance segmentation tasks. The network comes in five different versions: YOLOv8n (nano), YOLOv8s (small), YOLOv8m (medium), YOLOv8l (large), and YOLOv8x (extra large).

We trained different variants of the YOLOv8 model (nano, small, and medium) to achieve instance segmentation of yellow overtaking lines, excluding those related to parking or other purposes. We used models pre-trained on the Common Objects in Context (COCO) dataset [40].

To ensure a quick and accurate response while minimizing computational costs, we performed a grid search by varying the model size, number of epochs, batch size, and learning rate to find the optimal parameters.

To determine the best parameters, we initially trained the model on a smaller subset of the data. Due to the multiple combinations generated by the grid search, running these on the full dataset would have been time-consuming. Therefore, we used the test set as the training set and kept the validation set unchanged.

For the reduced dataset grid search, YOLOv8m with a batch size of 8, a learning rate of

1 \times 10^{- 4}

, and 100 epochs achieved the best results in terms of bounding boxes and segmentation mAP on the validation set. Based on these findings, we trained YOLOv8m with these hyperparameters on the entire dataset, which included 3388 training images, 323 validation images, and 324 test images.

Data augmentation is crucial for improving the robustness and performance of YOLO models. During training, we used the standard augmentation techniques provided by YOLOv8, as detailed in Table 1. However, we needed to modify the horizontal flip operation because flipping the image affects the DdLSD and DdLDS labels. Specifically, after flipping the image horizontally, the DdLSD label changes to DdLDS and vice versa. To handle this, we customized the RandomFlip class in YOLOv8.

3.3.2. Overtaking Lane Detection

YOLOPv2 [31] is a multitask deep learning network that has demonstrated effective and efficient results in vehicle detection, drivable area segmentation, and lane segmentation. The YOLOPv2 model was inspired by the architectures of YOLOP [41] and HybridNet [42]. The main difference in this lies in its backbone for feature extraction and its use of three separate decoder heads to perform specific tasks, rather than using a single branch for both drivable area segmentation and lane detection. The authors indicate that this modification is due to the inherent complexity of segmentation tasks; the drivable area and lane segments have distinct challenges, which require different characteristics at the feature level. Consequently, utilizing different network structures enhances the detection performance.

YOLOPv2 provides three types of output data: (1) information regarding object detection, specifically for the vehicle class, including the class number, bounding boxes, and confidence scores for each detected object; (2) a binary image resulting from the drivable area segmentation; and (3) a binary image with lane segmentation. Figure 4 shows examples of YOLOPv2’s output data plotted on a sample image.

In our study, we used only the lane segmentation output from YOLOPv2 (Figure 5b), as this is sufficient to determine the area where the driver is without needing the road mask. At this stage, we did not consider vehicle detection because the focus is on evaluating the driver’s movement; for this purpose, we did not check the distance between vehicles or whether there is a vehicle to the right.

We then applied post-processing to filter the lanes and retain only the lane relevant for overtaking, i.e., the yellow line on the driver’s left side if the driver is not currently overtaking. The post-processing consists of the following steps:

(a): The removal of small objects;
(b): The identification of the driver’s lane;
(c): The detection and disconnection of lane intersections;
(d): The removal of disconnected secondary lanes;
(e): The re-identification of the driver’s lane;
(f): The identification of the overtaking lane;
(g): The extension of the detected lane.

First, we eliminated small objects (a) that are likely to be noise rather than actual lanes by removing any objects with an area smaller than 50 pixels from the binary lane segmentation image provided by YOLOPv2. Figure 5c illustrates an example of this step.

To identify the driver’s lane (b), we first delineated the contours of each segmented object. We then applied the orthogonal distance regression model [43] to determine the slope of each detected boundary. Next, we identified which lanes with positive and negative slopes were closest to the center of the image, thus obtaining the most representative pair. Subsequently, we removed the remaining objects. Figure 5d displays the result of this step.

However, we observed that in some cases, there could be unwanted connections between the driver’s lane and the road lanes, which prevents the previous step from achieving the desired effect. To detect and disconnect the lane intersections (c), first, we identify each segmented object in the image and locate its intersection point. Next, we selected another point at the highest part of the object. A line, with a thickness of 3 pixels, is drawn to connect these two points, effectively separating both lanes (Figure 5e).

After disconnecting the objects, we remove the disconnected road lane (d). In this step, we identify the objects closest to the center of the image and eliminate the pixels from those that are further away (Figure 5f).

After the previous step, some unwanted noise may persist. To eliminate these remnants, we re-identify the driver’s lane (e), repeating step (b) to ensure accurate detection. While this process may seem redundant, it is crucial at both stages to minimize the risk of erroneously removing the lane of interest. Figure 5g illustrates the results of this step.

To determine the overtaking lane (f), we use the lane segmentation image from YOLOPv2 alongside the result from the previous step. The YOLOPv2 image contains lane detections along the road, while the post-processed image includes only the driver’s lane. We calculate the average x-coordinate of the segmented boundaries for both images and compare these values. If they are identical, we treat the driver and the highway lanes as the same, which prevents us from determining the overtaking lane. In this case, we obtain an image showing both lanes. If the center of the driver’s lane is greater than that of the highway lane, we remove the lane on the right; otherwise, we remove the lane on the left (Figure 5h).

Since some lanes might appear as short segments, we perform the extension of detected lanes (g). We start by identifying the contour of the lane from the previous step and apply the orthogonal distance regression model [43] to determine the line’s slope. Based on this slope, we draw a line with a thickness of 7 pixels extending from the bottom edge of the image to the highest point of the contour, limited to half the height of the image (Figure 5i).

3.4. Analysis of Illegal Overtaking Incidents on Continuous Lane Markings

In the proposed approach, we used YOLOv8 to detect yellow lanes frame by frame from the video. In any frame where a continuous lane is detected, we applied YOLOPv2 followed by post-processing to identify the overtaking lane. If the segmentations intersect, we retain the detection; otherwise, we discard it.

We applied the orthogonal distance regression model [43] to calculate the slope of the lane. Positive slope values suggest that the driver is on the left side of the lane, which could indicate an irregularity. However, we cannot draw any conclusions from a single frame. To address this, we stored the data from each frame in a Pandas DataFrame, allowing us to perform validation based on a broader range of information.

After processing the entire video and generating the complete DataFrame, we apply the following operations to reduce potential noise and improve the accuracy in correctly identifying overtaking irregularities:

(a): Removal of short sequences: a positive slope of detected lanes must be present for a minimum sequence of 50 frames. If the sequence length is less than this value, it is not considered an overtaking maneuver;
(b): Connecting nearby sequences: If the gap between two sequences is less than 20 frames, we connect them. This decision is based on the understanding that short gaps may be due to noise or missed detections of certain lanes;
(c): Assignment of the most common class using a sliding window: To accurately identify the lane in which the driver is during an overtaking maneuver, we determine the predominant class within a sliding window of 20 frames. This approach smooths detections and mitigates temporary fluctuations, ensuring a more stable and precise representation of the lane where overtaking occurs.

We empirically determined the frame quantities used in the operations described above. It is important to note that the videos used in our experiments have a frame rate of 30 frames per second (FPS). After filtering the data, we can identify the initial and final moments of overtaking in continuous lanes, ensuring a more accurate analysis by relying on temporal information rather than individual frames alone.

3.5. Evaluation Metrics

We used precision, recall, F1-score, and Mean Average Precision (mAP) to quantitatively assess the model’s performance. To clarify these metrics, the key parameters are defined as follows: True positive (TP) refers to instances where the model correctly identifies a positive case. False positive (FP) occurs when the model incorrectly identifies a case as positive. False negative (FN) represents instances where the model fails to detect a positive case, instead classifying it as negative.

Precision (P) represents the proportion of TPs among all cases predicted as positive. In simpler terms, it measures how many of the model’s positive predictions are correct. Equation (1) defines the formula for calculating precision:

P = \frac{T P}{(F P + T P)}

(1)

Recall (R) measures the proportion of TPs among all positive instances. It evaluates how well the model identifies all relevant positive cases. Recall is calculated using Equation (2):

R = \frac{T P}{(F N + T P)}

(2)

The F1-score (F) combines precision (P) and recall (R) into a single metric and is calculated as shown in Equation (3):

F = 2 \cdot \frac{P \cdot R}{(P + R)}

(3)

The Average Precision (AP), also applied in [44], measures a model’s performance for a specific category by building a precision–recall curve. This curve is created by plotting precision values (y-axis) against recall values (x-axis) at various confidence thresholds. The AP is calculated as the area under this curve, summarizing how well the model identifies that category across all thresholds. Mathematically, Equation (4) expresses this relationship:

A P = \sum_{k = 0}^{k = n - 1} (R (k) - R (k + 1)) \cdot P (k)

(4)

where

P r e c i s i o n (k)

is the precision at threshold k,

R e c a l l (k)

is the recall at threshold k and k is the number of thresholds.

Mean Average Precision (mAP) is the average of the AP values across multiple categories. It provides a comprehensive measure of the model’s overall accuracy and effectiveness in object recognition by summarizing performance across various classes and thresholds. The mAP is computed as Equation (5):

m A P = \frac{1}{n} \sum_{k = 1}^{k = n} A P_{k}

(5)

where n is the number of categories and

A P_{k}

is the AP for the k-th category. For our case,

n = 5

, corresponding to the five types of lanes.

In the results, we use the symbols mAP50 and mAP50-95, where mAP50 calculates mAP using a 50% confidence threshold, and mAP50-95 uses confidence thresholds ranging from 50% to 95%, giving detailed information on the performance at varying confidence levels.

4. Results

This section presents the experimental results, which evaluate the effectiveness of different models and the detection of illegal overtaking with the proposed approach. We carried out the tests on an NVIDIA GeForce RTX 2070 SUPER with 8 GB of VRAM.

4.1. Yellow Line Detection

In this section, we assessed the performance of various instance segmentation models using pre-trained YOLOv8. Table 2 presents the results achieved with the best hyperparameters, as detailed in Section 3.3.1.

The results in Table 2 show that the YOLOv8m model performs well overall in detecting and segmenting most types of yellow lanes, but there are notable differences in its effectiveness across different classes.

The model demonstrates a high precision, recall, and Mean Average Precision (mAP) for the DdSL and DdLDS classes, indicating its effectiveness in detecting these lane types, which exhibit distinct visual patterns and are sufficiently represented in the dataset. The DdLDS class, in particular, achieves perfect precision in mask segmentation, reflecting a very low rate of FPs.

On the other hand, the SSL class exhibits the lowest performance, with lower precision, recall, and mAP values. The subtle appearance and underrepresentation of these lanes in the dataset suggest challenges in detecting and segmenting them. To improve the model’s performance for this class, increasing the number of SSL instances in the training data or adjusting hyperparameters could be effective solutions.

The model shows moderate results for the SDL and DdLSD classes, achieving reasonable precision and recall; however, it achieves slightly lower mAP scores at more challenging Intersection over Union (IoU) thresholds (mAP50-95).

The results demonstrate the model’s promising performance in detecting more common lane types, such as DdSL and DdLDS, with high precision and recall. However, the lower scores for SSL indicate potential for improvement. Further fine-tuning or incorporating more diverse training data could help the model better handle variations in lane appearance and challenging environmental conditions.

4.2. Experiments

We used YOLOv8 to segment yellow road lines (Experiment 1) and determine whether drivers can overtake. We trained YOLOv8 to recognize five types of lane markings: SDL, SSL, DdSL, DdLSD, and DdLDS. SDL and DdLSD lane markings allow overtaking, while SSL, DdSL, and DdLDS markings prohibit overtaking and indicate illegal maneuvers.

Figure 6 shows an example of detection for each possible yellow line on the road or highway. The images on the left display the original frames, and those on the right show the detection overlaid on the corresponding frames.

However, we noticed that YOLOv8 sometimes mistakenly identifies white markings and yellow boxes as single solid yellow lines. It also classifies white roadside markings as either double or single solid lanes. Additionally, the system often misinterprets parking markings as single solid lanes. As a result, in these situations—where the lane is to the driver’s right—the system may incorrectly recommend illegal overtaking. Figure 7 illustrates some of these cases.

To minimize these errors, we evaluated a second approach: Experiment 2. In this approach, we applied a filter when YOLOv8 detects a road marking that could suggest an illegal overtaking opportunity, such as SSL, DdSL, or DdLDS types. This filter uses YOLOPv2 for lane detection, and incorporates additional post-processing, as detailed in Section 3.3.2. After identifying the lane involved in overtaking, we check for overlap with the segment detected by YOLOv8. If an overlap exists, we retain the segmentation; if not, we discard it, signaling a potential detection error.

Figure 8 presents an example of illegal overtaking. On the left, our approach (YOLOPv2 with post-processing) successfully identifies the overtaking lane, highlighting it in green over the original image, with the YOLOv8 detection shown in red. On the right, however, a bus occludes the lane marking that defines the highway boundary, preventing YOLOPv2 from detecting it. As a result, the post-processing algorithm fails to determine the correct overtaking lane, as the calculated distance between the edges of the lane and the highway boundary is the same in this case. Despite this limitation, the overall results remain largely unaffected, and lane filtering still significantly reduces the number of FPs.

The implementation of the overtaking lane detection method has notable FPs from YOLOv8. However, some FP cases persist, especially in areas that lack continuous lane markings, such as residential zones with predominantly white street markings or rural areas without lane markings. As a result, we decided to focus our analysis exclusively on instances when the vehicle travels on highways or major roads.

Given this, we conducted Experiment 3, where we first checked the geolocation information. If the driver is on a two-way highway (conditions described in Section 3.2), we extracted the intervals of interest. We then applied the processing described in Experiment 2 exclusively to these intervals. This initial data filtering significantly optimizes the processing time and the accuracy of detecting violations related to illegal overtaking.

An important point is that the current research utilized real traffic videos provided by CEMIG. Our dataset comprised videos of actual routes recorded in various environments, including residential areas, single- and dual-direction highways, and rural areas, captured at different times of day and under varying weather conditions using in-vehicle cameras in cars and motorcycles. Since the CEMIG dataset contained only a limited number of infractions, we had to conduct additional simulations to assess our method’s effectiveness in scenarios with more violations. As a result, we simulated cases involving traffic violations, such as overtaking in continuous lanes, to assess our approach. Table 3 presents the distribution of these videos between simulated infractions (conducted in real scenarios and intentionally executed) and routine videos from CEMIG, along with the number of infractions and the total hours analyzed for each case.

Table 4 presents the results from the three experiments conducted with this dataset. The results demonstrate progressive improvements in lane detection by addressing key challenges related to FPs and contextual accuracy.

The analysis of Experiment 1 reveals several challenges faced by YOLOv8 in detecting yellow lane markings across different datasets. The model struggled with various issues: the complexity and variability of lane markings in residential areas, false detections in rural settings, and misclassifications on single- and dual-lane roads. Factors such as motion blur due to vehicle movement, camera instability caused by road conditions, varying lighting, and the poor quality of lane markings further complicate detection.

For the CEMIG data, which represent a diverse range of real-world scenarios, the precision was notably low, at 0.312. Despite achieving a high recall of 1.000, the F1-score was relatively low, at 0.476, indicating that although the model successfully identifies nearly all true lane markings (high recall), it also includes many FPs (low precision), resulting in a lower F1-score. In contrast, the simulated dataset, which also includes real-world scenarios but is predominantly composed of highway scenarios, achieved a higher precision of 0.600 and a better F1-score of 0.750. This comparison highlights that YOLOv8 performs better in scenarios with less variability, such as highways, but may face challenges in more complex environments where the diversity of objects with characteristics similar to lane markings increases the likelihood of confusing the model.

Experiment 2 implemented a filtering mechanism that combined YOLOPv2 with post-processing to verify if the detected lane markings were correctly identified as overtaking lanes. This method notably enhanced the precision, achieving a score of 0.714 for the CEMIG dataset. However, the recall for the simulation dataset saw a slight reduction to 0.917. This decrease was due to an FN, where a lane detected by YOLOv8 was not matched with the overtaking lane identified by YOLOPv2 and was therefore excluded. Despite this, the overall F1-score improved to 0.889, indicating a more effective balance between accurately identifying true violations and reducing false detections. Lane type matching played a crucial role in refining the results, although challenges remained in accurately identifying lanes in more complex scenarios.

Experiment 3 improved the methodology by incorporating geolocation data and focusing detection on two-way highways, where overtaking lanes are relevant. This contextual filtering effectively reduced FPs from residential, rural, and single-direction roads, improving the precision to 1.000 across both datasets. The overall F1-score increased to 0.970, indicating that integrating the geographic context significantly enhanced the model’s accuracy in detecting relevant scenarios and reduced false detections. Figure 9 presents a few examples of correct detections of illegal overtaking, illustrating the performance under favorable conditions and challenging environments, such as poor lighting or worn lane markings.

5. Discussion

Detecting illegal overtaking poses a significant challenge in computer vision, especially when using basic smartphone cameras. The difficulty increases under adverse weather conditions and low-light environments. In this study, we propose a method to identify illegal overtaking by analyzing videos recorded using smartphones and combining them with geolocation data. The approach employs two YOLO models: one for detecting lane types and another for identifying overtaking lanes. To enhance accuracy, we applied several post-processing techniques. By integrating detection areas, the system analyzes temporal patterns to determine whether the driver is in an illegal overtaking zone.

We conducted three distinct experiments using our method to analyze the impact of the proposed techniques on the achieved results. The progression through the experiments highlights how each step addressed specific limitations of the previous approach. Experiment 1 faced challenges due to diverse environments, which we addressed by introducing lane-type filtering in Experiment 2. We further enhanced the method by adding geolocation context in Experiment 3. These advancements demonstrate a clear improvement in detecting relevant lane markings and reducing false positives.

In the study proposed by [18], the authors achieved precision and recall values of 0.900 and 0.928, respectively. When comparing these with the results from our study, we observe that in Experiments 2 and 3, we achieved superior precision and recall values. In Experiment 3, we achieved a remarkable precision of 1.000 and a recall of 0.941 across the entire dataset.

Similarly, in examining the study by [29], which reported a precision of 0.920 and a recall of 0.890, Experiment 3 also demonstrates superiority in both metrics, culminating in an F1-score of 0.970, thus exceeding the results obtained by [29].

Regarding the use of sensors and cameras, works utilizing monocular cameras and combined sensors, such as those by [15,16], highlight significant challenges in scenarios with considerable variability. Night-time conditions often lead to a loss of color information, while adverse weather can significantly impair sensor performance, complicating overtaking detection. We encounter similar challenges, particularly regarding lane changes in residential or rural areas. The results from Experiment 1, specifically within the CEMIG dataset, yielded a precision of 0.312, reflecting the difficulties experienced in these conditions. Additionally, the quality and quantity of training data heavily influence model effectiveness, making dataset collection a substantial challenge, as noted by [27].

While recent advancements have made strides in detecting illegal overtaking and lane recognition, several challenges persist. Many models focus on optimizing the performance for ideal conditions with good visibility and favorable weather, which limits their effectiveness in adverse situations, such as low light or poor weather conditions, particularly with smartphone cameras [28,29]. Furthermore, limited training data reduces the model’s ability to generalize across different environments, camera positions, and driver behaviors. This limitation in the dataset can significantly affect the performance in various traffic scenarios [27].

Another critical gap is integrating multiple sensors to improve environmental perception. While some studies explore combining cameras with LiDAR [31], others suggest incorporating additional sensors like radar, sonar, and RGB-D data to enhance overtaking detection, especially in adverse conditions [28]. Despite promising advancements in models by [16,27], the real-time analysis of vehicle behavior, such as speed and proximity, remains a challenge.

In addition to the challenges posed by varying environmental conditions, we also encountered difficulties in differentiating between the colors of the lines, which could be either yellow or white. Initially, we annotated both colors with all marking types (SDL, SSL, DdSL, DdLSD, and DdLDS) and trained the model. However, the model struggled to correctly differentiate these markings, likely due to factors like lighting conditions during capture and the influence of the surrounding terrain. For example, the reddish soil near the road, along with vehicle flow, could distort the perception of lane boundaries, even for human observers. In Brazil, white lanes typically mark the boundaries between the highway and the roadside, separate lanes in residential areas, or delineate lanes on one-way highways. To address this issue, we focused solely on yellow lanes for training, as they provided a more distinct differentiation given the specific conditions of our dataset.

Our method addresses these gaps by proposing a more accessible solution using smartphone cameras, eliminating the need for depth sensing or additional sensors. While more challenging for detection, this solution is more scalable and cost-effective, offering significant precision in real-world traffic environments. Despite the challenges posed by adverse conditions, our method maintains high accuracy in detecting illegal overtaking violations, demonstrating its robustness and potential for broader applicability.

To enhance the model’s effectiveness in various road conditions, different countries, and the diversity of traffic rules, we need to focus on future refinement to improve the accuracy without excessive reliance on filtering techniques. Furthermore, to ensure broader applicability, incorporating images with white lane markings, particularly from regions where white lanes indicate overtaking zones, allows the model to generalize more effectively across different scenarios.

6. Conclusions

In conclusion, this study demonstrates significant advancements in lane change detection, offering practical implications for vehicle auditing and road safety. By identifying overtaking violations, our method provides valuable insights into driver behavior, supporting targeted interventions to improve adherence to traffic rules and enhance road safety.

The literature review revealed that previous methods often relied on complex and costly sensor setups, such as LiDAR, radar, and RGB-D data, or focused solely on lane detection without categorization. While effective in test environments, these approaches faced scalability challenges due to high costs and infrastructure requirements. In contrast, our method leverages low-cost smartphone cameras, eliminating the need for depth sensing or additional sensors. Despite the increased detection complexity, this approach remains scalable and cost-effective, achieving high accuracy in lane change detection.

The developed method effectively detects illegal overtaking. However, it can be more effective in countries where yellow markings indicate overtaking lanes. Also, it performs well on highways, while urban environments, with their complexity, present additional challenges and increase the likelihood of false positives. A further aspect to consider is the variability of the dataset in weather conditions and at different times of the day. An additional aspect to consider is the dataset’s variability in weather conditions and at different times of the day. However, its accuracy in night-time videos could not be assessed due to the lack of data under such conditions, as our data source does not include this information.

For future work, we would like to highlight the following areas for improvement:

Incorporating night-time footage and challenging weather conditions, such as snow, into the training dataset to improve the model’s robustness.
Expanding the training dataset to include global data, and, specifically, images of highways worldwide, covering different overtaking lane marking colors (including both yellow and white markings).
Training the models by incorporating the data from items 1 and 2. Additionally, we may need to adjust the system to handle these new data, and we should collect more violation videos, particularly those depicting traffic violations under various climatic conditions, lighting scenarios, and during night-time.
Modifying the method to enable real-time processing, allowing integration with in-vehicle systems that provide immediate alerts to drivers about overtaking violations.

The results presented in this paper and the proposed directions for future work highlight the role of the developed system integrated with vehicle auditing systems in improving driving behavior and road safety. This system has the potential to help reduce traffic violations and, consequently, decrease the risk of accidents on the road.

Author Contributions

Conceptualization, K.D.M. and V.d.C.B.; methodology, K.D.M. and V.d.C.B.; software, K.D.M. and V.d.C.B.; validation, K.D.M. and V.d.C.B.; formal analysis, K.D.M. and V.d.C.B.; investigation, K.D.M., V.d.C.B., L.C.D. and A.R.D.; resources, L.C.D., A.R.D. and V.d.C.B.; data curation, K.D.M. and V.d.C.B.; writing—original draft preparation, K.D.M. and V.d.C.B.; writing—review and editing, K.D.M. and V.d.C.B.; visualization, K.D.M. and V.d.C.B.; supervision, G.d.C.B. and V.T.; project administration, G.d.C.B. and V.T.; funding acquisition, L.C.D., A.R.D., G.d.C.B. and V.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work was carried out within the scope of the Research and Development project of ANEEL No. P&D-04950-0661, Integrated Computer Vision System for Revenue Protection and Work Safety, linked to the agreed contract No. 4320000234.

Data Availability Statement

The data are not publicly available due to confidentiality agreements with CEMIG Distribution.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AP	Average Precision
CEMIG	Energy Company of Minas Gerais
CNN	Convolutional Neural Network
COCO	Common Objects in Context
DdLDS	Double lane with dash/solid
DdLSD	Double lane with solid/dash
DdSL	Double solid lane
FN	False negative
FP	False positive
FPS	Frames per second
FSMs	Finite State Machines
GPS	Global Positioning System
IoU	Intersections over Union
mAP	Mean Average Precision
OSM	OpenStreetMap
P	Precision
R	Recall
SDL	Single dash lane
SSL	Single solid lane
TP	True positive
WHO	World Health Organization
YOLO	You Only Look Once
YOLOPv2	YOLO for Panoptic driving Perception version 2
YOLOv8	You Only Look Once version 8

References

World Health Organization (WHO). Global Status Report on Road Safety. Available online: https://iris.who.int/bitstream/handle/10665/375016/9789240086517-eng.pdf?sequence=1 (accessed on 12 September 2024).
Polícia Rodoviária Federal (PRF). Dados Abertos da PRF. Available online: https://www.gov.br/prf/pt-br/acesso-a-informacao/dados-abertos/dados-abertos-da-prf (accessed on 10 November 2024).
Zhang, Y.; Lu, H.; Qu, W. Geographical Detection of Traffic Accidents Spatial Stratified Heterogeneity and Influence Factors. Int. J. Environ. Res. Public Health 2020, 17, 572. [Google Scholar] [CrossRef] [PubMed]
Karpenko, M.; Prentkovskis, O.; Skačkauskas, P. Numerical Simulation of Vehicle Tyre under Various Load Conditions and Its Effect on Road Traffic Safety. Promet-Traffic Transp. 2024, 36, 1–11. [Google Scholar] [CrossRef]
Hudec, J.; Šarkan, B.; Caban, J.; Stopka, O. The impact of driving schools’ training on fatal traffic accidents in the Slovak Republic. Sci. J. Silesian Univ. Technol. Ser. Transp. 2021, 110, 45–57. [Google Scholar] [CrossRef]
Figueira, A.C.; Larocca, A.P.C. Analysis of the factors influencing overtaking in two-lane highways: A driving simulator study. Transp. Res. Part F Traffic Psychol. Behav. 2020, 69, 38–48. [Google Scholar] [CrossRef]
Li, X.; Zhuge, C.; Yu, B. Analysis on the Impact of Illegal Driver Behaviors on Road Traffic Accidents Case Study on China. In Proceedings of the 2019 11th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 24–25 August 2019; Volume 1, pp. 250–253. [Google Scholar] [CrossRef]
Farah, H. Age and Gender Differences in Overtaking Maneuvers on Two-Lane Rural Highways. Transp. Res. Rec. 2011, 2248, 30–36. [Google Scholar] [CrossRef]
Jenkins, J.M.; Rilett, L.R. Classifying Passing Maneuvers: A Behavioral Approach. Transp. Res. Rec. 2005, 1937, 14–21. [Google Scholar] [CrossRef]
Sabek, B.; Srour, F.J.; El Mendelek, M.; El Khoury-Malhame, M.; Khoury, J. Are you in the mood to pass? A study on the interplay of psychological traits and traffic on young drivers’ overtaking behavior on two-lane, two-way highways. Transp. Res. Part F Traffic Psychol. Behav. 2024, 101, 59–77. [Google Scholar] [CrossRef]
Nerio, N.C.A.L.; Nabe, N.C. The Mediating Role of Traffic Safety Awareness on Road Safety Attitude and Behavior of Drivers. J. Leg. Subj. 2024, 4, 24–35. [Google Scholar] [CrossRef]
Athree, M.; Jayasiri, A. Vision-based automatic warning system to prevent dangerous and illegal vehicle overtaking. In Proceedings of the 2020 International Research Conference on Smart Computing and Systems Engineering (SCSE), Colombo, Sri Lanka, 24 September 2020; pp. 25–30. [Google Scholar] [CrossRef]
Rezagholipour, K.; Massoudian, N.; Eshghi, M. Modeling and reducing overtaking accidents on two-lane curved road. In Proceedings of the 2016 2nd International Conference of Signal Processing and Intelligent Systems (ICSPIS), Tehran, Iran, 14–15 December 2016; pp. 1–5. [Google Scholar] [CrossRef]
Parekh, D.; Poddar, N.; Rajpurkar, A.; Chahal, M.; Kumar, N.; Joshi, G.P.; Cho, W. A Review on Autonomous Vehicles: Progress, Methods and Challenges. Electronics 2022, 11, 2162. [Google Scholar] [CrossRef]
Lin, Y.C.; Lin, C.L.; Huang, S.T.; Kuo, C.H. Implementation of an Autonomous Overtaking System Based on Time to Lane Crossing Estimation and Model Predictive Control. Electronics 2021, 10, 2293. [Google Scholar] [CrossRef]
Wu, L.T.; Lin, H.Y. Overtaking Vehicle Detection Techniques based on Optical Flow and Convolutional Neural Network. In Proceedings of the 4th International Conference on Vehicle Technology and Intelligent Transport Systems, Funchal, Portugal, 16–18 March 2018; pp. 133–140. [Google Scholar] [CrossRef]
Miglani, A.; Kumar, N. Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges. Veh. Commun. 2019, 20, 100184. [Google Scholar] [CrossRef]
Panichpapiboon, S.; Leakkaw, P. Lane Change Detection With Smartphones: A Steering Wheel-Based Approach. IEEE Access 2020, 8, 91076–91088. [Google Scholar] [CrossRef]
Tunc, I.; Soylemez, M.T. Fuzzy logic and deep Q learning based control for traffic lights. Alex. Eng. J. 2023, 67, 343–359. [Google Scholar] [CrossRef]
Alabyad, N.; Hany, Z.; Mostafa, A.; Eldaby, R.; Tagen, I.A.; Mehanna, A. From Vision to Precision: The Dynamic Transformation of Object Detection in Autonomous Systems. In Proceedings of the 2024 6th International Conference on Computing and Informatics (ICCI), Cairo, Egypt, 6–7 March 2024; pp. 332–344. [Google Scholar] [CrossRef]
Guzmán, P.; Díaz, J.; Ralli, J.; Agís, R.; Ros, E. Low-cost sensor to detect overtaking based on optical flow. Mach. Vis. Appl. 2011, 25, 699–711. [Google Scholar] [CrossRef]
Li, J.; Ma, C.; Han, Y.; Mu, H.; Jiang, L. Enhanced SCNN-Based Hybrid Spatial-Temporal Lane Detection Model for Intelligent Transportation Systems. IEEE Access 2024, 12, 40075–40091. [Google Scholar] [CrossRef]
Surendra, H.; Samudyata, A.; Shivashankar, B.; Surendra, S.; Mohana; Moharir, M. Lane Detection and Traffic Sign Detection using Deep Learning and Computer Vision for Autonomous Driving Research Using CARLA Simulator. Int. J. Recent Innov. Trends Comput. Commun. 2023, 11, 2062–2069. [Google Scholar] [CrossRef]
Deng, L.; Liu, X.; Jiang, M.; Li, Z.; Ma, J.; Li, H. Lane Detection Based on Adaptive Cross-Scale Region of Interest Fusion. Electronics 2023, 12, 4911. [Google Scholar] [CrossRef]
Lee, Y.; Park, M. Rearview Camera-Based Blind-Spot Detection and Lane Change Assistance System for Autonomous Vehicles. Appl. Sci. 2025, 15, 419. [Google Scholar] [CrossRef]
Lu, E.H.-C.; Chiu, W.-C. Lane Detection Based on CycleGAN and Feature Fusion in Challenging Scenes. Vehicles 2025, 7, 2. [Google Scholar] [CrossRef]
Perepu, S.K.; Prasanna Kumar, P. Safe overtaking using image processing and deep learning techniques. In Proceedings of the 2021 IEEE International Conference on Computing (ICOCO), Kuala Lumpur, Malaysia, 17–19 November 2021; pp. 55–60. [Google Scholar] [CrossRef]
Xia, Y.; Wang, C.; Shi, X.; Zhang, L. Vehicles overtaking detection using RGB-D data. Signal Process. 2015, 112, 98–109. [Google Scholar] [CrossRef]
Arun Kumar, H.D.; Prabhakar, C.J. Detection and Tracking of Lane Crossing Vehicles in Traffic Video for Abnormality Analysis. Int. J. Eng. Adv. Technol. 2021, 10, 1–9. [Google Scholar] [CrossRef]
Zamfir, S.; Drosescu, R.; Gaiginschi, R. Intelligent system for driver assistance in overtaking manoeuvres using multiple Camera and Radar Sensors—Part 1. IOP Conf. Ser. Mater. Sci. Eng. 2020, 997, 012136. [Google Scholar] [CrossRef]
Han, C.; Zhao, Q.; Zhang, S.; Chen, Y.; Zhang, Z.; Yuan, J. YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception. arXiv 2022, arXiv:2208.11434. [Google Scholar]
Faizi, F.S.; Al-sulaifanie, A.K. Vision-Based Multi-Stages Lane Detection Algorithm. Pertanika J. Sci. Technol. 2024, 32, 1811–1827. [Google Scholar] [CrossRef]
Chamola, V.; Chougule, A.; Sam, A.; Hussain, A.; Yu, F.R. Overtaking Mechanisms Based on Augmented Intelligence for Autonomous Driving: Data Sets, Methods, and Challenges. IEEE Internet Things J. 2024, 11, 17911–17933. [Google Scholar] [CrossRef]
Vigne, B.; Orjuela, R.; Lauffenburger, J.P.; Basset, M. Overtaking on two-lane two-way rural roads: A personalized and reactive approach for automated vehicle. Transp. Res. Part C Emerg. Technol. 2024, 166, 104800. [Google Scholar] [CrossRef]
Gearhart, D.; Knisely, G.; Kreiser, K.; DiLuca, K.; Nesbitt, D. Valhalla. Available online: https://github.com/valhalla (accessed on 12 September 2024).
Contributors, O. OpenStreetMap. Available online: https://www.openstreetmap.org (accessed on 12 September 2024).
Yin, C.; Tang, J.; Yuan, T.; Xu, Z.; Wang, Y. Bridging the Gap Between Semantic Segmentation and Instance Segmentation. IEEE Trans. Multimed. 2022, 24, 4183–4196. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Jocher, G.; Chaurasia, A.; Qiu, J. YOLOv8 by Ultralytics. Available online: https://github.com/ultralytics/ultralytics (accessed on 20 October 2024).
Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context. arXiv 2015, arXiv:1405.0312. [Google Scholar]
Wu, D.; Liao, M.W.; Zhang, W.T.; Wang, X.G.; Bai, X.; Cheng, W.Q.; Liu, W.Y. YOLOP: You Only Look Once for Panoptic Driving Perception. Mach. Intell. Res. 2022, 19, 550–562. [Google Scholar] [CrossRef]
Vu, D.; Ngo, B.; Phan, H. HybridNets: End-to-End Perception Network. arXiv 2022, arXiv:2203.09035. [Google Scholar]
ORDPACK—Software for Weighted Orthogonal Distance. Available online: https://scholar.colorado.edu/concern/reports/1z40kt68x?locale=fr (accessed on 18 September 2024).
Xiao, D.; Dianati, M.; Geiger, W.G.; Woodman, R. Review of Graph-Based Hazardous Event Detection Methods for Autonomous Driving Systems. IEEE Trans. Intell. Transp. Syst. 2023, 24, 4697–4715. [Google Scholar] [CrossRef]

Figure 1. Driving scenario.

Figure 2. Workflow of the proposed methodology.

Figure 3. Some examples of images annotated from the dataset: (a) single dash lane; (b) single solid lane; (c) double solid lane; (d) double lane with solid/dash; (e) double lane with dash/solid; (f) lane transition (double lane with solid/dash to single dash lane).

Figure 4. Examples of YOLOPv2 predictions on selected images from the dataset.

Figure 5. Steps for acquiring the overtaking lane: (a) Original frame (b). YOLOPv2 output for lane segmentation. (c) Removal of small objects. (d) Identification of driver’s lane. (e) Detection and disconnection of lane intersections. (f) Removal of disconnected secondary lanes. (g) Re-identification of driver’s lane. (h) Identification of the overtaking lane. (i) Extension of detected lane. (j) Overlay of the lane on the input image.

Figure 6. Examples of yellow lane detection by YOLOv8.

Figure 7. Examples of incorrect lane detections.

Figure 8. Example of detecting illegal overtaking by overlaying YOLOv8 results with those from YOLOPv2 alongside post-processing.

Figure 9. Examples of correct illegal overtaking detections. (a) Overtaking in DdLDS under favorable detection conditions. (b) Overtaking in DdSL with worn lane markings. (c,d) Overtaking in DdSL with harsh sunlight.

Table 1. Data augmentation operations used in YOLOv8 and their default probabilities. The table shows only operations with probabilities greater than zero, indicating the transformations effectively applied to the input data.

Transformation	Default Probability
Adjust Hue	0.015
Adjust Saturation	0.7
Adjust Brightness	0.4
Horizontal and Vertical Shift	0.1
Scale Image	0.5
Horizontal Flip	0.5
Mosaic Combination	1.0
Random Erasing	0.4
Crop Image Fraction	1.0

Table 2. Results of YOLOv8m for yellow line detection on the test set.

Class	Instances	Box (P)	Box (R)	Box (mAP50)	Box (mAP50-95)	Mask (P)	Mask (R)	Mask (mAP50)	Mask (mAP50-95)
SDL	91	0.837	0.802	0.881	0.643	0.803	0.761	0.840	0.428
SSL	13	0.516	0.538	0.463	0.335	0.522	0.538	0.449	0.271
DdSL	103	0.879	0.922	0.939	0.837	0.881	0.922	0.939	0.804
DdLDS	48	0.976	0.861	0.906	0.769	1.000	0.880	0.927	0.734
DdLSD	34	0.832	0.853	0.875	0.771	0.833	0.853	0.877	0.695
All classes	289	0.808	0.795	0.813	0.671	0.808	0.791	0.806	0.586

Table 3. Summary of video data and infractions.

Source	Amount of Videos	Total Duration	Total Infractions
CEMIG	37	38:32:48	5
Simulation	19	02:36:36	12
Total	56	41:09:24	17

Table 4. Results obtained from each experiment.

Experiment	Measure	CEMIG	Simulation	All
Experiment 1	TP	5	12	17
	FP	11	8	19
	FN	0	0	0
	P	0.312	0.600	0.472
	R	1.000	1.000	1.000
	F1-score	0.476	0.750	0.642
Experiment 2	TP	5	11	16
	FP	2	1	3
	FN	0	1	1
	P	0.714	0.917	0.842
	R	1.000	0.917	0.941
	F1-score	0.833	0.917	0.889
Experiment 3	TP	5	11	16
	FP	0	0	0
	FN	0	1	1
	P	1.000	1.000	1.000
	R	1.000	0.917	0.941
	F1-score	1.000	0.957	0.970

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Marcomini, K.D.; Brito, V.d.C.; Balestra, G.d.C.; Tosetto, V.; Duarte, L.C.; Donadon, A.R. A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing. J. Sens. Actuator Netw. 2025, 14, 10. https://doi.org/10.3390/jsan14010010

AMA Style

Marcomini KD, Brito VdC, Balestra GdC, Tosetto V, Duarte LC, Donadon AR. A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing. Journal of Sensor and Actuator Networks. 2025; 14(1):10. https://doi.org/10.3390/jsan14010010

Chicago/Turabian Style

Marcomini, Karem Daiane, Vitória de Carvalho Brito, Gregori da Cruz Balestra, Vitor Tosetto, Luiz Carlos Duarte, and Antonio Roberto Donadon. 2025. "A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing" Journal of Sensor and Actuator Networks 14, no. 1: 10. https://doi.org/10.3390/jsan14010010

APA Style

Marcomini, K. D., Brito, V. d. C., Balestra, G. d. C., Tosetto, V., Duarte, L. C., & Donadon, A. R. (2025). A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing. Journal of Sensor and Actuator Networks, 14(1), 10. https://doi.org/10.3390/jsan14010010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Approach to Road Safety: Detecting Illegal Overtaking Using Smartphone Cameras and Deep Learning for Vehicle Auditing

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Dataset

3.2. Geolocation Information

3.3. Lane Detection

3.3.1. Yellow Line Detection

3.3.2. Overtaking Lane Detection

3.4. Analysis of Illegal Overtaking Incidents on Continuous Lane Markings

3.5. Evaluation Metrics

4. Results

4.1. Yellow Line Detection

4.2. Experiments

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI