Meta-YOLOv8: Meta-Learning-Enhanced YOLOv8 for Precise Traffic Light Color Detection in ADAS
Abstract
:1. Introduction
- Safety: Accurate traffic light detection for ADAS is critical to ensuring the safety of passengers, pedestrians, and other vehicles. For an autonomous vehicle (AV), the detection of stop and proceed lights is essential to obey traffic laws and avoid accidents. This task is particularly challenging due to the varying weather and lighting conditions that require precise identification of traffic light colors.
- ADAS real-time decision making and navigation: The ability of ADAS to recognize and interpret traffic lights in real time is of paramount importance for the prompt decision-making processes that are necessary for adjusting speed and navigating around detours, thus ensuring the smooth and predictable operation of the vehicle.
- Traffic flow optimization: Data obtained from the detection of traffic lights can be seamlessly integrated into smart city infrastructure to optimize traffic flow. This data plays a crucial role in developing adaptive traffic light control systems, which are of essential for reducing road congestion and enhancing traffic efficiency.
- Human–machine interface (HMI) improvement: The implementation of traffic light detection technology is key for enhancing the human–machine interface (HMI) in semi-autonomous vehicles. By providing drivers with accurate, timely information about road conditions, it promotes safer and more efficient driving. This decision-making support not only improves safety but also reduces cognitive load on drivers, ensuring a more comfortable and efficient driving experience.
- Autonomy under diverse conditions: The capacity to detect and interpret traffic lights in a variety of environmental contexts represents a key indicator of the level of autonomy that an ADAS can achieve [2].
Enhanced Adaptability with Meta-Learning in YOLOv8
2. Related Work
- -
- Single Shot Multibox Detector (SSD): The SSD [14] method is distinguished by its high processing speed, which is achieved through a single-shot approach that obviates the necessity for a separate region proposal network. By employing a set of default bounding boxes and aspect ratios, SSD is able to predict the presence of objects at multiple scales, thereby facilitating the detection of objects of varying sizes within an image. However, it should be noted that SSD may encounter difficulties in detecting very small objects, and extensive data augmentation may be necessary to achieve the desired level of robustness. With regard to the substantial data dependencies inherent to its operation, the performance of SSD is contingent upon the extent and diversity of the training data employed, which is necessary for the effective discernment of diverse object scales and aspect ratios [15].
- -
- You Only Look Once (YOLOv8): The YOLO family, particularly the developments observed in YOLOv5 and YOLOv8 [16,17], has the capacity for real-time object detection with a high degree of accuracy. These models adopt a comprehensive approach to image processing, simultaneously predicting bounding boxes and class probabilities in a single evaluation. This approach markedly diminishes the requisite inference time, rendering it well suited to applications that necessitate real-time analysis. One of the principal advantages of the more recent iterations, such as YOLOv8, is the enhancement in the ability to recognize small objects and the improvement in generalization across different datasets. These advancements have been made possible by architectural innovations and rigorous training regimes. Nevertheless, it should be noted that YOLO models may still be susceptible to challenges posed by occluded or overlapping objects. Moreover, while these models have reduced their data requirements through enhanced architectures, they continue to benefit considerably from the availability of extensive annotated datasets to optimize their detection capabilities [18].
- -
- Faster R-CNN: [19] is a pioneering model in the region-based convolutional neural network (CNN) family, and offers a distinctive combination of accuracy and comprehensiveness. A region proposal network (RPN) is employed to hypothesize object locations, with these predictions then refined by a Fast R-CNN detector. Although this two-stage process is more computationally intensive, it offers high precision and recall rates, which are particularly useful in scenarios where accuracy is critical. One limitation of Faster R-CNN is its relatively slow processing speed, which makes it less suitable for real-time detection tasks. Furthermore, the model requires substantial data inputs to effectively train both the RPN and the detector, making it a data-intensive model [20].
- -
- Detection Transformers (DETR): introduced an end-to-end object detection framework that employs Transformers [13], an architectural approach that has demonstrated considerable success in the field of natural language processing. DETR circumvents the need for numerous manually designed components by learning to perform object detection as a direct set prediction problem. While it benefits from Transformers’ capacity to attend to global contexts within an image, DETR typically necessitates longer training periods and larger datasets to achieve optimal performance levels. Furthermore, DETR encounters difficulties in the detection of small objects due to the global nature of attention mechanisms. Nevertheless, it provides a promising avenue for adaptability due to its flexible architecture that is not constrained by preset anchor boxes or proposals [21].
- -
- Tiny YOLOv4: [22] is a streamlined version of the YOLO object detection model. It has been designed to be faster and more efficient, particularly on edge devices with limited computational resources. The model maintains an optimal balance between speed and accuracy by employing a reduced number of layers and parameters in comparison to the full YOLOv4 model. Tiny YOLOv4 is particularly effective for applications requiring real-time processing, such as traffic light color detection, where it can quickly identify and classify objects with relatively low latency. However, it may not consistently attain the same degree of accuracy than more sophisticated models can achieve by employing advanced architectures and learning strategies to enhance detection performance, particularly for smaller and densely packed objects.
3. Our Proposal
3.1. Meta-YOLOV8 Architecture
3.1.1. CBS (Convolutions, Batch Normalization, and Pooling)
3.1.2. Batch Normalization
3.1.3. SiLU (Sigmoid Linear Unit)
3.1.4. Spatial Pyramid Pooling Fast (SPPF)
3.1.5. Detection Block
3.2. Meta-Learner
4. Methods
4.1. Data
4.2. Data Preprocessing
- Data cleaning: Corrupted and irrelevant images (such as those that were blurred, improperly exposed, or did not contain any traffic lights) were removed.
- Image resizing: To maintain consistency with the training model and to reduce computational load, images were resized to a standard dimension while preserving their aspect ratio. This uniformity is necessary for batch processing during model training.
- Normalization: Pixel values in the images were normalized to have a mean of zero and a standard deviation of one. This step is critical for helping the model’s convergence during training and improving its generalization abilities.
- Augmentation: Techniques such as random rotations, flipping, scaling, and cropping were applied to artificially expand the dataset (for some images only). This not only helps in preventing overfitting but also ensures the model is invariant to common variations in the real world.
- Color space conversion: Considering the importance of color in traffic light detection, images were converted into different color spaces such as HSV (hue, saturation, value) or LAB, which might be more effective in highlighting traffic lights under various lighting conditions.
- Contrast adjustment: Histogram equalization was used on the images to enhance contrast, ensuring that traffic lights were distinguishable even under sub-optimal lighting conditions.
- Noise reduction: To improve image quality, noise reduction techniques such as Gaussian blurring or median filtering were utilized to smooth out the images, reducing the impact of sensor noise or compression artifacts.
- Edge enhancement: Edge detection filters (e.g., Sobel, Canny) were applied to some images to accentuate the borders of traffic lights, which can aid the model in identifying these objects against complex backgrounds.
- Region of interest (ROI) extraction: In some cases, ROIs were defined to focus the model’s attention on specific areas where traffic lights are likely to be found, thereby reducing the computational complexity, and improving detection performance.
- Data splitting: The dataset was randomly split into training, validation, and testing sets. This ensures that there is no data leakage, and the model’s performance can be accurately evaluated.
- Balance classes: To prevent model bias towards over-represented classes, techniques such as over-sampling the minority class or under-sampling the majority class were applied to balance the dataset.
4.3. Labeling Methods
- By considering a reduced set of features, the meta-model’s learning process becomes more efficient. This targeted approach helps the model better differentiate between objects and their unique attributes.
- Eliminating unnecessary features simplifies the meta-model, making it easier to interpret and maintain. Additionally, this simplification can lead to faster inference times and reduced computational resource usage, resulting in lower latency.
- The simplified and focused model can operate effectively in harsh weather conditions, which presents a significant challenge for traditional models trained with conventional labeling data.
4.4. Evaluation Metrics
4.5. Experiment Setup
4.6. Training Process
5. Results and Discussion
5.1. Meta-YOLOv8 Comparison with Base Model (YOLOV8)
5.2. Model Adaptability
5.3. Meta-YOLOv8 vs. Other Existing Methods
5.3.1. FPS Comparison
5.3.2. Mean Average Precision (mAP)
5.3.3. Test Accuracy
5.3.4. FLOPS and Parameters
6. Conclusions
Limitations
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mogelmose, A.; Trivedi, M.M.; Moeslund, T.B. Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey. IEEE Trans. Intell. Transp. Syst. 2012, 13, 1484–1497. [Google Scholar] [CrossRef]
- Zhai, C.; Li, K.; Zhang, R.; Peng, T.; Zong, C. Phase diagram in multi-phase heterogeneous traffic flow model integrating the perceptual range difference under human-driven and connected vehicles environment. Chaos Solitons Fractals 2024, 182, 114791. [Google Scholar] [CrossRef]
- Navarro Lafuente, A. Business Modelling of 5G-Based Drone-as-a-Service Solution. Master’s Thesis, Universitat Politècnica de Catalunya, Barcelona, Spain, 2024. [Google Scholar]
- Jain, A.; Mishra, A.; Shukla, A.; Tiwari, R. A novel genetically optimized convolutional neural network for traffic sign recognition: A new benchmark on Belgium and Chinese traffic sign datasets. Neural Process. Lett. 2019, 50, 3019–3043. [Google Scholar] [CrossRef]
- Gautam, S.; Kumar, A. Image-based automatic traffic lights detection system for autonomous cars: A review. Multimed. Tools Appl. 2023, 82, 26135–26182. [Google Scholar] [CrossRef]
- Varghese, R.; Sambath, M. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. In Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, 18–19 April 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; PMLR: Proceedings of Machine Learning Research: New York, NY, USA, 2017; pp. 1126–1135. [Google Scholar]
- Beyaz, A.; Gerdan, D. Meta-learning-based prediction of different corn cultivars from color feature extraction. J. Agric. Sci. 2021, 27, 32–41. [Google Scholar]
- Binangkit, J.L.; Widyantoro, D.H. Increasing accuracy of traffic light color detection and recognition using machine learning. In Proceedings of the 2016 10th International Conference on Telecommunication Systems Services and Applications (TSSA), Denpasar, Indonesia, 6–7 October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–5. [Google Scholar]
- Pandharkar, M.; Raoundale, P. A Systematic Study of Approaches used to Address the Long Tail Problem. In Proceedings of the 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 15–17 March 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 430–437. [Google Scholar]
- Chen, Q.; Dai, Z.; Xu, Y.; Gao, Y. CTM-YOLOv8n: A Lightweight Pedestrian Traffic-Sign Detection and Recognition Model with Advanced Optimization. World Electr. Veh. J. 2024, 15, 285. [Google Scholar] [CrossRef]
- Müller, J.; Dietmayer, K. Detecting traffic lights by single shot detection. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 266–273. [Google Scholar]
- Almagambetov, A.; Velipasalar, S.; Baitassova, A. Mobile standards-based traffic light detection in assistive devices for individuals with color-vision deficiency. IEEE Trans. Intell. Transp. Syst. 2014, 16, 1305–1320. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- You, S.; Bi, Q.; Ji, Y.; Liu, S.; Feng, Y.; Wu, F. Traffic sign detection method based on improved SSD. Information 2020, 11, 475. [Google Scholar] [CrossRef]
- Tian, Y.; Yang, G.; Wang, Z.; Wang, H.; Li, E.; Liang, Z. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput. Electron. Agric. 2019, 157, 417–426. [Google Scholar] [CrossRef]
- Safaldin, M.; Zaghden, N.; Mejdoub, M. An Improved YOLOv8 to Detect Moving Objects. IEEE Access 2024, 12, 59782–59806. [Google Scholar] [CrossRef]
- Zaatouri, K.; Ezzedine, T. A self-adaptive traffic light control system based on YOLO. In Proceedings of the 2018 International Conference on Internet of Things, Embedded Systems and Communications (IINTEC), Hamammet, Tunisia, 20–21 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 16–19. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the 29th Advances in Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, 7–12 December 2015; Volume 2. [Google Scholar]
- Gavrilescu, R.; Zet, C.; Foșalău, C.; Skoczylas, M.; Cotovanu, D. Faster R-CNN: An approach to real-time object detection. In Proceedings of the 2018 International Conference and Exposition on Electrical And Power Engineering (EPE), Iasi, Romania, 18–19 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 165–168. [Google Scholar]
- Chuang, C.H.; Lee, C.C.; Lo, J.H.; Fan, K.C. Traffic Light Detection by Integrating Feature Fusion and Attention Mechanism. Electronics 2023, 12, 3727. [Google Scholar] [CrossRef]
- Jiang, Z.; Zhao, L.; Li, S.; Jia, Y. Real-time object detection method based on improved YOLOv4-tiny. arXiv 2020, arXiv:2011.04244. [Google Scholar]
- Arnold, S.M.; Mahajan, P.; Datta, D.; Bunner, I.; Zarkias, K.S. learn2learn: A library for meta-learning research. arXiv 2020, arXiv:2008.12284. [Google Scholar]
- Ren, X.; Zhang, W.; Wu, M.; Li, C.; Wang, X. Meta-yolo: Meta-learning for few-shot traffic sign detection via decoupling dependencies. Appl. Sci. 2022, 12, 5543. [Google Scholar] [CrossRef]
- Shmelkov, K.; Schmid, C.; Alahari, K. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 3400–3409. [Google Scholar]
- Flores-Calero, M.; Astudillo, C.A.; Guevara, D.; Maza, J.; Lita, B.S.; Defaz, B.; Ante, J.S.; Zabala-Blanco, D.; Armingol Moreno, J.M. Traffic sign detection and recognition using YOLO object detection algorithm: A systematic review. Mathematics 2024, 12, 297. [Google Scholar] [CrossRef]
- Karim, M.J.; Nahiduzzaman, M.; Ahsan, M.; Haider, J. Development of an Early Detection and Automatic Targeting System for Cotton Weeds using an Improved Lightweight YOLOv8 Architecture on an Edge Device. Knowl.-Based Syst. 2024, 300, 112204. [Google Scholar] [CrossRef]
- Finn, C.B. Learning to Learn with Gradients; University of California: Berkeley, CA, USA, 2018. [Google Scholar]
- Tammisetti, V.; Bierzynski, K.; Stettinger, G.; Morales-Santos, D.P.; Cuellar, M.P.; Molina-Solana, M. LaANIL: ANIL with Look-Ahead Meta-Optimization and Data Parallelism. Electronics 2024, 13, 1585. [Google Scholar] [CrossRef]
- Starck, J.L.; Murtagh, F.; Bijaoui, A. Image Processing and Data Analysis: The Multiscale Approach; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Vandaele, R.; Nervo, G.A.; Gevaert, O. Topological image modification for object detection and topological image processing of skin lesions. Sci. Rep. 2020, 10, 21061. [Google Scholar] [CrossRef]
- Rädsch, T.; Reinke, A.; Weru, V.; Tizabi, M.D.; Schreck, N.; Kavur, A.E.; Pekdemir, B.; Roß, T.; Kopp-Schneider, A.; Maier-Hein, L. Labelling instructions matter in biomedical image analysis. Nat. Mach. Intell. 2023, 5, 273–283. [Google Scholar] [CrossRef]
- Li, R.; Cao, W.; Wu, S.; Wong, H.S. Generating target image-label pairs for unsupervised domain adaptation. IEEE Trans. Image Process. 2020, 29, 7997–8011. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Handa, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv 2015, arXiv:1505.07293. [Google Scholar]
- Lee, Y.; Hwang, J.w.; Lee, S.; Bae, Y.; Park, J. An energy and GPU-computation efficient backbone network for real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–20 June 2019; IEEE: Piscataway, NJ, USA, 2019. [Google Scholar]
- Sutskever, I.; Martens, J.; Dahl, G.; Hinton, G. On the importance of initialization and momentum in deep learning. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 1139–1147. [Google Scholar]
- Smith, L.N.; Topin, N. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Baltimore, MD, USA, 10 May 2019; SPIE: Bellingham, WA, USA, 2019; Volume 11006, pp. 369–386. [Google Scholar]
- Sobrecueva, L. Automated Machine Learning with AutoKeras: Deep Learning Made Accessible for Everyone with Just Few Lines of Coding; Packt Publishing Ltd.: Birmingham, UK, 2021. [Google Scholar]
- Fu, K.; Zhang, T.; Zhang, Y.; Yan, M.; Chang, Z.; Zhang, Z.; Sun, X. Meta-SSD: Towards fast adaptation for few-shot object detection with meta-learning. IEEE Access 2019, 7, 77597–77606. [Google Scholar] [CrossRef]
- Wang, Y.X.; Ramanan, D.; Hebert, M. Meta-learning to detect rare objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 9925–9934. [Google Scholar]
- Yan, J.; Wang, H.; Yan, M.; Diao, W.; Sun, X.; Li, H. IoU-adaptive deformable R-CNN: Make full use of IoU for multi-class object detection in remote sensing imagery. Remote Sens. 2019, 11, 286. [Google Scholar] [CrossRef]
- Wang, S.; Zhang, Z.; Chao, Q.; Yu, T. AFE-YOLOv8: A Novel Object Detection Model for Unmanned Aerial Vehicle Scenes with Adaptive Feature Enhancement. Algorithms 2024, 17, 276. [Google Scholar] [CrossRef]
- Chabi Adjobo, E.; Sanda Mahama, A.T.; Gouton, P.; Tossa, J. Automatic localization of five relevant Dermoscopic structures based on YOLOv8 for diagnosis improvement. J. Imaging 2023, 9, 148. [Google Scholar] [CrossRef]
- Wang, G.; Luo, C.; Sun, X.; Xiong, Z.; Zeng, W. Tracking by instance detection: A meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 6288–6297. [Google Scholar]
- Wang, N.; Gao, Y.; Chen, H.; Wang, P.; Tian, Z.; Shen, C.; Zhang, Y. NAS-FCOS: Fast neural architecture search for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 11943–11951. [Google Scholar]
No. | Dataset | Important Features | Quality/Uncertainty in Data |
---|---|---|---|
1 | Kitty | Long distance and edges of traffic lights | 90%/10% |
2 | Kaggle | Long distance and edges of traffic signal lights | 75%/30% |
3 | Carla Traffic Light Images | Colors of traffic signals in different weather conditions | 85%/20% |
4 | LISA Traffic Light Dataset | Long-distance view and edges of traffic signals | 80%/20% |
5 | Cityscapes | Traffic signals in different weather condition | 85%/15% |
6 | Eurocity | Color and contrast of traffic signals | 90%/15% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tammisetti, V.; Stettinger, G.; Cuellar, M.P.; Molina-Solana, M. Meta-YOLOv8: Meta-Learning-Enhanced YOLOv8 for Precise Traffic Light Color Detection in ADAS. Electronics 2025, 14, 468. https://doi.org/10.3390/electronics14030468
Tammisetti V, Stettinger G, Cuellar MP, Molina-Solana M. Meta-YOLOv8: Meta-Learning-Enhanced YOLOv8 for Precise Traffic Light Color Detection in ADAS. Electronics. 2025; 14(3):468. https://doi.org/10.3390/electronics14030468
Chicago/Turabian StyleTammisetti, Vasu, Georg Stettinger, Manuel Pegalajar Cuellar, and Miguel Molina-Solana. 2025. "Meta-YOLOv8: Meta-Learning-Enhanced YOLOv8 for Precise Traffic Light Color Detection in ADAS" Electronics 14, no. 3: 468. https://doi.org/10.3390/electronics14030468
APA StyleTammisetti, V., Stettinger, G., Cuellar, M. P., & Molina-Solana, M. (2025). Meta-YOLOv8: Meta-Learning-Enhanced YOLOv8 for Precise Traffic Light Color Detection in ADAS. Electronics, 14(3), 468. https://doi.org/10.3390/electronics14030468