Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition
Abstract
:1. Introduction
- Scenario detection and interpretation: Our model enhances the ability to accurately detect and interpret diverse road scenarios, despite challenges such as changing lighting conditions, perspective shifts, and environmental factors like shadows. This robustness is crucial for real-time road safety applications.
- Addressing specific road scenario challenges: Our research tackles the complexities of recognizing various road scenarios that involve variations in lighting, scale, blur, perspective angle, and contrasts within the same classification, as well as similarities between different classes. Advanced convolutional methods are employed to handle these challenges effectively.
- Comprehensive model comparison: We provide a detailed comparative analysis of multiple models, including CNN2D, CNN3D, LSTM, LRCN, and our proposed model. By evaluating key performance metrics such as accuracy, precision, recall, and F1-score, our study offers valuable insights into the performance of each framework in the context of road scenario recognition.
2. Related Studies
3. Methodology
3.1. Dataset
3.2. Preprocessing
Extraction of Videos
- Driving reverse: this class refers to the act of operating a vehicle while moving forward in the opposite lane, with the front end pointing in the direction opposite to the intended travel.
- Driving reverse (others): this class includes other vehicles misusing the lane, cars parked in the center of the road, people walking in the middle of the road, and people standing in the middle of road.
- Object falling: this class includes throwing objects from vehicles, people throwing objects onto the road from the sidewalk area, and people in the middle of the road throwing objects toward the center of the road.
- Pedestrian: this class is a place where people walk in public areas, using their feet rather than a vehicle or other mode of transportation.
- Stop vehicle: In this class, the cars stopping at any location are captured by the surveillance camera. The driver stepping on the brakes, and the brake lights are enabled.
3.3. Feature Extraction
3.4. LSTM Layer
3.5. Multi-Directional Long-Term Recurrent Convolutional Network
3.6. Optimization and Loss Function
4. Experiments
4.1. Experimental Setup
4.2. Data Augmentation
4.3. Performance Results on Confusion Matrix
4.4. Evaluation Metrics
5. Conclusions
6. Limitation and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CNN | Convolutional Neural Network |
LSTM | Long-Short Term Memory |
LRCN | Long-Term Recurrent Convolutional Network |
ANN | Artificial Neural Network |
RCNN | Region-based Convolutional Neural Networks |
BiLRCN | Bi-directional Long-Term Recurrent Convolutional Network |
YOLO | You Only Look Once |
Faster R-CNN | Faster Region Convolutional Neural Network |
RNN | Recurrent Neural Network |
TCN | Temporal Convolutional Network |
References
- Socha, R.; Kogut, B. Urban video surveillance as a tool to improve security in public spaces. Sustainability 2020, 12, 6210. [Google Scholar] [CrossRef]
- Davidson, K.; Briggs, J.; Nolan, E.; Bush, J.; Håkansson, I.; Moloney, S. The making of a climate emergency response: Examining the attributes of climate emergency plans. Urban Clim. 2020, 33, 100666. [Google Scholar] [CrossRef]
- Toriumi, A.; Abu-Lebdeh, G.; Alhajyaseen, W.; Christie, N.; Gehlert, T.; Mehran, B.; Mussone, L.; Shawky, M.; Tang, K.; Nakamura, H. A multi-country survey for collecting and analyzing facts related to road traffic safety: Legislation, enforcement, and education for safer drivers. IATSS Res. 2022, 46, 14–25. [Google Scholar] [CrossRef]
- Aufrere, R.; Chapuis, R.; Chausse, F. A model-driven approach for real-time road recognition. Mach. Vis. Appl. 2001, 13, 95–107. [Google Scholar] [CrossRef]
- Alrajhi, A.; Roy, K.; Qingge, L.; Kribs, J. Detection of road condition defects using multiple sensors and IoT technology: A review. IEEE Open J. Intell. Transp. Syst. 2023, 4, 372–392. [Google Scholar] [CrossRef]
- Hasanujjaman, M.; Chowdhury, M.Z.; Jang, Y.M. Sensor fusion in autonomous vehicle with traffic surveillance camera system: Detection, localization, and AI networking. Sensors 2023, 23, 3335. [Google Scholar] [CrossRef] [PubMed]
- Micko, K.; Papcun, P.; Zolotova, I. Review of IoT sensor systems used for monitoring the road infrastructure. Sensors 2023, 23, 4469. [Google Scholar] [CrossRef] [PubMed]
- Sohail, A.; Cheema, M.A.; Ali, M.E.; Toosi, A.N.; Rakha, H.A. Data-driven approaches for road safety: A comprehensive systematic literature review. Saf. Sci. 2023, 158, 105949. [Google Scholar] [CrossRef]
- Elharrouss, O.; Almaadeed, N.; Al-Maadeed, S. A review of video surveillance systems. J. Vis. Commun. Image Represent. 2021, 77, 103116. [Google Scholar] [CrossRef]
- Wassouf, Y.; Korekov, E.M.; Serebrenny, V.V. Decision Making for Advanced Driver Assistance Systems for Public Transport. In Proceedings of the 2023 5th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE), Moscow, Russia, 16–18 March 2023; IEEE: Piscataway, NJ, USA, 2023; Volume 5, pp. 1–6. [Google Scholar]
- Estable, S.; Schick, J.; Stein, F.; Janssen, R.; Ott, R.; Ritter, W.; Zheng, Y.J. A real-time traffic sign recognition system. In Proceedings of the Intelligent Vehicles’ 94 Symposium, Paris, France, 24–26 October 1994; IEEE: Piscataway, NJ, USA, 1994; pp. 213–218. [Google Scholar]
- González, D.M.; Morillas, J.M.B.; Rey-Gozalo, G. Effects of noise on pedestrians in urban environments where road traffic is the main source of sound. Sci. Total. Environ. 2023, 857, 159406. [Google Scholar] [CrossRef]
- Fredj, H.B.; Chabbah, A.; Baili, J.; Faiedh, H.; Souani, C. An efficient implementation of traffic signs recognition system using CNN. Microprocess. Microsyst. 2023, 98, 104791. [Google Scholar] [CrossRef]
- Sarfraz, M.S.; Shahzad, A.; Elahi, M.A.; Fraz, M.; Zafar, I.; Edirisinghe, E.A. Real-time automatic license plate recognition for CCTV forensic applications. J. Real-Time Image Process. 2013, 8, 285–295. [Google Scholar] [CrossRef]
- Grabowski, D.; Czyżewski, A. System for monitoring road slippery based on CCTV cameras and convolutional neural networks. J. Intell. Inf. Syst. 2020, 55, 521–534. [Google Scholar] [CrossRef]
- Sirirattanapol, C.; Nagai, M.; Witayangkurn, A.; Pravinvongvuth, S.; Ekpanyapong, M. Bangkok CCTV image through a road environment extraction system using multi-label convolutional neural network classification. ISPRS Int. J. Geo-Inf. 2019, 8, 128. [Google Scholar] [CrossRef]
- Lin, C.Y.; Lian, F.L. System integration of sensor-fusion localization tasks using vision-based driving lane detection and road-marker recognition. IEEE Syst. J. 2020, 14, 4523–4534. [Google Scholar] [CrossRef]
- Zhu, Q. Research on road traffic situation awareness system based on image big data. IEEE Intell. Syst. 2019, 35, 18–26. [Google Scholar] [CrossRef]
- Paetzold, F.; Franke, U. Road recognition in urban environment. Image Vis. Comput. 2000, 18, 377–387. [Google Scholar] [CrossRef]
- Ke, R.; Liu, C.; Yang, H.; Sun, W.; Wang, Y. Real-time traffic and road surveillance with parallel edge intelligence. IEEE J. Radio Freq. Identif. 2022, 6, 693–696. [Google Scholar] [CrossRef]
- Fang, C.Y.; Fuh, C.S.; Yen, P.; Cherng, S.; Chen, S.W. An automatic road sign recognition system based on a computational model of human recognition processing. Comput. Vis. Image Underst. 2004, 96, 237–268. [Google Scholar] [CrossRef]
- Cho, S.M.; Choi, B.J. CNN-based recognition algorithm for four classes of roads. Int. J. Fuzzy Log. Intell. Syst. 2020, 20, 114–118. [Google Scholar] [CrossRef]
- Xiangxue, W.; Lunhui, X.; Kaixun, C. Data-driven short-term forecasting for urban road network traffic based on data processing and LSTM-RNN. Arab. J. Sci. Eng. 2019, 44, 3043–3060. [Google Scholar] [CrossRef]
- Massa, L.; Barbosa, A.; Oliveira, K.; Vieira, T. LRCN-RetailNet: A recurrent neural network architecture for accurate people counting. Multimed. Tools Appl. 2021, 80, 5517–5537. [Google Scholar] [CrossRef]
- Ma, Y.; Wei, Y.; Shi, Y.; Li, X.; Tian, Y.; Zhao, Z. Online learning engagement recognition using bidirectional Long-Term recurrent convolutional networks. Sustainability 2022, 15, 198. [Google Scholar] [CrossRef]
- Yang, W.; Zhang, X.; Lei, Q.; Shen, D.; Xiao, P.; Huang, Y. Lane position detection based on long short-term memory (LSTM). Sensors 2020, 20, 3115. [Google Scholar] [CrossRef] [PubMed]
- Sinulingga, H.R.; Munir, R. Road recognition system with heuristic method and machine learning. In Proceedings of the 2020 7th International Conference on Advance Informatics: Concepts, Theory and Applications (ICAICTA), Online, 8–9 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Arya, D.; Maeda, H.; Ghosh, S.K.; Toshniwal, D.; Mraz, A.; Kashiyama, T.; Sekimoto, Y. Deep learning-based road damage detection and classification for multiple countries. Autom. Constr. 2021, 132, 103935. [Google Scholar] [CrossRef]
- Wu, M.; Kwon, T.J. An Automatic Architecture Designing Approach of Convolutional Neural Networks for Road Surface Conditions Image Recognition: Tradeoff between Accuracy and Efficiency. J. Sens. 2022, 2022, 3325282. [Google Scholar] [CrossRef]
- Luo, H.; Li, C.; Wu, M.; Cai, L. An Enhanced Lightweight Network for Road Damage Detection Based on Deep Learning. Electronics 2023, 12, 2583. [Google Scholar] [CrossRef]
- Park, J.; Wen, M.; Sung, Y.; Cho, K. Multiple event-based simulation scenario generation approach for autonomous vehicle smart sensors and devices. Sensors 2019, 19, 4456. [Google Scholar] [CrossRef]
- Zyner, A.; Worrall, S.; Nebot, E. Naturalistic driver intention and path prediction using recurrent neural networks. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1584–1594. [Google Scholar] [CrossRef]
- Choi, J.G.; Kong, C.W.; Kim, G.; Lim, S. Car crash detection using ensemble deep learning and multimodal data from dashboard cameras. Expert Syst. Appl. 2021, 183, 115400. [Google Scholar] [CrossRef]
- Djenouri, Y.; Belbachir, A.N.; Michalak, T.; Belhadi, A.; Srivastava, G. Enhancing smart road safety with federated learning for Near Crash Detection to advance the development of the Internet of Vehicles. Eng. Appl. Artif. Intell. 2024, 133, 108350. [Google Scholar] [CrossRef]
- Mumuni, A.; Mumuni, F. Data augmentation: A comprehensive survey of modern approaches. Array 2022, 16, 100258. [Google Scholar] [CrossRef]
- Hussain, Z.; Gimenez, F.; Yi, D.; Rubin, D. Differential data augmentation techniques for medical imaging classification tasks. In Proceedings of the AMIA Annual Symposium Proceedings, Washington, DC, USA, 4–8 November 2017; American Medical Informatics Association: Washington, DC, USA, 2017; Volume 2017, p. 979. [Google Scholar]
Reference | Year | Model | Dataset | Description |
---|---|---|---|---|
[22] | 2020 | CNN | Non-public dataset | Works on the determination of the four categories of walking environments (baille blocks, driveways, crosswalks, and sidewalks). |
[26] | 2020 | LSTM-RCNN | Caltech and KITTI traffic | The model that is suggested determines the roadway when a lane is blocked or distorted. |
[27] | 2020 | ANN | Indonesian roads | The method was utilized the technique of recognizing roadblocks, including the ambiguous lines, in static video. |
[28] | 2021 | YOLOv5 | Global Road Damage Detection | A random forest model is utilized to detect roadways, trained on features such as the primary color value and the block normalizing position. |
[29] | 2022 | CNN | Automated vehicle location system | Image recognition system for road surface conditions, which can support safety-related decision-making. |
[30] | 2023 | YOLOV3 | KITTI | A lightweight model reconstruction and pruning for high-precision. Deployment on mobile devices real-time detection requirements. |
Class No | Class Name | Videos per Class |
---|---|---|
0 | driving_reverse | 100 |
1 | driving_reverse(others) | 139 |
2 | object_falling | 34 |
3 | pedestrian | 111 |
4 | stop_vehicle | 114 |
Total | 498 |
Class No | Class Name | No. of Videos per Class | Training | Test |
---|---|---|---|---|
0 | Driving_Reverse | 150 | 112 | 19 |
1 | Driving_Reverse(others) | 150 | 112 | 34 |
2 | Object_Falling | 150 | 112 | 15 |
3 | Pedestrian | 150 | 112 | 23 |
4 | Stop_Vehicle | 150 | 112 | 34 |
Total | 750 | 560 | 125 |
Method | Accuracy | Precision | Recall | F1-Score | Training Time min:s |
---|---|---|---|---|---|
CNN2D | 71% | 69% | 71% | 71% | 53:16 |
CNN3D | 74% | 73% | 75% | 74% | 41:56 |
LSTM | 89% | 88% | 90% | 89% | 47:36 |
LRCN | 90% | 90% | 91% | 90% | 6:51 |
Ours | 91% | 89% | 92% | 91% | 4:13 |
Models | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Faster R-CNN | 89% | 87% | 88% | 88% |
RNN | 84% | 83% | 82% | 84% |
RNN-TCN | 90% | 89% | 88% | 89% |
Inception | 85% | 84% | 84% | 85% |
RCNN | 86% | 85% | 84% | 85% |
Ours | 91% | 89% | 92% | 91% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dofitas, C., Jr.; Gil, J.-M.; Byun, Y.-C. Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition. Sensors 2024, 24, 4618. https://doi.org/10.3390/s24144618
Dofitas C Jr., Gil J-M, Byun Y-C. Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition. Sensors. 2024; 24(14):4618. https://doi.org/10.3390/s24144618
Chicago/Turabian StyleDofitas, Cyreneo, Jr., Joon-Min Gil, and Yung-Cheol Byun. 2024. "Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition" Sensors 24, no. 14: 4618. https://doi.org/10.3390/s24144618
APA StyleDofitas, C., Jr., Gil, J. -M., & Byun, Y. -C. (2024). Multi-Directional Long-Term Recurrent Convolutional Network for Road Situation Recognition. Sensors, 24(14), 4618. https://doi.org/10.3390/s24144618