A Real-Time Embedded System for Driver Drowsiness Detection Based on Visual Analysis of the Eyes and Mouth Using Convolutional Neural Network and Mouth Aspect Ratio
Abstract
:1. Introduction
- An improved technique is proposed for extracting the area of interest (ROI) corresponding to the eyes using MediaPipe, correcting the area created by detecting points on the face, and guaranteeing the ROI of the eyes in the different head postures performed by the driver.
- A Convolutional Neural Network (CNN)-based approach for drowsiness detection using the Mouth Aspect Ratio (MAR) is proposed, where the CNN is used to perform transfer learning training for feature extraction and eye state learning based on the InceptionV3, VGG16, ResNet50V2 networks. Another CNN network named “DD-AI” is proposed and trained from start to finish to detect yawning using MAR by extracting the facial points of the mouth.
- An evaluation of the four networks using the Grad-CAM technique to observe the inference performance of each network when classifying the eye state images and obtaining the heat maps of each network, making it possible to visualize the regions that have greater importance in each CNN model.
- A comparison and discussion of experimental results obtained in other studies using the same approach, allowing for observation of the overall system performance, precision, and accuracy.
- A portable, robust, efficient, fast, and low-cost system for detecting drowsiness in real time.
2. Related Work
3. Proposed Methodology
4. Materials and Methods
4.1. Step 1: Image Acquisition System
4.2. Step 2: Facial Landmark Detection
4.3. Step 3: ROI Selection
Algorithm 1 ROI Correction |
|
4.4. Step 4: ROI Extraction
4.5. Step 5: Method and Evaluation
4.5.1. Mouth Aspect Ratio (MAR) Method
4.5.2. Convolutional Neural Network (CNN) Method
- a. Dataset Creation
- b. CNN Training Experiments
- b.1. Proposed CNN “DD-AI” architecture
- c. Preliminary CNN training results
4.6. Step 6: Alarm Activation
5. Hardware Implementation
6. Experimental Results
6.1. CNN Testing Results
6.2. CNN Visual Result
6.3. Overall Results of the CNNs
6.4. System Performance Results
7. Comparison and Discussion
8. Conclusions and Future Works
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- World Health Organization (WHO). Road Traffic Injuries. Available online: https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (accessed on 17 March 2023).
- Policía Nacional del Perú (PNP). Police Statistical Yearbook 2020 (Spanish). Available online: https://www.policia.gob.pe/estadisticopnp/documentos/anuario-2020/anuario-estadistico-policial-2020.pdf (accessed on 17 March 2023).
- Centro Nacional de Epidemiología, Prevención y Control de Enfermedades. CDC Peru Reported Close to 12,000 Road Traffic Injuries during the First Half of 2022 (Spanish). Available online: https://www.dge.gob.pe/portalnuevo/informativo/prensa/cdc-peru-reporto-cerca-de-12-mil-lesionados-por-accidentes-de-transito-durante-la-primera-mitad-del-2022/ (accessed on 17 March 2023).
- Observatorio Nacional de Seguridad Vial. Road Accident Report and Actions to Promote Road Safety (Spanish). Available online: https://www.onsv.gob.pe/post/informe-de-siniestralidad-vial-y-las-acciones-para-promover-la-seguridad-vial/ (accessed on 17 March 2023).
- Ministerio de Salud. Minsa: Drivers Should Sleep at Least Six Hours to Avoid Accidents (Spanish). Available online: https://www.gob.pe/institucion/minsa/noticias/14013-minsa-choferes-deben-dormir-seis-horas-por-lo-menos-para-evitar-accidentes (accessed on 17 March 2023).
- Albadawi, Y.; Takruri, M.; Awad, M. A Review of Recent Developments in Driver Drowsiness Detection Systems. Sensors 2022, 22, 2069. Available online: https://www.mdpi.com/1424-8220/22/5/2069 (accessed on 17 March 2023). [CrossRef] [PubMed]
- Reddy, P.V.; D’Souza, J.; Rakshit, S.; Bavariya, S.; Badrinath, P. A Survey on Driver Safety Systems using Internet of Things. Int. J. Eng. Res. Technol. (IJERT) 2022, 11, 180–185. [Google Scholar] [CrossRef]
- Anber, S.; Alsaggaf, W.; Shalash, W. A Hybrid Driver Fatigue and Distraction Detection Model Using AlexNet Based on Facial Features. Electronics 2022, 11, 285. [Google Scholar] [CrossRef]
- Dua, M.; Shakshi; Singla, R.; Raj, S.; Jangra, A. Deep CNN Models-Based Ensemble Approach to Driver Drowsiness Detection. Neural Comput. Appl. 2021, 33, 3155–3168. [Google Scholar] [CrossRef]
- Hashemi, M.; Mirrashid, A.; Beheshti Shirazi, A. Driver Safety Development: Real-Time Driver Drowsiness Detection System Based on Convolutional Neural Network. SN Comput. Sci. 2020, 1, 289. [Google Scholar] [CrossRef]
- Reddy, B.; Kim, Y.-H.; Yun, S.; Seo, C.; Jang, J. Real-Time Driver Drowsiness Detection for Embedded System Using Model Compression of Deep Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 121–128. [Google Scholar]
- Jabbar, R.; Al-Khalifa, K.; Kharbeche, M.; Alhajyaseen, W.; Jafari, M.; Jiang, S. Real-Time Driver Drowsiness Detection for Android Application Using Deep Neural Networks Techniques. Procedia Comput. Sci. 2018, 130, 400–407. [Google Scholar] [CrossRef]
- Jabbar, R.; Shinoy, M.; Kharbeche, M.; Al-Khalifa, K.; Krichen, M.; Barkaoui, K. Driver Drowsiness Detection Model Using Convolutional Neural Networks Techniques for Android Application. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), Dubai, United Arab Emirates, 15–18 June 2020; IEEE: New York, NY, USA, 2020; pp. 237–242. [Google Scholar] [CrossRef]
- He, H.; Zhang, X.; Jiang, F.; Wang, C.; Yang, Y.; Liu, W.; Peng, J. A Real-Time Driver Fatigue Detection Method Based on Two-Stage Convolutional Neural Network. IFAC-PapersOnLine 2020, 53, 15374–15379. [Google Scholar] [CrossRef]
- Çivik, E.; Yüzgeç, U. Deep Learning Based Continuous Real-Time Driver Fatigue Detection for Embedded System. In Proceedings of the 28th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 27–29 August 2020; IEEE: New York, NY, USA, 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Li, X.; Xia, J.; Cao, L.; Zhang, G.; Feng, X. Driver Fatigue Detection Based on Convolutional Neural Network and Face Alignment for Edge Computing Device. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 235, 2699–2711. [Google Scholar] [CrossRef]
- Rahman, A.; Hriday, M.B.H.; Khan, R. Computer Vision-Based Approach to Detect Fatigue Driving and Face Mask for Edge Computing Device. Heliyon 2022, 8, e11204. [Google Scholar] [CrossRef] [PubMed]
- Flores-Monroy, J.; Nakano-Miyatake, M.; Escamilla-Hernandez, E.; Sanchez-Perez, G.; Perez-Meana, H. SOMN_IA: Portable and Universal Device for Real-Time Detection of Driver’s Drowsiness and Distraction Levels. Electronics 2022, 11, 2558. [Google Scholar] [CrossRef]
- Singh, N.T.; Saurav; Pathak, N.; Raizada, A.; Shukla, S. Real-Time Driver Drowsiness Detection System Using Cascaded ConvNet Framework. In Proceedings of the 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 14–16 June 2023; pp. 828–833. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2016; Volume 9908, pp. 630–645. [Google Scholar] [CrossRef]
- Sikander, G.; Anwar, S. Driver Fatigue Detection Systems: A Review. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2339–2352. [Google Scholar] [CrossRef]
- Lugaresi, C.; Tang, J.; Nash, H.; McClanahan, C.; Uboweja, E.; Hays, M.; Zhang, F.; Chang, C.L.; Yong, M.G.; Lee, J.; et al. MediaPipe: A Framework for Building Perception Pipelines. arXiv 2019, arXiv:1906.08172. [Google Scholar]
- Florez, R.; Palomino-Quispe, F.; Coaquira-Castillo, R.J.; Herrera-Levano, J.C.; Paixão, T.; Alvarez, A.B. A CNN-Based Approach for Driver Drowsiness Detection by Real-Time Eye State Identification. Appl. Sci. 2023, 13, 7849. [Google Scholar] [CrossRef]
- Florez Zela, R.D. Diseño e Implementación de un Sistema Detector de Somnolencia en Tiempo Real Mediante Visión Computacional Usando Redes Neuronales Convolucionales Aplicado a Conductores; Universidad Nacional de San Antonio Abad del Cusco: Cusco, Peru, 2024. (In Spanish) [Google Scholar]
- Weng, C.-H.; Lai, Y.-H.; Lai, S.-H. Driver Drowsiness Detection via a Hierarchical Temporal Deep Belief Network. In Proceedings of the Computer Vision–ACCV 2016 Workshops: ACCV 2016 International Workshops, Taipei, Taiwan, 20–24 November 2016; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2017; Volume 10111, pp. 117–133. [Google Scholar] [CrossRef]
- King, D.E. Dlib-ml: A Machine Learning Toolkit. J. Mach. Learn. Res. 2009, 10, 1755–1758. [Google Scholar]
- Abtahi, S.; Omidyeganeh, M.; Shirmohammadi, S.; Hariri, B. YawDD: A Yawning Detection Dataset. In Proceedings of the 5th ACM Multimedia Systems Conference (MMSys), Brisbane, Australia, 2–5 June 2014; pp. 24–28. [Google Scholar] [CrossRef]
- Koestinger, M.; Wohlhart, P.; Roth, P.M.; Bischof, H. Annotated Facial Landmarks in the Wild: A Large-Scale, Real-World Database for Facial Landmark Localization. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; IEEE: New York, NY, USA, 2011; pp. 2144–2151. [Google Scholar] [CrossRef]
- Cech, J.; Soukupova, T. Real-Time Eye Blink Detection Using Facial Landmarks; Center for Machine Perception, Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague: Praha, Czechia, 2016; pp. 1–8. [Google Scholar]
- ELP. 2MP CMOS OV2710 Sensor Free Driver Night Vision IR USB Camera Module Full HD 1080P. Available online: https://www.elpcctv.com/2mp-cmos-ov2710-sensor-free-driver-night-vision-ir-usb-camera-module-full-hd-1080p-p-383.html (accessed on 2 March 2023).
- Grishchenko, I.; Ablavatski, A.; Kartynnik, Y.; Raveendran, K.; Grundmann, M. Attention Mesh: High-Fidelity Face Mesh Prediction in Real-Time. arXiv 2020, arXiv:2006.10962. [Google Scholar]
- Petrellis, N.; Zogas, S.; Christakos, P.; Mousouliotis, P.; Keramidas, G.; Voros, N.; Antonopoulos, C. Software Acceleration of the Deformable Shape Tracking Application: How to Eliminate the Eigen Library Overhead. In Proceedings of the 2021 European Symposium on Software Engineering, Larissa, Greece, 19–21 November 2021; pp. 51–57. [Google Scholar] [CrossRef]
- Caelen, O. A Bayesian Interpretation of the Confusion Matrix. Ann. Math. Artif. Intell. 2017, 81, 429–450. [Google Scholar] [CrossRef]
- Kwon, K.-A.; Shipley, R.J.; Edirisinghe, M.; Ezra, D.G.; Rose, G.; Best, S.M.; Cameron, R.E. High-Speed Camera Characterization of Voluntary Eye Blinking Kinematics. J. R. Soc. Interface 2013, 10, 20130227. [Google Scholar] [CrossRef] [PubMed]
- Süzen, A.A.; Duman, B.; Şen, B. Benchmark Analysis of Jetson TX2, Jetson Nano and Raspberry Pi Using Deep-CNN. In Proceedings of the 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey, 26–28 June 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- Magán, E.; Sesmero, M.P.; Alonso-Weber, J.M.; Sanchis, A. Driver Drowsiness Detection by Applying Deep Learning Techniques to Sequences of Images. Appl. Sci. 2022, 12, 1145. [Google Scholar] [CrossRef]
Factor | 2019 | 2020 | 2021 | 2022 |
---|---|---|---|---|
Human factor | 75.5% | 73.8% | 69.1% | 70.4% |
Vehicle factor | 2.1% | 2.0% | 2.0% | 1.7% |
Infrastructure factor | 3.3% | 2.7% | 3.2% | 2.9% |
Other factors | 19.1% | 21.5% | 25.8% | 24.9% |
Data Set | Drowsy | Not Drowsy |
---|---|---|
Training Set | 2380 | 2380 |
Validation Set | 510 | 510 |
Test Set | 510 | 510 |
Hyper-Parameters | Value |
---|---|
Optimizer | ADAM |
0.001 | |
0.9 | |
Learning rate | 0.999 |
Epochs | 30 |
Batch size | 32 |
Number of experiments | 10 for each CNN |
Model | Class Name | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|
Inception V3 | Not drowsy | 0.9852 ± 0.003 | 0.9888 ± 0.003 | 0.9870 ± 0.002 | 0.9870 ± 0.002 |
Drowsy | 0.9888 ± 0.003 | 0.9851 ± 0.003 | 0.9869 ± 0.002 | ||
VGG16 | Not drowsy | 0.9865 ± 0.004 | 0.9849 ± 0.004 | 0.9857 ± 0.002 | 0.9857 ± 0.002 |
Drowsy | 0.9849 ± 0.004 | 0.9865 ± 0.004 | 0.9857 ± 0.002 | ||
ResNet50V2 | Not drowsy | 0.9926 ± 0.003 | 0.9953 ± 0.003 | 0.9939 ± 0.002 | 0.9939 ± 0.002 |
Drowsy | 0.9953 ± 0.003 | 0.9926 ± 0.003 | 0.9939 ± 0.002 | ||
Proposed DD-AI | Not drowsy | 0.9980 ± 0.000 | 0.9988 ± 0.001 | 0.9984 ± 0.001 | 0.9984 ± 0.001 |
Drowsy | 0.9988 ± 0.001 | 0.9980 ± 0.000 | 0.9984 ± 0.001 |
Model | Class Name | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|
Inception V3 | Not drowsy | 0.9895 ± 0.006 | 0.9896 ± 0.006 | 0.9895 ± 0.002 | 0.9895 ± 0.002 |
Drowsy | 0.9897 ± 0.005 | 0.9894 ± 0.006 | 0.9895 ± 0.002 | ||
VGG16 | Not drowsy | 0.9868 ± 0.001 | 0.9798 ± 0.006 | 0.9833 ± 0.003 | 0.9833 ± 0.003 |
Drowsy | 0.9800 ± 0.006 | 0.9869 ± 0.001 | 0.9834 ± 0.003 | ||
ResNet50V2 | Not drowsy | 0.9967 ± 0.003 | 0.9931 ± 0.003 | 0.9949 ± 0.003 | 0.9949 ± 0.003 |
Drowsy | 0.9932 ± 0.003 | 0.9967 ± 0.003 | 0.9949 ± 0.003 | ||
Proposed DD-AI | Not drowsy | 0.9988 ± 0.001 | 0.9988 ± 0.001 | 0.9988 ± 0.001 | 0.9988 ± 0.001 |
Drowsy | 0.9988 ± 0.001 | 0.9988 ± 0.001 | 0.9988 ± 0.001 |
Model | Result in Training File Size (KB) | Result in Testing Response Time | ||
---|---|---|---|---|
Training Time | Training Time | |||
InceptionV3 | 4.3 min ± 2 s | 98,055 | 22,828,286 | 51.01 ms |
VGG16 | 3.1 min ± 3 s | 69,586 | 15,740,190 | 39.00 ms |
ResNet50V2 | 3.4 min ± 3 s | 140,594 | 27,662,302 | 60.02 ms |
Proposed DD-AI | 3.1 min ± 1 s | 75,618 | 6,448,002 | 33.00 ms |
CNN | Accuracy | FPS |
---|---|---|
InceptionV3 | 92.45% | 9–14 |
VGG16 | 90.27% | 9–14 |
ResNet50V2 | 95.86% | 9–14 |
DD-AI | 96.55% | 9–14 |
System | Used Hardware | FPS | Accuracy |
---|---|---|---|
Reddy et al. [11] | NVIDIA Jetson TK1 | 14.9 | 89.5% |
Jabbar et al. [12] | Android Phone | - | 81% |
Jabbar et al. [13] | Samsung Galaxy S8 Plus | 234.25 | 83.3% |
He et al. [14] | Raspberry Pi 4 | 10.4 | 94.7% |
Çivik and Yüzgeç [15] | NVIDIA Jetson Nano | 6 | 94.05% |
Li et al. [16] | NVIDIA Jetson Nano | 58 | 89.55% |
Rahman et al. [17] | NVIDIA Jetson Nano | - | 97.44% |
Flores-Monroy et al. [18] | NVIDIA Jetson Nano | 21 | 95.77% |
Singh, N.T. et al. [19] | unspecified | - | 98.1% |
Proposed DD-AI | NVIDIA Jetson Nano | 14 | 96.55% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Florez, R.; Palomino-Quispe, F.; Alvarez, A.B.; Coaquira-Castillo, R.J.; Herrera-Levano, J.C. A Real-Time Embedded System for Driver Drowsiness Detection Based on Visual Analysis of the Eyes and Mouth Using Convolutional Neural Network and Mouth Aspect Ratio. Sensors 2024, 24, 6261. https://doi.org/10.3390/s24196261
Florez R, Palomino-Quispe F, Alvarez AB, Coaquira-Castillo RJ, Herrera-Levano JC. A Real-Time Embedded System for Driver Drowsiness Detection Based on Visual Analysis of the Eyes and Mouth Using Convolutional Neural Network and Mouth Aspect Ratio. Sensors. 2024; 24(19):6261. https://doi.org/10.3390/s24196261
Chicago/Turabian StyleFlorez, Ruben, Facundo Palomino-Quispe, Ana Beatriz Alvarez, Roger Jesus Coaquira-Castillo, and Julio Cesar Herrera-Levano. 2024. "A Real-Time Embedded System for Driver Drowsiness Detection Based on Visual Analysis of the Eyes and Mouth Using Convolutional Neural Network and Mouth Aspect Ratio" Sensors 24, no. 19: 6261. https://doi.org/10.3390/s24196261
APA StyleFlorez, R., Palomino-Quispe, F., Alvarez, A. B., Coaquira-Castillo, R. J., & Herrera-Levano, J. C. (2024). A Real-Time Embedded System for Driver Drowsiness Detection Based on Visual Analysis of the Eyes and Mouth Using Convolutional Neural Network and Mouth Aspect Ratio. Sensors, 24(19), 6261. https://doi.org/10.3390/s24196261