Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions
Abstract
:1. Introduction
- Embedding temporal information in three-dimensional (3D) input into convolutional neural networks (3D CNN, 3D autoencoder and 3D ResNet), with the third dimension spanning a typical blink duration, combined with a simple classifier in order to classify the image sequences during training.
- The proposed inference mode (blink detection in unseen videos) that utilizes one accumulator for each eye, aggregating predictions during a single forward pass of all subsequences in the unseen video with step = 1. The accumulators are subsequently processed using morphology and watershed segmentation in order to robustly detect blinks of any duration.
- A novel and accurate definition of blink detection metrics that considers many-to-one predicted and actual blinks (and vice versa).
2. Materials and Methods
2.1. Related Work
2.2. Overview of the Proposed Methodology
2.3. Eye Region Detection and Preparation of Dataset for the Training Phase
2.4. Deep Learning Architectures for Blink Detection
2.4.1. Three-Dimensional CNN Architectures for Subsequence Classification
2.4.2. Three-Dimensional CNN Autoencoder
2.5. Supervised Training Mode
2.6. Inference Mode: Identifying Blinks in Unseen Videos
Algorithm 1 Prediction Accumulator |
Input: video sequence Output: the accumulator A
|
2.7. Definition of Blink Detection Metrics During Inference
2.8. Clinical Setting
3. Results
3.1. The Available Dataset
3.2. Quantitative Results
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Morimoto, C.H.; Mimica, M.R. Eye gaze tracking techniques for interactive applications. Comput. Vis. Image Underst. 2005, 98, 4–24. [Google Scholar] [CrossRef]
- Stern, J.A.; Boyer, D.; Schroeder, D. Blink rate: A possible measure of fatigue. Hum. Factors 1994, 36, 285–297. [Google Scholar] [CrossRef] [PubMed]
- Maffei, A.; Angrilli, A. Spontaneous eye blink rate: An index of dopaminergic component of sustained attention and fatigue. Int. J. Psychophysiol. 2018, 123, 58–63. [Google Scholar] [CrossRef]
- VanderWerf, F.; Brassinga, P.; Reits, D.; Aramideh, M.; Ongerboer de Visser, B. Eyelid movements: Behavioral studies of blinking in humans under different stimulus conditions. J. Neurophysiol. 2003, 89, 2784–2796. [Google Scholar] [CrossRef] [PubMed]
- Cruz, A.A.; Garcia, D.M.; Pinto, C.T.; Cechetti, S.P. Spontaneous eyeblink activity. Ocul. Surf. 2011, 9, 29–41. [Google Scholar] [CrossRef] [PubMed]
- Hasan, S.A.; Baker, R.S.; Sun, W.S.; Rouholiman, B.R.; Chuke, J.C.; Cowen, D.E.; Porter, J.D. The role of blink adaptation in the pathophysiology of benign essential blepharospasm. Arch. Ophthalmol. 1997, 115, 631–636. [Google Scholar] [CrossRef]
- Kimura, N.; Watanabe, A.; Suzuki, K.; Toyoda, H.; Hakamata, N.; Fukuoka, H.; Washimi, Y.; Arahata, Y.; Takeda, A.; Kondo, M.; et al. Measurement of spontaneous blinks in patients with Parkinson’s disease using a new high-speed blink analysis system. J. Neurol. Sci. 2017, 380, 200–204. [Google Scholar] [CrossRef]
- Ogawa, K.; Okazaki, M.; Mori, H.; Hidaka, T.; Tomioka, Y.; Tanaka, K.; Uemura, N.; Akiyama, M. Comparative blink analysis in patients with established facial paralysis using high-speed video analysis. J. Craniofacial Surg. 2022, 33, 797–802. [Google Scholar] [CrossRef] [PubMed]
- Cohen, D.J.; Detlor, J.; Young, J.G.; Shaywitz, B.A. Clonidine ameliorates Gilles de la Tourette syndrome. Arch. Gen. Psychiatry 1980, 37, 1350–1357. [Google Scholar] [CrossRef]
- Osaki, M.H.; Osaki, T.H.; Garcia, D.M.; Osaki, T.; Gameiro, G.R.; Belfort, R.; Cruz, A.A.V. Analysis of blink activity and anomalous eyelid movements in patients with hemifacial spasm. Graefe’s Arch. Clin. Exp. Ophthalmol. 2020, 258, 669–674. [Google Scholar] [CrossRef] [PubMed]
- Craig, J.P.; Nichols, K.K.; Akpek, E.K.; Caffery, B.; Dua, H.S.; Joo, C.K.; Liu, Z.; Nelson, J.D.; Nichols, J.J.; Tsubota, K.; et al. TFOS DEWS II definition and classification report. Ocul. Surf. 2017, 15, 276–283. [Google Scholar] [CrossRef] [PubMed]
- Stevens, J.R. Eye blink and schizophrenia: Psychosis or tardive dyskinesia? Am. J. Psychiatry 1978, 135, 223–226. [Google Scholar]
- Dawson, D.; Searle, A.K.; Paterson, J.L. Look before you (s) leep: Evaluating the use of fatigue detection technologies within a fatigue risk management system for the road transport industry. Sleep Med. Rev. 2014, 18, 141–152. [Google Scholar] [CrossRef]
- Ezzat, M.; Maged, M.; Gamal, Y.; Adel, M.; Alrahmawy, M.; El-Metwally, S. Blink-To-Live eye-based communication system for users with speech impairments. Sci. Rep. 2023, 13, 7961. [Google Scholar] [CrossRef]
- Soukupova, T.; Cech, J. Eye blink detection using facial landmarks. In Proceedings of the 21st Computer Vision Winter Workshop, Rimske Toplice, Slovenia, 3–5 February 2016; Volume 2. [Google Scholar]
- Ge, Z. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Kartynnik, Y.; Ablavatski, A.; Grishchenko, I.; Grundmann, M. Real-time facial surface geometry from monocular video on mobile GPUs. arXiv 2019, arXiv:1907.06724. [Google Scholar]
- Viola, P.; Jones, M.J. Robust real-time face detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
- Li, J.W. Eye blink detection based on multiple Gabor response waves. In Proceedings of the 2008 International Conference on Machine Learning and Cybernetics, Kunming, China, 12–15 July 2008; IEEE: Piscataway, NJ, USA, 2008; Volume 5, pp. 2852–2856. [Google Scholar]
- Pauly, L.; Sankar, D. A novel method for eye tracking and blink detection in video frames. In Proceedings of the 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS), Bhubaneshwar, India, 2–3 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 252–257. [Google Scholar]
- Królak, A.; Strumiłło, P. Eye-blink detection system for human–computer interaction. Univers. Access Inf. Soc. 2012, 11, 409–419. [Google Scholar] [CrossRef]
- Choi, I.; Han, S.; Kim, D. Eye detection and eye blink detection using adaboost learning and grouping. In Proceedings of the 2011 20th International Conference on Computer Communications and Networks (ICCCN), Maui, HI, USA, 31 July–4 August 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 1–4. [Google Scholar]
- Al-gawwam, S.; Benaissa, M. Robust eye blink detection based on eye landmarks and Savitzky–Golay filtering. Information 2018, 9, 93. [Google Scholar] [CrossRef]
- Drutarovsky, T.; Fogelton, A. Eye blink detection using variance of motion vectors. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 436–448. [Google Scholar]
- Fogelton, A.; Benesova, W. Eye blink detection based on motion vectors analysis. Comput. Vis. Image Underst. 2016, 148, 23–33. [Google Scholar] [CrossRef]
- Farnebäck, G. Two-frame motion estimation based on polynomial expansion. In Proceedings of the Image Analysis: 13th Scandinavian Conference, SCIA 2003, Halmstad, Sweden, 29 June–2 July 2003; Proceedings 13. Springer: Berlin/Heidelberg, Germany, 2003; pp. 363–370. [Google Scholar]
- Fogelton, A.; Benesova, W. Eye blink completeness detection. Comput. Vis. Image Underst. 2018, 176, 78–85. [Google Scholar] [CrossRef]
- de la Cruz, G.; Lira, M.; Luaces, O.; Remeseiro, B. Eye-lrcn: A long-term recurrent convolutional network for eye blink completeness detection. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 5130–5140. [Google Scholar] [CrossRef] [PubMed]
- Nousias, G.; Panagiotopoulou, E.K.; Delibasis, K.; Chaliasou, A.M.; Tzounakou, A.M.; Labiris, G. Video-based eye blink identification and classification. IEEE J. Biomed. Health Inform. 2022, 26, 3284–3293. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Nellas, I.A.; Tasoulis, S.K.; Georgakopoulos, S.V.; Plagianakos, V.P. Two phase cooperative learning for supervised dimensionality reduction. Pattern Recognit. 2023, 144, 109871. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Beucher, S. Watershed, hierarchical segmentation and waterfall algorithm. In Mathematical Morphology and Its Applications to Image Processing; Springer: Berlin/Heidelberg, Germany, 1994; pp. 69–76. [Google Scholar]
- Talking Face Video. Available online: https://personalpages.manchester.ac.uk/staff/Timothy.F.Cootes/data/talking_face/talking_face.html (accessed on 31 March 2021).
Total | Right | Left | |||||
---|---|---|---|---|---|---|---|
Blink | No Blink | Blink | No Blink | Blink | No Blink | ||
de la Cruz et al. [28] | Blink | 952 | 220 | 480 | 106 | 472 | 114 |
Non Blink | 185 | - | 100 | - | 85 | - | |
Fogelton et al. [27] | Blink | 833 | 339 | 479 | 107 | 354 | 232 |
Non Blink | 68 | - | 41 | - | 27 | - | |
Nousias et al. [29] | Blink | 1113 | 59 | 562 | 24 | 551 | 35 |
Non Blink | 265 | - | 155 | - | 110 | - | |
3D CNN | Blink | 1056 | 116 | 522 | 64 | 534 | 52 |
Non Blink | 126 | - | 64 | - | 62 | - | |
3D autoencoder | Blink | 1050 | 122 | 518 | 68 | 532 | 54 |
Non Blink | 121 | - | 61 | - | 60 | - | |
3D ResNet | Blink | 1106 | 66 | 541 | 45 | 565 | 21 |
Non Blink | 94 | - | 40 | - | 54 | - |
Our Dataset | Talking Face [34] | |||
---|---|---|---|---|
Methods | Accuracy | F1-Score | Accuracy | F1-Score |
de la Cruz et al. [28] | 70.15 | 82.46 | -** | 97.90 * |
Fogelton et al. [27] | 67.18 | 80.37 | -** | 97.10 * |
Nousias et al. [29] | 77.45 | 87.29 | 86.51 | 92.80 |
3D CNN | 81.36 | 89.72 | 90.48 | 95.00 |
3D autoencoder | 81.21 | 89.63 | 91.27 | 95.44 |
3D ResNet | 87.36 | 93.25 | 92.86 | 96.30 |
Models | 12-Frame Sequence Single-Eye Inference | 60 s Video, Single-Eye Inference | 60 s Total Time for Both Eyes | Learnable Parameters |
---|---|---|---|---|
3D CNN | 0.13 s | 195 s | 510 s | 20,590,034 |
3D autoencoder | 0.15 s | 225 s | 570 s | 35,375,507 |
3D ResNet | 0.09 s | 135 s | 390 s | 174,500 |
IOU = 0.2 | IOU = 0.3 | IOU = 0.4 | ||||
---|---|---|---|---|---|---|
Methods | TP | FP | TP | FP | TP | FP |
3D CNN | 1056 | 172 | 1056 | 162 | 1050 | 174 |
3D autoencoder | 1050 | 166 | 1052 | 163 | 1044 | 167 |
3D ResNet | 1106 | 172 | 1100 | 166 | 1102 | 167 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nousias, G.; Delibasis, K.K.; Labiris, G. Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions. J. Imaging 2025, 11, 27. https://doi.org/10.3390/jimaging11010027
Nousias G, Delibasis KK, Labiris G. Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions. Journal of Imaging. 2025; 11(1):27. https://doi.org/10.3390/jimaging11010027
Chicago/Turabian StyleNousias, George, Konstantinos K. Delibasis, and Georgios Labiris. 2025. "Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions" Journal of Imaging 11, no. 1: 27. https://doi.org/10.3390/jimaging11010027
APA StyleNousias, G., Delibasis, K. K., & Labiris, G. (2025). Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions. Journal of Imaging, 11(1), 27. https://doi.org/10.3390/jimaging11010027