A Survey of Video Analysis Based on Facial Expression Recognition †
Abstract
:1. Introduction
2. Research Methodology
3. Database
- (CK+) [6]: This is the most used laboratory dataset when it comes to emotion detection; it has a total of 593 video sequences, which have a duration of 10 min at 60 fps.
- FER 2013 [7]: This is a dataset compiled by the Google image search API; due to this, it is a large-scale dataset without restrictions, containing 28,709 training images, 3589 validation images and 3589 test images. This is the second most used dataset after CK+.
- The Japanese Female Facial Expressio (JAFFE) [4]: This is a laboratory-controlled image database, containing 213 expression samples; in this, 10 women have 34 images for each of the six basic facial expressions and 1 image with a neutral expression.
DataSet | Reference | Description |
---|---|---|
CASME II http://casme.psych.ac.cn/casme/e2 (Accessed on 15 March 2023) | Singh (2019) [5] | This dataset contains a total of 3000 facial movements and 247 microexpressions, which are divided into positive, negative and neutral categories: anger: 37 negative videos; contempt: 17 negative videos; disgust: 31 negative videos; fear: 12 negative videos; sadness: 23 negative videos; joy: 26 positive videos; surprise: 25 positive videos; and neutral with a total of 46 negative videos. |
SMIC | Singh (2019) [5] | The SMIC dataset contains 164 spontaneous microexpressions from 16 participants. Joy: 69 videos; sadness: 75 videos; anger: 69 videos; disgust: 72 videos; surprise: 72 videos; and fear: 72 videos. |
KDEF | Akhand (2021) [4], Shehu (2022) [8] | The dataset contains a total of 4900 static images divided into 6 classes of expressions. Anger: 50 images; fear: 40 images; disgust: 40 images; happiness: 69 images; sadness: 59 images; and surprise: 28 images. |
CK+https://www.kaggle.com/datasets/davilsena/ckdataset (Accessed on 20 February 2023) | Lucey (2010) [6], Li (2020) [9], Cai (2018) [10], Shehu (2022) [8] | This dataset contains a total of 920 grayscale images, each with a resolution of 2304 pixels (48 × 48). The dataset is split into 80% for training, 10% for public testing, and 10% for private testing. The number of images for each expression is as follows: anger: 45 images; disgust: 59 images; fear: 25 images; happiness: 69 images; sadness: 28 images; surprise: 83 images; neutral: 593 images; and contempt: 18 images. |
JAFFE https://zenodo.org/record/3451524#.Y4fh2KLMKbU (Accessed on 10 March 2023) | Akhand (2021) [4], Penny, S. (1998) [11], Li (2020) [9] | This dataset contains a total of 213 grayscale images of Japanese female facial expressions, with a resolution of 256 × 256 pixels. The number of images for each expression is as follows: anger: 30 images; happiness: 31 images; sadness: 31 images; surprise: 31 images; fear: 30 images; disgust: 31 images; and neutral: 29 images. |
FER2013 | Monica, B. (2013) [7], Melinte, D.O. (2020) [12], Li (2020) [9] | This dataset contains a total of 35,887 grayscale images of human faces labeled with 6 emotions; each image has a resolution of 48 × 48 pixels. The number of images for each emotion is as follows: anger: 3180 images; fear: 1005 images; happiness: 7264 images; sadness: 5178 images; surprise: 2114 images; and neutral: 17,146 images |
AffectNet | Mollahosseini (2019) [13] | This dataset consists of over one million images in JPEG format, where each image is labeled with one or multiple facial expressions. The category of neutral has 459,652 images, happiness 224,500 images, sadness 1,113,997 images, anger 44,418 images, fear 33,345 images, surprise 28,881 images, disgust 3102 images, and contemplation 1211 images |
MMI https://mmifacedb.eu/ (Accessed on 12 March 2023) | Pantic (2005) [14], Cai (2018) [10] | This dataset consists of over 2900 high-resolution videos, in which the identified emotions are joy, sadness, anger, surprise, fear, and disgust. |
AFEW https://ibug.doc.ic.ac.uk/resources/afew-va-database/ (Accessed on 24 February 2023) | Dhall (2012) [15] | This dataset contains approximately 1809 videos of facial expressions, including videos of individuals expressing six basic emotions: happiness (571 videos), sadness (527 videos), anger (485 videos), surprise (465 videos), fear (215 videos), and disgust (256 videos). |
Yale B https://www.kaggle.com/datasets/olgabelitskaya/yale-face-database (Accessed on 7 March 2023) | Bendjillali (2022) [16] | This dataset contains a total of 165 GIF images from 15 subjects, with 11 images per subject. The facial expressions used in the images include happiness, neutral, sadness, drowsy, and surprised |
CMU PIE https://www.cs.cmu.edu/afs/cs/project/PIE/MultiPie/Multi-Pie/Home.html (Accessed on 1 March 2023) | Bendjillali (2022) [16] | This dataset contains over 750,000 images of 337 individuals. The subjects were photographed from 15 viewpoints and under 19 lighting conditions while displaying a variety of facial expressions. Additionally, high-resolution frontal images were also acquired. In total, the database contains over 305 GB of facial data, including 40 images of happiness expressions, 38 of sadness, 46 of anger, 51 of surprise, 42 of disgust, 38 of fear, 20 of neutral expressions, and 48 of smiles |
4. Facial Expressions Recognition
4.1. Architectures
4.2. Comparative
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Roopa, S.N. Research on face expression recognition. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 88–91. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W.V. Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues; Prentice-Hall: Cambridge, MA, USA, 2003. [Google Scholar]
- Shi, M.; Xu, L.; Chen, X. A Novel Facial Expression Intelligent Recognition Method Using Improved Convolutional Neural Network. IEEE Access 2020, 8, 57606–57614. [Google Scholar] [CrossRef]
- Akhand, M.A.H.; Roy, S.; Siddique, N.; Kamal, M.A.S.; Shimamura, T. Facial Emotion Recognition Using Transfer Learning in the Deep CNN. Electronics 2021, 10, 1036. [Google Scholar] [CrossRef]
- Singh, S.; Nasoz, F. Facial Expression Recognition with Convolutional Neural Networks. In Proceedings of the 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 6–8 January 2020; pp. 324–328. [Google Scholar] [CrossRef]
- Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, San Francisco, CA, USA, 13–18 June 2010; pp. 94–101. [Google Scholar] [CrossRef] [Green Version]
- Monica, B.; Marco, M.; Lakhmi, C.J. Neural Information Processing; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Shehu, H.A.; Browne, W.N.; Eisenbarth, H. An anti-attack method for emotion categorization from images [Formula presented]. Appl. Soft Comput. 2022, 128, 109456. [Google Scholar] [CrossRef]
- Li, J.; Jin, K.; Zhou, D.; Kubota, N.; Ju, Z. Attention mechanism-based CNN for facial expression recognition. Neurocomputing 2020, 411, 340–350. [Google Scholar] [CrossRef]
- Cai, J.; Meng, Z.; Khan, A.S.; Li, Z.; O’Reilly, J.; Tong, Y. Island Loss for Learning Discriminative Features in Facial Expression Recognition. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face I& Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018. [Google Scholar] [CrossRef] [Green Version]
- Penny Storms. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14–16 April 1998; IEEE Computer Society: Los Alamitos, CA, USA, 1998.
- Melinte, D.O.; Vladareanu, L. Facial expressions recognition for human–robot interaction using deep convolutional neural networks with rectified adam optimizer. Sensors 2020, 20, 2393. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mollahosseini, A.; Hasani, B.; Mahoor, M.H. AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Trans. Affect. Comput. 2019, 10, 18–31. [Google Scholar] [CrossRef] [Green Version]
- Pantic, M.; Valstar, M.; Rademaker, R.; Maat, L. Web-based database for facial expression analysis. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 6 July 2005; Volume 2005, pp. 317–321. [Google Scholar] [CrossRef] [Green Version]
- Dhall, A.; Goecke, R.; Lucey, S.; Gedeon, T. Collecting large, richly annotated facial-expression databases from movies. IEEE Multimed. 2012, 19, 34–41. [Google Scholar] [CrossRef] [Green Version]
- Bendjillali, R.I.; Beladgham, M.; Merit, K.; Taleb-Ahmed, A. Illumination-robust face recognition based on deep convolutional neural networks architectures. Indones. J. Electr. Eng. Comput. Sci. 2020, 18, 1015–1027. [Google Scholar] [CrossRef]
- Li, S.; Guo, L.; Liu, J. Towards East Asian Facial Expression Recognition in the Real World: A New Database and Deep Recognition Baseline. Sensors 2022, 22, 8089. [Google Scholar] [CrossRef]
- Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
Dataset | # Classes | Img. Res | Vid. Res | Fps | Duration | Public? |
---|---|---|---|---|---|---|
CASME II | 8 | 640 × 480 px–1280 × 960 px | 640 × 480 px–1280 × 960 px | 60 | 5 s | Yes |
SMIC | 6 | 320 × 240 px–640 × 480 px | 640 × 480 px–1280 × 960 px | 25–30 | 2 h 30 min | No |
KDEF | 6 | 490 × 640 px | - | - | - | Yes |
CK+ | 8 | 48 × 48 px | - | - | Yes | |
JAFFE | 7 | 256 × 256 px | - | - | - | Yes |
FER2013 | 6 | 48 × 48 px | - | - | - | Yes |
AffectNet | 8 | 224 × 224 px–512 × 512 px | - | - | Yes | |
MMI | 6 | 640 × 480 px–800 × 600 px–1280 × 960 px | 640 × 480 px–800 × 600 px–1280 × 960 px | 30 | - | Yes |
AFEW | 6 | 640 × 480 px–1920 × 1080 px | 640 × 480 px–1920 × 1080 px | 25 | 1 s to 6 s | Yes |
Yale B | 5 | 192 × 168 px | - | - | - | Yes |
CMU PIE | 8 | 320 × 240 px–640 × 480 px | - | - | - | Yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Díaz, P.; Vásquez, E.; Shiguihara, P. A Survey of Video Analysis Based on Facial Expression Recognition. Eng. Proc. 2023, 42, 3. https://doi.org/10.3390/engproc2023042003
Díaz P, Vásquez E, Shiguihara P. A Survey of Video Analysis Based on Facial Expression Recognition. Engineering Proceedings. 2023; 42(1):3. https://doi.org/10.3390/engproc2023042003
Chicago/Turabian StyleDíaz, Paul, Elvinn Vásquez, and Pedro Shiguihara. 2023. "A Survey of Video Analysis Based on Facial Expression Recognition" Engineering Proceedings 42, no. 1: 3. https://doi.org/10.3390/engproc2023042003
APA StyleDíaz, P., Vásquez, E., & Shiguihara, P. (2023). A Survey of Video Analysis Based on Facial Expression Recognition. Engineering Proceedings, 42(1), 3. https://doi.org/10.3390/engproc2023042003