sensors-logo

Journal Browser

Journal Browser

Deep Learning Based Face Recognition and Feature Extraction

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 15 April 2025 | Viewed by 13908

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
Interests: image denoising; image segmentation; image super-resolution; object detection; deep learning-based filtering
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
Interests: image understanding; machine learning; artificial intelligence; emotion analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
Interests: machine learning; pattern recognition; image processing; super-resolution reconstruction; face and gesture recognition; remote sensing

Special Issue Information

Dear Colleagues,

Human faces play a central role in interpersonal communication and social relationships, which is why their automatic recognition and analysis has attracted the attention of the computer vision community for decades. State-of-the-art facial recognition techniques enable the identification of a person for ID verification as well as to recognize and understand the psychophysical state, which is essential for smooth and high-quality human–computer interaction. Efficient facial recognition, verification, and identification algorithms are essential for developing reliable access control, surveillance, and security systems. In addition, the analysis of the emotional state of users provides valuable feedback, improving the user experience in many application areas.

In recent years, a number of face recognition methods based on deep learning and various feature extraction techniques have been developed, leading to significant advances in the field. Indeed, face recognition is one of the most active areas in computer vision research, and recent advances in systems based on deep learning have significantly improved their performance compared to solutions using classical machine learning and pattern recognition techniques. This Special Issue seeks original technical and review papers about the latest applications of deep learning including, but not limited to:

  • Face and facial landmarks detection and tracking,
  • Face recognition and identification,
  • Large-scale face recognition,
  • Facial expression recognition and analysis,
  • 3D face modeling,
  • Applications of face and facial expression recognition,
  • Fusion of multiple modalities,
  • |Face anti-spoofing techniques,
  • Face liveness detection from images and videos,
  • Surveillance applications,
  • Benchmarking and new protocols,
  • Psychological and behavioral analysis.

Prof. Dr. Bogdan Smolka
Dr. Karolina Nurzynska
Dr. Michal Kawulok
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1139 KiB  
Article
Facial Movements Extracted from Video for the Kinematic Classification of Speech
by Richard Palmer, Roslyn Ward, Petra Helmholz, Geoffrey R. Strauss, Paul Davey, Neville Hennessey, Linda Orton and Aravind Namasivayam
Sensors 2024, 24(22), 7235; https://doi.org/10.3390/s24227235 - 12 Nov 2024
Viewed by 768
Abstract
Speech Sound Disorders (SSDs) are prevalent communication problems in children that pose significant barriers to academic success and social participation. Accurate diagnosis is key to mitigating life-long impacts. We are developing a novel software solution—the Speech Movement and Acoustic Analysis Tracking (SMAAT) system [...] Read more.
Speech Sound Disorders (SSDs) are prevalent communication problems in children that pose significant barriers to academic success and social participation. Accurate diagnosis is key to mitigating life-long impacts. We are developing a novel software solution—the Speech Movement and Acoustic Analysis Tracking (SMAAT) system to facilitate rapid and objective assessment of motor speech control issues underlying SSD. This study evaluates the feasibility of using automatically extracted three-dimensional (3D) facial measurements from single two-dimensional (2D) front-facing video cameras for classifying speech movements. Videos were recorded of 51 adults and 77 children between 3 and 4 years of age (all typically developed for age) saying 20 words from the mandibular and labial-facial levels of the Motor-Speech Hierarchy Probe Wordlist (MSH-PW). Measurements around the jaw and lips were automatically extracted from the 2D video frames using a state-of-the-art facial mesh detection and tracking algorithm, and each individual measurement was tested in a Leave-One-Out Cross-Validation (LOOCV) framework for its word classification performance. Statistics were evaluated at the α=0.05 significance level and several measurements were found to exhibit significant classification performance in both the adult and child cohorts. Importantly, measurements of depth indirectly inferred from the 2D video frames were among those found to be significant. The significant measurements were shown to match expectations of facial movements across the 20 words, demonstrating their potential applicability in supporting clinical evaluations of speech production. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Graphical abstract

26 pages, 18348 KiB  
Article
Multitask Learning Strategy with Pseudo-Labeling: Face Recognition, Facial Landmark Detection, and Head Pose Estimation
by Yongju Lee, Sungjun Jang, Han Byeol Bae, Taejae Jeon and Sangyoun Lee
Sensors 2024, 24(10), 3212; https://doi.org/10.3390/s24103212 - 18 May 2024
Viewed by 1110
Abstract
Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and [...] Read more.
Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among the people in the dataset. Our integrated framework is constructed using complementary multitask learning methods to extract robust features for each task. Furthermore, introducing pseudo-labeling and multitask learning improves the face recognition performance by enabling the learning of pose-invariant features. Our method achieves state-of-the-art (SOTA) or near-SOTA performance on the AFLW2000-3D and BIWI datasets for facial landmark detection and head pose estimation, with competitive face verification performance on the IJB-C test dataset for face recognition. We demonstrate this through a novel testing methodology that categorizes cases as soft, medium, and hard based on the pose values of IJB-C. The proposed method achieves stable performance even when the dataset lacks diverse face identifications. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

15 pages, 2180 KiB  
Article
Predicting Facial Attractiveness from Colour Cues: A New Analytic Framework
by Yan Lu, Kaida Xiao, Michael Pointer and Yandan Lin
Sensors 2024, 24(2), 391; https://doi.org/10.3390/s24020391 - 9 Jan 2024
Viewed by 1459
Abstract
Various facial colour cues were identified as valid predictors of facial attractiveness, yet the conventional univariate approach has simplified the complex nature of attractiveness judgement for real human faces. Predicting attractiveness from colour cues is difficult due to the high number of candidate [...] Read more.
Various facial colour cues were identified as valid predictors of facial attractiveness, yet the conventional univariate approach has simplified the complex nature of attractiveness judgement for real human faces. Predicting attractiveness from colour cues is difficult due to the high number of candidate variables and their inherent correlations. Using datasets from Chinese subjects, this study proposed a novel analytic framework for modelling attractiveness from various colour characteristics. One hundred images of real human faces were used in experiments and an extensive set of 65 colour features were extracted. Two separate attractiveness evaluation sets of data were collected through psychophysical experiments in the UK and China as training and testing datasets, respectively. Eight multivariate regression strategies were compared for their predictive accuracy and simplicity. The proposed methodology achieved a comprehensive assessment of diverse facial colour features and their role in attractiveness judgements of real faces; improved the predictive accuracy (the best-fit model achieved an out-of-sample accuracy of 0.66 on a 7-point scale) and significantly mitigated the issue of model overfitting; and effectively simplified the model and identified the most important colour features. It can serve as a useful and repeatable analytic tool for future research on facial impression modelling using high-dimensional datasets. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

22 pages, 5540 KiB  
Article
Research on Fatigued-Driving Detection Method by Integrating Lightweight YOLOv5s and Facial 3D Keypoints
by Xiansheng Ran, Shuai He and Rui Li
Sensors 2023, 23(19), 8267; https://doi.org/10.3390/s23198267 - 6 Oct 2023
Cited by 3 | Viewed by 1677
Abstract
In response to the problem of high computational and parameter requirements of fatigued-driving detection models, as well as weak facial-feature keypoint extraction capability, this paper proposes a lightweight and real-time fatigued-driving detection model based on an improved YOLOv5s and Attention Mesh 3D keypoint [...] Read more.
In response to the problem of high computational and parameter requirements of fatigued-driving detection models, as well as weak facial-feature keypoint extraction capability, this paper proposes a lightweight and real-time fatigued-driving detection model based on an improved YOLOv5s and Attention Mesh 3D keypoint extraction method. The main strategies are as follows: (1) Using Shufflenetv2_BD to reconstruct the Backbone network to reduce parameter complexity and computational load. (2) Introducing and improving the fusion method of the Cross-scale Aggregation Module (CAM) between the Backbone and Neck networks to reduce information loss in shallow features of closed-eyes and closed-mouth categories. (3) Building a lightweight Context Information Fusion Module by combining the Efficient Multi-Scale Module (EAM) and Depthwise Over-Parameterized Convolution (DoConv) to enhance the Neck network’s ability to extract facial features. (4) Redefining the loss function using Wise-IoU (WIoU) to accelerate model convergence. Finally, the fatigued-driving detection model is constructed by combining the classification detection results with the thresholds of continuous closed-eye frames, continuous yawning frames, and PERCLOS (Percentage of Eyelid Closure over the Pupil over Time) of eyes and mouth. Under the premise that the number of parameters and the size of the baseline model are reduced by 58% and 56.3%, respectively, and the floating point computation is only 5.9 GFLOPs, the average accuracy of the baseline model is increased by 1%, and the Fatigued-recognition rate is 96.3%, which proves that the proposed algorithm can achieve accurate and stable real-time detection while lightweight. It provides strong support for the lightweight deployment of vehicle terminals. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

22 pages, 8887 KiB  
Article
GANMasker: A Two-Stage Generative Adversarial Network for High-Quality Face Mask Removal
by Mohamed Mahmoud and Hyun-Soo Kang
Sensors 2023, 23(16), 7094; https://doi.org/10.3390/s23167094 - 10 Aug 2023
Cited by 9 | Viewed by 2237
Abstract
Deep-learning-based image inpainting methods have made remarkable advancements, particularly in object removal tasks. The removal of face masks has gained significant attention, especially in the wake of the COVID-19 pandemic, and while numerous methods have successfully addressed the removal of small objects, removing [...] Read more.
Deep-learning-based image inpainting methods have made remarkable advancements, particularly in object removal tasks. The removal of face masks has gained significant attention, especially in the wake of the COVID-19 pandemic, and while numerous methods have successfully addressed the removal of small objects, removing large and complex masks from faces remains demanding. This paper presents a novel two-stage network for unmasking faces considering the intricate facial features typically concealed by masks, such as noses, mouths, and chins. Additionally, the scarcity of paired datasets comprising masked and unmasked face images poses an additional challenge. In the first stage of our proposed model, we employ an autoencoder-based network for binary segmentation of the face mask. Subsequently, in the second stage, we introduce a generative adversarial network (GAN)-based network enhanced with attention and Masked–Unmasked Region Fusion (MURF) mechanisms to focus on the masked region. Our network generates realistic and accurate unmasked faces that resemble the original faces. We train our model on paired unmasked and masked face images sourced from CelebA, a large public dataset, and evaluate its performance on multi-scale masked faces. The experimental results illustrate that the proposed method surpasses the current state-of-the-art techniques in both qualitative and quantitative metrics. It achieves a Peak Signal-to-Noise Ratio (PSNR) improvement of 4.18 dB over the second-best method, with the PSNR reaching 30.96. Additionally, it exhibits a 1% increase in the Structural Similarity Index Measure (SSIM), achieving a value of 0.95. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

42 pages, 3753 KiB  
Article
Enhancing Smart Home Security: Anomaly Detection and Face Recognition in Smart Home IoT Devices Using Logit-Boosted CNN Models
by Asif Rahim, Yanru Zhong, Tariq Ahmad, Sadique Ahmad, Paweł Pławiak and Mohamed Hammad
Sensors 2023, 23(15), 6979; https://doi.org/10.3390/s23156979 - 6 Aug 2023
Cited by 15 | Viewed by 5697
Abstract
Internet of Things (IoT) devices for the home have made a lot of people’s lives better, but their popularity has also raised privacy and safety concerns. This study explores the application of deep learning models for anomaly detection and face recognition in IoT [...] Read more.
Internet of Things (IoT) devices for the home have made a lot of people’s lives better, but their popularity has also raised privacy and safety concerns. This study explores the application of deep learning models for anomaly detection and face recognition in IoT devices within the context of smart homes. Six models, namely, LR-XGB-CNN, LR-GBC-CNN, LR-CBC-CNN, LR-HGBC-CNN, LR-ABC-CNN, and LR-LGBM-CNN, were proposed and evaluated for their performance. The models were trained and tested on labeled datasets of sensor readings and face images, using a range of performance metrics to assess their effectiveness. Performance evaluations were conducted for each of the proposed models, revealing their strengths and areas for improvement. Comparative analysis of the models showed that the LR-HGBC-CNN model consistently outperformed the others in both anomaly detection and face recognition tasks, achieving high accuracy, precision, recall, F1 score, and AUC-ROC values. For anomaly detection, the LR-HGBC-CNN model achieved an accuracy of 94%, a precision of 91%, a recall of 96%, an F1 score of 93%, and an AUC-ROC of 0.96. In face recognition, the LR-HGBC-CNN model demonstrated an accuracy of 88%, precision of 86%, recall of 90%, F1 score of 88%, and an AUC-ROC of 0.92. The models exhibited promising capabilities in detecting anomalies, recognizing faces, and integrating these functionalities within smart home IoT devices. The study’s findings underscore the potential of deep learning approaches for enhancing security and privacy in smart homes. However, further research is warranted to evaluate the models’ generalizability, explore advanced techniques such as transfer learning and hybrid methods, investigate privacy-preserving mechanisms, and address deployment challenges. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

Back to TopTop