GAN-Based High-Quality Face-Swapping Composite Network
Abstract
:1. Introduction
- We design a composite face-swapping generation network model to solve the feature loss and image blurring problems in the face-swapping process. The model includes two main modules: the facial feature extraction module, and the face feature fusion generation module.
- To address the problem of unnatural facial feature fusion and poor image quality in face change tasks, we innovatively used a combination of variational autoencoders (VAEs) [17] and GANs to improve image quality post-face swap.
- Our proposed model is experimentally validated to be more robust in face recognition, pose verification, and image quality assessment compared with other good models. In the validation of the assessment of image quality, our proposed model reduces the difference to 0.46 in the FID image quality score and obtains an excellent score of 0.91 in the SSIM score.
2. Related Works
3. Methods
3.1. Facial Feature Extraction Module
3.2. Facial Feature Fusion Generation Module
4. Experiments
4.1. Trainning and Implementation Details
4.2. Competing Methods
4.3. Performance Evaluation Metrics
5. Conclusions and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.-F. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Man, Q.; Cho, Y.I.; Jang, S.G.; Lee, H.J. Transformer-based gan for new hairstyle generative networks. Electronics 2022, 11, 2106. [Google Scholar] [CrossRef]
- Bitouk, D.; Kumar, N.; Dhillon, S.; Belhumeur, P.; Nayar, S.K. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graph. 2008, 27, 1–8. [Google Scholar] [CrossRef]
- Cootes, T.F.; Edwards, G.J.; Taylor, C.J. Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 681–685. [Google Scholar] [CrossRef]
- V Blanz, V.; Scherbaum, K.; Vetter, T.; Seidel, H.P. Exchanging faces in images. In Computer Graphics Forum; Blackwell Publishing, Inc.: Oxford, UK; Boston, MA, USA, 2004; Volume 23, pp. 669–676. [Google Scholar]
- Agarwala, A.; Dontcheva, M.; Agrawala, M.; Drucker, S.; Colburn, A.; Curless, B.; Salesin, D.; Cohen, M. Interactive digital photomontage. ACM Trans. Graph. 2004, 23, 294–302. [Google Scholar] [CrossRef]
- Phan, H.; Nguyen, A. DeepFace-EMD: Re-ranking using patch-wise earth mover’s distance improves out-of-distribution face identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 20259–20269. [Google Scholar]
- Chang, F.J.; Tuan Tran, A.; Hassner, T.; Masi, I.; Nevatia, R.; Medioni, G. Faceposenet: Making a case for landmark-free face alignment. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017; pp. 1599–1608. [Google Scholar]
- Korshunova, I.; Shi, W.; Dambre, J.; Theis, L. Fast face-swap using convolutional neural networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3677–3685. [Google Scholar]
- Nirkin, Y.; Masi, I.; Tuan, A.T.; Hassner, T.; Medioni, G. On face segmentation, face swapping, and face perception. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 98–105. [Google Scholar]
- Pérez, P.; Gangnet, M.; Blake, A. Poisson image editing. In ACM SIGGRAPH 2003 Papers (SIGGRAPH ’03); Association for Computing Machinery: New York, NY, USA, 2003; pp. 313–318. [Google Scholar] [CrossRef]
- Li, A.; Hu, J.; Fu, C.; Zhang, X.; Zhou, J. Attribute-conditioned face swapping network for low-resolution images. In Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, 22–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2305–2309. [Google Scholar]
- Liu, K.; Perov, I.; Gao, D.; Chervoniy, N.; Zhou, W.; Zhang, W. Deepfacelab: Integrated, flexible and extensible face-swapping framework. Pattern Recognit. 2023, 141, 109628. [Google Scholar] [CrossRef]
- FaceSwap, “FaceSwap”. Available online: https://github.com/MarekKowalski/FaceSwap/ (accessed on 15 November 2019).
- Nirkin, Y.; Keller, Y.; Hassner, T. Fsgan: Subject agnostic face swapping and reenactment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Long Beach, CA, USA, 15–20 June 2019; pp. 7184–7193. [Google Scholar]
- Xu, Z.; Yu, X.; Hong, Z.; Zhu, Z.; Han, J.; Liu, J.; Ding, E.; Bai, X. Facecontroller: Controllable attribute editing for face in the wild. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 3083–3091. [Google Scholar]
- Razavi, A.; Van den Oord, A.; Vinyals, O. Generating diverse high-fidelity images with vq-vae-2. Adv. Neural Inf. Process. Syst. 2019, 32, 14866–14876. [Google Scholar]
- Parkhi, O.; Vedaldi, A.; Zisserman, A. Deep face recognition. In Proceedings of the British Machine Vision Conference 2015, Swansea, UK, 7–10 September 2015; British Machine Vision Association: Glasgow, UK, 2015. [Google Scholar]
- Taigman, Y.; Yang, M.; Ranzato, M.A.; Wolf, L. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1701–1708. [Google Scholar]
- Sun, Y.; Wang, X.; Tang, X. Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1891–1898. [Google Scholar]
- Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4690–4699. [Google Scholar]
- Deng, J.; Guo, J.; Zhou, Y.; Yu, J.; Kotsia, I.; Zafeiriou, S. Retinaface: Single-stage dense face localisation in the wild. arXiv 2019, arXiv:1905.00641. [Google Scholar]
- Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
- Thies, J.; Zollhofer, M.; Stamminger, M.; Theobalt, C.; Nießner, M. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2387–2395. [Google Scholar]
- Blanz, V.; Romdhani, S.; Vetter, T. Face identification across different poses and illuminations with a 3d morphable model. In Proceedings of the Fifth IEEE International Conference on Automatic Face Gesture Recognition, Washinton DC, USA, 20–21 May 2002; IEEE: Piscataway, NJ, USA, 2002; pp. 202–207. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Paul Smolley, S. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2794–2802. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Li, L.; Bao, J.; Yang, H.; Chen, D.; Wen, F. Faceshifter: Towards high fidelity and occlusion aware face swapping. arXiv 2019, arXiv:1912.13457. [Google Scholar]
- Chen, R.; Chen, X.; Ni, B.; Ge, Y. Simswap: An efficient framework for high fidelity face swapping. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2003–2011. [Google Scholar]
- Wang, T.; Li, Z.; Liu, R.; Wang, Y.; Nie, L. An efficient attribute-preserving framework for face swapping. IEEE Trans. Multimed. 2024, 26, 6554–6565. [Google Scholar] [CrossRef]
- Li, Q.; Wang, W.; Xu, C.; Sun, Z.; Yang, M.-H. Learning disentangled representation for one-shot progressive face swapping. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 1–17. [Google Scholar] [CrossRef] [PubMed]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of gans for improved quality, stability, and variation. arXiv 2017, arXiv:1710.10196. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Long Beach, CA, USA, 15–20 June 2019; pp. 1–11. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Configuration Information | |
---|---|
Operating Systems | Windows 10 |
Development Languages | Python 3.11 |
Frameworks | Pytorch 2.1.0 + CUDA 11.8 |
CPU | AMD 5800X |
GPU | NVIDIA RTX 3090 24 G (x2) |
Memory | 48 GB |
Methods | ID Retrieval | Pose |
---|---|---|
Faceswap [14] | 54.19 | 5.73 |
FSGAN [15] | 60.34 | 5.28 |
Deepface [13] | 81.96 | 4.29 |
Simswap [30] | 92.65 | 2.74 |
Ours | 97.82 | 1.55 |
Methods | FID | SSIM |
---|---|---|
Faceswap | 0.57 | 0.75 |
FSGAN | 0.63 | 0.54 |
Deepface | 0.59 | 0.80 |
Simswap | 0.53 | 0.85 |
Ours | 0.46 | 0.91 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Man, Q.; Cho, Y.-I.; Gee, S.-J.; Kim, W.-J.; Jang, K.-A. GAN-Based High-Quality Face-Swapping Composite Network. Electronics 2024, 13, 3092. https://doi.org/10.3390/electronics13153092
Man Q, Cho Y-I, Gee S-J, Kim W-J, Jang K-A. GAN-Based High-Quality Face-Swapping Composite Network. Electronics. 2024; 13(15):3092. https://doi.org/10.3390/electronics13153092
Chicago/Turabian StyleMan, Qiaoyue, Young-Im Cho, Seok-Jeong Gee, Woo-Je Kim, and Kyoung-Ae Jang. 2024. "GAN-Based High-Quality Face-Swapping Composite Network" Electronics 13, no. 15: 3092. https://doi.org/10.3390/electronics13153092
APA StyleMan, Q., Cho, Y. -I., Gee, S. -J., Kim, W. -J., & Jang, K. -A. (2024). GAN-Based High-Quality Face-Swapping Composite Network. Electronics, 13(15), 3092. https://doi.org/10.3390/electronics13153092