Spotting Deepfakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models
Abstract
:1. Introduction
2. Proposed Mutli-Stream Deepfake Detection Methodology
2.1. CNN and Transfer Learning
2.2. Dataset
3. Experiments
3.1. Experimental Protocol
3.2. Experimental Results
3.2.1. Scenario 1
3.2.2. Scenario 2
3.2.3. Performance under Novel Type of Face Manipulation
3.2.4. Comparison with the Existing Deepfake Detection Methods
3.2.5. Performance of the Proposed Method on Existing Datasets
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Conflicts of Interest
References
- Akhtar, Z.; Dasgupta, D. A Comparative Evaluation of Local Feature Descriptors for DeepFakes Detection. In Proceedings of the IEEE International Symposium on Technologies for Homeland Security (HST), Woburn, MA, USA, 5–6 November 2019; pp. 1–5. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 1–9. [Google Scholar]
- Yang, X.; Li, Y.; Lyu, S. Exposing deep fakes using inconsistent head poses. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 1–4. [Google Scholar]
- Korshunov, P.; Marcel, S. Deepfakes: A new threat to face recognition? assessment and detection. arXiv 2018, arXiv:1812.08685. [Google Scholar]
- Li, Y.; Yang, X.; Sun, P.; Qi, H.; Lyu, S. Celeb-DF: A new dataset for DeepFake forensics. arXiv 2019, arXiv:1909.12962. [Google Scholar]
- Deep Fake Detection Dataset. Available online: https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html (accessed on 25 June 2021).
- Dolhansky, B.; Howes, R.; Pflaum, B.; Baram, N.; Ferrer, C. The Deepfake Detection Challenge (DFDC) Preview Dataset. arXiv 2019, arXiv:1910.08854. [Google Scholar]
- Afchar, D.; Nozick, V.; Yamagishi, J.; Echizen, I. Mesonet: A compact facial video forgery detection network. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 11–13 December 2018; pp. 1–7. [Google Scholar]
- Li, Y.; Lyu, S. Exposing DeepFake videos by detecting face warping artifacts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 1–7. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1–12. [Google Scholar]
- Zhou, P.; Han, X.; Morariu, V.I.; Davis, L.S. Two-stream neural networks for tampered face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1–9. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–12. [Google Scholar]
- McCloskey, S.; Albright, M. Detecting GAN-generated imagery using color cues. arXiv 2018, arXiv:1812.08247. [Google Scholar]
- Guan, H.; Kozak, M.; Robertson, E.; Lee, Y.; Yates, A.; Delgado, A.; Zhou, D.; Kheyrkhah, T.; Smith, J.; Fiscus, J. MFC datasets: Largescale benchmark datasets for media forensic challenge evaluation. In Proceedings of the IEEE Winter Applications of Computer Vision Workshops, Waikoloa Village, HI, USA, 7–11 January 2019; pp. 63–72. [Google Scholar]
- Nataraj, L.; Mohammed, T.; Manjunath, B.; Chandrasekaran, S.; Flenner, A.; Bappy, J.; Roy-Chowdhury, A. Detecting GAN generated fake images using co-occurrence matrices. arXiv 2019, arXiv:1903.06836. [Google Scholar] [CrossRef] [Green Version]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv 2017, arXiv:1703.10593. [Google Scholar]
- Choi, Y.; Choi, M.; Kim, M.; Ha, J.; Kim, S.; Choo, J. StarGAN: Unified generative adversarial networks for multi-domain imageto-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8789–8797. [Google Scholar]
- Neves, J.; Tolosana, R.; Vera-Rodriguez, R.; Lopes, V.; Proenca, H. Real or fake? spoofing state-of-the-art face synthesis detection systems. arXiv 2019, arXiv:1911.05351. [Google Scholar]
- Nguyen, H.H.; Yamagishi, J.; Echizen, I. Use of a capsule network to detect fake images and videos. arXiv 2019, arXiv:1910.12467. [Google Scholar]
- Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Proceedings of the Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 1–11. [Google Scholar]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. Faceforensics++: Learning to detect manipulated facial images. arXiv 2019, arXiv:1901.08971. [Google Scholar]
- Matern, F.; Riess, C.; Stamminger, M. Exploiting visual artifacts to expose DeepFakes and face manipulations. In Proceedings of the IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 83–92. [Google Scholar]
- Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1–11. [Google Scholar]
- Guarnera, L.; Giudice, O.; Battiato, S. DeepFake Detection by Analyzing Convolutional Traces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 1–10. [Google Scholar]
- Moon, T.K. The Expectation-Maximization Algorithm. IEEE Signal Process. Mag. 1996, 13, 47–60. [Google Scholar] [CrossRef]
- He, Z.; Zuo, W.; Kan, M.; Shan, S.; Chen, X. AttGAN: Facial Attribute Editing by Only Changing What You Want. IEEE Trans. Image Process. 2019, 28, 5464–5478. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cho, W.; Choi, S.; Park, D.K.; Shin, I.; Choo, J. Image-to-Image Translation via Group-Wise Deep Whitening-and-Coloring Transformation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1–15. [Google Scholar]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and Improving the Image Quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Patter Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 1–21. [Google Scholar]
- Hernandez-Ortega, J.; Tolosana, R.; Fierrez, J.; Morales, A. Deepfakeson-phys: Deepfakes detection based on heart rate estimation. In Proceedings of the 35th AAAI Conference on Artificial Intelligence Workshops, Online, 17 April 2021; pp. 1–8. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Deniz, E.; Şengür, A.; Kadiroğlu, Z.; Guo, Y.; Bajaj, V.; Budak, U. Transfer learning based histopathologic image classification for breast cancer detection. Health Inf. Sci. Syst. 2018, 6, 1–7. [Google Scholar] [CrossRef] [PubMed]
- Orenstein, E.C.; Beijbom, O. Transfer Learning and Deep Feature Extraction for Planktonic Image Data Sets. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 1082–1088. [Google Scholar]
- Viola, P.A.; Jones, M.J. Rapid Object Detection using a Boosted Cascade of Simple Features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001; pp. 1–8. [Google Scholar]
- Akhtar, Z.; Mouree, M.R.; Dasgupta, D. Utility of Deep Learning Features for Facial Attributes Manipulation Detection. In Proceedings of the IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI), Irvine, CA, USA, 21–23 September 2020; pp. 55–60. [Google Scholar]
- Yu, N.; Davis, L.; Fritz, M. Attributing Fake Images to GANs: Analyzing Fingerprints in Generated Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 1–11. [Google Scholar]
- Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral Normalization for Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–26. [Google Scholar]
- Bellemare, M.; Danihelka, I.; Dabney, W.; Mohamed, S.; Lakshminarayanan, B.; Hoyer, S.; Munos, R. The Cramer Distance as a Solution to Biased Wasserstein Gradients. arXiv 2017, arXiv:1705.10743. [Google Scholar]
- Binkowski, M.; Sutherland, D.; Arbel, M.; Gretton, A. Demystifying MMD GANs. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–36. [Google Scholar]
- Ganaie, M.A.; Hu, M. Ensemble deep learning: A review. arXiv 2021, arXiv:2104.02395. [Google Scholar]
- Ciftci, U.A.; Demir, I.; Yin, L. FakeCatcher: Detection of Synthetic Portrait Videos Using Biological Signals. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 1–17. [Google Scholar] [CrossRef] [PubMed]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. arXiv 2018, arXiv:1803.09179. [Google Scholar]
- Jafar, M.T.; Ababneh, M.; Al-Zoube, M.; Elhassan, A. Forensics And Analysis Of Deepfake Videos. In Proceedings of the 11th IEEE International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 53–58. [Google Scholar]
Methods | Techniques | Dataset | Year |
---|---|---|---|
Afchar et al. [8] | Designed CNN | Private dataset | 2018 |
Li et al. [9] | Face Warping Features | UADFV and DeepfakeTIMIT | 2019 |
Zhou et al. [11] | GoogleNet model | Private dataset | 2017 |
McCloskey and Albright [13] | GAN-Pipeline Features | NIST MFC2018 | 2018 |
Nataraj et al. [15] | Steganalysis Features | 100K-Faces | 2019 |
Nguyen et al. [19] | Capsule Network | FaceForensics++ | 2019 |
Matern et al. [22] | CNN and Logistic Regression Model | Private dataset | 2019 |
Guarnera et al. [24] | GAN-Pipeline Features | Private dataset | 2020 |
Hernandez-Ortega et al. [29] | Convolutional Attention Network (CAN) | Celeb-DF and DFDC | 2021 |
Accuracy | Sensitivity | Specificity | |
---|---|---|---|
Scenario 1 | 99.71% | 99.67% | 99.75% |
Accuracy | Sensitivity | Specificity | |
---|---|---|---|
Scenario 2 | 99.82% | 99.83% | 99.97% |
Trained on Manipulation Type | Tested on Manipulation Type | Performance (%) |
---|---|---|
Age | Beard | 88.20 |
Face Swap | 86.25 | |
Glasses | 89.60 | |
Hair Color | 76.21 | |
Hairstyle | 75.23 | |
Makeup | 86.30 | |
Smiling | 92.10 | |
Average | 84.84 | |
Face Swap | Age | 95.26 |
Beard | 97.81 | |
Glasses | 94.30 | |
Hair Color | 75.46 | |
Hairstyle | 77.28 | |
Makeup | 69.80 | |
Smiling | 95.40 | |
Average | 86.47 | |
Glasses | Age | 86.30 |
Beard | 82.14 | |
Face Swap | 80.63 | |
Hair Color | 75.26 | |
Hairstyle | 72.44 | |
Makeup | 71.50 | |
Smiling | 70.78 | |
Average | 77.01 | |
Hair Color | Age | 84.65 |
Beard | 66.36 | |
Face Swap | 69.82 | |
Glasses | 75.35 | |
Hairstyle | 88.37 | |
Makeup | 67.20 | |
Smiling | 63.27 | |
Average | 73.57 | |
Hairstyle | Age | 82.36 |
Beard | 73.26 | |
Face Swap | 78.20 | |
Glasses | 75.60 | |
Hair Color | 83.26 | |
Makeup | 65.46 | |
Smiling | 69.45 | |
Average | 75.37 | |
Smiling | Age | 77.85 |
Beard | 79.80 | |
Face Swap | 82.99 | |
Glasses | 73.21 | |
Hair Color | 68.47 | |
Hairstyle | 75.80 | |
Makeup | 81.70 | |
Average | 77.12 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yavuzkilic, S.; Sengur, A.; Akhtar, Z.; Siddique, K. Spotting Deepfakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models. Symmetry 2021, 13, 1352. https://doi.org/10.3390/sym13081352
Yavuzkilic S, Sengur A, Akhtar Z, Siddique K. Spotting Deepfakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models. Symmetry. 2021; 13(8):1352. https://doi.org/10.3390/sym13081352
Chicago/Turabian StyleYavuzkilic, Semih, Abdulkadir Sengur, Zahid Akhtar, and Kamran Siddique. 2021. "Spotting Deepfakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models" Symmetry 13, no. 8: 1352. https://doi.org/10.3390/sym13081352
APA StyleYavuzkilic, S., Sengur, A., Akhtar, Z., & Siddique, K. (2021). Spotting Deepfakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models. Symmetry, 13(8), 1352. https://doi.org/10.3390/sym13081352