HE-CycleGAN: A Symmetric Network Based on High-Frequency Features and Edge Constraints Used to Convert Facial Sketches to Images
Abstract
:1. Introduction
- (1)
- We propose a network called HE-CycleGAN for converting facial sketch images to facial images.
- (2)
- We added a high-frequency feature extractor (HFFE) to the generator of HE-CycleGAN, which alleviates the problem of losing details in facial images generated by the traditional CycleGAN to meet the constraint of cyclic consistency.
- (3)
- We designed a multi-scale wavelet edge discriminator (MSWED). This MSWED can solve the problem of generated facial edge overflow.
- (4)
- Finally, we quantitatively and qualitatively validated the effectiveness of the proposed HE-CycleGAN.
2. Related Work
3. The Proposed Method
3.1. Generator Network Structure
Algorithm 1. Extract features using Haar wavelet transform |
Input: F, // F is the input feature Output: W //include four feature components and K [ ] // define four filtering kernels as a list Randomly initialize four feature components W ] Step 2 // step size during filtering for each in W, K // traverse the position of each element in the output feature for each in ][] // traverse the position of each element in the filtering kernel for each Calculate the product of and ][] Add S to ][] end end end |
3.2. Discriminator Network Structure
3.3. The Loss Function
3.3.1. Adversarial Loss
3.3.2. Multi-Scale Wavelet Edge Discrimination Adversarial Loss
3.3.3. Cycle-Consistency Loss
3.3.4. Color-Identify Loss
3.3.5. HE-CycleGAN Objective Function
4. Experiments
4.1. Datasets
4.2. Experimental Procedure
4.3. Result Analysis
4.4. Ablation Studies
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhu, J.Y.; Park, T.; Isola, P. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Babu, K.K.; Dubey, S.R. CSGAN: Cyclic-synthesized generative adversarial networks for image-to-image transformation. Expert Syst. Appl. 2021, 169, 114431. [Google Scholar] [CrossRef]
- Babu, K.K.; Dubey, S.R. Cdgan: Cyclic discriminative generative adversarial networks for image-to-image transformation. J. Vis. Commun. Image Represent. 2022, 82, 103382. [Google Scholar] [CrossRef]
- Wang, G.; Shi, H.; Chen, Y.; Wu, B. Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks. Appl. Intell. 2023, 53, 17243–17259. [Google Scholar] [CrossRef]
- Isola, p.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Senapati, R.K.; Satvika, R.; Anmandla, A.; Ashesh Reddy, G.; Anil Kumar, C. Image-to-image translation using Pix2Pix GAN and cycle GAN. In International Conference on Data Intelligence and Cognitive Informatics; Springer Nature Singapore: Singapore, 2023; pp. 573–586. [Google Scholar]
- Zhang, Y.; Yu, L.; Sun, B.; He, J. ENG-Face: Cross-domain heterogeneous face synthesis with enhanced asymmetric CycleGAN. Appl. Intell. 2022, 52, 15295–15307. [Google Scholar] [CrossRef]
- Chu, C.; Zhmoginov, A.; Sandler, M. Cyclegan, a master of steganography. arXiv 2017, arXiv:1712.02950. [Google Scholar]
- Porav, H.; Musat, V.; Newman, P. Reducing Steganography In Cycle-consistency GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 15–20 June 2019; pp. 78–82. [Google Scholar]
- Gao, Y.; Wei, F.; Bao, J.; Gu, S.; Chen, D.; Wen, F.; Lian, Z. High-fidelity and arbitrary face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 16115–16124. [Google Scholar]
- Lin, C.T.; Kew, J.L.; Chan, C.S.; Lai, S.H.; Zach, C. Cycle-object consistency for image-to-image domain adaptation. Pattern Recognit. 2023, 138, 109416. [Google Scholar] [CrossRef]
- Wang, X.; Tang, X. Face photo-sketch synthesis and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 31, 1955–1967. [Google Scholar] [CrossRef] [PubMed]
- Xiao, B.; Gao, X.; Tao, D.; Li, X. A new approach for face recognition by sketches in photos. Signal Process. 2009, 89, 1576–1588. [Google Scholar] [CrossRef]
- Bono, F.M.; Radicioni, L.; Cinquemani, S.; Conese, C.; Tarabini, M. Development of soft sensors based on neural networks for detection of anomaly working condition in automated machinery. In Proceedings of the NDE 4.0, Predictive Maintenance, and Communication and Energy Systems in a Globally Networked World, Long Beach, CA, USA, 4–10 April 2022; pp. 56–70. [Google Scholar]
- Zhang, L.; Lin, L.; Wu, X.; Ding, S.; Zhang, L. End-to-end photo-sketch generation via fully convolutional representation learning. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, Shanghai, China, 23–26 June 2015. [Google Scholar]
- Zhou, G.; Fan, Y.; Shi, J.; Lu, Y.; Shen, J. Conditional generative adversarial networks for domain transfer: A survey. Appl. Sci. 2022, 12, 8350. [Google Scholar] [CrossRef]
- Porkodi, S.P.; Sarada, V.; Maik, V.; Gurushankar, K. Generic image application using gans (generative adversarial networks): A review. Evol. Syst. 2023, 14, 903–917. [Google Scholar] [CrossRef]
- Li, Y.; Chen, X.; Wu, F.; Zha, Z.J. Linestofacephoto: Face photo generation from lines with conditional self-attention generative adversarial networks. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2323–2331. [Google Scholar]
- Chen, S.Y.; Su, W.; Gao, L.; Xia, S.; Fu, H. Deep generation of face images from sketches. arXiv 2020, arXiv:2006.01047. [Google Scholar]
- Li, L.; Tang, J.; Shao, Z.; Tan, X.; Ma, L. Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement. Vis. Comput. 2022, 38, 3577–3594. [Google Scholar] [CrossRef]
- Sun, J.; Yu, H.; Zhang, J.J.; Dong, J.; Yu, H.; Zhong, G. Face image-sketch synthesis via generative adversarial fusion. Neural Netw. 2022, 154, 179–189. [Google Scholar] [CrossRef] [PubMed]
- Shao, X.; Qiang, Z.; Dai, F.; He, L.; Lin, H. Face Image Completion Based on GAN Prior. Electronics 2022, 11, 1997. [Google Scholar] [CrossRef]
- Ren, G.; Geng, W.; Guan, P.; Cao, Z.; Yu, J. Pixel-wise grasp detection via twin deconvolution and multi-dimensional attention. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 4002–4010. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Gao, G.; Lai, H.; Jia, Z. Unsupervised image dedusting via a cycle-consistent generative adversarial network. Remote Sens. 2023, 15, 1311. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, X.; Tang, X. Coupled information-theoretic encoding for face photo-sketch recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 20–25 June 2011. [Google Scholar]
- Koch, B.; Grbić, R. One-shot lip-based biometric authentication: Extending behavioral features with authentication phrase information. Image Vis. Comput. 2024, 142, 104900. [Google Scholar] [CrossRef]
- Liu, F.; Chen, D.; Wang, F.; Li, Z.; Xu, F. Deep learning based single sample face recognition: A survey. Artif. Intell. Rev. 2023, 56, 2723–2748. [Google Scholar] [CrossRef]
- Rajeswari, G.; Ithaya Rani, P. Face occlusion removal for face recognition using the related face by structural similarity index measure and principal component analysis. J. Intell. Fuzzy Syst. 2022, 42, 5335–5350. [Google Scholar] [CrossRef]
- Ko, K.; Yeom, T.; Lee, M. Superstargan: Generative adversarial networks for image-to-image translation in large-scale domains. Neural Netw. 2023, 162, 330–339. [Google Scholar] [CrossRef]
- Kynkäänniemi, T.; Karras, T.; Aittala, M.; Aila, T.; Lehtinen, J. The role of imagenet classes in fréchet inception distance. arXiv 2022, arXiv:2203.06026. [Google Scholar]
- Song, Z.; Zhang, Z.; Fang, F.; Fan, Z.; Lu, J. Deep semantic-aware remote sensing image deblurring. Signal Process. 2023, 211, 109108. [Google Scholar] [CrossRef]
- Jayasumana, S.; Ramalingam, S.; Veit, A.; Glasner, D.; Chakrabarti, A.; Kumar, S. Rethinking fid: Towards a better evaluation metric for image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
Datasets | Number of Sample Pairs | Size | Train/Test |
---|---|---|---|
CUHK_student | 188 | 250 × 200 | 100/88 |
XM2VTS | 295 | 250 × 200 | 195/100 |
AR | 123 | 250 × 200 | 80/43 |
Pix2Pix [5] | CycleGAN [1] | CSGAN [2] | CDGAN [3] | LSCIT [4] | Ours | |
---|---|---|---|---|---|---|
SSIM ⬆ | 0.6866 | 0.6938 | 0.6837 | 0.6916 | 0.6944 | 0.7118 |
LPIPS ⬇ | 0.2756 | 0.2300 | 0.2524 | 0.2270 | 0.2281 | 0.2017 |
FID ⬇ | 127.6600 | 65.5658 | 87.0142 | 60.4071 | 67.2641 | 51.4870 |
Pix2Pix [5] | CycleGAN [1] | CSGAN [2] | CDGAN [3] | LSCIT [4] | Ours | |
---|---|---|---|---|---|---|
SSIM ⬆ | 0.5834 | 0.5940 | 0.5984 | 0.5967 | 0.6057 | 0.6109 |
LPIPS ⬇ | 0.2481 | 0.2371 | 0.2426 | 0.2453 | 0.2452 | 0.2207 |
FID ⬇ | 66.5135 | 47.1245 | 58.0198 | 47.1513 | 50.8334 | 41.2961 |
Pix2Pix [5] | CycleGAN [1] | CSGAN [2] | CDGAN [3] | LSCIT [4] | Ours | |
---|---|---|---|---|---|---|
SSIM ⬆ | 0.6836 | 0.6801 | 0.6930 | 0.6830 | 0.6816 | 0.7048 |
LPIPS ⬇ | 0.2423 | 0.2596 | 0.2276 | 0.2585 | 0.2529 | 0.2128 |
FID ⬇ | 99.0394 | 92.0436 | 77.0084 | 77.5533 | 74.7337 | 51.8288 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
CycleGAN [1] | 0.6938 | 0.2300 | 65.5658 |
+HFFE | 0.7034 | 0.2080 | 54.0132 |
+MSWED | 0.7085 | 0.2107 | 56.7533 |
HE-CycleGAN | 0.7118 | 0.2017 | 51.4870 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
CycleGAN [1] | 0.5940 | 0.2371 | 47.1245 |
+HFFE | 0.6004 | 0.2288 | 43.3593 |
+MSWED | 0.6031 | 0.2250 | 41.9002 |
HE-CycleGAN | 0.6109 | 0.2207 | 41.2961 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
CycleGAN [1] | 0.6801 | 0.2596 | 92.0436 |
+HFFE | 0.7029 | 0.2169 | 52.6087 |
+MSWED | 0.6978 | 0.2240 | 64.8579 |
HE-CycleGAN | 0.7048 | 0.2128 | 51.8288 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
+HFFE | 0.7034 | 0.2080 | 54.0132 |
HFFE − ECANet [24] | 0.6969 | 0.2115 | 56.8686 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
+HFFE | 0.6004 | 0.2288 | 43.3593 |
HFFE − ECANet [24] | 0.5978 | 0.2340 | 44.9255 |
Method | SSIM ⬆ | LPIPS ⬇ | FID ⬇ |
---|---|---|---|
+HFFE | 0.7029 | 0.2169 | 52.6087 |
HFFE − ECANet [24] | 0.6971 | 0.2203 | 56.2123 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, B.; Du, R.; Li, J.; Tang, Y. HE-CycleGAN: A Symmetric Network Based on High-Frequency Features and Edge Constraints Used to Convert Facial Sketches to Images. Symmetry 2024, 16, 1015. https://doi.org/10.3390/sym16081015
Li B, Du R, Li J, Tang Y. HE-CycleGAN: A Symmetric Network Based on High-Frequency Features and Edge Constraints Used to Convert Facial Sketches to Images. Symmetry. 2024; 16(8):1015. https://doi.org/10.3390/sym16081015
Chicago/Turabian StyleLi, Bin, Ruiqi Du, Jie Li, and Yuekai Tang. 2024. "HE-CycleGAN: A Symmetric Network Based on High-Frequency Features and Edge Constraints Used to Convert Facial Sketches to Images" Symmetry 16, no. 8: 1015. https://doi.org/10.3390/sym16081015
APA StyleLi, B., Du, R., Li, J., & Tang, Y. (2024). HE-CycleGAN: A Symmetric Network Based on High-Frequency Features and Edge Constraints Used to Convert Facial Sketches to Images. Symmetry, 16(8), 1015. https://doi.org/10.3390/sym16081015