An End-to-End Generation Model for Chinese Calligraphy Characters Based on Dense Blocks and Capsule Network
Abstract
:1. Introduction
- (1)
- A self-attention mechanism and a densely connected module are employed to reduce redundant and missing strokes.
- (2)
- To reduce twisted and deformed strokes, a capsule network and a fully connected network are employed in the design of the discriminator.
- (3)
- Additionally, perceptual loss is introduced to enhance the similarity of calligraphy style between the generated calligraphy and authentic ones.
2. Related Work
2.1. Image-to-Image Translation
2.2. Calligraphy Generation
3. Method
3.1. Network Architecture
3.2. Generator
3.2.1. Generator Structure
3.2.2. Dense Blocks
3.2.3. Self-Attention Mechanism
3.3. Discriminator
3.3.1. Discriminator Structure
3.3.2. CapsNet
3.4. Loss Function
3.4.1. Adversarial Loss
3.4.2. Cycle Consistency Loss
3.4.3. Perceptual Loss
3.5. Discussion of Proposed Method
4. Experiment
4.1. Dataset
4.2. Training Process
4.3. Comparative Experiments
4.3.1. Qualitative Comparison
- (1)
- Yan Zhenqing’s regular script
- (2)
- Deng Shiru’s clerical script
- (3)
- Wang Xizhi’s running script
4.3.2. Quantitative Comparison
4.4. Ablation Study
4.5. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yuan, S.; Dai, A.; Yan, Z.; Liu, R.; Chen, M.; Chen, B.; Qiu, Z.; He, X. Learning to Generate Poetic Chinese Landscape Painting with Calligraphy. arXiv 2023, arXiv:2305.04719. [Google Scholar]
- Wu, R.; Chao, F.; Zhou, C.; Chang, X.; Yang, L.; Shang, C.; Zhang, Z.; Shen, Q. Internal model control structure inspired robotic calligraphy system. IEEE Trans. Ind. Inform. 2023, 20, 2600–2610. [Google Scholar] [CrossRef]
- Wu, S.J.; Yang, C.Y.; Hsu, J.Y. Calligan: Style and structure-aware chinese calligraphy character generator. arXiv 2020, arXiv:2005.12500. [Google Scholar]
- Zhou, P.; Zhao, Z.; Zhang, K.; Li, C.; Wang, C. An end-to-end model for chinese calligraphy generation. Multimed. Tools Appl. 2021, 80, 6737–6754. [Google Scholar] [CrossRef]
- Chai, X.; Wang, Y.; Chen, X.; Gan, Z.; Zhang, Y. TPE-GAN: Thumbnail preserving encryption based on GAN with key. IEEE Signal Process. Lett. 2022, 29, 972–976. [Google Scholar] [CrossRef]
- Jiang, F.; Ma, J.; Webster, C.J.; Li, X.; Gan, V.J. Building layout generation using site-embedded GAN model. Autom. Constr. 2023, 151, 104888. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhou, X.; Qin, M.; Chen, X. Chinese Character Style Transfer Based on Multi-scale GAN. Signal Image Video Process. 2022, 16, 559–567. [Google Scholar] [CrossRef]
- Kong, Y.; Luo, C.; Ma, W.; Zhu, Q.; Zhu, S.; Yuan, N.; Jin, L. Look Closer to Supervise Better: One-shot Font Generation via Component-based Discriminator. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 13482–13491. [Google Scholar]
- Li, Y.; Duan, J.; Su, X.; Zhang, L.; Yu, H.; Liu, X. A calligraphy character generation algorithm based on improved adversarial network. J. Zhejiang Univ. 2023, 57, 1326–1334. [Google Scholar]
- Wang, X.; Hui, L.; Li, C.; Sun, Z.; Xiao, Y. A Study of Calligraphy Font Generation Based on DANet-GAN. In Proceedings of the Chinese Control Conference, Tianjin, China, 24–26 July 2023; pp. 8473–8478. [Google Scholar]
- Zhou, H.; Liu, Q.; Weng, D.; Wang, Y. Unsupervised cycle-consistent generative adversarial networks for pan sharpening. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Li, K.; Wang, Y.; Zhang, J.; Gao, P.; Song, G.; Liu, Y.; Li, H.; Qiao, Y. Uniformer: Unifying convolution and self-attention for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12581–12600. [Google Scholar] [CrossRef]
- Girdhar, N.; Sinha, A.; Gupta, S. DenseNet-II: An improved deep convolutional neural network for melanoma cancer detection. Soft Comput. 2023, 27, 13285–13304. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Zhang, L.; Sun, J.; Meng, R.; Yin, S.; Zhao, Q. DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction. J. Cell. Mol. Med. 2023, 27, 3117–3126. [Google Scholar] [CrossRef] [PubMed]
- Song, J.; Yi, H.; Xu, W.; Li, B.; Li, X. Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss. Sensors 2023, 23, 2098. [Google Scholar] [CrossRef] [PubMed]
- Pang, Y.; Lin, J.; Qin, T.; Chen, Z. Image-to-image translation: Methods and applications. IEEE Trans. Multimed. 2021, 24, 3859–3881. [Google Scholar] [CrossRef]
- Torbunov, D.; Huang, Y.; Yu, H.; Huang, J.; Yoo, S.; Lin, M.; Viren, B.; Ren, Y. Uvcgan: Unet vision transformer cycle-consistent gan for unpaired image-to-image translation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 702–712. [Google Scholar]
- Pinheiro Cinelli, L.; Araújo Marins, M.; Barros da Silva, E.A.; Netto, S.L. Variational Autoencoder//Variational Methods for Machine Learning with Applications to Deep Networks; Springer International Publishing: Cham, Switzerland, 2021; pp. 111–149. [Google Scholar]
- Siddique, N.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V. U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access 2021, 9, 82031–82057. [Google Scholar] [CrossRef]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July; IEEE: Piscataway, NJ, USA, 2017; pp. 1125–1134. [Google Scholar]
- Tumanyan, N.; Geyer, M.; Bagon, S.; Dekel, T. Plug-and-play diffusion features for text-driven image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 1921–1930. [Google Scholar]
- Parmar, G.; Kumar Singh, K.; Zhang, R.; Li, Y.; Lu, J.; Zhu, J. Zero-shot image-to-image translation. In Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, Los Angeles, CA, USA, 6–10 August 2023; pp. 1–11. [Google Scholar]
- Ko, K.; Yeom, T.; Lee, M. Superstargan: Generative adversarial networks for image-to-image translation in large-scale domains. Neural Netw. 2023, 162, 330–339. [Google Scholar] [CrossRef]
- Huang, Y.; He, M.; Jin, L.; Wang, Y. Rd-gan: Few/zero-shot chinese character style transfer via radical decomposition and rendering. In Proceedings of the Computer Vision—ECCV, Glasgow, UK, 23–28 August 2020; Volume 12351, pp. 156–172. [Google Scholar]
- Gao, Y.; Wu, J. Gan-based unpaired chinese character image translation via skeleton transformation and stroke rendering. Proc. AAAI Conf. Artif. Intell. 2020, 34, 646–653. [Google Scholar] [CrossRef]
- Xiao, Y.; Lei, W.; Lu, L.; Chang, X.; Zheng, X.; Chen, X. CS-GAN: Cross-structure generative adversarial networks for Chinese calligraphy translation. Knowl.-Based Syst. 2021, 229, 107334. [Google Scholar] [CrossRef]
- Wen, Q.; Li, S.; Han, B.; Yuan, Y. Zigan: Fine-grained chinese calligraphy font generation via a few-shot style transfer approach. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 621–629. [Google Scholar]
- Wei, M.; Wu, Q.; Ji, H.; Wang, J.; Lyu, T.; Liu, J.; Zhao, L. A Skin Disease Classification Model Based on DenseNet and ConvNeXt Fusion. Electronics 2023, 12, 438. [Google Scholar] [CrossRef]
- Zhou, M.; Liu, X.; Yi, T.; Bai, Z.; Zhang, P. A superior image inpainting scheme using Transformer-based self-supervised attention GAN model. Expert Syst. Appl. 2023, 233, 120906. [Google Scholar] [CrossRef]
- Shao, G.; Huang, M.; Gao, F.; Liu, T.; Li, L. DuCaGAN: Unified dual capsule generative adversarial network for unsupervised image-to-image translation. IEEE Access 2020, 8, 154691–154707. [Google Scholar] [CrossRef]
- Wei, Y.; Liu, Y.; Li, C.; Cheng, J.; Song, R.; Chen, X. TC-Net: A Transformer Capsule Network for EEG-based emotion recognition. Comput. Biol. Med. 2023, 152, 106463. [Google Scholar] [CrossRef] [PubMed]
- Lei, Y.; Wu, Z.; Li, Z.; Yang, Y.; Liang, Z. BP-CapsNet: An image-based Deep Learning method for medical diagnosis. Appl. Soft Comput. 2023, 146, 110683. [Google Scholar] [CrossRef]
- Liu, X.; Li, X.; Fiumara, G.; De Meo, P. Link prediction approach combined graph neural network with capsule network. Expert Syst. Appl. 2023, 212, 118737. [Google Scholar] [CrossRef]
- Long, J.; Qin, Y.; Yang, Z.; Huang, Y.; Li, C. Discriminative feature learning using a multiscale convolutional capsule network from attitude data for fault diagnosis of industrial robots. Mech. Syst. Signal Process. 2023, 182, 109569. [Google Scholar] [CrossRef]
- He, J.; Wang, C.; Jiang, D.; Li, Z.; Liu, Y.; Zhang, T. CycleGAN with an improved loss function for cell detection using partly labeled images. IEEE J. Biomed. Health Inform. 2020, 24, 2473–2480. [Google Scholar] [CrossRef] [PubMed]
- Satchidanandam, A.; Al Ansari, R.M.S.; Sreenivasulu, A.L.; Rao, V.S.; Godla, S.R.; Kaur, C. Enhancing Style Transfer with GANs: Perceptual Loss and Semantic Segmentation. Int. J. Adv. Comput. Sci. Appl. 2023, 11, 321–329. [Google Scholar] [CrossRef]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the Computer Vision—ECCV, Amsterdam, The Netherlands, 11–14 October 2016; Volume 14, pp. 694–711. [Google Scholar]
- Chen, K.; He, K.; Xu, D. Multi-autoencoder with Perceptual Loss-Based Network for Infrared and Visible Image Fusion. In Proceedings of the 2023 6th International Conference on Image and Graphics Processing, Chongqing, China, 6–8 January 2023; pp. 104–110. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 2009, 26, 98–117. [Google Scholar] [CrossRef]
- Korhonen, J.; You, J. Peak signal-to-noise ratio revisited: Is simple beautiful? In Proceedings of the 2012 Fourth International Workshop on Quality of Multimedia Experience, Melbourne, VIC, Australia, 5–7 July 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 37–38. [Google Scholar]
- Nian, F. The Traditional Treasure of Chinese Character Printing Fonts; China Academy of Art: Beijing, China, 2009. [Google Scholar]
- Hu, C.; Wu, J. An Essay on the Calligraphy of Deng Shiru. Academics 2010, 7, 164–173. [Google Scholar]
- He, L. Orchid Pavilion Preface and Its Cultural Significance in Calligraphy; China Academy of Art: Beijing, China, 2015. [Google Scholar]
- Tang, H.; Xu, D.; Sebe, N.; Yan, Y. Attention-guided generative adversarial networks for unsupervised image-to-image translation. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar]
- Choi, Y.; Uh, Y.; Yoo, J.; Ha, J.W. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 8188–8197. [Google Scholar]
- Zhang, S. Appreciation of Wang Xizhi’s Classic Calligraphy Work “Orchid Pavilion Preface”. Collect. Investig. 2022, 13, 149–151. [Google Scholar]
Sample Category | Quantity | |
---|---|---|
Regular Script | Ouyang Xun’s “Inscription on the Sweet Spring in the Jiucheng Palace” | 1107 |
Yan Zhenqing’s “Duobao Pagoda Stele” | 479 | |
Liu Gongquan’s “Xuanmi Pagoda Stele” | 1315 | |
Zhao Mengfu’s “Danba Stele” | 902 | |
Clerical Script | Deng Shiru’s “Shaoxue Qinshu Clerical Script Album” | 245 |
Running Script | Wang Xizhi’s “Orchid Pavilion Preface” | 324 |
Total | 4372 |
Methods | Strokes | Structure | Style |
---|---|---|---|
pix2pix | broken | incomplete | dissimilar |
AGGAN | broken | incomplete | dissimilar |
CycleGAN | distortion | deformation | dissimilar |
StarGAN-v2 | redundant | deformation | dissimilar |
Ours | clear | complete | similar |
Methods | SSIM (↑) | MSE (↓) | PSNR (↑) |
---|---|---|---|
pix2pix | 0.622 | 29.761 | 10.240 |
AGGAN | 0.630 | 29.459 | 10.481 |
CycleGAN | 0.635 | 29.216 | 10.483 |
StarGAN-v2 | 0.742 | 29.120 | 10.629 |
Ours | 0.758 | 28.680 | 10.853 |
Methods | SSIM (↑) | MSE (↓) | PSNR (↑) |
---|---|---|---|
pix2pix | 0.549 | 31.823 | 9.259 |
AGGAN | 0.558 | 31.687 | 9.684 |
CycleGAN | 0.563 | 30.837 | 9.741 |
StarGAN-v2 | 0.571 | 30.587 | 9.839 |
Ours | 0.613 | 28.484 | 10.753 |
Methods | SSIM (↑) | MSE (↓) | PSNR (↑) |
---|---|---|---|
AGGAN | 0.527 | 30.497 | 9.429 |
StarGAN-v2 | 0.537 | 29.791 | 9.562 |
pix2pix | 0.511 | 30.685 | 9.203 |
CycleGAN | 0.532 | 29.910 | 9.482 |
Ours | 0.594 | 27.843 | 10.655 |
Methods | SSIM (↑) | MSE (↓) | PSNR (↑) |
---|---|---|---|
Ours | 0.594 | 27.843 | 10.655 |
Ours— | 0.532 | 31.292 | 9.488 |
Ours—CapsNet | 0.526 | 31.717 | 9.711 |
Our—dense blocks | 0.518 | 32.727 | 9.578 |
Ours—self-attention | 0.510 | 31.012 | 9.763 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, W.; Sun, Z.; Wu, X. An End-to-End Generation Model for Chinese Calligraphy Characters Based on Dense Blocks and Capsule Network. Electronics 2024, 13, 2983. https://doi.org/10.3390/electronics13152983
Zhang W, Sun Z, Wu X. An End-to-End Generation Model for Chinese Calligraphy Characters Based on Dense Blocks and Capsule Network. Electronics. 2024; 13(15):2983. https://doi.org/10.3390/electronics13152983
Chicago/Turabian StyleZhang, Weiqi, Zengguo Sun, and Xiaojun Wu. 2024. "An End-to-End Generation Model for Chinese Calligraphy Characters Based on Dense Blocks and Capsule Network" Electronics 13, no. 15: 2983. https://doi.org/10.3390/electronics13152983
APA StyleZhang, W., Sun, Z., & Wu, X. (2024). An End-to-End Generation Model for Chinese Calligraphy Characters Based on Dense Blocks and Capsule Network. Electronics, 13(15), 2983. https://doi.org/10.3390/electronics13152983