TOLGAN: An End-To-End Framework for Producing Traditional Orient Landscape
Abstract
:1. Introduction
2. Related Work
2.1. Deep Learning-Based General Approach
2.2. Deep Learning-Based TOL Generation Approach
3. TOL Generation Framework
3.1. Overview
3.2. Model Architecture
3.2.1. Generator
3.2.2. Discriminator
3.2.3. Loss Function
4. Training
4.1. Building TOL Image Dataset
4.2. Training Our Model
5. Implementation and Results
6. Analysis
6.1. Comparison
6.2. Quantitative Evaluation
6.3. Qualitative Evaluation
6.3.1. User Study
- Question 1 (Quality of the generated image): Among the five result images, select one image that most resembles TOL style.
- Question 2 (Preservation of input photograph): Among the five result images, select one image that most resembles the input image.
6.3.2. Focus Group Evaluation
6.4. Ablation Study
6.5. Diversity
6.6. Limitation
7. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Strassmann, S. Hairy Brushes. ACM Comput. Graph. 1986, 20, 225–232. [Google Scholar] [CrossRef]
- Guo, Q.; Kunii, T. Modeling the Diffuse Paintings of ‘Sumie’. Proc. Model. Comput. Graph. 1991, 1991, 329–338. [Google Scholar]
- Lee, J. Simulating Oriental Black-ink Painting. IEEE Comput. Graph. Appl. 1999, 19, 74–81. [Google Scholar] [CrossRef]
- Lee, J. Diffusion Rendering of Black Ink Paintings using new Paper and Ink Models. Comput. Graph. 2001, 25, 295–308. [Google Scholar] [CrossRef]
- Way, D.; Lin, Y.; Shin, Z. The Synthesis of Trees in Chinese Landscape Painting using Silhoutte and Texture Strokes. J. WASC 2002, 10, 499–506. [Google Scholar]
- Huang, S.; Way, D.; Shih, Z. Physical-based Model of Ink Diffusion in Chinese Ink Paintings. Proc. WSCG 2003, 2003, 33–40. [Google Scholar]
- Yu, J.; Luo, G.; Peng, Q. Image-based Synthesis of Chinese Landscape Painting. J. Comput. Sci. Technol. 2003, 18, 22–28. [Google Scholar] [CrossRef]
- Xu, S.; Xu, Y.; Kang, S.; Salesin, D.; Pan, Y.; Shum, H. Animating Chinese Paintings through Stroke-based Decomposition. ACM Trans. Graph. 2006, 25, 239–267. [Google Scholar] [CrossRef]
- Zhang, S.; Chen, T.; Zhang, Y.; Hu, S.; Martin, R. Video-based Running Water Animation in Chinese Painting Style. Sci. China Ser. F Inf. Sci. 2009, 52, 162–171. [Google Scholar] [CrossRef]
- Shi, W. Shan Shui in the World: A Generative Approach to Traditional Chinese Landscape Painting. In Proceedings of the IEEE VIS 2016 Arts Program, Baltimore, MD, USA, 23–28 October 2016; pp. 41–47. [Google Scholar]
- Gatys, L.; Ecker, A.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
- Ulyanov, D.; Lebedev, V.; Vedaldi, A.; Lempitsky, V. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images. In Proceedings of the ICML 2016, New York, NY, USA, 19–24 June 2016; pp. 1349–1357. [Google Scholar]
- Huang, X.; Belongie, S. Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. In Proceedings of the ICCV 2017, Venice, Italy, 22–29 October 2017; pp. 1501–1510. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the NIPS 2014, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A. Image-to-image Translation with Conditional Adversarial Networks. In Proceedings of the CVPR 2017, Honolulu, HI, USA, 17–26 June 2017; pp. 1125–1134. [Google Scholar]
- Zhu, J.; Park, T.; Isola, P.; Efros, A. Unpaired Image-to-image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the ICCV 2017, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Park, T.; Liu, M.; Wang, T.; Zhu, J. Semantic Image Synthesis with Spatially-Adaptive Normalization. In Proceedings of the CVPR 2019, Long Beach, CA, USA, 15–20 June 2019; pp. 2337–2346. [Google Scholar]
- Huang, X.; Liu, M.-Y.; Belongie, S.; Kautz, J. Multimodal Unsupervised Image-to-Image Translation. In Proceedings of the ECCV 2018, Munich, Germany, 8–14 September 2018; pp. 172–189. [Google Scholar]
- Li, B.; Xiong, C.; Wu, T.; Zhou, Y.; Zhang, L.; Chu, R. Neural Abstract Style Transfer for Chinese Traditional Painting. In Proceedings of the ACCV 2018, Perth, Australia, 2–6 December 2018; pp. 212–227. [Google Scholar]
- Lin, D.; Wang, Y.; Xu, G.; Li, J.; Fu, K. Transform a Simple Sketch to a Chinese Painting by a Multiscale Deep Neural Network. Algorithms 2018, 11, 4. [Google Scholar] [CrossRef]
- He, B.; Gao, F.; Ma, D.; Shi, B.; Duan, L. Chipgan: A Generative Adversarial Network for Chinese Ink Wash Painting Style Transfer. In Proceedings of the ACM Multimedia 2018, Seoul, Republic of Korea, 22–26 October 2018; pp. 1172–1180. [Google Scholar]
- Zhou, L.; Wang, Q.-F.; Huang, K.; Lo, C.-H.Q. ShanshuiDaDA: An Interactive and Generative Approach to Chinese Shanshui Painting Document. In Proceedings of the International Conference on Document Analysis and Recognition 2019, Sydney, Australia, 20–25 September 2019; pp. 819–824. [Google Scholar]
- Xue, A. End-to-End Chinese Landscape Painting Creation Using Generative Adversarial Networks. In Proceedings of the WACV 2021, Online, 5–9 January 2021; pp. 3863–3871. [Google Scholar]
- Hung, M.; Trang, M.; Nakatsu, R.; Tosa, N. Unusual Transformation: A Deep Learning Approach to Create Art. In Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Springer: Cham, Switzerland, 2022; Volume 422, pp. 309–320. [Google Scholar]
- Chung, C.; Huang, H. Interactively transforming chinese ink paintings into realistic images using a border enhance generative adversarial network. Multimed. Tools Appl. 2023, 82, 11663–11696. [Google Scholar]
- Wright, M.; Ommer, B. ArtFID: Quantitative Evaluation of Neural Style Transfer. In Proceedings of the German Conference on Pattern Recognition 2022, Konstanz, Germnay, 27–30 September 2022; pp. 560–576. [Google Scholar]
D steps per G | 1 |
batch size | 1 |
optimizer | Adam Adam (, ) |
preprocessing | resize, crop, flip |
crop size | 256 |
number of labels | 12 |
parameter initialization | Xavier |
GAN mode | Hinge |
training scheme | TTUR |
discriminator | multiscale (no. of subnet = 2, no. of layer = 4) |
ine epoch | 120 |
normalization | instance normalization |
MUNIT | CycleGAN | ChipGAN | NST | Ours | |
---|---|---|---|---|---|
FID | 213.5 | 198.4 | 235.7 | 187.6 | 154.5 |
ArtFID | 185.7 | 156.3 | 176.3 | 163.3 | 131.5 |
Question | Models | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | Sum |
---|---|---|---|---|---|---|---|---|---|---|
1 | MUNIT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ChipGAN | 4 | 6 | 0 | 9 | 3 | 12 | 10 | 12 | 56 | |
CycleGAN | 0 | 0 | 5 | 5 | 0 | 0 | 4 | 0 | 14 | |
NST | 0 | 4 | 4 | 1 | 2 | 4 | 3 | 2 | 20 | |
ours | 26 | 20 | 21 | 15 | 25 | 14 | 13 | 16 | 150 | |
2 | MUNIT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ChipGAN | 3 | 2 | 0 | 8 | 1 | 10 | 7 | 9 | 40 | |
CycleGAN | 0 | 5 | 5 | 8 | 2 | 3 | 5 | 0 | 28 | |
NST | 6 | 12 | 11 | 3 | 14 | 8 | 8 | 10 | 72 | |
ours | 21 | 11 | 14 | 11 | 13 | 9 | 10 | 16 | 150 |
Models | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | Average |
---|---|---|---|---|---|---|---|---|---|
MUNIT | 2.5 | 1.8 | 2.2 | 2.1 | 2.2 | 2.1 | 2.2 | 1.9 | 2.13 |
ChipGAN | 4.2 | 4.7 | 3.7 | 7.2 | 6.3 | 7.2 | 7.0 | 6.4 | 5.84 |
CycleGAN | 3.3 | 4.0 | 5.8 | 5.7 | 2.9 | 5.4 | 5.7 | 1.7 | 4.31 |
NST | 2.9 | 5.8 | 5.2 | 4.2 | 6.1 | 6.3 | 6.9 | 4.5 | 5.24 |
ours | 6.9 | 7.4 | 7.6 | 7.7 | 7.6 | 6.5 | 7.1 | 6.3 | 7.14 |
FID | ArtFID | |
---|---|---|
(a) without AdaIN | 194.7 | 178.6 |
(b) with AdaIN | 157.4 | 140.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, B.; Yang, H.; Min, K. TOLGAN: An End-To-End Framework for Producing Traditional Orient Landscape. Electronics 2024, 13, 4468. https://doi.org/10.3390/electronics13224468
Kim B, Yang H, Min K. TOLGAN: An End-To-End Framework for Producing Traditional Orient Landscape. Electronics. 2024; 13(22):4468. https://doi.org/10.3390/electronics13224468
Chicago/Turabian StyleKim, Booyong, Heekyung Yang, and Kyungha Min. 2024. "TOLGAN: An End-To-End Framework for Producing Traditional Orient Landscape" Electronics 13, no. 22: 4468. https://doi.org/10.3390/electronics13224468
APA StyleKim, B., Yang, H., & Min, K. (2024). TOLGAN: An End-To-End Framework for Producing Traditional Orient Landscape. Electronics, 13(22), 4468. https://doi.org/10.3390/electronics13224468