Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval
Abstract
:1. Introduction
- We propose a single-material translation framework based on real-time material segmentation and neural style transfer with automatic style image retrieval.
- We present a human perceptual study applied to 100 participants to evaluate the capacity of our generated results to fool the human perception of objects with translated materials.
2. Related Work
3. Proposed Method
3.1. Style Image Retrieval
3.2. Material Translation with NST
3.2.1. Real-Time Material Segmentation
3.2.2. Material Translation
4. Experimental Results
4.1. Implementation Details
4.2. Datasets
4.3. Ablation Study
4.4. Comparison among SOTA NST Methods
4.5. Human Perceptual Study
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 11976–11986. [Google Scholar]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
- Jing, Y.; Yang, Y.; Feng, Z.; Ye, J.; Yu, Y.; Song, M. Neural style transfer: A review. IEEE Trans. Vis. Comput. Graph. 2019, 26, 3365–3385. [Google Scholar] [CrossRef] [PubMed]
- Siarohin, A.; Zen, G.; Majtanovic, C.; Alameda-Pineda, X.; Ricci, E.; Sebe, N. How to make an image more memorable? A deep style transfer approach. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Mountain View, CA, USA, 6–9 June 2017; pp. 322–329. [Google Scholar]
- Yanai, K.; Tanno, R. Conditional fast style transfer network. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Mountain View, CA, USA, 6–9 June 2017; pp. 434–437. [Google Scholar]
- Li, T.; Qian, R.; Dong, C.; Liu, S.; Yan, Q.; Zhu, W.; Lin, L. Beautygan: Instance-level facial makeup transfer with deep generative adversarial network. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Korea, 22–26 October 2018; pp. 645–653. [Google Scholar]
- Matsuo, S.; Shimoda, W.; Yanai, K. Partial style transfer using weakly supervised semantic segmentation. In Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, Hong Kong, China, 10–14 July 2017; pp. 267–272. [Google Scholar]
- Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. arXiv 2016, arXiv:1607.08022. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Ahn, J.; Kwak, S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4981–4990. [Google Scholar]
- Chao, P.; Kao, C.Y.; Ruan, Y.S.; Huang, C.H.; Lin, Y.L. Hardnet: A low memory traffic network. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October– 2 November 2019; pp. 3552–3561. [Google Scholar]
- Sharan, L.; Rosenholtz, R.; Adelson, E. Material perception: What can you see in a brief glance? J. Vis. 2009, 9, 784. [Google Scholar] [CrossRef]
- Zhang, Y.; Ozay, M.; Liu, X.; Okatani, T. Integrating deep features for material recognition. In Proceedings of the 23rd International Conference on Pattern Recognition, Cancun, Mexico, 4–8 December 2016; pp. 3697–3702. [Google Scholar]
- Benitez-Garcia, G.; Shimoda, W.; Yanai, K. Style Image Retrieval for Improving Material Translation Using Neural Style Transfer. In Proceedings of the 2020 Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia (MMArt-ACM ’20), Dublin, Ireland, 26–29 October 2020; pp. 8–13. [Google Scholar]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 694–711. [Google Scholar]
- Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1501–1510. [Google Scholar]
- Li, Y.; Fang, C.; Yang, J.; Wang, Z.; Lu, X.; Yang, M.H. Universal style transfer via feature transforms. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 386–396. [Google Scholar]
- Li, X.; Liu, S.; Kautz, J.; Yang, M.H. Learning linear transformations for fast arbitrary style transfer. arXiv 2018, arXiv:1808.04537. [Google Scholar]
- Zhang, C.; Zhu, Y.; Zhu, S.C. Metastyle: Three-way trade-off among speed, flexibility, and quality in neural style transfer. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 1254–1261. [Google Scholar]
- Kolkin, N.; Salavon, J.; Shakhnarovich, G. Style Transfer by Relaxed Optimal Transport and Self-Similarity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10051–10060. [Google Scholar]
- Xu, Z.; Hou, L.; Zhang, J. IFFMStyle: High-Quality Image Style Transfer Using Invalid Feature Filter Modules. Sensors 2022, 22, 6134. [Google Scholar] [CrossRef] [PubMed]
- Kim, M.; Choi, H.C. Total Style Transfer with a Single Feed-Forward Network. Sensors 2022, 22, 4612. [Google Scholar] [CrossRef] [PubMed]
- Shimoda, W.; Yanai, K. Distinct class-specific saliency maps for weakly supervised semantic segmentation. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 218–234. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Huang, X.; Liu, M.Y.; Belongie, S.; Kautz, J. Multimodal unsupervised image-to-image translation. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 172–189. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Wang, T.C.; Liu, M.Y.; Zhu, J.Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8798–8807. [Google Scholar]
- Chen, Q.; Koltun, V. Photographic image synthesis with cascaded refinement networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1511–1520. [Google Scholar]
Method | Training Set | Test Set |
---|---|---|
PSA | 10,000 (EFMD) | 1000 (FMD) |
HarDNet-base | 10,000 (EFMD) | 1000 (FMD) |
InceptionV3 | 10,000 (EFMD) | 1000 (FMD) |
HarDNet | 900 (FMD) | 100 (FMD) |
NST-based | - | 100 (FMD) |
w/o Refine | w/ Refine | |||
---|---|---|---|---|
Method | acc | mIoU | acc | mIoU |
Baseline | - | - | 0.556 | 0.4860 |
VGG19-IN | 0.409 | 0.3967 | 0.572 | 0.5062 |
VGG19-BN | 0.291 | 0.3612 | 0.543 | 0.4887 |
VGG19 | 0.270 | 0.3520 | 0.506 | 0.4845 |
Method | acc ↑ | mIoU ↑ | IS ↑ | FID ↓ | Inference Time ↓ |
---|---|---|---|---|---|
Gatys [4] | 0.572 | 0.5062 | 4.181 | 61.30 | 45.6545 s |
STROTSS [22] | 0.515 | 0.4887 | 4.046 | 60.29 | 89.1562 s |
Johnson’s [17] | 0.506 | 0.4464 | 3.887 | 68.44 | 0.0881 s |
MetaStyle [21] | 0.442 | 0.4674 | 3.635 | 61.93 | 0.1868 s |
WCT [19] | 0.353 | 0.4079 | 3.604 | 64.53 | 1.0151 s |
LST [20] | 0.343 | 0.3606 | 3.569 | 62.95 | 0.4816 s |
AdaIN [18] | 0.304 | 0.2780 | 3.129 | 74.52 | 0.1083 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Benitez-Garcia, G.; Takahashi, H.; Yanai, K. Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval. Sensors 2022, 22, 7317. https://doi.org/10.3390/s22197317
Benitez-Garcia G, Takahashi H, Yanai K. Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval. Sensors. 2022; 22(19):7317. https://doi.org/10.3390/s22197317
Chicago/Turabian StyleBenitez-Garcia, Gibran, Hiroki Takahashi, and Keiji Yanai. 2022. "Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval" Sensors 22, no. 19: 7317. https://doi.org/10.3390/s22197317
APA StyleBenitez-Garcia, G., Takahashi, H., & Yanai, K. (2022). Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval. Sensors, 22(19), 7317. https://doi.org/10.3390/s22197317