SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting
Abstract
:1. Introduction
- A novel deep learning framework for high-resolution inpainting, which allows the input of 2K and 4K resolution images to yield equally sharp results.
- A “degradation and refinement” strategy is proposed to suppress suppressing memory and computational overhead while guaranteeing a high inpainting quality at high-resolution.
- The structural coherence and visual fidelity of the inpainted results are enhanced to be ahead of the state-of-art technology.
- A general high-resolution inpainting pipeline consisting of an independent inpainter and refiner in series that can be trained and modified separately.
2. Related Work
2.1. Image Inpainting
2.2. High-Frequency Image Content Reconstruction
3. Method
3.1. Framework and Flow
3.2. LR Inpainting Network
3.3. SR Network
4. Training Configuration and Strategy
5. Experimental Results and Discussion
5.1. Implementation Details
5.2. Qualitative Evaluation
5.3. Quantitative Evaluation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
HR | High-Resolution |
LR | Low-Resolution |
SR | Super-Resolution |
CNN | Convolutional Neural Networks |
GAN | Generative Adversarial Networks |
References
- Ballester, C.; Bertalmio, M.; Caselles, V.; Sapiro, G.; Verdera, J. Filling-in by joint interpolation of vector fields and gray levels. IEEE Trans. Image Process. 2001, 10, 1200–1211. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Esedoglu, S.; Shen, J. Digital inpainting based on the Mumford–Shah–Euler image model. Eur. J. Appl. Math. 2002, 13, 353–370. [Google Scholar] [CrossRef] [Green Version]
- Liu, D.; Sun, X.; Wu, F.; Li, S.; Zhang, Y.Q. Image compression with edge-based inpainting. IEEE Trans. Circuits Syst. Video Technol. 2007, 17, 1273–1287. [Google Scholar]
- Waykule, M.; Patil, M. Region filling and object removal by exemplar-based image inpainting. Int. J. Sci. Eng. Res. 2012, 3, 2229–5518. [Google Scholar]
- He, K.; Sun, J. Statistics of patch offsets for image completion. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 16–29. [Google Scholar]
- Drori, I.; Cohen-Or, D.; Yeshurun, H. Fragment-based image completion. In ACM SIGGRAPH 2003 Papers; ACM: New York, NY, USA, 2003; pp. 303–312. [Google Scholar]
- Wilczkowiak, M.; Brostow, G.J.; Tordoff, B.; Cipolla, R. Hole filling through photomontage. In Proceedings of the BMVC 2005-Proceedings of the British Machine Vision Conference, Oxford, UK, 5–8 September 2005. [Google Scholar]
- Xu, Z.; Sun, J. Image inpainting by patch propagation using patch sparsity. IEEE Trans. Image Process. 2010, 19, 1153–1165. [Google Scholar]
- Nazeri, K.; Ng, E.; Joseph, T.; Qureshi, F.Z.; Ebrahimi, M. Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv 2019, arXiv:1901.00212. [Google Scholar]
- Yeh, R.A.; Chen, C.; Lim, T.Y.; Schwing, A.G.; Hasegawa-Johnson, M.; Do, M.N. Semantic Image Inpainting with Deep Generative Models. arXiv 2017, arXiv:1607.07539. [Google Scholar]
- Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Generative Image Inpainting with Contextual Attention. arXiv 2018, arXiv:1801.07892. [Google Scholar]
- Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. 2017, 36, 1–14. [Google Scholar] [CrossRef]
- Oord, A.V.; Kalchbrenner, N.; Kavukcuoglu, K. Pixel Recurrent Neural Networks. In Machine Learning Research, Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; Balcan, M.F., Weinberger, K.Q., Eds.; PMLR: New York, New York, USA, 2016; Volume 48, pp. 1747–1756. [Google Scholar]
- Liao, L.; Hu, R.; Xiao, J.; Wang, Z. Edge-Aware Context Encoder for Image Inpainting. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 3156–3160. [Google Scholar] [CrossRef]
- Pathak, D.; Krahenbuhl, P.; Donahue, J.; Darrell, T.; Efros, A.A. Context Encoders: Feature Learning by Inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Xiong, W.; Yu, J.; Lin, Z.; Yang, J.; Lu, X.; Barnes, C.; Luo, J. Foreground-Aware Image Inpainting. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5833–5841. [Google Scholar] [CrossRef] [Green Version]
- Yi, Z.; Tang, Q.; Azizi, S.; Jang, D.; Xu, Z. Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 7505–7514. [Google Scholar] [CrossRef]
- Yang, C.; Lu, X.; Lin, Z.; Shechtman, E.; Wang, O.; Li, H. High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4076–4084. [Google Scholar] [CrossRef] [Green Version]
- Ikehata, S.; Cho, J.H.; Aizawa, K. Depth map inpainting and super-resolution based on internal statistics of geometry and appearance. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; pp. 938–942. [Google Scholar] [CrossRef]
- Kim, S.Y.; Aberman, K.; Kanazawa, N.; Garg, R.; Wadhwa, N.; Chang, H.; Karnad, N.; Kim, M.; Liba, O. Zoom-to-Inpaint: Image Inpainting with High-Frequency Details. arXiv 2021, arXiv:2012.09401. [Google Scholar]
- Efros, A.A.; Freeman, W.T. Image Quilting for Texture Synthesis and Transfer. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 12–17 August 2001; pp. 341–346. [Google Scholar] [CrossRef]
- Efros, A.; Leung, T. Texture synthesis by non-parametric sampling. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Corfu, Greece, 20–27 September 1999; Volume 2, pp. 1033–1038. [Google Scholar] [CrossRef] [Green Version]
- Wang, L.; Chen, W.; Yang, W.; Bi, F.; Yu, F.R. A State-of-the-Art Review on Image Synthesis With Generative Adversarial Networks. IEEE Access 2020, 8, 63514–63537. [Google Scholar] [CrossRef]
- Pan, X.; Zhan, X.; Dai, B.; Lin, D.; Loy, C.C.; Luo, P. Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation. arXiv 2020, arXiv:2003.13659. [Google Scholar]
- Brock, A.; Donahue, J.; Simonyan, K. Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv 2019, arXiv:1809.11096. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv 2018, arXiv:1710.10196. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
- Zhang, H.; Xu, T.; Li, H.; Zhang, S.; Wang, X.; Huang, X.; Metaxas, D. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5908–5916. [Google Scholar] [CrossRef] [Green Version]
- Huang, T.S. Advances in Computer Vision and Image Processing: A Research Annual: Image Enhancement and Restoration, v. 2. Available online: http://a.xueshu.baidu.com/usercenter/paper/show?paperid=ff63d1c895dbe3a66d889dbc93368fad (accessed on 15 July 2021).
- Stark, H.; Yang, Y. Vector Space Projections: A Numerical Approach to Signal and Image Processing, Neural Nets, and Optics; Wiley-Interscience: Hoboken, NJ, USA, 1998. [Google Scholar]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 295–307. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar] [CrossRef] [Green Version]
- Farrugia, R.A.; Guillemot, C. Light Field Super-Resolution Using a Low-Rank Prior and Deep Convolutional Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 1162–1175. [Google Scholar] [CrossRef] [Green Version]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. arXiv 2016, arXiv:1511.04587. [Google Scholar]
- Dong, C.; Loy, C.C.; Tang, X. Accelerating the Super-Resolution Convolutional Neural Network. arXiv 2016, arXiv:1608.00367. [Google Scholar]
- Talab, M.A.; Awang, S.; Najim, S.A.d.M. Super-Low Resolution Face Recognition using Integrated Efficient Sub-Pixel Convolutional Neural Network (ESPCN) and Convolutional Neural Network (CNN). In Proceedings of the 2019 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 29 June 2019; pp. 331–335. [Google Scholar] [CrossRef]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced Deep Residual Networks for Single Image Super-Resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1132–1140. [Google Scholar] [CrossRef] [Green Version]
- Plötz, T.; Roth, S. Neural Nearest Neighbors Networks. arXiv 2018, arXiv:1810.12575. [Google Scholar]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual Dense Network for Image Super-Resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2472–2481. [Google Scholar] [CrossRef] [Green Version]
- Zhang, K.; Zuo, W.; Zhang, L. Deep Plug-and-Play Super-Resolution for Arbitrary Blur Kernels. arXiv 2019, arXiv:1903.12529. [Google Scholar]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv 2016, arXiv:1603.08155. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. arXiv 2018, arXiv:1611.07004. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar] [CrossRef] [Green Version]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
- Timofte, R.; Gu, S.; Wu, J.; Van Gool, L.; Zhang, L.; Yang, M.H.; Haris, M.; Shakhnarovich, G.; Ukita, N.; Hu, S.; et al. NTIRE 2018 Challenge on Single Image Super-Resolution: Methods and Results. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 965–96511. [Google Scholar] [CrossRef]
- Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; IEEE: Manhattan, NY, USA, 2003; Volume 2, pp. 1398–1402. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv 2017, arXiv:1706.08500. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar] [CrossRef] [Green Version]
2K Inpainting | 4K Inpainting | |||||||
---|---|---|---|---|---|---|---|---|
Method | PSNR↑ | SSIM↑ | NRMSE↓ | FID↓ | PSNR↑ | SSIM↑ | NRMSE↓ | FID↓ |
HiFill | 21.386 | 0.810 | 0.175 | 1.193 | 20.503 | 0.813 | 0.239 | 2.082 |
SR-Inpaint | 27.364 | 0.923 | 0.092 | 0.097 | 26.065 | 0.910 | 0.130 | 0.138 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, H.; Li, X.; Zhang, K.; He, Y.; Fan, H.; Liu, S.; Hao, C.; Jiang, B. SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting. Algorithms 2021, 14, 236. https://doi.org/10.3390/a14080236
Xu H, Li X, Zhang K, He Y, Fan H, Liu S, Hao C, Jiang B. SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting. Algorithms. 2021; 14(8):236. https://doi.org/10.3390/a14080236
Chicago/Turabian StyleXu, Haoran, Xinya Li, Kaiyi Zhang, Yanbai He, Haoran Fan, Sijiang Liu, Chuanyan Hao, and Bo Jiang. 2021. "SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting" Algorithms 14, no. 8: 236. https://doi.org/10.3390/a14080236
APA StyleXu, H., Li, X., Zhang, K., He, Y., Fan, H., Liu, S., Hao, C., & Jiang, B. (2021). SR-Inpaint: A General Deep Learning Framework for High Resolution Image Inpainting. Algorithms, 14(8), 236. https://doi.org/10.3390/a14080236