Progressive-Augmented-Based DeepFill for High-Resolution Image Inpainting
Abstract
:1. Introduction
- We propose a novel progressive image inpainting paradigm, which is capable of generating high-resolution content for the missing area under an end-to-end training framework.
- We design a novel gathered attention block, which learns the global importance of different pixels and channels in the semantic feature maps of various scales.
- We have designed a discriminator for image inpainting tasks that effectively separates real pixels and synthesized pixels in the completed image and learns different weights for each synthesized region block based on the matching degree of the synthesis, thereby improving the inpainting effect of the model.
- We conduct extensive experiments, and the results demonstrate the superiority of our proposal in image reconstruction accuracy and visual authenticity.
2. Related Works
2.1. Traditional Image Inpainting Methods
2.1.1. Diffusion-Based methods
- The algorithm models the image in the variational space, and regards the image as a piecewise smooth function, which can achieve continuous structure but does not contain any texture information.
- The algorithm is essentially a process of diffusion from the edge of the missing area to the interior. Once the area to be repaired is large or the texture is complex, it will fail [27].
2.1.2. Patch-Based Methods
2.2. Deep Learning-Based Methods
3. Methodology
3.1. Gathered Attention Block
3.1.1. Pixel Attention Block
- Perform convolution on the input feature image;
- The convolutional layer reduces the dimension of the feature map channel, which is equivalent to sparse coding of the feature map;
- Each pixel point of the input feature map is multiplied by the corresponding pixel attention weight to obtain the final feature map.
3.1.2. Channel Attention Block
- The input feature map is compressed into a one-dimensional vector after passing through the global max-pooling layer;
- Employ convolution to learn an attention weight for each channel dimension of the original feature map;
- Multiply each channel by the corresponding attention weight to obtain the final feature map.
3.2. Image Enhancement Module
3.2.1. Adaptive Parameter Control Network
3.2.2. Multi-Dimensional Image Enhancement Network
3.3. Generator Network
3.3.1. The First-Stage Generator
3.3.2. The Second-Stage Generator
3.4. MT-PatchGAN
3.5. End-to-End Optimization
4. Experiments
4.1. Experimental Settings
4.1.1. Datasets
4.1.2. Baselines
4.1.3. Parameter Settings
4.2. Quantitative Analysis
Quality Analysis
4.3. Ablation Study
4.4. Application
4.4.1. Logo Removal
4.4.2. Object Removal
4.4.3. Face Editing
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wan, Z.; Zhang, B.; Chen, D.; Zhang, P.; Chen, D.; Liao, J.; Wen, F. Bringing old photos back to. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2747–2757. [Google Scholar]
- Youngjoo, J.; Jongyoul, P. Sc-fegan: Face editing generative adversarial network with user’s sketch and color. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1745–1753. [Google Scholar]
- Lin, S.; Wang, X.; Xiao, G.; Yan, Y.; Wang, H. Hierarchical representation via message propagation for robust model fitting. IEEE Trans. Ind. Electron. 2021, 68, 8582–8592. [Google Scholar] [CrossRef]
- Lin, S.; Xiao, G.; Yan, Y.; Suter, D.; Wang, H. Hypergraph optimization for multi-structural geometric model fitting. Proc. Aaai Conf. Artif. Intell. 2019, 33, 8730–8737. [Google Scholar] [CrossRef]
- Lin, S.; Luo, H.; Yan, Y.; Xiao, G.; Wang, H. Co-clustering on bipartite graphs for robust model fitting. IEEE Trans. Image Process. 2022, 31, 6605–6620. [Google Scholar] [CrossRef]
- Bertalmio, M.S.; Caselles, C.; Coloma, V.B. Image inpainting. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000; pp. 417–424. [Google Scholar]
- Shen, J.; Chan, T.F. Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 2002, 62, 1019–1043. [Google Scholar] [CrossRef]
- Sridevi, G.; Srinivas Kumar, S. Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circuits Syst. Signal Process. 2019, 38, 3802–3817. [Google Scholar] [CrossRef]
- Criminisi, A.P.; Kentaro, P.T. Object removal by exemplar-based inpainting. In Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, USA, 18–20 June 2003; IEEE: New York, NY, USA, 2003; Volume 2, p. II. [Google Scholar]
- Cheng, W.-H.; Hsieh, C.W.; Lin, S.-K.; Wang, C.-W.; Wu, J.-L. Robust algorithm for exemplar-based image inpainting. In Proceedings of the International Conference on Computer Graphics, Imaging and Visualization, Beijing, China, 26–29 July 2005; pp. 64–69. [Google Scholar]
- Xu, Z.; Sun, J. Image inpainting by patch propagation using patch sparsity. IEEE Trans. Image Process. 2010, 19, 1153–1165. [Google Scholar] [PubMed]
- Le Meur, O.; Gautier, J.; Guillemot, C. Examplar-based inpainting based on local geometry. In Proceedings of the 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 11–13 September 2011; IEEE: New York, NY, USA, 2011; pp. 3401–3404. [Google Scholar]
- Yan, Z.; Li, X.; Li, M.; Zuo, W.; Shan, S. Shift-net: Image inpainting via deep feature rearrangement. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 1–17. [Google Scholar]
- Lin, S.; Yang, A.; Lai, T.; Weng, J.; Wang, H. Multi-motion Segmentation via Co-attention-induced Heterogeneous Model Fitting. IEEE Trans. Circuits Syst. Video Technol. 2023, 2013, 1–13. [Google Scholar] [CrossRef]
- Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef]
- Hao, Y.; Shuyuan, L.; Lin, C.; Yang, L.; Hanzi, W.D.; Zhang, P.; Chen, D.; Liao, J.; Wen, F. SCINet: Semantic cue infusion network for lane detection. Proc. IEEE Int. Conf. Image Process. 2022, 2022, 1811–1815. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Zeng, Y.; Fu, J.; Chao, H.; Guo, B. Aggregated contextual transformations for high-resolution image inpainting. IEEE Trans. Vis. Comput. Graph. 2022, 29, 3266–3280. [Google Scholar] [CrossRef] [PubMed]
- Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Generative image inpainting with contextual attention. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 2018, 2018, 5505–5514. [Google Scholar]
- Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Free-form image inpainting with gated convolution. Proc. IEEE/CVF Int. Conf. Comput. Vis. 2019, 2019, 4471–4480. [Google Scholar]
- Romero, A.; Castillo, A.; Abril-Nova, J.; Timofte, R.; Das, R.; Hira, S.; Pan, Z.; Zhang, M.; Li, B.; He, D.; et al. NTIRE 2022 image inpainting challenge: Report. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–22 June 2022; pp. 1150–1182. [Google Scholar]
- Liu, G.; Reda, F.A.; Shih, K.J.; Wang, T.-C.; Tao, A.; Catanzaro, B. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 85–100. [Google Scholar]
- Wang, Y.; Li, C.; Liu, Z.; Li, M.; Tang, J.; Xie, X.; Chen, L.; Yu, P.S. An Adaptive Graph Pre-training Framework for Localized Collaborative Filtering. ACM Trans. Inf. Syst. 2022, 41, 1–27. [Google Scholar] [CrossRef]
- Liu, H.; Jiang, B.; Xiao, Y.; Yang, C. Coherent semantic attention for image inpainting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4170–4179. [Google Scholar]
- Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3730–3738. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Wang, X.; Niu, S.; Wang, H. Image inpainting detection based on multi-task deep learning network. IETE Tech. Rev. 2021, 38, 149–157. [Google Scholar] [CrossRef]
- Hays, J.; Efros, A.A. Scene completion using millions of photographs. Commun. ACM 2008, 51, 87–94. [Google Scholar] [CrossRef]
- Pathak, D.; Krahenbuhl, P.; Donahue, J.; Darrell, T.; Efros, A.A. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2536–2544. [Google Scholar]
- Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. ToG 2017, 36, 1–14. [Google Scholar] [CrossRef]
- Yeh, R.A.; Chen, C.; Yian, L.T.; Schwing, A.G.; Hasegawa-Johnson, M.; Do, M.N. Semantic image inpainting with deep generative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 June 2017; pp. 5485–5493. [Google Scholar]
- Li, Y.; Liu, S.; Yang, J.; Yang, M.-H. Generative face completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3911–3919. [Google Scholar]
- Mnih, V.; Heess, N.; Graves, A. Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 2014, 36, 27. [Google Scholar]
- Pang, B.; Li, C.; Liu, Y.; Lian, J.; Zhao, J.; Sun, H.; Deng, W.; Xie, X.; Zhang, Q. Improving Relevance Modeling via Heterogeneous Behavior Graph Learning in Bing Ads. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 3713–3721. [Google Scholar]
- Zhang, X.; Wang, X.; Shi, C.; Yan, Z.; Li, X.; Kong, B.; Lyu, S.; Zhu, B.; Lv, J.; Yin, Y.; et al. De-gan: Domain embedded gan for high quality face image inpainting. Pattern Recognit. 2022, 124, 108415. [Google Scholar] [CrossRef]
- Zhou, Y.; Barnes, C.; Shechtman, E.; Amirghodsi, S. Transfill: Reference-guided image inpainting by merging multiple color and spatial transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2266–2276. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 June 2017; pp. 1125–1134. [Google Scholar]
- Guo, X.; Yang, H.; Huang, D. Image inpainting via conditional texture and structure dual generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 14134–14143. [Google Scholar]
- Nazeri, K.; Ng, E.; Joseph, T.; Qureshi, F.Z.; Ebrahimi, M. Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv 2019, arXiv:1901.00212. [Google Scholar]
- Ren, Y.; Yu, X.; Zhang, R.; Li, T.H.; Liu, S.; Li, G. Structureflow: Image inpainting via structure-aware appearance flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 181–190. [Google Scholar]
- Liu, W.; Ren, G.; Yu, R.; Guo, S.; Zhu, J.; Zhang, L. Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions. Proc. Aaai Conf. Artif. Intell. 2022, 36, 1792–1800. [Google Scholar] [CrossRef]
- Hu, Y.; He, H.; Xu, C.; Wang, B.; Lin, S. Exposure: A White-Box Photo Post-Processing Framework. ACM Trans. Graph. 2018, 37, 26.1–26.17. [Google Scholar] [CrossRef]
- Xu, Y.; Feng, K.; Yan, X.; Yan, R.; Ni, Q.; Sun, B.; Lei, Z.; Zhang, Y.; Liu, Z. CFCNN: A novel convolutional fusion framework for collaborative fault identification of rotating machinery. Inf. Fusion 2023, 95, 1–16. [Google Scholar] [CrossRef]
- Polesel, A.; Mathews, V.; Ramponi, G. Image enhancement via adaptive unsharp masking. IEEE Trans. Image Process. 2000, 3, 9. [Google Scholar] [CrossRef]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2414–2423. [Google Scholar]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 694–711. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Xu, Y.; Yan, X.; Sun, B.; Zhai, J.; Liu, Z. Multireceptive Field Denoising Residual Convolutional Networks for Fault Diagnosis. IEEE Trans. Ind. Electron. 2022, 69, 11686–11696. [Google Scholar] [CrossRef]
- Su, H.; Zhu, X.; Gong, S. Open logo detection challenge. arXiv 2018, arXiv:1807.01964. [Google Scholar]
Filter | Parameters | Mapping Function |
---|---|---|
White Balance | , , | |
Gamma | G | |
Contrast | ||
Tone |
Mask(%) | CA | Deepfillv2 | PConv | CTSDG | AOT–GAN | PA–DeepFill | |
---|---|---|---|---|---|---|---|
L(10)↓ | 0–10 | 1.57 | 1.31 | 1.43 | 0.55 | 1.14 | 1.23 |
10–20 | 2.19 | 2.01 | 2.09 | 1.29 | 1.89 | 1.81 | |
20–30 | 3.69 | 3.31 | 3.47 | 2.31 | 2.76 | 2.33 | |
30–40 | 4.56 | 4.11 | 4.17 | 3.44 | 3.91 | 3.23 | |
40–50 | 5.33 | 5.14 | 5.30 | 4.84 | 5.01 | 4.84 | |
50–60 | 8.33 | 7.97 | 8.03 | 7.65 | 7.88 | 7.57 | |
PSNR↑ | 0–10 | 28.61 | 29.94 | 29.37 | 34.15 | 32.99 | 33.78 |
10–20 | 25.92 | 26.77 | 26.13 | 28.77 | 28.47 | 28.49 | |
20–30 | 20.43 | 21.10 | 20.99 | 25.32 | 25.19 | 25.49 | |
30–40 | 18.96 | 19.73 | 19.30 | 23.03 | 22.94 | 23.47 | |
40–50 | 16.47 | 18.02 | 16.45 | 21.17 | 21.07 | 21.14 | |
50–60 | 14.33 | 14.57 | 13.69 | 18.43 | 18.51 | 18.44 | |
SSIM(10)↑ | 0–10 | 9.33 | 9.55 | 9.20 | 9.75 | 9.44 | 9.33 |
10–20 | 8.94 | 9.08 | 8.94 | 9.33 | 9.19 | 9.11 | |
20–30 | 8.33 | 8.77 | 8.43 | 8.79 | 8.60 | 8.96 | |
30–40 | 7.99 | 8.15 | 7.94 | 8.22 | 8.11 | 8.24 | |
40–50 | 7.45 | 7.20 | 7.46 | 7.59 | 7.30 | 7.61 | |
50–60 | 6.00 | 6.25 | 6.33 | 6.70 | 6.37 | 7.01 | |
FID↓ | 0–10 | 4.25 | 3.89 | 3.91 | 3.01 | 3.14 | 3.11 |
10–20 | 13.79 | 13.20 | 15.37 | 8.89 | 9.01 | 8.94 | |
20–30 | 22.38 | 22.05 | 27.37 | 17.09 | 15.79 | 15.44 | |
30–40 | 34.33 | 35.97 | 38.44 | 26.97 | 25.14 | 24.75 | |
40–50 | 53.66 | 50.19 | 58.47 | 40.46 | 44.37 | 43.74 | |
50–60 | 84.94 | 81.46 | 91.37 | 68.31 | 70.59 | 64.58 |
Mask(%) | CA | Deepfillv2 | PConv | CTSDG | AOT-GAN | PA-DeepFill | |
---|---|---|---|---|---|---|---|
L(10)↓ | 0–10 | 1.05 | 0.92 | 0.98 | 0.54 | 0.78 | 0.73 |
10–20 | 1.35 | 1.29 | 1.30 | 0.89 | 1.14 | 0.91 | |
20–30 | 2.04 | 1.88 | 1.94 | 1.59 | 1.73 | 1.43 | |
30–40 | 2.99 | 2.71 | 2.85 | 2.37 | 2.35 | 2.27 | |
40–50 | 4.01 | 3.72 | 3.70 | 3.47 | 3.55 | 3.36 | |
50–60 | 6.58 | 6.35 | 6.44 | 6.13 | 6.07 | 6.19 | |
PSNR↑ | 0–10 | 34.12 | 37.51 | 35.09 | 39.48 | 38.39 | 38.91 |
10–20 | 31.32 | 33.05 | 32.74 | 34.15 | 33.60 | 34.33 | |
20–30 | 27.45 | 29.34 | 28.37 | 30.18 | 29.78 | 30.43 | |
30–40 | 24.27 | 26.35 | 25.91 | 27.04 | 26.73 | 27.11 | |
40–50 | 22.00 | 24.05 | 24.13 | 24.67 | 24.37 | 24.51 | |
50–60 | 18.97 | 20.12 | 19.87 | 20.68 | 20.50 | 20.44 | |
SSIM(10)↑ | 0–10 | 9.59 | 9.76 | 9.65 | 9.84 | 9.78 | 9.71 |
10–20 | 9.40 | 9.54 | 9.55 | 9.67 | 9.57 | 9.57 | |
20–30 | 8.91 | 9.23 | 9.21 | 9.31 | 9.16 | 9.27 | |
30–40 | 8.51 | 8.84 | 8.80 | 8.92 | 8.88 | 8.94 | |
40–50 | 8.13 | 8.40 | 8.40 | 8.49 | 8.44 | 8.64 | |
50–60 | 7.31 | 7.60 | 7.41 | 7.65 | 7.55 | 7.65 | |
FID↓ | 0–10 | 2.01 | 1.75 | 1.83 | 1.21 | 1.34 | 1.33 |
10–20 | 4.16 | 3.87 | 4.01 | 3.13 | 3.37 | 3.13 | |
20–30 | 7.90 | 7.31 | 7.51 | 6.29 | 6.84 | 6.22 | |
30–40 | 13.96 | 11.37 | 13.37 | 10.01 | 10.75 | 9.74 | |
40–50 | 19.39 | 15.35 | 18.05 | 13.66 | 14.64 | 13.43 | |
50–60 | 31.33 | 24.25 | 29.54 | 20.77 | 22.67 | 20.74 |
Model | PAB | CAB | L(10)↓ | PSNR↑ | SSIM↑ | FID↓ |
---|---|---|---|---|---|---|
1 | × | × | 4.01 | 21.97 | 7.33 | 19.97 |
2 | ✓ | × | 3.59 | 22.74 | 7.93 | 21.78 |
3 | × | ✓ | 3.57 | 22.75 | 7.99 | 21.32 |
4(PA-DeepFill) | ✓ | ✓ | 3.23 | 23.47 | 8.24 | 24.75 |
IE Module | L(10)↓ | PSNR↑ | SSIM↑ | FID↓ |
---|---|---|---|---|
× | 3.74 | 22.44 | 8.11 | 27.74 |
✓ | 3.23 | 23.47 | 8.24 | 24.75 |
Model | L(10)↓ | PSNR↑ | SSIM↑ | FID↓ |
---|---|---|---|---|
PatchGAN | 3.67 | 22.13 | 8.07 | 41.44 |
SN-PatchGAN | 3.34 | 23.07 | 8.26 | 33.47 |
MT-PatchGAN | 3.23 | 23.47 | 8.24 | 24.75 |
Size | L(10)↓ | PSNR↑ | SSIM↑ | FID↓ |
---|---|---|---|---|
3.39 | 23.00 | 8.23 | 27.71 | |
3.39 | 23.01 | 8.24 | 27.52 | |
3.23 | 23.47 | 8.24 | 24.75 | |
3.41 | 22.64 | 8.21 | 29.69 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cui, M.; Jiang, H.; Li, C. Progressive-Augmented-Based DeepFill for High-Resolution Image Inpainting. Information 2023, 14, 512. https://doi.org/10.3390/info14090512
Cui M, Jiang H, Li C. Progressive-Augmented-Based DeepFill for High-Resolution Image Inpainting. Information. 2023; 14(9):512. https://doi.org/10.3390/info14090512
Chicago/Turabian StyleCui, Muzi, Hao Jiang, and Chaozhuo Li. 2023. "Progressive-Augmented-Based DeepFill for High-Resolution Image Inpainting" Information 14, no. 9: 512. https://doi.org/10.3390/info14090512
APA StyleCui, M., Jiang, H., & Li, C. (2023). Progressive-Augmented-Based DeepFill for High-Resolution Image Inpainting. Information, 14(9), 512. https://doi.org/10.3390/info14090512