Universal Image Restoration with Text Prompt Diffusion
Abstract
:1. Introduction
- We proposed a text prompt diffusion model to solve the universal image restoration problem. To the best of our knowledge, this is the first attempt to incorporate text prompts into universal image restoration.
- We pioneered an effective denoising network for diffusion-based image restoration. By introducing text prior information into the diffusion model using an efficient restoration block (ERB), ETDiffIR can achieve excellent image restoration results.
- We constructed a combined dataset containing three different degradation types and generated synthetic captions using the visual-language model Minigpt-4, resulting in a high-quality dataset comprising paired text and images.
2. Related Work
2.1. Universal Image Restoration
2.2. Diffusion-Based Restoration
3. Method
3.1. Preliminary
3.2. Overview
3.3. Text–Image Fusion Block
3.4. Efficient Restoration Block
3.5. Efficient Restoration U-Shaped Network for Noise Prediction
3.6. Optimization and Inference
Algorithm 1 Training of ETDiffIR |
|
Algorithm 2 Inference of ETDiffIR |
Input: LR image , text caption c, total step T. Output: The restored image . |
|
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Metrics
4.4. Multiple Degradation Universal Restoration Results
4.5. Single Degradation Results
4.6. Ablation Studies
4.6.1. Importance of TIFB and ERUNet
4.6.2. Effect of Text Prompts
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, C.; Shi, X.; Qin, Y.; Li, X.; Han, X.; Yang, T.; Guo, S. Real-world blind super-resolution via feature matching with implicit high-resolution priors. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 1329–1338. [Google Scholar]
- Ji, S.W.; Lee, J.; Kim, S.W.; Hong, J.P.; Baek, S.J.; Jung, S.W.; Ko, S.J. XYDeblur: Divide and conquer for single image deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 17421–17430. [Google Scholar]
- Yao, M.; Huang, J.; Jin, X.; Xu, R.; Zhou, S.; Zhou, M.; Xiong, Z. Generalized Lightness Adaptation with Channel Selective Normalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 10668–10679. [Google Scholar]
- Jiang, K.; Wang, Z.; Yi, P.; Chen, C.; Huang, B.; Luo, Y.; Ma, J.; Jiang, J. Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8346–8355. [Google Scholar]
- Tu, Z.; Talebi, H.; Zhang, H.; Yang, F.; Milanfar, P.; Bovik, A.; Li, Y. Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 5769–5780. [Google Scholar]
- Chen, J.; Zhao, G. Contrastive Multiscale Transformer for Image Dehazing. Sensors 2024, 24, 2041. [Google Scholar] [CrossRef]
- Zhao, M.; Yang, R.; Hu, M.; Liu, B. Deep Learning-Based Technique for Remote Sensing Image Enhancement Using Multiscale Feature Fusion. Sensors 2024, 24, 673. [Google Scholar] [CrossRef] [PubMed]
- Xu, J.; Chen, Z.X.; Luo, H.; Lu, Z.M. An efficient dehazing algorithm based on the fusion of transformer and convolutional neural network. Sensors 2022, 23, 43. [Google Scholar] [CrossRef] [PubMed]
- Tan, C.; Wang, L.; Cheng, S. Image super-resolution via dual-level recurrent residual networks. Sensors 2022, 22, 3058. [Google Scholar] [CrossRef] [PubMed]
- Han, W.; Zhu, H.; Qi, C.; Li, J.; Zhang, D. High-resolution representations network for single image dehazing. Sensors 2022, 22, 2257. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Y.; Zhang, Z.; Xue, T.; Gu, J. Autodir: Automatic all-in-one image restoration with latent diffusion. arXiv 2023, arXiv:2310.10123. [Google Scholar]
- Kawar, B.; Elad, M.; Ermon, S.; Song, J. Denoising diffusion restoration models. Adv. Neural Inf. Process. Syst. 2022, 35, 23593–23606. [Google Scholar]
- Wang, Y.; Yu, J.; Zhang, J. Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model. In Proceedings of the Eleventh International Conference on Learning Representations, Virtual Event, 25–29 April 2022. [Google Scholar]
- Yang, P.; Zhou, S.; Tao, Q.; Loy, C.C. PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance. arXiv 2023, arXiv:2309.10810. [Google Scholar]
- Garber, T.; Tirer, T. Image Restoration by Denoising Diffusion Models with Iteratively Preconditioned Guidance. arXiv 2023, arXiv:2312.16519. [Google Scholar]
- Zhu, Y.; Zhang, K.; Liang, J.; Cao, J.; Wen, B.; Timofte, R.; Van Gool, L. Denoising Diffusion Models for Plug-and-Play Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 1219–1229. [Google Scholar]
- Li, B.; Liu, X.; Hu, P.; Wu, Z.; Lv, J.; Peng, X. All-in-one image restoration for unknown corruption. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 17452–17462. [Google Scholar]
- Luo, Z.; Gustafsson, F.K.; Zhao, Z.; Sjölund, J.; Schön, T.B. Controlling vision-language models for universal image restoration. arXiv 2023, arXiv:2310.01018. [Google Scholar]
- Park, D.; Lee, B.H.; Chun, S.Y. All-in-one image restoration for unknown degradations using adaptive discriminative filters for specific degradations. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 5815–5824. [Google Scholar]
- Zhang, J.; Huang, J.; Yao, M.; Yang, Z.; Yu, H.; Zhou, M.; Zhao, F. Ingredient-Oriented Multi-Degradation Learning for Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 5825–5835. [Google Scholar]
- Chen, W.T.; Huang, Z.K.; Tsai, C.C.; Yang, H.H.; Ding, J.J.; Kuo, S.Y. Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization: Toward a unified model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 17653–17662. [Google Scholar]
- Kim, K.; Oh, Y.; Ye, J.C. Zegot: Zero-shot segmentation through optimal transport of text prompts. arXiv 2023, arXiv:2301.12171. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 10684–10695. [Google Scholar]
- Liu, V.; Chilton, L.B. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 30 April–5 May 2022; pp. 1–23. [Google Scholar]
- Lyu, Y.; Lin, T.; Li, F.; He, D.; Dong, J.; Tan, T. Deltaedit: Exploring text-free training for text-driven image manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 6894–6903. [Google Scholar]
- Zhu, D.; Chen, J.; Shen, X.; Li, X.; Elhoseiny, M. Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv 2023, arXiv:2304.10592. [Google Scholar]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 286–301. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Trockman, A.; Kolter, J.Z. Patches are all you need? arXiv 2022, arXiv:2201.09792. [Google Scholar]
- Woo, S.; Debnath, S.; Hu, R.; Chen, X.; Liu, Z.; Kweon, I.S.; Xie, S. Convnext v2: Co-designing and scaling convnets with masked autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 16133–16142. [Google Scholar]
- Luo, Z.; Gustafsson, F.K.; Zhao, Z.; Sjölund, J.; Schön, T.B. Image restoration with mean-reverting stochastic differential equations. arXiv 2023, arXiv:2301.11699. [Google Scholar]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
- Ren, W.; Pan, J.; Zhang, H.; Cao, X.; Yang, M.H. Single image dehazing via multi-scale convolutional neural networks with holistic edges. Int. J. Comput. Vis. 2020, 128, 240–259. [Google Scholar] [CrossRef]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Cycleisp: Real image restoration via improved data synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2696–2705. [Google Scholar]
- Jiang, X.; Zhao, C.; Zhu, M.; Hao, Z.; Gao, W. Residual Spatial and Channel Attention Networks for Single Image Dehazing. Sensors 2021, 21, 7922. [Google Scholar] [CrossRef]
- Yan, Q.; Jiang, A.; Chen, K.; Peng, L.; Yi, Q.; Zhang, C. Textual Prompt Guided Image Restoration. arXiv 2023, arXiv:2312.06162. [Google Scholar]
- Xia, B.; Zhang, Y.; Wang, S.; Wang, Y.; Wu, X.; Tian, Y.; Yang, W.; Van Gool, L. Diffir: Efficient diffusion model for image restoration. arXiv 2023, arXiv:2303.09472. [Google Scholar]
- Li, H.; Yang, Y.; Chang, M.; Chen, S.; Feng, H.; Xu, Z.; Li, Q.; Chen, Y. Srdiff: Single image super-resolution with diffusion probabilistic models. Neurocomputing 2022, 479, 47–59. [Google Scholar] [CrossRef]
- Anderson, B.D. Reverse-time diffusion equation models. Stoch. Process. Their Appl. 1982, 12, 313–326. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Sinha, D.; El-Sharkawy, M. Thin mobilenet: An enhanced mobilenet architecture. In Proceedings of the 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York City, NY, USA, 10–12 October 2019; pp. 0280–0285. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
- Xiao, Y.; Yuan, Q.; Jiang, K.; He, J.; Jin, X.; Zhang, L. EDiffSR: An efficient diffusion probabilistic model for remote sensing image super-resolution. IEEE Trans. Geosci. Remote Sens. 2023, 62, 5601514. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Kloeden, P.E.; Platen, E.; Kloeden, P.E.; Platen, E. Stochastic Differential Equations; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
- Ma, K.; Duanmu, Z.; Wu, Q.; Wang, Z.; Yong, H.; Li, H.; Zhang, L. Waterloo exploration database: New challenges for image quality assessment models. IEEE Trans. Image Process. 2016, 26, 1004–1016. [Google Scholar] [CrossRef] [PubMed]
- Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 898–916. [Google Scholar] [CrossRef] [PubMed]
- Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision, ICCV, Vancouver, BC, Canada, 7–14 July 2001; Volume 2, pp. 416–423. [Google Scholar]
- Yang, F.; Yang, H.; Fu, J.; Lu, H.; Guo, B. Learning texture transformer network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5791–5800. [Google Scholar]
- Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 2018, 28, 492–505. [Google Scholar] [CrossRef] [PubMed]
- Xie, C.; Ning, Q.; Dong, W.; Shi, G. Tfrgan: Leveraging text information for blind face restoration with extreme degradation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 2534–2544. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 586–595. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30, 6626–6637. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9308–9316. [Google Scholar]
- Fan, Q.; Chen, D.; Yuan, L.; Hua, G.; Yu, N.; Chen, B. A general decoupled learning framework for parameterized image operators. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 33–47. [Google Scholar] [CrossRef]
- Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11908–11915. [Google Scholar]
- Yang, W.; Tan, R.T.; Feng, J.; Liu, J.; Guo, Z.; Yan, S. Deep joint rain detection and removal from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1357–1366. [Google Scholar]
- Chen, D.; He, M.; Fan, Q.; Liao, J.; Zhang, L.; Hou, D.; Yuan, L.; Hua, G. Gated context aggregation network for image dehazing and deraining. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 1375–1383. [Google Scholar]
- Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar]
- Song, Y.; He, Z.; Qian, H.; Du, X. Vision transformers for single image dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef]
- Yang, W.; Tan, R.T.; Feng, J.; Guo, Z.; Yan, S.; Liu, J. Joint rain detection and removal from a single image with contextualized deep networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1377–1393. [Google Scholar] [CrossRef]
- Ren, D.; Zuo, W.; Hu, Q.; Zhu, P.; Meng, D. Progressive image deraining networks: A better and simpler baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3937–3946. [Google Scholar]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 10–25 June 2021; pp. 14821–14831. [Google Scholar]
- Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Color image denoising via sparse 3D collaborative filtering with grouping constraint in luminance-chrominance space. In Proceedings of the 2007 IEEE International Conference on Image Processing, San Antonio, TX, USA, 16–19 September 2007; Volume 1, pp. I–313. [Google Scholar]
- Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef]
- Zhang, K.; Zuo, W.; Zhang, L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 2018, 27, 4608–4622. [Google Scholar] [CrossRef]
- Fan, C.M.; Liu, T.J.; Liu, K.H. SUNet: Swin transformer UNet for image denoising. In Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 28 May–1 June 2022; pp. 2333–2337. [Google Scholar]
Method | Denoising (CBSD68) | Deraining (Rain100L) | Dehazing (SOTS) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
PSNR↑ | SSIM↑ | LPIPS↓ | FID↓ | PSNR↑ | SSIM↑ | LPIPS↓ | FID↓ | PSNR↑ | SSIM↑ | LPIPS↓ | FID↓ | |
SD [23] | 20.13 | 0.476 | 0.286 | 109.64 | 26.21 | 0.712 | 0.094 | 32.62 | 25.49 | 0.805 | 0.098 | 22.40 |
AirNet [17] | 28.00 | 0.797 | 0.209 | 88.22 | 34.90 | 0.968 | 0.028 | 19.03 | 27.94 | 0.962 | 0.056 | 18.39 |
TKMANet [21] | 23.94 | 0.556 | 0.275 | 95.68 | 34.83 | 0.970 | 0.021 | 13.10 | 30.38 | 0.957 | 0.047 | 8.84 |
Universal-IR [18] | 24.36 | 0.579 | 0.269 | 75.03 | 35.28 | 0.968 | 0.017 | 11.78 | 30.04 | 0.962 | 0.038 | 5.57 |
Ours-LQ | 25.30 | 0.641 | 0.261 | 70.46 | 36.01 | 0.966 | 0.016 | 11.61 | 31.21 | 0.964 | 0.036 | 5.46 |
Ours | 25.84 | 0.653 | 0.252 | 67.97 | 36.08 | 0.969 | 0.016 | 11.55 | 31.54 | 0.968 | 0.034 | 5.38 |
Dataset | Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ | FID ↓ |
---|---|---|---|---|---|
Dehaze | GCANet [60] | 26.59 | 0.935 | 0.052 | 11.52 |
GridDehazeNet [61] | 25.86 | 0.944 | 0.048 | 10.62 | |
MAXIM [5] | 29.12 | 0.932 | 0.043 | 8.12 | |
DehazeFormer [62] | 30.29 | 0.964 | 0.045 | 7.58 | |
IR-SDE [31] | 25.25 | 0.906 | 0.060 | 8.33 | |
Ours | 30.44 | 0.934 | 0.027 | 5.37 | |
Deraining | JORDER [63] | 26.25 | 0.835 | 0.197 | 94.58 |
PReNet [64] | 29.46 | 0.899 | 0.128 | 52.67 | |
MPRNet [65] | 30.41 | 0.891 | 0.158 | 61.59 | |
MAXIM [5] | 30.81 | 0.903 | 0.133 | 58.72 | |
IR-SDE [31] | 31.65 | 0.904 | 0.047 | 18.64 | |
Ours | 31.35 | 0.907 | 0.038 | 14.75 | |
Denoising | CBM3D [66] | 24.66 | 0.675 | 0.467 | 144.48 |
DnCNN [67] | 28.01 | 0.802 | 0.221 | 87.23 | |
FFDNet [68] | 27.97 | 0.789 | 0.244 | 98.76 | |
SUNet [69] | 27.88 | 0.804 | 0.223 | 68.76 | |
IR-SDE [31] | 25.54 | 0.689 | 0.219 | 97.95 | |
Ours | 26.28 | 0.695 | 0.213 | 63.71 |
Method | TIFB | ERBUNet | UNet | Param. (M) | GFLOPS | PSNR ↑ | SSIM ↑ | LPIPS ↓ | FID ↓ |
---|---|---|---|---|---|---|---|---|---|
Baseline | ✗ | ✗ | ✓ | 48.98 | 129.02 | 27.66 | 0.787 | 0.133 | 31.11 |
Model-1 | ✗ | ✓ | ✗ | 33.73 | 75.81 | 28.54 | 0.798 | 0.135 | 30.04 |
Model-2 | ✓ | ✗ | ✓ | 174.18 | 129.52 | 28.86 | 0.841 | 0.085 | 28.76 |
Ours | ✓ | ✓ | ✗ | 158.93 | 76.31 | 29.36 | 0.845 | 0.093 | 27.94 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, B.; Fan, Z.; Xiang, X.; Chen, J.; Huang, D. Universal Image Restoration with Text Prompt Diffusion. Sensors 2024, 24, 3917. https://doi.org/10.3390/s24123917
Yu B, Fan Z, Xiang X, Chen J, Huang D. Universal Image Restoration with Text Prompt Diffusion. Sensors. 2024; 24(12):3917. https://doi.org/10.3390/s24123917
Chicago/Turabian StyleYu, Bing, Zhenghui Fan, Xue Xiang, Jiahui Chen, and Dongjin Huang. 2024. "Universal Image Restoration with Text Prompt Diffusion" Sensors 24, no. 12: 3917. https://doi.org/10.3390/s24123917
APA StyleYu, B., Fan, Z., Xiang, X., Chen, J., & Huang, D. (2024). Universal Image Restoration with Text Prompt Diffusion. Sensors, 24(12), 3917. https://doi.org/10.3390/s24123917