Deep HDR Deghosting by Motion-Attention Fusion Network
Abstract
:1. Introduction
- We demonstrate that motion information of the LDR images can distinguish the saturation area from the motion area of the LDR images. Hence, we propose to use motion information (e.g., optical flow) to guide the fusion of details in the foreground and background and to prevent ghosting of the HDR image.
- We propose an end-to-end attention-based fusion framework that uses a motion estimation module and a correlation estimation module to obtain the optical flow and image correlation clues, respectively. Then, the estimated optical flow, as well as the correlation clue, guide the network to pay more attention to the features from the saturation and motion areas by an attention-based fusion module and direct the network to reconstruct credible HDR details in the presence of saturation, occlusion, and underexposure.
- On both datasets with and without ground truth, we report stat-of-the-art fusion results.
2. Related Work
3. Proposed Method
3.1. Pipeline
3.2. Network Structure
3.2.1. Encoder
3.2.2. Motion Estimation Module
3.2.3. Correlation Estimation Module
3.2.4. Fusion Module
3.2.5. Decoder
3.3. Training Details
4. Experimental Results
4.1. Comparison with the State of the Art
4.1.1. Comparison on Dataset with Ground-Truth
4.1.2. Comparison on Dataset without Ground-Truth
4.2. Further Analysis
4.2.1. Analysis of Network Structure
4.2.2. Analysis of Encoder
4.2.3. Image Fusion for Denoising
4.2.4. Image Fusion for Traffic Scenes
4.2.5. Limitation of DDFNet
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Delbracio, M.; Kelly, D.; Brown, M.S.; Milanfar, P. Mobile computational photography: A tour. Annu. Rev. Vis. Sci. 2021, 7, 571–604. [Google Scholar] [CrossRef]
- Wang, L.; Yoon, K.J. Deep Learning for HDR Imaging: State-of-the-Art and Future Trends. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X. Benchmarking and comparing multi-exposure image fusion algorithms. Inf. Fusion 2021, 74, 111–131. [Google Scholar] [CrossRef]
- Fang, Y.; Zhu, H.; Ma, K.; Wang, Z.; Li, S. Perceptual evaluation for multi-exposure image fusion of dynamic scenes. IEEE Trans. Image Process. 2019, 29, 1127–1138. [Google Scholar] [CrossRef] [PubMed]
- Sen, P.; Kalantari, N.K.; Yaesoubi, M.; Darabi, S.; Goldman, D.B.; Shechtman, E. Robust patch-based hdr reconstruction of dynamic scenes. ACM Trans. Graph. 2012, 31, 203:1–203:11. [Google Scholar] [CrossRef] [Green Version]
- Srikantha, A.; Sidibé, D. Ghost detection and removal for high dynamic range images: Recent advances. Signal Process. Image Commun. 2012, 27, 650–662. [Google Scholar] [CrossRef] [Green Version]
- Tiwari, G.; Rani, P. A review on high-dynamic-range imaging with its technique. Int. J. Signal Process. Image Process. Pattern Recognit. 2015, 8, 93–100. [Google Scholar] [CrossRef]
- Tursun, O.T.; Akyüz, A.O.; Erdem, A.; Erdem, E. The state of the art in HDR deghosting: A survey and evaluation. In Computer Graphics Forum; Wiley Online Library: Hoboken, NJ, USA, 2015; Volume 34, pp. 683–707. [Google Scholar]
- Johnson, A.K. High dynamic range imaging—A review. Int. J. Image Process. (IJIP) 2015, 9, 198. [Google Scholar]
- Hu, J.; Gallo, O.; Pulli, K.; Sun, X. HDR deghosting: How to deal with saturation? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 1163–1170. [Google Scholar]
- Lee, C.; Li, Y.; Monga, V. Ghost-free high dynamic range imaging via rank minimization. IEEE Signal Process. Lett. 2014, 21, 1045–1049. [Google Scholar]
- Oh, T.H.; Lee, J.Y.; Tai, Y.W.; Kweon, I.S. Robust high dynamic range imaging by rank minimization. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1219–1232. [Google Scholar] [CrossRef]
- Li, Z.; Zheng, J.; Zhu, Z.; Wu, S. Selectively detail-enhanced fusion of differently exposed images with moving objects. IEEE Trans. Image Process. 2014, 23, 4372–4382. [Google Scholar] [CrossRef] [PubMed]
- Ma, K.; Li, H.; Yong, H.; Wang, Z.; Meng, D.; Zhang, L. Robust multi-exposure image fusion: A structural patch decomposition approach. IEEE Trans. Image Process. 2017, 26, 2519–2532. [Google Scholar] [CrossRef] [PubMed]
- Wu, S.; Xu, J.; Tai, Y.W.; Tang, C.K. Deep high dynamic range imaging with large foreground motions. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 117–132. [Google Scholar]
- Yan, Q.; Gong, D.; Shi, Q.; Hengel, A.v.d.; Shen, C.; Reid, I.; Zhang, Y. Attention-guided network for ghost-free high dynamic range imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1751–1760. [Google Scholar]
- Prabhakar, K.R.; Agrawal, S.; Singh, D.K.; Ashwath, B.; Babu, R.V. Towards practical and efficient high-resolution HDR deghosting with CNN. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2020; pp. 497–513. [Google Scholar]
- Kalantari, N.K.; Ramamoorthi, R. Deep high dynamic range imaging of dynamic scenes. ACM Trans. Graph. 2017, 36, 144:1–144:12. [Google Scholar] [CrossRef] [Green Version]
- Liu, C. Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2009. [Google Scholar]
- Peng, F.; Zhang, M.; Lai, S.; Tan, H.; Yan, S. Deep HDR reconstruction of dynamic scenes. In Proceedings of the 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 27–29 June 2018; pp. 347–351. [Google Scholar]
- Metwaly, K.; Monga, V. Attention-mask dense merger (attendense) deep hdr for ghost removal. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 2623–2627. [Google Scholar]
- Niu, Y.; Wu, J.; Liu, W.; Guo, W.; Lau, R.W. HDR-GAN: HDR image reconstruction from multi-exposed ldr images with large motions. IEEE Trans. Image Process. 2021, 30, 3885–3896. [Google Scholar] [CrossRef]
- Choi, S.; Cho, J.; Song, W.; Choe, J.; Yoo, J.; Sohn, K. Pyramid inter-attention for high dynamic range imaging. Sensors 2020, 20, 5102. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Yan, Q.; Gong, D.; Zhang, P.; Shi, Q.; Sun, J.; Reid, I.; Zhang, Y. Multi-scale dense networks for deep high dynamic range imaging. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 41–50. [Google Scholar]
- Dosovitskiy, A.; Fischer, P.; Ilg, E.; Hausser, P.; Hazirbas, C.; Golkov, V.; Van Der Smagt, P.; Cremers, D.; Brox, T. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2758–2766. [Google Scholar]
- Sun, D.; Yang, X.; Liu, M.Y.; Kautz, J. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8934–8943. [Google Scholar]
- Bhat, G.; Danelljan, M.; Van Gool, L.; Timofte, R. Deep burst super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9209–9218. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2472–2481. [Google Scholar]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Mantiuk, R.; Kim, K.J.; Rempel, A.G.; Heidrich, W. HDR-VDP-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Trans. Graph. (TOG) 2011, 30, 1–14. [Google Scholar] [CrossRef]
- Qin, X.; Shen, J.; Mao, X.; Li, X.; Jia, Y. Robust match fusion using optimization. IEEE Trans. Cybern. 2014, 45, 1549–1560. [Google Scholar] [CrossRef]
- Photomatix. Commercially-Available HDR Processing Software. Available online: https://www.hdrsoft.com/ (accessed on 11 October 2022).
- Li, S.; Kang, X. Fast multi-exposure image fusion with median filter and recursive filter. IEEE Trans. Consum. Electron. 2012, 58, 626–632. [Google Scholar] [CrossRef] [Green Version]
- Zheng, J.; Li, Z.; Zhu, Z.; Wu, S.; Rahardja, S. Hybrid patching for a sequence of differently exposed images with moving objects. IEEE Trans. Image Process. 2013, 22, 5190–5201. [Google Scholar] [CrossRef] [PubMed]
- Criminisi, A.; Pérez, P.; Toyama, K. Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 2004, 13, 1200–1212. [Google Scholar] [CrossRef] [PubMed]
- Van Engelen, J.E.; Hoos, H.H. A survey on semi-supervised learning. Mach. Learn. 2020, 109, 373–440. [Google Scholar] [CrossRef] [Green Version]
- Wang, L.; Yoon, K.J. Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3048–3068. [Google Scholar] [CrossRef]
Block | Layer | Filter Size | Dilation | Padding | Input | Output |
---|---|---|---|---|---|---|
Conv_layer | Conv + ReLU | 1 | 1 | 64 | 128 | |
Conv_layer | Conv + ReLU | 1 | 1 | 128 | 64 | |
DenseBlock 1 | Conv | 2 | 2 | 64 | 32 | |
Conv | 2 | 2 | 96 | 32 | ||
Conv | 2 | 2 | 128 | 32 | ||
Conv | 2 | 2 | 160 | 32 | ||
Conv | 2 | 2 | 192 | 32 | ||
Conv_layer | Conv | 1 | 1 | 224 | 64 | |
DenseBlock 2 | Conv | 2 | 2 | 64 | 32 | |
Conv | 2 | 2 | 96 | 32 | ||
Conv | 2 | 2 | 128 | 32 | ||
Conv | 2 | 2 | 160 | 32 | ||
Conv | 2 | 2 | 192 | 32 | ||
Conv_layer | Conv | 1 | 1 | 224 | 64 | |
DenseBlock 3 | Conv | 2 | 2 | 64 | 32 | |
Conv | 2 | 2 | 96 | 32 | ||
Conv | 2 | 2 | 128 | 32 | ||
Conv | 2 | 2 | 160 | 32 | ||
Conv | 2 | 2 | 192 | 32 | ||
Conv_layer | Conv | 1 | 1 | 224 | 64 | |
Tail | Conv | 1 | 1 | 192 | 64 | |
Conv | 1 | 1 | 64 | 64 | ||
Conv + ReLU | 1 | 1 | 64 | 3 |
Methods | Pre-Alignment | Boundary Cropping | PSNR- | SSIM- | PSNR-L | SSIM-L | HDR-vdp2 | |
---|---|---|---|---|---|---|---|---|
O.F. | Homo. | |||||||
Sen [5] | 43.49 | 0.9860 | 40.14 | 0.9764 | 65.58 | |||
Kalantari [18] | ✓ | ✓ | 42.17 | 0.9828 | 42.26 | 0.9841 | 67.88 | |
DeepHDR [15] | ✓ | 44.44 | 0.9917 | 44.01 | 0.9902 | 62.82 | ||
Prabhakar [17] | ✓ | ✓ | ✓ | 42.82 | - | 41.33 | - | - |
AHDRNet [16] | 46.16 | 0.9927 | 43.24 | 0.9901 | 68.46 | |||
Ours | 46.53 | 0.9924 | 43.38 | 0.9916 | 69.17 |
Modules | PSNR- | SSIM- | PSNR-L | SSIM-L | HDR-vdp2 |
---|---|---|---|---|---|
(a) No Attention | 40.38 | 0.9886 | 39.75 | 0.9813 | 67.50 |
(b) Only Corre. Attention | 46.38 | 0.9918 | 43.35 | 0.9915 | 68.45 |
(c) Only Motion Attention | 45.50 | 0.9911 | 42.52 | 0.9892 | 68.09 |
(d) Corre. + Motion + Warp | 46.39 | 0.9922 | 43.33 | 0.9915 | 68.48 |
(e) Corre. + Motion + Atten. | 46.53 | 0.9924 | 43.38 | 0.9916 | 69.17 |
Encoded | PSNR- | SSIM- | PSNR-L | SSIM-L | HDR-vdp2 |
---|---|---|---|---|---|
(a) LDR | 43.42 | 0.9890 | 40.32 | 0.9834 | 67.32 |
(b) LDR + GC | 46.53 | 0.9924 | 43.38 | 0.9916 | 69.17 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, Y.; Veelaert, P.; Philips, W. Deep HDR Deghosting by Motion-Attention Fusion Network. Sensors 2022, 22, 7853. https://doi.org/10.3390/s22207853
Xiao Y, Veelaert P, Philips W. Deep HDR Deghosting by Motion-Attention Fusion Network. Sensors. 2022; 22(20):7853. https://doi.org/10.3390/s22207853
Chicago/Turabian StyleXiao, Yifan, Peter Veelaert, and Wilfried Philips. 2022. "Deep HDR Deghosting by Motion-Attention Fusion Network" Sensors 22, no. 20: 7853. https://doi.org/10.3390/s22207853
APA StyleXiao, Y., Veelaert, P., & Philips, W. (2022). Deep HDR Deghosting by Motion-Attention Fusion Network. Sensors, 22(20), 7853. https://doi.org/10.3390/s22207853