Multiscale Normalization Attention Network for Water Body Extraction from Remote Sensing Imagery
Abstract
:1. Introduction
- To enhance the attention representation of water body semantic features, a multiscale normalization attention (MSNA) module was designed. It utilizes BN to obtain weights rather than global average pooling (GAP), which reduces the amount of parameters in the model and retains spatial information concurrently. In addition, a grouping strategy used in MSNA augments feature representations with multiscale context information and establishes long-distance feature dependencies between channels.
- To achieve high-level semantic understanding of multiscale water bodies, an optimized atrous spatial pyramid pooling (OASPP) module was designed. OASPP incorporates a global maximum pooling branch on the basis of ASPP, which alleviates the negative impacts from GAP fusing too much noise.
- To reduce training time and accelerate model convergence, a head module (FEH) for feature enhancing was designed. It utilizes three-layer convolution operations to refine the representation for decoding. Furthermore, it concatenates average pooling and maximum pooling to compress the size of the input, which has been proven to be negligible in deteriorating model performance.
- Based on the above-mentioned models, MSNANet is proposed to extract water bodies from RSI. MSNANet fully samples multiscale context information in the encoder stage, and reconstructs the resolution in the decoder stage to achieve accurate dense prediction.
2. Related Works
2.1. Dilated Convolution
2.2. Attention Mechanism
2.3. Batch Normalization
3. The Proposed Method
3.1. Multiscale Normalization Attention Module
3.2. Optimized Atrous Spatial Pyramid Pooling Module
3.3. Head Module for Feature Enhancing
4. Experiments
4.1. Dataset
4.1.1. Surface Water Dataset
4.1.2. Qinghai–Tibet Plateau Lake Dataset
4.2. Experimental Details
4.3. Evaluation Metrics
4.4. Results and Analysis
4.4.1. Results for the Surface Water Dataset
4.4.2. Results for the Qinghai–Tibet Plateau Lake Dataset
4.5. Ablation Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Crétaux, J.F.; Abarca-del-Río, R.; Bergé-Nguyen, M. Lake Volume Monitoring from Space. Surv. Geophys. 2016, 37, 269–305. [Google Scholar] [CrossRef] [Green Version]
- Rokni, K.; Ahmad, A.; Selamat, A.; Hazini, S. Water Feature Extraction and Change Detection Using Multitemporal Landsat Imagery. Remote Sens. 2014, 6, 4173–4189. [Google Scholar] [CrossRef] [Green Version]
- Hamm NA, S.; Atkinson, P.M.; Milton, E.J. A per-pixel, non-stationary mixed model for empirical line atmospheric correction in remote sensing. Remote Sens. Environ. 2012, 124, 666–678. [Google Scholar] [CrossRef]
- McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
- Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 640–651. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6230–6239. [Google Scholar]
- Chen, L.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Schlemper, J.; Oktay, O.; Schaap, M.; Heinrich, M.; Kainz, B.; Glocker, B.; Rueckert, D. Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 2019, 53, 197–207. [Google Scholar] [CrossRef]
- Li, X.; Li, T.; Chen, Z.; Zhang, K.; Xia, R. Attentively Learning Edge Distributions for Semantic Segmentation of Remote Sensing Imagery. Remote Sens. 2022, 14, 102. [Google Scholar] [CrossRef]
- Miao, Z.; Fu, K.; Sun, H.; Sun, X.; Yan, M. Automatic Water-Body Segmentation from High-Resolution Satellite Images via Deep Networks. IEEE Geosci. Remote Sens. Lett. 2018, 15, 602–606. [Google Scholar] [CrossRef]
- He, C.; Li, S.; Xiong, D.; Fang, P.; Liao, M. Remote Sensing Image Semantic Segmentation Based on Edge Information Guidance. Remote Sens. 2020, 12, 1501. [Google Scholar] [CrossRef]
- Wang, B.; Chen, Z.; Wu, L.; Yang, X.; Zhou, Y. SADA-Net: A Shape Feature Optimization and Multiscale Context Information-Based Water Body Extraction Method for High-Resolution Remote Sensing Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 1744–1759. [Google Scholar] [CrossRef]
- Weng, L.; Xu, Y.; Xia, M.; Zhang, Y.; Liu, J.; Xu, Y. Water areas segmentation from remote sensing images using a separable residual segnet network. ISPRS Int. J. Geo Inf. 2020, 9, 256. [Google Scholar] [CrossRef]
- Guo, H.; He, G.; Jiang, W.; Yin, R.; Yan, L.; Leng, W. A Multi-Scale Water Extraction Convolutional Neural Network (MWEN) Method for GaoFen-1 Remote Sensing Images. ISPRS Int. J. Geo Inf. 2020, 9, 189. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Xiao, Z.; Li, D.; Fan, M.; Zhao, L. Semantic Segmentation of Remote Sensing Images Using Multiscale Decoding Network. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1492–1496. [Google Scholar] [CrossRef]
- Li, X.; Xu, F.; Xia, R.; Lyu, X.; Gao, H.; Tong, Y. Hybridizing Cross-Level Contextual and Attentive Representations for Remote Sensing Imagery Semantic Segmentation. Remote Sens. 2021, 13, 2986. [Google Scholar] [CrossRef]
- Wang, Z.; Gao, X.; Zhang, Y. HA-Net: A Lake Water Body Extraction Network Based on Hybrid-Scale Attention and Transfer Learning. Remote Sens. 2021, 13, 4121. [Google Scholar] [CrossRef]
- Zhang, Z.; Lu, M.; Ji, S.; Yu, H.; Nie, C. Rich CNN Features for Water-Body Segmentation from Very High Resolution Aerial and Satellite Imagery. Remote Sens. 2021, 13, 1912. [Google Scholar] [CrossRef]
- Xu, Y.; Du, B.; Zhang, L. Self-Attention Context Network: Addressing the Threat of Adversarial Attacks for Hyperspectral Image Classification. IEEE Trans. Image Process. 2021, 30, 8671–8685. [Google Scholar] [CrossRef]
- Xu, Y.; Ghamisi, P. Consistency-Regularized Region-Growing Network for Semantic Segmentation of Urban Scenes with Point-Level Annotations. IEEE Trans. Image Process. 2022, 31, 5038–5051. [Google Scholar] [CrossRef]
- Li, Y.; Shi, T.; Zhang, Y.; Chen, W.; Wang, Z.; Li, H. Learning deep semantic segmentation network under multiple weakly-supervised constraints for cross-domain remote sensing image semantic segmentation. ISPRS J. Photogramm. Remote Sens. 2021, 175, 20–33. [Google Scholar] [CrossRef]
- Xu, Y.; Du, B.; Zhang, L. Robust Self-Ensembling Network for Hyperspectral Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2022; early access. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Panboonyuen, T.; Jitkajornwanich, K.; Lawawirojwong, S.; Srestasathiern, P.; Vateekul, P. Semantic Labeling in Remote Sensing Corpora Using Feature Fusion-Based Enhanced Global Convolutional Network with High-Resolution Representations and Depthwise Atrous Convolution. Remote Sens. 2020, 12, 1233. [Google Scholar] [CrossRef] [Green Version]
- Li, Z.; Chen, X.; Jiang, J.; Han, Z.; Li, Z.; Fang, T.; Huo, H.; Li, Q.; Liu, M. Cascaded Multiscale Structure with Self-Smoothing Atrous Convolution for Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
- Ma, B.; Chang, C. Semantic Segmentation of High-Resolution Remote Sensing Images Using Multiscale Skip Connection Network. IEEE Sens. J. 2022, 22, 3745–3755. [Google Scholar] [CrossRef]
- Bai, H.; Cheng, J.; Huang, X.; Liu, S.; Deng, C. HCANet: A Hierarchical Context Aggregation Network for Semantic Segmentation of High-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19. Article Sequence Number: 6002105. [Google Scholar] [CrossRef]
- Kyrkou, C.; Theocharides, T. EmergencyNet: Efficient Aerial Image Classification for Drone-Based Emergency Monitoring Using Atrous Convolutional Feature Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1687–1699. [Google Scholar] [CrossRef]
- Li, H.; Qiu, K.; Chen, L.; Mei, X.; Hong, L.; Tao, C. SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2021, 18, 905–909. [Google Scholar] [CrossRef]
- Niu, R.; Sun, X.; Tian, Y.; Diao, W.; Chen, K.; Fu, K. Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–18. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Duan, C.; Sun, J.; Zhang, C. Multistage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19. Article Sequence Number: 8009205. [Google Scholar] [CrossRef]
- Sinha, A.; Dolz, J. Multi-Scale Self-Guided Attention for Medical Image Segmentation. IEEE J. Biomed. Health 2021, 25, 121–130. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Xu, F.; Lyu, X.; Gao, H.; Tong, Y.; Cai, S.J.; Li, S.Y.; Liu, D.F. Dual attention deep fusion semantic segmentation networks of large-scale satellite remote-sensing images. Int. J. Remote Sens. 2021, 42, 3583–3610. [Google Scholar] [CrossRef]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
- Zeng, T.; Xu, F.; Lyu, X.; Li, X.; Wang, X.; Chen, J.; Wu, C. Feature difference for single-shot object detection. IET Image Process. 2022, 1–17. [Google Scholar] [CrossRef]
- Lin, G.; Milan, A.; Shen, C.; Reid, I. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 22–25 July 2017; pp. 1925–1934. [Google Scholar]
- Yu, Z.; Feng, C.; Liu, M.; Ramalingam, S. Casenet: Deep category-aware semantic edge detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5964–5973. [Google Scholar]
- Bertasius, G.; Shi, J.; Torresani, L. Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 8–10 June 2015; pp. 4380–4389. [Google Scholar]
- Xie, S.; Tu, Z. Holistically-nested edge detection. In Proceedings of the IEEE International Conference on Computer Vision (CVPR), Santiago, Chile, 13–16 December 2015; pp. 1395–1403. [Google Scholar]
- Liu, Y.; Shao, Z.; Teng, Y.; Hoffmann, N. NAM: Normalization-based Attention Module. arXiv 2021, arXiv:2111.12419. [Google Scholar]
- Bai, R.; Jiang, S.; Sun, H.; Yang, Y.; Li, G. Deep Neural Network-Based Semantic Segmentation of Microvascular Decompression Images. Sensors 2021, 21, 1167. [Google Scholar] [CrossRef]
- Zhang, C.; Jiang, W.; Zhao, Q. Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision. Remote Sens. 2021, 13, 1176. [Google Scholar] [CrossRef]
- Li, L. Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images. Remote Sens. 2019, 11, 2142. [Google Scholar] [CrossRef] [Green Version]
- Wu, Y.; Jiang, J.; Huang, Z.; Tian, Y. FPANet: Feature pyramid aggregation network for real-time semantic segmentation. Appl. Intell. 2022, 52, 3319–3336. [Google Scholar] [CrossRef]
- Lian, X.; Pang, Y.; Han, J.; Pan, J. Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation. Pattern Recognit. 2021, 110, 0031–3203. [Google Scholar] [CrossRef]
- Baheti, B.; Innani, S.; Gajre, S.; Talbar, S. Semantic scene segmentation in unstructured environment with modified DeepLabV3+. Pattern Recognit. Lett. 2020, 138, 223–229. [Google Scholar] [CrossRef]
- Yang, M.; Yu, K.; Zhang, C.; Li, Z.; Yang, K. DenseASPP for Semantic Segmentation in Street Scenes. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 3684–3692. [Google Scholar]
- Wang, Z.; Gao, X.; Zhang, Y.; Zhao, G. MSLWENet: A Novel Deep Learning Network for Lake Water Body Extraction of Google Remote Sensing Images. Remote Sens. 2020, 12, 4140. [Google Scholar] [CrossRef]
- Ding, L.; Tang, H.; Bruzzone, L. LANet: Local Attention Embedding to Improve the Semantic Segmentation of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 426–435. [Google Scholar] [CrossRef]
Hyperparameter | Setting |
---|---|
Batch size | 4 |
Loss function | Cross entropy loss |
Optimizer | Adadelta |
Initial learning rate | 0.1 |
Maximum epochs | 200 |
Method | OA | F1-Score | Kappa | WIoU | MIoU |
---|---|---|---|---|---|
MSNANet | 94.44 | 92.12 | 0.87908 | 85.4 | 88.66 |
Attention U-Net | 93.34 | 90.56 | 0.85479 | 82.76 | 86.54 |
DeeplabV3+ | 94.3 | 91.87 | 0.87487 | 84.96 | 88.28 |
PSPNet | 94.17 | 91.73 | 0.87299 | 84.72 | 88.12 |
SegNet | 94.15 | 91.55 | 0.8701 | 84.41 | 87.87 |
LANet | 94.14 | 91.62 | 0.87114 | 84.52 | 87.96 |
Method | Params (M) | Training Time (s) | Flops (G) |
---|---|---|---|
MSNANet | 72.3 | 627 | 61.943 |
Attention U-Net | 34.9 | 459 | 66.636 |
DeeplabV3+ | 54.7 | 348 | 20.757 |
PSPNet | 46.8 | 413 | 46.112 |
SegNet | 29.4 | 271 | 40.169 |
LANet | 24.0 | 255 | 8.309 |
Method | OA | F1-Score | Kappa | WIoU | MIoU |
---|---|---|---|---|---|
MSNANet | 98.47 | 98.11 | 0.96824 | 96.29 | 96.88 |
Attention U-Net | 98.24 | 97.87 | 0.96347 | 95.76 | 96.42 |
DeeplabV3+ | 98.39 | 98.03 | 0.96691 | 96.14 | 96.75 |
PSPNet | 98.4 | 98.03 | 0.9671 | 96.15 | 96.77 |
SegNet | 98.12 | 97.69 | 0.96135 | 95.49 | 96.21 |
LANet | 98.29 | 97.89 | 0.96457 | 95.86 | 96.52 |
Dataset | Method | Params (M) | OA | F1-Score | Kappa | WIoU | MIoU | Training Time (s/epoch) |
---|---|---|---|---|---|---|---|---|
SW dataset | MSNANet | 72.3 | 94.44 | 92.12 | 0.87908 | 85.4 | 88.66 | 627 |
MSNANet (without FEH) | 72.2 | 94.22 | 91.63 | 0.87214 | 84.55 | 88.05 | 2629 | |
MSNANet (without OASPP) | 39.6 | 94.33 | 91.91 | 0.87531 | 85.04 | 88.32 | 373 | |
MSNANet (without MSNA) | 70.6 | 94.26 | 91.79 | 0.87381 | 84.83 | 88.19 | 611 | |
QTPL dataset | MSNANet | 72.3 | 98.47 | 98.11 | 0.96824 | 96.29 | 96.88 | 326 |
MSNANet (without FEH) | 72.2 | 98.31 | 97.96 | 0.96585 | 96.01 | 96.65 | 1097 | |
MSNANet (without OASPP) | 39.6 | 98.41 | 98.05 | 0.96713 | 96.17 | 96.77 | 157 | |
MSNANet (without MSNA) | 70.6 | 98.37 | 97.98 | 0.96621 | 96.06 | 96.68 | 254 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lyu, X.; Fang, Y.; Tong, B.; Li, X.; Zeng, T. Multiscale Normalization Attention Network for Water Body Extraction from Remote Sensing Imagery. Remote Sens. 2022, 14, 4983. https://doi.org/10.3390/rs14194983
Lyu X, Fang Y, Tong B, Li X, Zeng T. Multiscale Normalization Attention Network for Water Body Extraction from Remote Sensing Imagery. Remote Sensing. 2022; 14(19):4983. https://doi.org/10.3390/rs14194983
Chicago/Turabian StyleLyu, Xin, Yiwei Fang, Baogen Tong, Xin Li, and Tao Zeng. 2022. "Multiscale Normalization Attention Network for Water Body Extraction from Remote Sensing Imagery" Remote Sensing 14, no. 19: 4983. https://doi.org/10.3390/rs14194983
APA StyleLyu, X., Fang, Y., Tong, B., Li, X., & Zeng, T. (2022). Multiscale Normalization Attention Network for Water Body Extraction from Remote Sensing Imagery. Remote Sensing, 14(19), 4983. https://doi.org/10.3390/rs14194983