Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3+
Abstract
:1. Introduction
2. Related Research
2.1. Deeplabv3+ Network Model
2.2. Modified Aligned Concept Network Model
3. Improved DeeplabV3+ Network Model
3.1. Optimized Feature Extraction Module
3.2. Add CA Module
3.3. Optimizing the Loss Function
4. Experiment and Result Analysis
4.1. Experimental Data
4.2. Experimental Environment and Evaluation Criteria
4.3. Experimental Process
4.4. Ablation Experiment
4.5. Comparison of Segmentation Performance of Different Methods
4.6. Comparison with Existing Surface Feature Classification Methods of High-Resolution Remote-Sensing Images
- (1)
- The ground feature classification methods adopted in references [32,33,34] need to manually select the spectrum, texture, geometry, shadow, background and geoscience auxiliary features of the ground feature categories of remote-sensing images, and the segmentation results depend on the advantages and disadvantages of the image feature selection; In this paper, the use of depth neural networks can automatically extract the features of images, which can give full play to the advantages of depth neural networks in image segmentation, and provide a new idea for the study of ground object classification of remote-sensing images.
- (2)
- The advantage of the method adopted in literature [35] is that only a small number of images are needed as the model training set, and the RGB value of each pixel in the image and an additional feature related to the local entropy information are extracted. The final classification accuracy is 0.06% different from that of the deep learning model (FCN) that requires a lot of image training. The classification rules of the method in literature [35] are explained in the paper. When the RGB value of a pixel value in the image of the WHDLD dataset has a high value, it is classified as a building class. The classification results also classify roads and sidewalks as building classes. The method proposed in literature [35] takes a long time to train on a small number of image datasets, and the classification rules of the model limit its inability to accurately classify labels with similar RGB values. In contrast, although the deep-learning method requires a large number of images as the training set, it can be more effectively segment the image dataset with more labels.
- (3)
- The attention mechanism is applied to the remote-sensing image ground object classification model based on the depth neural network, and the lightweight MobilenetV2 network is used as the backbone feature extraction network, which improves the ground-object classification accuracy of the model, results in a reduction of model parameters and a lower training cost.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Shao, Z.; Tang, P.; Wang, Z.; Saleem, N.; Yam, S.; Sommai, C. BRRNet: A fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sens. 2020, 12, 1050. [Google Scholar] [CrossRef] [Green Version]
- Huang, X.; Wang, Y. Investigating the effects of 3D urban morphology on the surface urban heat island effect in urban functional zones by using high-resolution remote sensing data: A case study of Wuhan, Central China. ISPRS J. Photogramm. Remote Sens. 2019, 152, 119–131. [Google Scholar] [CrossRef]
- Chen, Y.; Fan, R.; Yang, X.; Wang, J.; Latif, A. Extraction of urban water bodies from high-resolution remote-sensing imagery using deep learning. Water 2018, 10, 585. [Google Scholar] [CrossRef] [Green Version]
- Yuan, Y.; Chen, X.; Wang, J. Object-contextual representations for semantic segmentation. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 173–190. [Google Scholar]
- Cao, X.; Li, T.; Li, H.; Xia, S.; Ren, F.; Sun, Y.; Xu, X. A Robust Parameter-Free Thresholding Method for Image Segmentation. IEEE Access 2019, 7, 3448–3458. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Cheng, X.; Wu, Z.; Guo, W. An Over-Segmentation-Based Uphill Clustering Method for Individual Trees Extraction in Urban Street Areas From MLS Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2206–2221. [Google Scholar] [CrossRef]
- Pan, S.; Tao, Y.; Nie, C.; Chong, Y. PEGNet: Progressive Edge Guidance Network for Semantic Segmentation of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2021, 18, 637–641. [Google Scholar] [CrossRef]
- Minaee, S.; Boykov, Y.Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. University of California, Los Angeles, Los Angeles, CA, USA et al. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 9 November 2017; pp. 6230–6239. [Google Scholar]
- Evan, S.; Jonathan, L.; Trevor, D. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar]
- Yang, F.; Li, X.; Shen, J. MSB-FCN: Multi-Scale Bidirectional FCN for Object Skeleton Extraction. IEEE Trans. Image Process. 2021, 30, 2301–2312. [Google Scholar] [CrossRef] [PubMed]
- Yu, Y.; Rashidi, M.; Samali, B.; Mohammadi, M.; Nguyen, T.N.; Zhou, X. Crack detection of concrete structures using deep convolutional neural networks optimized by enhanced chicken swarm algorithm. Struct. Health Monit. 2022, 21, 2244–2263. [Google Scholar] [CrossRef]
- Fan, T.; Sun, T.; Xie, X.; Liu, H.; Na, Z. Automatic Micro-Crack Detection of Polycrystalline Solar Cells in Industrial Scene. IEEE Access 2022, 10, 16269–16282. [Google Scholar] [CrossRef]
- Li, R.; Duan, C.; Zheng, S.; Zhang, C.; Atkinson, P.M. MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Olaf, R.; Philipp, F.; Thomas, B. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Li, C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv 2015, arXiv:1412.7062. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. Computer Vision and Pattern Recognition. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Li, C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the 15th European Conference, Proceedings, Part VI, Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Wang, C.; Du, P.; Wu, H.; Li, J.; Zhao, C.; Zhu, H. A cucumber leaf disease severity classification method based on the fusion of DeepLabV3+ and U-Net. Comput. Electron. Agric. 2021, 189, 106373. [Google Scholar] [CrossRef]
- Azad, R.; Asadi-Aghbolaghi, M.; Fathy, M.; Escalera, S. Attention deeplabv3+: Multi-level context attention mechanism for skin lesion segmentation. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 251–266. [Google Scholar]
- Li, Z.; Wang, R.; Zhang, W.; Hu, F.; Meng, L. Multiscale Features Supported DeepLabV3+ Optimization Scheme for Accurate Water Semantic Segmentation. IEEE Access 2019, 7, 155787–155804. [Google Scholar] [CrossRef]
- da Cruz, L.B.; Júnior, D.A.D.; Diniz, J.O.B.; Silva, A.C.; de Almeida, J.D.S.; de Paiva, A.C.; Gattass, M. Kidney tumor segmentation from computed tomography images using DeepLabv3+ 2.5 D model. Expert Syst. Appl. 2022, 192, 116270. [Google Scholar] [CrossRef]
- Du, S.; Du, S.; Liu, B.; Zhang, X. Incorporating DeepLabv3+ and object-based image analysis for semantic segmentation of very high resolution remote sensing images. Int. J. Digit. Earth 2021, 14, 357–378. [Google Scholar] [CrossRef]
- Koh, J.C.O.; Spangenberg, G.; Kant, S. Automated machine learning for high-throughput image-based plant phenotyping. Remote Sens. 2021, 13, 858. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Zhou, D.; Hou, Q.; Chen, Y.; Feng, J.; Yan, S. Rethinking bottleneck structure for efficient mobile network design. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 680–697. [Google Scholar]
- Luo, H.; Chen, C.; Fang, L.; Zhu, X.; Lu, L. High-resolution aerial images semantic segmentation using deep fully convolutional network with channel attention mechanism. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 3492–3507. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. In Proceedings of the 15th European Conference, Proceedings, Part VI, Munich, Germany, 8–14 September 2018; pp. 286–301. [Google Scholar]
- Chu, Q.; Ouyang, W.; Li, H.; Wang, X.; Liu, B.; Yu, N. Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 25 December 2017; pp. 4846–4855. [Google Scholar]
- Deng, Z.; Li, D.; Ke, Y.; Wu, Y.; Li, X.; Gong, H. An improved SVM algorithm for high spatial resolution remote sensing image classification. Remote Sens. Land Resour. 2016, 3, 12–18. [Google Scholar]
- Zhou, Y.; Zhang, R.; Wang, S.; Wang, F. Feature Selection Method Based on High-Resolution Remote Sensing Images and the Effect of Sensitive Features on Classification Accuracy. Sensors 2018, 18, 2013. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shi, L.; Wan, Y.C.; Gao, X.J.; Wang, M. Feature Selection for Object-Based Classification of High-Resolution Remote Sensing Images Based on the Combination of a Genetic Algorithm and Tabu Search. Comput. Intell. Neurosci. 2018, 2018, 6595792. [Google Scholar] [CrossRef] [PubMed]
- Castellano, G.; Castiello, C.; Montemurro, A.; Vessio, G.; Zaza, G. Segmentation of remotely sensed images with a neuro-fuzzy inference system. In Proceedings of the WILF 2021: 13th International Workshop on Fuzzy Logic and Applications, Vietri sul Mare, Italy, 20–22 December 2021; Available online: ceur-ws.org/Vol-3074/paper15.pdf (accessed on 1 January 2020).
Input | Network | Expansion Multiple of Input Channel | Number of Output Channels | Module Repetitions | Step |
---|---|---|---|---|---|
256 × 256 × 3 | Conv2d | - | 32 | 1 | 2 |
128 × 128 × 32 | Bottleneck | 1 | 16 | 1 | 1 |
128 × 128 × 16 | Bottleneck | 6 | 24 | 2 | 2 |
64 × 64 × 24 | Bottleneck | 6 | 32 | 3 | 2 |
32 × 32 × 32 | Bottleneck | 6 | 64 | 4 | 2 |
32 × 32 × 64 | Bottleneck | 6 | 96 | 3 | 1 |
16 × 16 × 96 | Bottleneck | 6 | 160 | 3 | 2 |
8 × 8 × 160 | Bottleneck | 6 | 320 | 1 | 1 |
Method | mPA (%) | mRecall (%) | mIou (%) |
---|---|---|---|
Traditional DeeplabV3+ | 72.52 | 71.74 | 60.52 |
Scheme 1 | 74.34 | 73.78 | 62.67 |
Scheme 2 | 76.14 | 75.45 | 64.22 |
Scheme 3 | 75.72 | 75.07 | 63.84 |
Scheme 4 | 76.75 | 76.12 | 64.76 |
Method | mPA (%) | mRecall (%) | mIoU (%) |
---|---|---|---|
Traditional DeeplabV3+ | 71.24 | 70.44 | 59.23 |
Scheme 1 | 74.09 | 73.58 | 62.47 |
Scheme 2 | 75.91 | 75.31 | 64.07 |
Scheme 3 | 75.46 | 74.87 | 63.64 |
Scheme 4 | 76.57 | 75.95 | 64.58 |
Method | mPA (%) | mRecall (%) | mIoU (%) |
---|---|---|---|
Traditional DeeplabV3+ | 72.52 | 71.74 | 60.52 |
U-Net | 70.12 | 69.62 | 58.32 |
PSP-Net | 67.23 | 66.56 | 55.46 |
MACU-Net | 74.28 | 73.67 | 62.37 |
Paper Method | 76.75 | 76.12 | 64.76 |
Method | mPA (%) | mRecall (%) | mIoU (%) |
---|---|---|---|
Traditional DeeplabV3+ | 71.24 | 70.44 | 59.23 |
U-Net | 70.01 | 69.41 | 58.11 |
PSP-Net | 66.90 | 66.32 | 55.23 |
MACU-Net | 73.95 | 73.36 | 62.07 |
Paper Method | 76.57 | 75.95 | 64.58 |
Method | Training Time/Epoch (S) | Parameter Quantity (M) |
---|---|---|
Traditional DeeplabV3+ | 331 | 209.71 |
U-Net | 245 | 10.86 |
PSP-Net | 211 | 9.31 |
MACU-Net | 317 | 5.12 |
Paper method | 265 | 22.51 |
Literature | Dataset | Feature Extraction | Feature Extraction | Feature Extraction | Feature Extraction |
---|---|---|---|---|---|
Literature [32] | WorldView2 High resolution satellite remote-sensing Image data | 0.5 m and 1.8 m | Paper | Support vector machine | Manual extraction |
Literature [33] | Self made dataset | 1 m; 0.5 m and 0.2 m | Paper | Relief F Algorithm, genetic algorithm and support vector machine | Manual extraction |
Literature [34] | WorldView-2 & QuickBird | 0.5 m and 0.6 m | Paper | Tabu search algorithm, genetic algorithm and support vector machine | Manual extraction |
Literature [35] | WHDLD | 2 m | image | adaptive neuro-fuzzy inference system | Automatic extraction |
Paper | WHDLD and CCF BDCI | 2 m | image | Deep neural network | Automatic extraction |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fu, J.; Yi, X.; Wang, G.; Mo, L.; Wu, P.; Kapula, K.E. Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3+. Sensors 2022, 22, 7477. https://doi.org/10.3390/s22197477
Fu J, Yi X, Wang G, Mo L, Wu P, Kapula KE. Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3+. Sensors. 2022; 22(19):7477. https://doi.org/10.3390/s22197477
Chicago/Turabian StyleFu, Junjie, Xiaomei Yi, Guoying Wang, Lufeng Mo, Peng Wu, and Kasanda Ernest Kapula. 2022. "Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3+" Sensors 22, no. 19: 7477. https://doi.org/10.3390/s22197477
APA StyleFu, J., Yi, X., Wang, G., Mo, L., Wu, P., & Kapula, K. E. (2022). Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3+. Sensors, 22(19), 7477. https://doi.org/10.3390/s22197477