Remote Sensing Scene Image Classification Based on Self-Compensating Convolution Neural Network
Abstract
:1. Introduction
- (1)
- From the perspective of filter, a self-compensated convolution (SCC) was proposed. It includes three stages: reducing the number of filters, channel compensation using input features, and channel reassignment. A new convolution way was provided for remote sensing scene classification;
- (2)
- A self-compensating bottleneck module (SCBM) based on self-compensating convolution is presented. The module enables more shallow information to be transmitted to the deeper layer through a wider channel shortcut, which is helpful to improve the feature extraction ability of the model;
- (3)
- Based on the self-compensating bottleneck module, a lightweight modular self-compensating convolution neural network (SCCNN) is constructed for remote sensing scene image classification. Experiments show that the proposed method can classify remote sensing scene images more effectively with less parameters, and the classification accuracy is equivalent to or even better than that of some state-of-the-art classification methods, which proves the effectiveness of the proposed lightweight network.
2. Related Works
2.1. Convolution Methods
2.2. Residual Bottleneck Blocks
3. Methodology
3.1. Self-Compensation Convolution
3.2. Self-Compensating Bottleneck Module
3.3. Self-Compensating Convolution Neural Network
4. Experiment and Results
4.1. Datasets
4.2. Setting of the Experiments
4.3. Visual Analysis of the Model
4.4. Comparison with Advanced Methods
4.5. Performance Evaluation of the Proposed SCCNN Method
4.6. Comparison of Model Running Time
4.7. Comparison of Computational Complexity of Models
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
References
- Krizhevsky, A.; Sutskever, I.; Hinton, G. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Han, K.; Guo, J.; Zhang, C.; Zhu, M. Attribute-aware attention model for fifine-grained representation learning. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18), Seoul, Korea, 22–26 October 2018; pp. 2040–2048. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Chen, L.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A. Semantic image segmentation with deep convolutional nets and fully connected crfs. In Proceedings of the ICLR, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Luo, J.; Wu, J.; Lin, W. Thinet: A fifilter level pruning method for deep neural network compression. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5058–5066. [Google Scholar]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and training of neural networks for effificient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar]
- You, S.; Xu, C.; Xu, C.; Tao, D. Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 1285–1294. [Google Scholar]
- Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv 2016, arXiv:1510.00149. [Google Scholar]
- Han, S.; Pool, J.; Tran, J.; Dally, W.J. Learning both Weights and Connections for Efficient Neural Networks. arXiv 2015, arXiv:1506.02626v3. [Google Scholar]
- Rastegari, M.; Ordonez, V.; Redmon, J.; Farhadi, A. Xnor-net: Imagenet classification using binary convolutional neural networks. In Proceedings of the 14th European Conference (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 525–542. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531v1. [Google Scholar]
- Xu, Z.; Hsu, Y.C.; Huang, J. Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks. arXiv 2018, arXiv:1709.00513v2. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Effificient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861v1. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. arXiv 2020, arXiv:1911.11907v2. [Google Scholar]
- Singh, P.; Verma, V.K.; Rai, P.; Namboodiri, V.P. HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs. arXiv 2019, arXiv:1903.04120v2. [Google Scholar]
- Yang, B.; Bender, G.; Le, Q.V.; Ngiam, J. CondConv: Conditionally Parameterized Convolutions for Effificient Inference. arXiv 2020, arXiv:1904.04971v3. [Google Scholar]
- Liu, J.J.; Hou, Q.; Cheng, M.M.; Wang, C.; Feng, J. Improving Convolutional Networks with Self-Calibrated Convolutions. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10096–10105. [Google Scholar]
- Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Ding, X.; Guo, Y.; Ding, G.; Han, J. ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks. arXiv 2019, arXiv:1908.03930. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Wide residual networks. arXiv 2017, arXiv:1605.07146. [Google Scholar]
- Chen, Y.; Li, J.; Xiao, H.; Jin, X.; Yan, S.; Feng, J. Dual path networks. Adv. Neural Inf. Process. Syst. 2017, 30, 4467–4475. [Google Scholar]
- Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Shi, C.; Zhang, X.; Sun, J.; Wang, L. Remote Sensing Scene Image Classification Based on Dense Fusion of Multi-level Features. Remote Sens. 2021, 13, 4379. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, J.; Tian, J.; Zhuo, L.; Zhang, J. Residual Dense Network Based on Channel-Spatial Attention for the Scene Classification of a High-Resolution Remote Sensing Image. Remote Sens. 2020, 12, 1887. [Google Scholar] [CrossRef]
- Dong, R.; Xu, D.; Jiao, L.; Zhao, J.; An, J. A Fast Deep Perception Network for Remote Sensing Scene Classification. Remote Sens. 2020, 12, 729. [Google Scholar] [CrossRef] [Green Version]
- Yang, Y.; Newsam, S. Bag-of-visual-words and spatial extensions for land-use. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 3–5 November 2010; pp. 270–279. [Google Scholar]
- Zou, Q.; Ni, L.; Zhang, T.; Wang, Q. Deep learning based feature selection for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2015, 12, 2321–2325. [Google Scholar] [CrossRef]
- Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L. AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar] [CrossRef] [Green Version]
- Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef] [Green Version]
- Xia, G.S.; Yang, W.; Delon, J.; Gousseau, Y.; Sun, H.; Maitre, H. Structural high-resolution satellite image indexing. In Proceedings of the ISPRS TC VII—100 Years ISPRS, Vienna, Austria, 5–7 July 2010; Volume 38, pp. 298–303. [Google Scholar]
- Zhao, B.; Zhong, Y.; Xia, G.S.; Zhang, L. Dirichlet-Derived multiple topic scene classification model for high spatial resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 2108–2123. [Google Scholar] [CrossRef]
- Liu, X.; Zhou, Y.; Zhao, J.; Yao, R.; Liu, B.; Zheng, Y. Siamese convolutional neural networks for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1200–1204. [Google Scholar] [CrossRef]
- Zhou, Y.; Liu, X.; Zhao, j.; Ma, D.; Yao, R.; Liu, B.; Zheng, Y. Remote sensing scene classification based on rotation-invariant feature learning and joint decision making. EURASIP J. Image Video Process. 2019, 2019, 3. [Google Scholar] [CrossRef] [Green Version]
- Lu, X.; Ji, W.; Li, X.; Zheng, X. Bidirectional adaptive feature fusion for remote sensing scene classification. Neurocomputing 2019, 328, 135–146. [Google Scholar] [CrossRef]
- Liu, Y.; Zhong, Y.; Qin, Q. Scene classification based on multiscale convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 7109–7121. [Google Scholar] [CrossRef] [Green Version]
- Cao, R.; Fang, L.; Lu, T.; He, N. Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2021, 18, 43–47. [Google Scholar] [CrossRef]
- Zhao, F.; Mu, X.; Yang, Z.; Yi, Z. A novel two-stage scene classification model based on Feature variable significancein high-resolution remote sensing. Geocarto Int. 2019, 35, 1603–1614. [Google Scholar] [CrossRef]
- Liu, B.D.; Meng, J.; Xie, W.Y.; Shao, S.; Li, Y.; Wang, Y. Weighted spatial pyramid matching collaborative representation for remote-sensing-image scene classification. Remote Sens. 2019, 11, 518. [Google Scholar] [CrossRef] [Green Version]
- He, N.; Fang, L.; Li, S.; Plaza, J.; Plaza, A. Skip-connected covariance network for remote sensing scene classification. IEEE Trans. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 1461–1474. [Google Scholar] [CrossRef] [Green Version]
- He, N.; Fang, L.; Li, S.; Plaza, A.; Plaza, J. Remote sensing scene classification using multilayer stacked covariance pooling. Remote Sens. 2018, 56, 6899–6910. [Google Scholar] [CrossRef]
- Sun, H.; Li, S.; Zheng, X.; Lu, X. Remote sensing scene classification by gated bidirectional network. IEEE Trans. Geosci. Remote Sens. 2020, 58, 82–96. [Google Scholar] [CrossRef]
- Lu, X.; Sun, H.; Zheng, X. A feature aggregation convolutional neural network for remote sensing scene classification. IEEE Trans. Geosci. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7894–7906. [Google Scholar] [CrossRef]
- Li, B.; Su, W.; Wu, H.; Li, R.; Zhang, W.; Qin, W.; Zhang, S. Aggregated deep fisher feature for VHR remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 3508–3523. [Google Scholar] [CrossRef]
- Cheng, G.; Yang, C.; Yao, X.; Guo, L.; Han, J. When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2811–2821. [Google Scholar] [CrossRef]
- Boualleg, Y.; Farah, M.; Farah, I.R. Remote sensing scene classification using convolutional features and deep forest classifier. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1944–1948. [Google Scholar] [CrossRef]
- Xie, J.; He, N.; Fang, L.; Plaza, A. Scale-free convolutional neural network for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6916–6928. [Google Scholar] [CrossRef]
- Zhang, W.; Tang, P.; Zhao, L. Remote sensing image scene classification using CNN-CapsNet. Remote Sens. 2019, 11, 494. [Google Scholar] [CrossRef] [Green Version]
- Zhang, D.; Li, N.; Ye, Q. Positional context aggregation network for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2019, 17, 943–947. [Google Scholar] [CrossRef]
- Shi, C.; Wang, T.; Wang, L. Branch Feature Fusion Convolution Network for Remote Sensing Scene Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5194–5210. [Google Scholar] [CrossRef]
- Li, J.; Lin, D.; Wang, Y.; Xu, G.; Zhang, Y.; Ding, C.; Zhou, Y. Deep discriminative representation learning with attention map for scene classification. Remote Sens. 2020, 12, 1366. [Google Scholar] [CrossRef]
- Liu, M.; Jiao, L.; Liu, X.; Li, L.; Liu, F.; Yang, S. C-CNN: Contourlet convolutional neural networks. IEEE Trans. Neural Networks Learn. Syst. 2021, 32, 2636–2649. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Zhang, Y.; Wang, S. A lightweight and discriminative model for remote sensing scene classification with multidilation pooling module. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2636–2653. [Google Scholar] [CrossRef]
- Li, W.; Wang, Z.; Wang, Y.; Wu, J.; Wang, J.; Jia, Y.; Gui, G. Classification of high-spatial-resolution remote sensing scenes method using transfer learning and deep convolutional neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1986–1995. [Google Scholar] [CrossRef]
- Wang, C.; Lin, W.; Tang, P. Multiple resolution block feature for remote-sensing scene classification. Int. J. Remote Sens. 2019, 40, 6884–6904. [Google Scholar] [CrossRef]
- Zhong, Y.; Fei, F.; Zhang, L. Large patch convolutional neural networks for the scene classification of high spatial resolution imagery. J. Appl. Remote Sens. 2016, 10, 25006. [Google Scholar] [CrossRef]
- Zhong, Y.; Fei, F.; Liu, Y.; Zhao, B.; Jiao, H.; Zhang, L. SatCNN: Satellite image dataset classification using agile convolutional neural networks. Remote Sens. Lett. 2016, 8, 136–145. [Google Scholar] [CrossRef]
- Han, X.; Zhong, Y.; Cao, L.; Zhang, L. Pre-trained AlexNet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sens. 2017, 9, 848. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Zhong, Y.; Fei, F.; Zhu, Q.; Qin, Q. Scene classification based on a deep random-scale stretched convolutional neural network. Remote Sens. 2018, 10, 444. [Google Scholar] [CrossRef] [Green Version]
- Chaib, S.; Liu, H.; Gu, Y.; Yao, H. Deep feature fusion for VHR remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4775–4784. [Google Scholar] [CrossRef]
- Yu, Y.; Liu, F. A two-stream deep fusion framework for highresolution aerial scene classification. Comput. Intell. Neurosci. 2018, 2018, 8639367. [Google Scholar] [CrossRef] [Green Version]
- Yan, P.; He, F.; Yang, Y.; Hu, F. Semi-supervised representation learning for remote sensing image classification based on generative adversarial networks. IEEE Access 2020, 8, 54135–54144. [Google Scholar] [CrossRef]
Input Dimension | Operator | Output Dimension |
---|---|---|
Conv | ||
SSC | ||
SSC | ||
Conv |
Datasets | Number of Images in Each Category | Number of Categories | Total Number of Images | Spatial Resolution (m) | Image Size |
---|---|---|---|---|---|
UCM21 | 100 | 21 | 2100 | 0.3 | 256 × 256 |
RSSCN7 | 400 | 7 | 2800 | - | 400 × 400 |
AID | 200~400 | 30 | 10,000 | 0.5~0.8 | 600 × 600 |
NWPU45 | 700 | 45 | 31,500 | ~30–0.2 | 256 × 256 |
WHU-RS19 | ~50 | 19 | 1005 | 0.5 | 600 × 600 |
SIRI-WHU | 200 | 12 | 2400 | 2 | 200 × 200 |
The Network Model | OA (%) | Number of Parameters |
---|---|---|
Siamese CNN [36] | 94.29 | - |
Siamese ResNet50 with R.D method [37] | 94.76 | - |
Bidirectional adaptive feature fusion method [38] | 95.48 | 130 M |
Multiscale CNN [39] | 96.66 ± 0.90 | 60 M |
VGG_VD16 with SAFF method [40] | 97.02 ± 0.78 | 15 M |
Variable-weighted multi-fusion method [41] | 98.56 ± 0.23 | 15 M |
ResNet + WSPM-CRC method [42] | 97.95 | 23 M |
Skip-connected CNN [43] | 97.98 ± 0.56 | 6 M |
VGG16 with MSCP [44] | 98.36 ± 0.58 | - |
Gated bidirectiona + global feature method [45] | 98.57 ± 0.48 | 138 M |
Feature aggregation CNN [46] | 98.81 ± 0.24 | 130 M |
Aggregated deep fisher feature method [47] | 98.81 ± 0.51 | 23 M |
Discriminative CNN [48] | 98.93 ± 0.10 | 130 M |
VGG16-DF method [49] | 98.97 | 130 M |
Scale-Free CNN [50] | 99.05 ± 0.27 | 130 M |
Inceptionv3 + CapsNet method [51] | 99.05 ± 0.24 | 22 M |
Positional context aggregation method [52] | 99.21 ± 0.18 | 28 M |
LCNN-BFF method [53] | 99.29 ± 0.24 | 6.2 M |
DDRL-AM method [54] | 99.05 ± 0.08 | - |
Coutourlet CNN [55] | 99.25 ± 0.49 | 12.6 M |
SE-MDPMNet [56] | 99.09 ± 0.78 | 5.17 M |
ResNet50 [57] | 98.76 ± 0.15 | 25.61 M |
Multiple resolution BlockFeature method [58] | 94.19 ± 1.5 | - |
LPCNN [59] | 99.56 ± 0.58 | 5.6 M |
SICNN [60] | 98.67 ± 0.82 | 23 M |
Pre-trained-AlexNet-SPP-SS [61] | 98.99 ± 0.48 | 38 M |
SRSCNN [62] | 97.62 ± 0.28 | - |
DCA by addition [63] | 99.46 ± 0.37 | 45 M |
Two-stream deep fusion Framework [64] | 99.35 ± 0.3 | 6 M |
Semi-supervised representation learning method [65] | 94.05 ± 1.2 | 210 M |
Proposed | 99.76 ± 0.05 | 0.49 M |
The Network Model | OA (%) | Number of Parameters |
---|---|---|
VGG16 + SVM method [32] | 87.18 | 130 M |
Variable-weighted multi-fusion method [41] | 89.1 | 53 M |
ResNet + SPM-CRC method [42] | 93.86 | 23 M |
Aggregated deep fisher feature Method [48] | 95.21 ± 0.50 | 23 M |
LCNN-BFF method [53] | 94.64 ± 0.21 | 6.2 M |
Coutourlet CNN [55] | 95.54 ± 0.17 | 12.6 M |
SE-MDPMNet [56] | 92.64 ± 0.66 | 5.17 M |
Proposed | 97.21 ± 0.32 | 0.49 M |
The Network Model | OA (20/80%) | OA (50/50%) | Number of Parameters |
---|---|---|---|
Bidirectional adaptive feature fusion method [38] | - | 93.56 | 130 M |
VGG_VD16 with SAFF method [40] | 90.25 ± 0.29 | 93.83 ± 0.28 | 15 M |
Skip-connected CNN [43] | 91.10 ± 0.15 | 93.30 ± 0.13 | 6 M |
Gated bidirectiona method [45] | 90.16 ± 0.24 | 93.72 ± 0.34 | 18 M |
Gated bidirectiona + global feature method [45] | 92.20 ± 0.23 | 95.48 ± 0.12 | 138 M |
Feature aggregation CNN [46] | - | 95.45 ± 0.11 | 130 M |
Discriminative CNN [48] | 85.62 ± 0.10 | 94.47 ± 0.12 | 60 M |
LCNN-BFF method [53] | 91.66 ± 0.48 | 94.64 ± 0.16 | 6.2 M |
ResNet50 [57] | 92.39 ± 0.15 | 94.69 ± 0.19 | 25.61 M |
Fine-tuning method [32] | 86.59 ± 0.29 | 89.64 ± 0.36 | 130 M |
Proposed | 93.15 ± 0.25 | 97.31 ± 0.10 | 0.49 M |
The Network Model | OA (50/50%) | OA (80/20%) | Number of Parameters |
---|---|---|---|
DMTM [35] | 91.52 | - | - |
Siamese ResNet_50 [36] | 95.75 | 97.50 | - |
Siamese AlexNet [36] | 83.25 | 88.96 | - |
Siamese VGG-16 [36] | 94.50 | 97.30 | - |
Fine-tune MobileNetV2 [56] | 95.77 ± 0.16 | 96.21 ± 0.31 | 3.5 M |
SE-MDPMNet [56] | 96.96 ± 0.19 | 98.77 ± 0.19 | 5.17 M |
LPCNN [59] | - | 89.88 | - |
SICNN [60] | - | 93.00 | - |
Pre-trained-AlexNet-SPP-SS [61] | - | 95.07 ± 1.09 | - |
SRSCNN [62] | 93.44 | 94.76 | - |
Proposed | 98.08 ± 0.45 | 99.37 ± 0.26 | 0.49 M |
The Network Model | OA (40/60%) | OA (60/40%) | Number of Parameters |
---|---|---|---|
CaffeNet [32] | 95.11 ± 1.20 | 96.24 ± 0.56 | 60.97 M |
VGG-VD-16 [32] | 95.44 ± 0.60 | 96.05 ± 0.91 | 138.36 M |
GoogLeNet [32] | 93.12 ± 0.82 | 94.71 ± 1.33 | 7 M |
Fine-tune MobileNetV2 [56] | 96.82 ± 0.35 | 98.14 ± 0.33 | 3.5 M |
SE-MDPMNet [56] | 98.46 ± 0.21 | 98.97 ± 0.24 | 5.17 M |
DCA by addition [63] | - | 98.70 ± 0.22 | - |
Two-stream deep fusion Framework [64] | 98.23 ± 0.56 | 98.92 ± 0.52 | - |
TEX-Net-LF [45] | 98.48 ± 0.37 | 98.88 ± 0.49 | - |
Proposed | 98.65 ± 0.45 | 99.51 ± 0.15 | 0.49 M |
The Network Model | OA (10/90%) | OA (20/80%) | Number of Parameters |
---|---|---|---|
VGG_VD16 with SAFF method [40] | 84.38 ± 0.19 | 87.86 ± 0.14 | 15 M |
Skip-connected CNN [43] | 84.33 ± 0.19 | 87.30 ± 0.23 | 6 M |
Discriminative with AlexNet [48] | 85.56 ± 0.20 | 87.24 ± 0.12 | 130 M |
Discriminative with VGG16 [48] | 89.22 ± 0.50 | 91.89 ± 0.22 | 130 M |
VGG16 + CapsNet [51] | 85.05 ± 0.13 | 89.18 ± 0.14 | 130 M |
LCNN-BFF method [53] | 86.53 ± 0.15 | 91.73 ± 0.17 | 6.2 M |
Contourlet CNN [55] | 85.93 ± 0.51 | 89.57 ± 0.45 | 12.6 M |
ResNet50 [57] | 86.23 ± 0.41 | 88.93 ± 0.12 | 25.61 M |
InceptionV3 [57] | 85.46 ± 0.33 | 87.75 ± 0.43 | 45.37 M |
Fine-tuning method [32] | 87.15 ± 0.45 | 90.36 ± 0.18 | 130 M |
Proposed | 92.02 ± 0.50 | 94.39 ± 0.16 | 0.49 M |
The Network Model | Time Required to Process Each Image (s) |
---|---|
Siamese ResNet_50 [36] | 0.053 |
Siamese AlexNet [36] | 0.028 |
Siamese VGG-16 [36] | 0.039 |
LCNN-BFF [53] | 0.029 |
GBNet + global feature [45] | 0.052 |
GBNet [45] | 0.048 |
Proposed | 0.014 |
The Network Model | OA (%) | Number of Parameters | FLOPs |
---|---|---|---|
LCNN-BFF [53] | 94.64 | 6.1 M | 24.6 M |
GoogLeNet [32] | 85.84 | 7 M | 1.5 G |
CaffeNet [32] | 88.25 | 60.97 M | 715 M |
VGG-VD-16 [32] | 87.18 | 138 M | 15.5 G |
Fine-tune MobileNetV2 [56] | 94.71 | 3.5 M | 334 M |
SE-MDPMNet [56] | 92.64 | 5.17 M | 3.27 G |
Contourlet CNN [55] | 95.54 | 12.6 M | 2.1 G |
Proposed | 97.21 | 0.49 M | 1.9 M |
The Network Model | OA (%) | Kappa (%) | ATT | FLOPs | Number of Parameters | ||||
---|---|---|---|---|---|---|---|---|---|
AID | NWPU | AID | NWPU | AID | NWPU | AID | NWPU | ||
SCBM | 97.31 | 94.39 | 97.05 | 93.68 | 0.045 s | 0.093 s | 1.9 M | 2.0 M | 0.49 M |
CBM | 95.56 | 92.45 | 94.64 | 91.49 | 0.135 s | 0.245 s | 3.3 M | 3.4 M | 0.85 M |
The Network Model | OA (%) | Kappa (%) | ATT | FLOPs | Number of Parameters | ||||
---|---|---|---|---|---|---|---|---|---|
AID | NWPU | AID | NWPU | AID | NWPU | AID | NWPU | ||
SCCNN | 97.31 | 94.39 | 97.05 | 93.68 | 0.045 s | 0.093 s | 1.99 M | 2.03 M | 0.49 M |
CN | 96.41 | 93.29 | 96.35 | 93.08 | 0.045 s | 0.093 s | 1.99 M | 2.03 M | 0.49 M |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shi, C.; Zhang, X.; Sun, J.; Wang, L. Remote Sensing Scene Image Classification Based on Self-Compensating Convolution Neural Network. Remote Sens. 2022, 14, 545. https://doi.org/10.3390/rs14030545
Shi C, Zhang X, Sun J, Wang L. Remote Sensing Scene Image Classification Based on Self-Compensating Convolution Neural Network. Remote Sensing. 2022; 14(3):545. https://doi.org/10.3390/rs14030545
Chicago/Turabian StyleShi, Cuiping, Xinlei Zhang, Jingwei Sun, and Liguo Wang. 2022. "Remote Sensing Scene Image Classification Based on Self-Compensating Convolution Neural Network" Remote Sensing 14, no. 3: 545. https://doi.org/10.3390/rs14030545
APA StyleShi, C., Zhang, X., Sun, J., & Wang, L. (2022). Remote Sensing Scene Image Classification Based on Self-Compensating Convolution Neural Network. Remote Sensing, 14(3), 545. https://doi.org/10.3390/rs14030545