A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet
Abstract
:1. Introduction
2. Methodology
2.1. Instance Extraction and Classifier
2.2. MIL Pooling Based on Channel Attention
2.3. Bag-Level Classification
3. Experiment and Results
3.1. Datasets Description
3.1.1. UCM Dataset
3.1.2. SIRI-WHU Dataset
3.1.3. AID Dataset
3.1.4. NWPU Dataset
3.2. Experimental Settings
3.3. Results and Comparison
3.3.1. Experiments on the UCM Dataset
3.3.2. Experiments on the SIRI-WHU Dataset
3.3.3. Experiments on the AID Dataset
3.3.4. Experiments on the NWPU Dataset
3.3.5. Prediction Time
3.3.6. Model Size
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wang, X.; Yuan, L.; Xu, H.; Wen, X. CSDS: End-to-end aerial scenes classification with depthwise separable convolution and an attention mechanism. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10484–10499. [Google Scholar] [CrossRef]
- Cheng, G.; Xie, X.; Han, J.; Guo, L.; Xia, G.S. Remote sensing image scene classification meets deep learning: Challenges, methods, benchmarks, and opportunities. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3735–3756. [Google Scholar] [CrossRef]
- Zhang, L.; Han, Y.; Yang, Y.; Song, M.; Yan, S.; Tian, Q. Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 2013, 22, 5071–5084. [Google Scholar] [CrossRef]
- Longbotham, N.; Chaapel, C.; Bleiler, L.; Padwick, C.; Emery, W.J.; Pacifici, F. Very high resolution multiangle urban classification analysis. IEEE Trans. Geosci. Remote Sens. 2011, 50, 1155–1170. [Google Scholar] [CrossRef]
- Tayyebi, A.; Pijanowski, B.C.; Tayyebi, A.H. An urban growth boundary model using neural networks, GIS and radial parameterization: An application to Tehran, Iran. Landsc. Urban Plan. 2011, 100, 35–44. [Google Scholar] [CrossRef]
- Chen, W.; Li, X.; He, H.; Wang, L. Assessing different feature sets’ effects on land cover classification in complex surface-mined landscapes by ZiYuan-3 satellite imagery. Remote Sens. 2017, 10, 23. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Shao, G. Object-based urban vegetation mapping with high-resolution aerial photography as a single data source. Int. J. Remote Sens. 2013, 34, 771–789. [Google Scholar] [CrossRef]
- Ahmed, Z.; Hussain, A.J.; Khan, W.; Baker, T.; Al-Askar, H.; Lunn, J.; Al-Shabandar, R.; Al-Jumeily, D.; Liatsis, P. Lossy and lossless video frame compression: A novel approach for high-temporal video data analytics. Remote Sens. 2020, 12, 1004. [Google Scholar] [CrossRef] [Green Version]
- Kleanthous, N.; Hussain, A.; Khan, W.; Sneddon, J.; Liatsis, P. Deep transfer learning in sheep activity recognition using accelerometer data. Expert Syst. Appl. 2022, 207, 117925. [Google Scholar] [CrossRef]
- Hu, J.; Xia, G.S.; Hu, F.; Sun, H.; Zhang, L. A comparative study of sampling analysis in scene classification of high-resolution remote sensing imagery. In Proceedings of the 2015 IEEE International geoscience and remote sensing symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 2389–2392. [Google Scholar]
- Swain, M.J.; Ballard, D.H. Color indexing. Int. J. Comput. Vis. 1991, 7, 11–32. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Oliva, A.; Torralba, A. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 2001, 42, 145–175. [Google Scholar] [CrossRef]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Zhu, Q.; Zhong, Y.; Zhao, B.; Xia, G.S.; Zhang, L. Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 2016, 13, 747–751. [Google Scholar] [CrossRef]
- Hofmann, T. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 2001, 42, 177–196. [Google Scholar] [CrossRef]
- Blei, D.; Ng, A.; Jordan, M. Latent dirichlet allocation. In Proceedings of the 2001 Neural Information Processing Systems (NIPS) Conference, Vancouver, BC, Canada, 3–8 December 2001. [Google Scholar]
- Hu, F.; Xia, G.S.; Hu, J.; Zhang, L. Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens. 2015, 7, 14680–14707. [Google Scholar] [CrossRef] [Green Version]
- Penatti, O.A.; Nogueira, K.; Dos Santos, J.A. Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, 7–12 June 2015; pp. 44–51. [Google Scholar]
- Chen, J.; Huang, H.; Peng, J.; Zhu, J.; Chen, L.; Tao, C.; Li, H. Contextual information-preserved architecture learning for remote-sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–14. [Google Scholar] [CrossRef]
- Tang, X.; Lin, W.; Ma, J.; Zhang, X.; Liu, F.; Jiao, L. Class-Level Prototype Guided Multiscale Feature Learning for Remote Sensing Scene Classification With Limited Labels. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Bi, Q.; Qin, K.; Zhang, H.; Xia, G.S. Local semantic enhanced convnet for aerial scene recognition. IEEE Trans. Image Process. 2021, 30, 6498–6511. [Google Scholar] [CrossRef]
- Cheng, G.; Yang, C.; Yao, X.; Guo, L.; Han, J. When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2811–2821. [Google Scholar] [CrossRef]
- Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L.; Lu, X. AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar] [CrossRef]
- Liu, X.; Zhou, Y.; Zhao, J.; Yao, R.; Liu, B.; Zheng, Y. Siamese convolutional neural networks for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1200–1204. [Google Scholar] [CrossRef]
- Han, X.; Zhong, Y.; Cao, L.; Zhang, L. Pre-trained alexnet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sens. 2017, 9, 848. [Google Scholar] [CrossRef] [Green Version]
- Tang, X.; Ma, Q.; Zhang, X.; Liu, F.; Ma, J.; Jiao, L. Attention consistent network for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2030–2045. [Google Scholar] [CrossRef]
- Cao, R.; Fang, L.; Lu, T.; He, N. Self-attention-based deep feature fusion for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2020, 18, 43–47. [Google Scholar] [CrossRef]
- Sun, H.; Li, S.; Zheng, X.; Lu, X. Remote sensing scene classification by gated bidirectional network. IEEE Trans. Geosci. Remote Sens. 2019, 58, 82–96. [Google Scholar] [CrossRef]
- Tang, P.; Wang, X.; Feng, B.; Liu, W. Learning multi-instance deep discriminative patterns for image classification. IEEE Trans. Image Process. 2016, 26, 3385–3396. [Google Scholar] [CrossRef]
- Dietterich, T.G.; Lathrop, R.H.; Lozano-Pérez, T. Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 1997, 89, 31–71. [Google Scholar] [CrossRef] [Green Version]
- Wang, X.; Wang, B.; Bai, X.; Liu, W.; Tu, Z. Max-margin multiple-instance dictionary learning. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 846–854. [Google Scholar]
- Wang, Q.; Yuan, Y.; Yan, P.; Li, X. Saliency detection by multiple-instance learning. IEEE Trans. Cybern. 2013, 43, 660–672. [Google Scholar] [CrossRef]
- Wang, C.; Huang, K.; Ren, W.; Zhang, J.; Maybank, S. Large-scale weakly supervised object localization via latent category learning. IEEE Trans. Image Process. 2015, 24, 1371–1385. [Google Scholar] [CrossRef] [PubMed]
- Bi, Q.; Zhou, B.; Qin, K.; Ye, Q.; Xia, G.S. All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification. arXiv 2022, arXiv:2205.03371. [Google Scholar] [CrossRef]
- Wang, X.; Yan, Y.; Tang, P.; Bai, X.; Liu, W. Revisiting multiple instance neural networks. Pattern Recognit. 2018, 74, 15–24. [Google Scholar] [CrossRef] [Green Version]
- Pinheiro, P.O.; Collobert, R. From image-level to pixel-level labeling with convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1713–1721. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2472–2481. [Google Scholar]
- Bi, Q.; Qin, K.; Li, Z.; Zhang, H.; Xu, K.; Xia, G.S. A multiple-instance densely-connected ConvNet for aerial scene classification. IEEE Trans. Image Process. 2020, 29, 4911–4926. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Newsam, S. Bag-of-visual-words and spatial extensions for land-use classification. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 2–5 November 2010; pp. 270–279. [Google Scholar]
- Zhao, B.; Zhong, Y.; Xia, G.S.; Zhang, L. Dirichlet-derived multiple topic scene classification model for high spatial resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2015, 54, 2108–2123. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef] [Green Version]
- Anwer, R.M.; Khan, F.S.; van de Weijer, J.; Molinier, M.; Laaksonen, J. Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification. ISPRS J. Photogramm. Remote Sens. 2018, 138, 74–85. [Google Scholar] [CrossRef] [Green Version]
- Gong, X.; Xie, Z.; Liu, Y.; Shi, X.; Zheng, Z. Deep salient feature based anti-noise transfer network for scene classification of remote sensing imagery. Remote Sens. 2018, 10, 410. [Google Scholar] [CrossRef] [Green Version]
- Zhang, W.; Tang, P.; Zhao, L. Remote sensing image scene classification using CNN-CapsNet. Remote Sens. 2019, 11, 494. [Google Scholar] [CrossRef] [Green Version]
- Li, B.; Su, W.; Wu, H.; Li, R.; Zhang, W.; Qin, W.; Zhang, S. Aggregated deep fisher feature for VHR remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 3508–3523. [Google Scholar] [CrossRef]
- Qi, K.; Guan, Q.; Yang, C.; Peng, F.; Shen, S.; Wu, H. Concentric circle pooling in deep convolutional networks for remote sensing scene classification. Remote Sens. 2018, 10, 934. [Google Scholar] [CrossRef] [Green Version]
- Liu, B.D.; Meng, J.; Xie, W.Y.; Shao, S.; Li, Y.; Wang, Y. Weighted spatial pyramid matching collaborative representation for remote-sensing-image scene classification. Remote Sens. 2019, 11, 518. [Google Scholar] [CrossRef]
- Xu, K.; Huang, H.; Deng, P.; Li, Y. Deep Feature Aggregation Framework Driven by Graph Convolutional Network for Scene Classification in Remote Sensing. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 5751–5765. [Google Scholar] [CrossRef]
- Zhang, B.; Zhang, Y.; Wang, S. A lightweight and discriminative model for remote sensing scene classification with multidilation pooling module. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2636–2653. [Google Scholar] [CrossRef]
- Wang, J.; Zhong, Y.; Zheng, Z.; Ma, A.; Zhang, L. RSNet: The search for remote sensing deep neural networks in recognition tasks. IEEE Trans. Geosci. Remote Sens. 2020, 59, 2520–2534. [Google Scholar] [CrossRef]
- Li, Z.; Wu, Q.; Cheng, B.; Cao, L.; Yang, H. Remote sensing image scene classification based on object relationship reasoning CNN. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1–5. [Google Scholar] [CrossRef]
- Zhong, Y.; Fei, F.; Zhang, L. Large patch convolutional neural networks for the scene classification of high spatial resolution imagery. J. Appl. Remote Sens. 2016, 10, 025006. [Google Scholar] [CrossRef]
- Zhong, Y.; Fei, F.; Liu, Y.; Zhao, B.; Jiao, H.; Zhang, L. SatCNN: Satellite image dataset classification using agile convolutional neural networks. Remote Sens. Lett. 2017, 8, 136–145. [Google Scholar] [CrossRef]
- Liu, Y.; Zhong, Y.; Fei, F.; Zhu, Q.; Qin, Q. Scene classification based on a deep random-scale stretched convolutional neural network. Remote Sens. 2018, 10, 444. [Google Scholar] [CrossRef] [Green Version]
- Shi, C.; Wang, T.; Wang, L. Branch feature fusion convolution network for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5194–5210. [Google Scholar] [CrossRef]
- He, N.; Fang, L.; Li, S.; Plaza, A.; Plaza, J. Remote sensing scene classification using multilayer stacked covariance pooling. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6899–6910. [Google Scholar] [CrossRef]
- Bi, Q.; Qin, K.; Zhang, H.; Li, Z.; Xu, K. RADC-Net: A residual attention based convolution network for aerial scene classification. Neurocomputing 2020, 377, 345–359. [Google Scholar] [CrossRef]
- Chaib, S.; Liu, H.; Gu, Y.; Yao, H. Deep feature fusion for VHR remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4775–4784. [Google Scholar] [CrossRef]
- He, N.; Fang, L.; Li, S.; Plaza, J.; Plaza, A. Skip-connected covariance network for remote sensing scene classification. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 1461–1474. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sun, X.; Zhu, Q.; Qin, Q. A multi-level convolution pyramid semantic fusion framework for high-resolution remote sensing image scene classification and annotation. IEEE Access 2021, 9, 18195–18208. [Google Scholar] [CrossRef]
- Liu, M.; Jiao, L.; Liu, X.; Li, L.; Liu, F.; Yang, S. C-CNN: Contourlet convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 2636–2649. [Google Scholar] [CrossRef] [PubMed]
- Xu, K.; Huang, H.; Deng, P.; Shi, G. Two-stream feature aggregation deep neural network for scene classification of remote sensing images. Inf. Sci. 2020, 539, 250–268. [Google Scholar] [CrossRef]
- Xu, K.; Huang, H.; Li, Y.; Shi, G. Multilayer feature fusion network for scene classification in remote sensing. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1894–1898. [Google Scholar] [CrossRef]
- Wang, X.; Duan, L.; Shi, A.; Zhou, H. Multilevel feature fusion networks with adaptive channel dimensionality reduction for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Shi, C.; Zhao, X.; Wang, L. A multi-branch feature fusion strategy based on an attention mechanism for remote sensing image scene classification. Remote Sens. 2021, 13, 1950. [Google Scholar] [CrossRef]
Datasets | No. of Classes | Image Per Class | No. of Images | Spatial Resolution (in meters) | Image Size | Training Ratio Setting |
---|---|---|---|---|---|---|
UCM | 21 | 100 | 2100 | 0.3 | 256 × 256 | 50%, 80% |
SIRI-WHU | 12 | 200 | 2400 | 2 | 200 × 200 | 50%, 80% |
AID | 30 | 220–400 | 10,000 | 0.5–8 | 600 × 600 | 20%, 50% |
NWPU | 45 | 700 | 31,500 | 0.2–30 | 256 × 256 | 10%, 20% |
Methods | OA(50/50)(%) | OA(80/20)(%) | Number of Parameters |
---|---|---|---|
AlexNet [24] | 93.98 ± 0.67 | 95.02 ± 0.81 | 60 M |
VGGNet-16 [24] | 94.14 ± 0.69 | 95.21 ± 1.20 | 130 M |
GoogLeNet [24] | 92.70 ± 0.60 | 94.31 ± 0.89 | 7 M |
TEX-Net with VGG [47] | 94.22 ± 0.50 | 95.31 ± 0.69 | 130 M |
D-CNN with AlexNet [23] | —- | 96.67 ± 0.10 | 60 M |
CCP-net [51] | —- | 97.52 ± 0.97 | 130 M |
ADFF [50] | 97.22 ± 0.45 | 98.81 ± 0.51 | 23 M |
DSFATN [48] | —- | 98.25 | 143 M |
Inception-v3-CapsNet [49] | 97.59 ± 0.16 | 99.05 ± 0.24 | 22 M |
WSPM-CRC [52] | —- | 97.95 | 23 M |
SAFF with AlexNet [28] | —- | 96.13 ± 0.97 | 60 M |
DFAGCN [53] | —- | 98.48 ± 0.42 | 130 M |
Fine-tune MobileNetV2 [54] | 97.88 ± 0.31 | 98.13 ± 0.33 | 3.5 M |
DC-Net [43] | 94.52 ± 0.63 | 96.21 ± 0.67 | 0.5 M |
GBNet [29] | 97.05 ± 0.19 | 98.57 ± 0.48 | 18 M |
LSENet [22] | 97.94 ± 0.35 | 98.69 ± 0.53 | 130 M |
RSNet [55] | —- | 96.78 ± 0.60 | 1.22 M |
CIPAL [20] | 91.96 ± 0.91 | 96.58 ± 0.76 | 1.53 M |
ORRCNN [56] | 96.58 | 96.42 | —- |
MILRDA (ours) | 98.19 ± 0.54 | 98.81 ± 0.12 | 0.66 M |
Methods | OA(50/50)(%) | OA(80/20)(%) | Number of Parameters |
---|---|---|---|
AlexNet [25] | 82.50 | 88.33 | 60 M |
VGGNet-16 [25] | 94.92 | 96.25 | 130 M |
ResNet-50 [25] | 94.67 | 95.63 | 26 M |
DMTM [45] | 91.52 | —- | —- |
Siamese AlexNet [25] | 83.25 | 88.96 | 60 M |
Siamese VGG16 [25] | 94.50 | 97.30 | 130 M |
Siamese ResNet50 [25] | 95.75 | 97.50 | 26 M |
Fine-tune MobileNetV2 [54] | 95.77 ± 0.16 | 96.21 ± 0.31 | 3.5 M |
SE-MDPMNet [54] | 96.96 ± 0.19 | 98.77 ± 0.19 | 5.17 M |
LPCNN [57] | —- | 89.88 | —- |
SICNN [58] | —- | 93.00 | —- |
Pre-trained-AlexNet-SPP-SS [26] | —- | 95.07 ± 1.09 | —- |
SRSCNN [59] | 93.44 | 94.76 | —- |
MILRDA (ours) | 97.16 ± 0.37 | 98.75 ± 0.18 | 0.66 M |
Methods | OA(20/80)(%) | OA(50/50)(%) | Number of Parameters |
---|---|---|---|
AlexNet [24] | 86.86 ± 0.47 | 89.53 ± 0.31 | 60 M |
VGGNet-16 [24] | 86.59 ± 0.29 | 89.64 ± 0.36 | 130 M |
GoogLeNet [24] | 83.44 ± 0.40 | 86.39 ± 0.55 | 7 M |
TEX-Net with VGG [47] | 87.32 ± 0.37 | 90.00 ± 0.33 | 130 M |
D-CNN with AlexNet [23] | 85.62 ± 0.10 | 94.47 ± 0.12 | 60 M |
Fusion by addition [63] | —- | 91.87 ± 0.36 | —- |
WSPM-CRC [52] | —- | 95.11 | 23 M |
DFAGCN [53] | —- | 94.88 ± 0.22 | 130 M |
SAFF with AlexNet [28] | 87.51 ± 0.36 | 91.83 ± 0.27 | 60 M |
VGG16+MSCP [61] | 91.52 ± 0.21 | 94.42 ± 0.17 | 130 M |
AlexNet+MSCP [61] | 88.99 ± 0.38 | 92.36 ± 0.21 | 60 M |
GBNet [29] | 90.16 ± 0.24 | 93.72 ± 0.34 | 18 M |
DC-Net [43] | 87.37 ± 0.41 | 91.49 ± 0.22 | 0.5 M |
LCNN-BFF [60] | 91.66 ± 0.48 | 94.64 ± 0.16 | 6.2 M |
Skip-connected CNN [64] | 91.10 ± 0.15 | 93.30 ± 0.13 | 6 M |
CIPAL [20] | 91.22 ± 0.83 | 93.45 ± 0.31 | 1.53 M |
ORRCNN [56] | 86.42 | 92.00 | |
LCPP [65] | 90.96 ± 0.33 | 93.12 ± 0.28 | |
MILRDA (ours) | 91.95 ± 0.19 | 95.46 ± 0.26 | 0.66 M |
Methods | OA(10/90)(%) | OA(20/80)(%) | Number of Parameters |
---|---|---|---|
AlexNet [46] | 76.69 ± 0.21 | 79.85 ± 0.13 | 60 M |
VGGNet-16 [46] | 76.47 ± 0.18 | 79.79 ± 0.15 | 130 M |
GoogLeNet [46] | 76.19 ± 0.38 | 78.48 ± 0.26 | 7 M |
Fine-tuned VGG16 [46] | 87.15 ± 0.45 | 90.36 ± 0.18 | 130 M |
Fine-tuned AlexNet [46] | 81.22 ± 0.19 | 85.16 ± 0.18 | 60 M |
Fine-tuned GoogLeNet [46] | 82.57 ± 0.12 | 86.02 ± 0.18 | 7 M |
DFAGCN [53] | —- | 89.29 ± 0.28 | 130 M |
TFADNN [67] | 87.78 ± 0.11 | 90.86 ± 0.24 | 130 M |
SAFF with AlexNet [28] | 80.05 ± 0.29 | 84.00 ± 0.17 | 60 M |
Contourlet CNN [66] | 85.93 ± 0.51 | 89.57 ± 0.45 | 12.6 M |
Inception-v3-CapsNet [49] | 89.03 ± 0.21 | 92.60 ± 0.11 | 22 M |
SCCov [64] | 89.30 ± 0.35 | 92.10 ± 0.25 | 13 M |
MFNet [68] | 90.17 ± 0.25 | 92.73 ± 0.21 | —- |
LCNN-BFF [60] | 86.53 ± 0.15 | 91.73 ± 0.17 | 6.2 M |
ACNet [27] | 91.09 ± 0.13 | 92.42 ± 0.16 | 130 M |
ACR-MLFF [69] | 90.01 ± 0.33 | 92.45 ± 0.20 | 26 M |
AMB-CNN [70] | 88.99 ± 0.14 | 92.42 ± 0.14 | 5.6 M |
MILRDA (ours) | 91.56 ± 0.18 | 92.87 ± 0.26 | 0.66 M |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Xu, H.; Yuan, L.; Dai, W.; Wen, X. A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet. Remote Sens. 2022, 14, 5095. https://doi.org/10.3390/rs14205095
Wang X, Xu H, Yuan L, Dai W, Wen X. A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet. Remote Sensing. 2022; 14(20):5095. https://doi.org/10.3390/rs14205095
Chicago/Turabian StyleWang, Xinyu, Haixia Xu, Liming Yuan, Wei Dai, and Xianbin Wen. 2022. "A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet" Remote Sensing 14, no. 20: 5095. https://doi.org/10.3390/rs14205095
APA StyleWang, X., Xu, H., Yuan, L., Dai, W., & Wen, X. (2022). A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet. Remote Sensing, 14(20), 5095. https://doi.org/10.3390/rs14205095