Region-Focusing Data Augmentation via Salient Region Activation and Bitplane Recombination for Target Detection
Abstract
:1. Introduction
- To the best of our knowledge, this is the first time that data augmentation has been examined from the perspective of meaningful focusing regions, rather than the whole sample image, and both positive regions and typical negative regions have been considered at the same time.
- A region-based strategy for bitplane recombination is proposed, which can maintain the internal structures of the focusing regions. And through a combination with the region-focusing strategy, a multiplied rate of data augmentation can be achieved.
2. Related Works
2.1. Saliency Detection
2.2. Bitplane Techniques
3. Methods
3.1. Empirical Risk of Data Augmentation
3.2. Region-Focusing Based on Salient Region Activation
3.3. Region-Based Data Augmentation
- The meaningful focusing regions are all selected, and the extracted bitplanes of different regions are the same order bitplanes, as shown in Figure 3c:
- The meaningful focusing regions are partially selected, but the extracted bitplanes of different regions are the same order bitplanes, as shown in Figure 3d:
- 3.
- The meaningful focusing regions are all selected, but the extracted bitplanes of different regions are different order bitplanes, as shown in Figure 3e:
- 4.
- The meaningful focusing regions are partially selected, and the extracted bitplanes of different regions are different order bitplanes, as shown in Figure 3f:
3.4. Total Framework of the Proposed Method
Algorithm 1. ROBIT data augmentation |
Input: Training dataset . Parameter setting: , number of image blocks of image ; , the probability threshold of the data augmentation. Output: Augmented set . For to do Crop into image blocks . For j = 1 to do For each epoch do While averaged focusing map is generated Extract focusing regions ; In each focusing region do Randomly generate the probability of data augmentation from . If do Region focusing based on salient region activation; Region-focused Bitplane extraction as in Equation (21); Bitplane recombination to generate new images as in Equation (26). End End End End End End |
4. Results
4.1. Dataset
4.2. Experimental Settings of Our Proposed Method
4.3. Comparison with Different Object Detection Methods
4.4. Comparison with Different Data Augmentation Methods
4.5. Effectiveness of the Sub-Policies in Region-Based Operations
4.6. Ablation Study
4.7. Meaningful Focusing Regions vs. Only Positive Regions
4.8. Comparison on the HRSC2016 Dataset
5. Discussion
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Sun, C.; Shrivastava, A.; Singh, S.; Gupta, A. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 843–852. [Google Scholar]
- Xie, Y.; Zhan, N.; Zhu, J.; Xu, B.; Chen, H.; Mao, W.; Luo, X.; Hu, Y. Landslide Extraction from Aerial Imagery Considering Context Association Characteristics. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103950. [Google Scholar] [CrossRef]
- Zhu, J.; Zhang, J.; Chen, H.; Xie, Y.; Gu, H.; Lian, H. A Cross-View Intelligent Person Search Method Based on Multi-Feature Constraints. Int. J. Digit. Earth 2024, 17, 2346259. [Google Scholar] [CrossRef]
- Xu, Y.; Hou, J.; Zhu, X.; Wang, C.; Shi, H.; Wang, J.; Li, Y.; Ren, P. Hyperspectral Image Super-Resolution with ConvLSTM Skip-Connections. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5519016. [Google Scholar] [CrossRef]
- Cao, S.; Feng, D.; Liu, S.; Xu, W.; Chen, H.; Xie, Y.; Zhang, H.; Pirasteh, S.; Zhu, J. BEMRF-Net: Boundary Enhancement and Multiscale Refinement Fusion for Building Extraction from Remote Sensing Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 16342–16358. [Google Scholar] [CrossRef]
- Zhang, H.; Han, X.; Deng, J.; Sun, W. How to Evaluate and Remove the Weakened Bands in Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2024. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization Strategy to Train Strong Classifiers with Localizable Features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhang, H.; Xu, Z.; Han, X.; Sun, W. Data Augmentation using Bitplane Information Recombination Model. IEEE Trans. Image Process. 2022, 31, 3713–3725. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, W.; Bai, L.; Ren, P. Metalantis: A Comprehensive Underwater Image Enhancement Framework. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5618319. [Google Scholar] [CrossRef]
- Liang, J.; Liang, S.; Liu, A.; Ma, K.; Li, J.; Cao, X. Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond Empirical Risk Minimization. Int. Conf. Learn. Represent. 2018. [Google Scholar]
- Verma, V.; Lamb, A.; Beckham, C.; Najafi, A.; Mitliagkas, I.; Lopez-Paz, D.; Bengio, Y. Manifold mixup: Better Representations by Interpolating Hidden States. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6438–6447. [Google Scholar]
- Mariani, G.; Scheidegger, F.; Istrate, R.; Bekas, C.; Malossi, C. Bagan: Data Augmentation with Balancing GAN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Zhao, S.; Liu, Z.; Lin, J.; Zhu, J.Y.; Han, S. Differentiable Augmentation for Data-efficient GAN Training. Adv. Neural Inf. Process. Syst. 2020, 33, 7559–7570. [Google Scholar]
- Jiang, L.; Dai, B.; Wu, W.; Loy, C.C. Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data. Adv. Neural Inf. Process. Syst. 2021, 34, 21655–21667. [Google Scholar]
- DeVries, T.; Taylor, G.W. Improved Regularization of Convolutional Neural Networks with Cutout. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Zhong, Z.; Zheng, L.; Kang, G.; Li, S.; Yang, Y. Random Erasing Data Augmentation. AAAI Conf. Artif. Intell. 2020, 34, 13001–13008. [Google Scholar] [CrossRef]
- Kumar Singh, K.; Jae Lee, Y. Hide-and-seek: Forcing a Network to Be Meticulous for Weakly-supervised Object and Action Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3524–3533. [Google Scholar]
- Chen, P.; Liu, S.; Zhao, H.; Wang, X.; Jia, J. Gridmask Data Augmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Takahashi, R.; Matsubara, T.; Uehara, K. Ricap: Random Image Cropping and Patching Data Augmentation for Deep CNNs. In Proceedings of the Asian Conference on Machine Learning, PMLR, Beijing, China, 14–16 November 2018; pp. 786–798. [Google Scholar]
- Droste, R.; Jiao, J.; Noble, J. Unified image and video saliency modeling. In Proceedings of the European Computer Vision Conference, Glasgow, UK, 23–28 August 2020; Volume 16, pp. 419–435. [Google Scholar]
- Liu, N.; Han, J. A Deep Spatial Contextual Longterm Recurrent Convolutional Network for Saliency Detection. IEEE Trans. Image Process. 2018, 27, 3264–3274. [Google Scholar] [CrossRef] [PubMed]
- Djilali, Y.A.D.; McGuinness, K.; O’Connor, N. Learning Saliency from Fixations. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2024; pp. 383–393. [Google Scholar]
- Hosseini, A.; Kazerouni, A.; Akhavan, S.; Brudno, M.; Taati, B. SUM: Saliency Unification through Mamba for Visual Attention Modeling. arXiv 2024, arXiv:2406.17815. [Google Scholar]
- Khan, Z.F.; Kannan, A. Intelligent Segmentation of Medical Images using Fuzzy Bitplane Thresholding. Meas. Sci. Rev. 2014, 14, 94–101. [Google Scholar] [CrossRef]
- Dubey, S.R.; Singh, S.K.; Singh, R.K. Local Bit-plane Decoded Pattern: A Novel Feature Descriptor for Biomedical Image Retrieval. IEEE J. Biomed. Health Inform. 2015, 20, 1139–1147. [Google Scholar] [CrossRef]
- Tuan, T.A.; Kim, J.Y.; Bao, P.T. 3D Brain Magnetic Resonance Imaging Segmentation by Using Bitplane and Adaptive Fast Marching. Int. J. Imaging Syst. Technol. 2018, 28, 223–230. [Google Scholar] [CrossRef]
- Vladimir, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
- He, Z.; Xie, L.; Chen, X.; Zhang, Y.; Wang, Y.; Tian, Q. Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Zhang, J.; Sclaroff, S. Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 889–902. [Google Scholar] [CrossRef] [PubMed]
- Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
- Liu, Z.; Yuan, L.; Weng, L.; Yang, Y. A High Resolution Optical Satellite Image Dataset for Ship Recognition and Some New Baselines. Int. Conf. Pattern Recognit. Appl. Methods 2017, 2, 324–331. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2849–2858. [Google Scholar]
- Ma, J.; Shao, W.; Ye, H.; Wang, L.; Wang, H.; Zheng, Y.; Xue, X. Arbitrary-oriented Scene Text Detection via Rotation Proposals. IEEE Trans. Multimed. 2018, 20, 3111–3122. [Google Scholar] [CrossRef]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Azimi, S.M.; Vig, E.; Bahmanyar, R.; Körner, M.; Reinartz, P. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery. In Asian Conference on Computer Vision; Springer International Publishing: Cham, Switzerland, 2018; pp. 150–165. [Google Scholar]
- Fu, K.; Chang, Z.; Zhang, Y.; Xu, G.; Zhang, K.; Sun, X. Rotation-aware and Multi-scale Convolutional Neural Network for Object Detection in Remote Sensing Images. ISPRS J. Photogramm. Remote Sens. 2020, 161, 294–308. [Google Scholar] [CrossRef]
- Yang, X.; Sun, H.; Sun, X.; Yan, M.; Guo, Z.; Fu, K. Position Detection and Direction Prediction for Arbitrary-oriented Ships via Multiscale Rotation Region Convolutional Neural Network. IEEE Access 2018, 6, 50839–50849. [Google Scholar] [CrossRef]
- Ding, J.; Xue, N.; Xia, G.-S.; Bai, X.; Yang, W.; Yang, M.Y.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; et al. Object Detection in Aerial Images: A Large-scale Benchmark and Challenges. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 7778–7796. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Leng, W.; Han, X.; Sun, W. MOON: A Subspace-Based Multi-Branch Network for Object Detection in Remotely Sensed Images. Remote Sens. 2023, 15, 4201. [Google Scholar] [CrossRef]
- Yi, J.; Wu, P.; Liu, B.; Huang, Q.; Qu, H.; Metaxas, D. Oriented Object Detection in Aerial Images with Box Boundary-aware Vectors. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 2150–2159. [Google Scholar]
- Zhang, H.; Xu, Z.; Han, X.; Sun, W. Refining FFT-based Heatmap for the Detection of Cluster Distributed Targets in Satellite Images. In Proceedings of the British Machine Vision Conference, Online, 22–25 November 2021. [Google Scholar]
- Han, J.; Ding, J.; Xue, N.; Xia, G.S. Redet: A Rotation-equivariant Detector for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 2786–2795. [Google Scholar]
- Yang, X.; Yan, J.; Feng, Z.; He, T. R3det: Refined Single-stage Detector with Feature Refinement for Rotating Object. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RRPN [36] | 80.9 | 65.8 | 35.3 | 67.4 | 59.9 | 50.9 | 55.8 | 90.7 | 66.9 | 72.4 | 55.1 | 52.2 | 55.1 | 53.4 | 48.2 | 61.0 |
Yang et al. [40] | 81.3 | 71.4 | 36.5 | 67.4 | 61.2 | 50.9 | 56.6 | 90.7 | 68.1 | 72.4 | 55.1 | 55.6 | 62.4 | 53.4 | 51.5 | 62.3 |
DCN [37] | 85.0 | 75.7 | 31.6 | 73.4 | 58.9 | 46.1 | 60.7 | 89.6 | 75.2 | 76.8 | 52.3 | 61.7 | 47.6 | 58.7 | 44.0 | 62.5 |
ICN [38] | 81.4 | 74.3 | 47.7 | 70.3 | 64.9 | 67.8 | 70.0 | 90.8 | 79.1 | 78.2 | 53.6 | 62.9 | 67.0 | 64.2 | 50.2 | 68.2 |
FFA-3 [39] | 88.8 | 74.4 | 48.8 | 57.9 | 63.6 | 75.9 | 79.6 | 90.8 | 80.3 | 82.9 | 54.3 | 60.0 | 66.9 | 66.8 | 42.5 | 68.9 |
RT-Mxnet [35] | 88.6 | 78.5 | 43.4 | 75.9 | 68.8 | 73.7 | 83.6 | 90.7 | 77.3 | 81.5 | 58.4 | 53.5 | 62.8 | 58.9 | 47.7 | 69.6 |
RT-Pytorch [41] | 88.0 | 76.1 | 52.6 | 72.1 | 78.0 | 77.8 | 87.3 | 90.4 | 84.7 | 82.7 | 53.9 | 62.7 | 75.4 | 68.2 | 56.4 | 73.8 |
RT-Pytorch + GHM [42] | 88.7 | 77.4 | 53.9 | 77.4 | 77.6 | 77.6 | 87.7 | 90.8 | 86.8 | 85.6 | 61.9 | 60.1 | 76.1 | 70.5 | 64.3 | 75.8 |
RT-Pytorch + ROBIT | 89.2 | 84.1 | 54.8 | 76.0 | 78.4 | 82.8 | 87.7 | 90.8 | 84.5 | 85.4 | 65.6 | 63.2 | 77.1 | 71.6 | 59.5 | 76.7 |
PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RT-Pytorch | 88.0 | 76.1 | 52.6 | 72.1 | 78.0 | 77.8 | 87.3 | 90.4 | 84.7 | 82.7 | 53.9 | 62.7 | 75.4 | 68.2 | 56.4 | 73.8 |
RT-Pytorch + C&R | 89.4 | 78.2 | 54.3 | 72.7 | 72.4 | 76.5 | 87.7 | 90.8 | 80.5 | 85.7 | 60.1 | 61.9 | 76.0 | 72.4 | 58.7 | 74.5 |
RT-Pytorch + C&S | 88.9 | 76.4 | 54.3 | 76.8 | 73.4 | 76.7 | 87.4 | 90.8 | 86.9 | 78.7 | 64.7 | 63.0 | 75.8 | 71.1 | 55.2 | 74.7 |
RT-Pytorch + BIRD | 88.9 | 78.6 | 52.7 | 75.7 | 71.8 | 76.9 | 87.6 | 90.8 | 85.6 | 84.2 | 63.4 | 62.3 | 77.0 | 70.8 | 55.1 | 74.8 |
RT-Pytorch + rBIRD | 88.9 | 82.6 | 53.0 | 77.1 | 73.1 | 77.1 | 87.5 | 90.8 | 86.3 | 84.4 | 62.7 | 61.3 | 74.8 | 71.1 | 58.4 | 75.3 |
RT-Pytorch + IKD | 89.2 | 83.2 | 53.7 | 75.5 | 77.7 | 82.2 | 87.6 | 90.7 | 86.9 | 84.9 | 61.9 | 62.8 | 77.1 | 71.7 | 58.7 | 76.2 |
RT-Pytorch + ROBIT | 89.2 | 84.1 | 54.8 | 76.0 | 78.4 | 82.8 | 87.7 | 90.8 | 84.5 | 85.4 | 65.6 | 63.2 | 77.1 | 71.6 | 59.5 | 76.7 |
PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RT-Pytorch | 88.0 | 76.1 | 52.6 | 72.1 | 78.0 | 77.8 | 87.3 | 90.4 | 84.7 | 82.7 | 53.9 | 62.7 | 75.4 | 68.2 | 56.4 | 73.8 |
RT-Pytorch + | 88.9 | 78.2 | 54.9 | 75.7 | 78.1 | 77.9 | 87.5 | 90.9 | 86.3 | 85.4 | 61.0 | 65.5 | 77.2 | 71.8 | 58.0 | 75.8 |
RT-Pytorch + | 89.1 | 83.3 | 54.6 | 74.3 | 78.3 | 77.7 | 87.8 | 90.8 | 87.0 | 85.7 | 67.2 | 62.2 | 77.1 | 70.9 | 63.6 | 76.6 |
RT-Pytorch + | 89.3 | 84.7 | 54.7 | 76.7 | 78.4 | 82.7 | 87.6 | 90.7 | 87.1 | 85.5 | 64.9 | 62.2 | 76.9 | 71.5 | 56.6 | 76.6 |
RT-Pytorch + | 89.2 | 84.1 | 54.8 | 76.0 | 78.4 | 82.8 | 87.7 | 90.8 | 84.5 | 85.4 | 65.6 | 63.2 | 77.1 | 71.6 | 59.5 | 76.7 |
PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RT-Pytorch | 88.0 | 76.1 | 52.6 | 72.1 | 78.0 | 77.8 | 87.3 | 90.4 | 84.7 | 82.7 | 53.9 | 62.7 | 75.4 | 68.2 | 56.4 | 73.8 |
RT-Pytorch + BR | 88.9 | 78.6 | 52.7 | 75.7 | 71.8 | 76.9 | 87.6 | 90.8 | 85.6 | 84.2 | 63.4 | 62.3 | 77.0 | 70.8 | 55.1 | 74.8 |
RT-Pytorch + RF | 88.9 | 78.0 | 54.6 | 75.3 | 73.7 | 77.7 | 87.5 | 90.8 | 86.5 | 85.6 | 60.1 | 61.6 | 76.8 | 69.7 | 59.2 | 75.1 |
RT-Pytorch +ROBIT | 89.2 | 84.1 | 54.8 | 76.0 | 78.4 | 82.8 | 87.7 | 90.8 | 84.5 | 85.4 | 65.6 | 63.2 | 77.1 | 71.6 | 59.5 | 76.7 |
Method | mAP |
---|---|
HeatNet | 74.8 |
HeatNet + only positive samples | 75.2 |
HeatNet + ROBIT | 75.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.; Han, X.; Sun, W. Region-Focusing Data Augmentation via Salient Region Activation and Bitplane Recombination for Target Detection. Remote Sens. 2024, 16, 4806. https://doi.org/10.3390/rs16244806
Zhang H, Han X, Sun W. Region-Focusing Data Augmentation via Salient Region Activation and Bitplane Recombination for Target Detection. Remote Sensing. 2024; 16(24):4806. https://doi.org/10.3390/rs16244806
Chicago/Turabian StyleZhang, Huan, Xiaolin Han, and Weidong Sun. 2024. "Region-Focusing Data Augmentation via Salient Region Activation and Bitplane Recombination for Target Detection" Remote Sensing 16, no. 24: 4806. https://doi.org/10.3390/rs16244806
APA StyleZhang, H., Han, X., & Sun, W. (2024). Region-Focusing Data Augmentation via Salient Region Activation and Bitplane Recombination for Target Detection. Remote Sensing, 16(24), 4806. https://doi.org/10.3390/rs16244806