Learning Response-Consistent and Background-Suppressed Correlation Filters for Real-Time UAV Tracking
Abstract
:1. Introduction
- A novel response-consistent module is proposed to minimize the difference between the responses obtained by the filter in adjacent frames, which avoids mutations of the target response caused by background interferences, and thus helps to enhance the discriminative ability of the learned filter.
- A novel background-suppressed module is proposed, which employs an attention mask matrix to make the learned filter identify the background information, and thus represses the interferences caused by distractors.
- Experimental results on three public challenging UAV benchmarks have been completed and demonstrate that our tracker achieves excellent tracking performance against 22 other state-of-the-art trackers. Our tracker can realize real-time tracking on a single CPU platform.
2. Related Work
2.1. Discriminative Correlation Filter Methods
2.2. Tracking with Reducing Boundary Effect
2.3. Tracking with Response-Based Approach
2.4. Tracking with Background Distractors Suppression
3. Proposed Method
3.1. Baseline Tracker
3.2. Response-Consistent Module
3.3. Background-Suppressed Module
3.4. Overall Objective Function
3.5. Optimization
3.6. Updating Appearance Model
3.7. Target Localization and Scale Estimation
3.8. Tracking Pipeline
Algorithm 1: Response-Consistent and Background-Suppressed Correlation Filter (RCBSCF) |
4. Experiments
4.1. Implementation Details
4.2. Benchmark and Metric
4.3. Comparison with Handcrafted-Based Features
4.3.1. Analysis of Tracking Performance
4.3.2. Attribute-Based Analysis
4.4. Comparison with Deep-Based Features
4.5. Ablation Study
4.6. Analysis of Key Parameters
- (1)
- and in the response-consistent module: Figure 9 reports the experimental results of the proposed tracker by changing the values of and when is fixed. varies from 2 to 44 with step size 2. varies from 2 to 22 with step size 1. In this experiment, we first keep the value of equal to 6 and fix its value. Then, we increase the value of while is set to 16.4. We obtain the best precision and success rates when reaches 37. In addition, to analyze the impact of , we keep the values of and equal to 37 and 6, respectively. With the analysis of in Figure 9, it can be seen that both the precision and success rates reach the top when is set to 16.4.
- (2)
- in the background-suppressed module: In the experiment for studying the impact of , we change the value of while and are all fixed. varies from 1 to 12 with the step size 1. and are set to 37 and 16.4, respectively. With the analysis of in Figure 9, it can be seen that when is equal to 6, both the precision and success rates reach the best scores.
4.7. Failure Cases
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gracchi, T.; Rossi, G.; Tacconi Stefanelli, C.; Tanteri, L.; Pozzani, R.; Moretti, S. Tracking the evolution of riverbed morphology on the basis of UAV photogrammetry. Remote Sens. 2021, 13, 829. [Google Scholar] [CrossRef]
- Lin, S.; Garratt, M.A.; Lambert, A.J. Monocular vision-based real-time target recognition and tracking for autonomously landing an UAV in a cluttered shipboard environment. Auton. Robot. 2017, 41, 881–901. [Google Scholar] [CrossRef]
- Xu, C.; Huang, D.; Liu, J. Target location of unmanned aerial vehicles based on the electro-optical stabilization and tracking platform. Measurement 2019, 147, 106848. [Google Scholar] [CrossRef]
- Lo, L.Y.; Yiu, C.H.; Tang, Y.; Yang, A.S.; Li, B.; Wen, C.Y. Dynamic Object Tracking on Autonomous UAV System for Surveillance Applications. Sensors 2021, 21, 7888. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Simeonova, S.; Shahbazi, M. Orientation-and scale-invariant multi-vehicle detection and tracking from unmanned aerial videos. Remote Sens. 2019, 11, 2155. [Google Scholar] [CrossRef] [Green Version]
- Danelljan, M.; Häger, G.; Khan, F.S.; Felsberg, M. Learning spatially regularized correlation filters for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Lukezic, A.; Vojir, T.; Cehovin Zajc, L.; Matas, J.; Kristan, M. Discriminative correlation filter with channel and spatial reliability. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Li, F.; Tian, C.; Zuo, W.; Zhang, L.; Yang, M.H. Learning spatial-temporal regularized correlation filters for visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Dai, K.; Wang, D.; Lu, H.; Sun, C.; Li, J. Visual tracking via adaptive spatially-regularized correlation filters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Li, Y.; Fu, C.; Ding, F.; Huang, Z.; Lu, G. AutoTrack: Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Kiani Galoogahi, H.; Sim, T.; Lucey, S. Correlation filters with limited boundaries. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Kiani Galoogahi, H.; Fagg, A.; Lucey, S. Learning background-aware correlation filters for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Huang, Z.; Fu, C.; Li, Y.; Lin, F.; Lu, P. Learning aberrance repressed correlation filters for real-time UAV tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Zhang, F.; Ma, S.; Qiu, Z.; Qi, T. Learning target-aware background-suppressed correlation filters with dual regression for real-time UAV tracking. Signal Process. 2022, 191, 108352. [Google Scholar] [CrossRef]
- Ye, J.; Fu, C.; Lin, F.; Ding, F.; An, S.; Lu, G. Multi-regularized correlation filter for UAV tracking and self-localization. IEEE Trans. Ind. Electron. 2021, 69, 6004–6014. [Google Scholar] [CrossRef]
- Mueller, M.; Smith, N.; Ghanem, B. Context-aware correlation filter tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Zhang, W.; Jiao, L.; Li, Y.; Liu, J. Sparse learning-based correlation filter for robust tracking. IEEE Trans. Image Process. 2020, 30, 878–891. [Google Scholar] [CrossRef]
- Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 2011, 3, 1–122. [Google Scholar]
- Mueller, M.; Smith, N.; Ghanem, B. A benchmark and simulator for uav tracking. In Proceedings of the Computer Vision—ECCV 2016—14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Li, S.; Yeung, D.Y. Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Du, D.; Qi, Y.; Yu, H.; Yang, Y.; Duan, K.; Li, G.; Zhang, W.; Huang, Q.; Tian, Q. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the Computer Vision—ECCV 2012—12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 583–596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, San Diego, CA, USA, 20–26 June 2005. [Google Scholar]
- Danelljan, M.; Shahbaz Khan, F.; Felsberg, M.; Van de Weijer, J. Adaptive color attributes for real-time visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Li, Y.; Zhu, J. A scale adaptive kernel correlation filter tracker with feature integration. In Proceedings of the Computer Vision—ECCV 2014 Workshops, Zurich, Switzerland, 6–7 September 2014. [Google Scholar]
- Van De Weijer, J.; Schmid, C.; Verbeek, J.; Larlus, D. Learning color names for real-world applications. IEEE Trans. Image Process. 2009, 18, 1512–1523. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bertinetto, L.; Valmadre, J.; Golodetz, S.; Miksik, O.; Torr, P.H. Staple: Complementary learners for real-time tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Danelljan, M.; Robinson, A.; Shahbaz Khan, F.; Felsberg, M. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In Proceedings of the Computer Vision—ECCV 2016—14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Danelljan, M.; Bhat, G.; Shahbaz Khan, F.; Felsberg, M. ECO: Efficient Convolution Operators for Tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Xu, T.; Feng, Z.H.; Wu, X.J.; Kittler, J. Joint group feature selection and discriminative filter learning for robust visual object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Lin, F.; Fu, C.; He, Y.; Guo, F.; Tang, Q. Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 2160–2174. [Google Scholar] [CrossRef]
- Lin, F.; Fu, C.; He, Y.; Xiong, W.; Li, F. ReCF: Exploiting Response Reasoning for Correlation Filters in Real-Time UAV Tracking. IEEE Trans. Intell. Transp. Syst. 2021, 23, 10469–10480. [Google Scholar] [CrossRef]
- Fu, C.; Ding, F.; Li, Y.; Jin, J.; Feng, C. Learning dynamic regression with automatic distractor repression for real-time UAV tracking. Eng. Appl. Artif. Intell. 2021, 98, 104116. [Google Scholar] [CrossRef]
- Huang, B.; Xu, T.; Shen, Z.; Jiang, S.; Li, J. BSCF: Learning background suppressed correlation filter tracker for wireless multimedia sensor networks. Ad Hoc Netw. 2021, 111, 102340. [Google Scholar] [CrossRef]
- Liu, H.; Li, B. Target tracker with masked discriminative correlation filter. IET Image Process. 2020, 14, 2227–2234. [Google Scholar] [CrossRef]
- Wang, W.; Zhang, K.; Lv, M.; Wang, J. Discriminative visual tracking via spatially smooth and steep correlation filters. Inf. Sci. 2021, 578, 147–165. [Google Scholar] [CrossRef]
- Huo, Y.; Wang, Y.; Yan, X.; Dai, K. Soft mask correlation filter for visual object tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018. [Google Scholar]
- Xu, T.; Feng, Z.H.; Wu, X.J.; Kittler, J. Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans. Image Process. 2019, 28, 5596–5609. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, N.; Zhou, W.; Tian, Q.; Hong, R.; Wang, M.; Li, H. Multi-cue correlation filters for robust visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Wang, C.; Zhang, L.; Xie, L.; Yuan, J. Kernel cross-correlator. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Danelljan, M.; Häger, G.; Khan, F.S.; Felsberg, M. Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1561–1575. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P.H. Fully-Convolutional Siamese Networks for Object Tracking. In Proceedings of the Computer Vision—ECCV 2016 Workshops, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Li, F.; Yao, Y.; Li, P.; Zhang, D.; Zuo, W.; Yang, M.H. Integrating boundary and center correlation filters for visual tracking with aspect ratio variation. In Proceedings of the IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2017, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Wang, N.; Song, Y.; Ma, C.; Zhou, W.; Liu, W.; Li, H. Unsupervised deep tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Choi, J.; Chang, H.J.; Fischer, T.; Yun, S.; Lee, K.; Jeong, J.; Demiris, Y.; Choi, J.Y. Context-aware deep feature compression for high-speed visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Guo, Q.; Feng, W.; Zhou, C.; Huang, R.; Wan, L.; Wang, S. Learning dynamic siamese network for visual object tracking. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Wang, N.; Zhou, W.; Song, Y.; Ma, C.; Liu, W.; Li, H. Unsupervised deep representation learning for real-time tracking. Int. J. Comput. Vis. 2021, 129, 400–418. [Google Scholar] [CrossRef]
Trackers | Source | Avg. Success Rate | Avg. Precision Rate | Avg. FPS | CPU |
---|---|---|---|---|---|
RCBSCF | This work | 0.485 | 0.711 | 36.21 | √ |
DR2Track [35] | EAAI21 | 0.447 | 0.657 | 28.47 | √ |
AutoTrack [10] | CVPR20 | 0.472 | 0.707 | 28.97 | √ |
ARCF [13] | ICCV19 | 0.471 | 0.700 | 24.08 | √ |
LADCF-HC [40] | TIP19 | 0.444 | 0.651 | 19.36 | √ |
STRCF [8] | CVPR18 | 0.436 | 0.637 | 21.83 | √ |
MCCT-H [41] | CVPR18 | 0.416 | 0.626 | 47.03 | √ |
KCC [42] | AAAI18 | 0.354 | 0.546 | 28.9 | √ |
Staple_CA [16] | CVPR17 | 0.388 | 0.596 | 47.29 | √ |
BACF [12] | ICCV17 | 0.416 | 0.617 | 41.49 | √ |
fDSST [43] | TPAMI16 | 0.376 | 0.579 | 173.25 | √ |
SRDCF [6] | ICCV15 | 0.403 | 0.588 | 9.01 | √ |
KCF [24] | TPAMI15 | 0.279 | 0.483 | 574.78 | √ |
SAMF [27] | ECCV14 | 0.332 | 0.519 | 8.51 | √ |
Trackers | UAV123@10fps | DTB70 | UAVDT | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BC | SOB | CM | FM | FOC | POC | BC | SOA | OCC | FCM | BC | CM | LO | |
DR2Track [35] | 0.267 | 0.424 | 0.413 | 0.321 | 0.200 | 0.356 | 0.366 | 0.492 | 0.420 | 0.505 | 0.348 | 0.367 | 0.335 |
AutoTrack [10] | 0.313 | 0.471 | 0.460 | 0.341 | 0.240 | 0.407 | 0.379 | 0.470 | 0.412 | 0.490 | 0.403 | 0.438 | 0.367 |
ARCF [13] | 0.291 | 0.462 | 0.435 | 0.326 | 0.231 | 0.408 | 0.377 | 0.484 | 0.446 | 0.496 | 0.414 | 0.450 | 0.387 |
LADCF-HC [40] | 0.301 | 0.461 | 0.467 | 0.340 | 0.237 | 0.391 | 0.350 | 0.458 | 0.447 | 0.474 | 0.382 | 0.415 | 0.356 |
STRCF [8] | 0.317 | 0.455 | 0.443 | 0.328 | 0.232 | 0.389 | 0.369 | 0.447 | 0.400 | 0.467 | 0.346 | 0.372 | 0.319 |
MCCT-H [41] | 0.285 | 0.439 | 0.397 | 0.271 | 0.234 | 0.372 | 0.325 | 0.426 | 0.384 | 0.436 | 0.345 | 0.368 | 0.327 |
KCC [42] | 0.237 | 0.376 | 0.331 | 0.217 | 0.185 | 0.302 | 0.191 | 0.298 | 0.279 | 0.306 | 0.348 | 0.367 | 0.304 |
Staple_CA [16] | 0.297 | 0.454 | 0.384 | 0.217 | 0.220 | 0.354 | 0.200 | 0.357 | 0.360 | 0.368 | 0.345 | 0.367 | 0.324 |
BACF [12] | 0.273 | 0.418 | 0.392 | 0.274 | 0.183 | 0.323 | 0.339 | 0.417 | 0.346 | 0.432 | 0.379 | 0.401 | 0.331 |
fDSST [43] | 0.209 | 0.384 | 0.325 | 0.240 | 0.197 | 0.314 | 0.225 | 0.351 | 0.310 | 0.388 | 0.328 | 0.376 | 0.332 |
SRDCF [6] | 0.263 | 0.421 | 0.399 | 0.311 | 0.229 | 0.355 | 0.256 | 0.379 | 0.310 | 0.398 | 0.358 | 0.390 | 0.327 |
KCF [24] | 0.125 | 0.278 | 0.209 | 0.145 | 0.135 | 0.223 | 0.182 | 0.275 | 0.270 | 0.280 | 0.240 | 0.271 | 0.229 |
SAMF [27] | 0.167 | 0.343 | 0.274 | 0.217 | 0.190 | 0.279 | 0.221 | 0.319 | 0.321 | 0.333 | 0.275 | 0.307 | 0.255 |
RCBSCF (Ours) | 0.319 | 0.483 | 0.470 | 0.361 | 0.257 | 0.434 | 0.391 | 0.495 | 0.476 | 0.510 | 0.417 | 0.453 | 0.391 |
Trackers | Source | Avg. Success Rate | Avg. Precision Rate | Avg. FPS | GPU |
---|---|---|---|---|---|
RCBSCF | This work | 0.485 | 0.711 | 36.21 | × |
ECO [31] | CVPR17 | 0.493 | 0.716 | 11.80 | √ |
SiamFC [44] | ECCV16 | 0.479 | 0.706 | 40.10 | √ |
IBCCF [45] | ICCV17 | 0.445 | 0.648 | 2.40 | √ |
DSiam [48] | ICCV17 | 0.468 | 0.696 | 17.82 | √ |
TRACA [47] | CVPR18 | 0.400 | 0.594 | 77.83 | √ |
UDT [46] | CVPR19 | 0.433 | 0.622 | 48.53 | √ |
UDT+ [46] | CVPR19 | 0.454 | 0.683 | 34.83 | √ |
LUDT [49] | IJCV21 | 0.416 | 0.593 | 45.27 | √ |
LUDT+ [49] | IJCV21 | 0.448 | 0.681 | 33.69 | √ |
RCBSCF-N | RCBSCF-RC | RCBSCF-BS | RCBSCF | |
---|---|---|---|---|
Success rate | 0.453 | 0.466 | 0.475 | 0.492 |
Precision rate | 0.628 | 0.649 | 0.665 | 0.682 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.; Li, Y.; Liu, H.; Yuan, D.; Yang, Y. Learning Response-Consistent and Background-Suppressed Correlation Filters for Real-Time UAV Tracking. Sensors 2023, 23, 2980. https://doi.org/10.3390/s23062980
Zhang H, Li Y, Liu H, Yuan D, Yang Y. Learning Response-Consistent and Background-Suppressed Correlation Filters for Real-Time UAV Tracking. Sensors. 2023; 23(6):2980. https://doi.org/10.3390/s23062980
Chicago/Turabian StyleZhang, Hong, Yan Li, Hanyang Liu, Ding Yuan, and Yifan Yang. 2023. "Learning Response-Consistent and Background-Suppressed Correlation Filters for Real-Time UAV Tracking" Sensors 23, no. 6: 2980. https://doi.org/10.3390/s23062980
APA StyleZhang, H., Li, Y., Liu, H., Yuan, D., & Yang, Y. (2023). Learning Response-Consistent and Background-Suppressed Correlation Filters for Real-Time UAV Tracking. Sensors, 23(6), 2980. https://doi.org/10.3390/s23062980