High-Performance Detection-Based Tracker for Multiple Object Tracking in UAVs
Abstract
:1. Introduction
- We propose a balanced JDR network for MOT, which is achieved using a symmetric framework design that ensures fairness between the detection task and Re-ID task.
- To address the issue of irregular object motion, we utilize the SMF for trajectory state prediction and update, in which the state is described by a bounded set. Based on the bounded set, we design the AMC module to accurately select candidates for appearance matching, thereby reducing false matches.
- We propose a MMF module to address the abrupt movement of the UAV, which determines the matching strategy based on the UAV motion state.
2. Related Work
2.1. Detection Method
2.2. Re-ID Method
2.3. Matching Method
2.3.1. Matched Based on Appearance Information
2.3.2. Matched Based on Motion Sign
2.3.3. Matched Based on Appearance Information and Motion Sign
2.4. Motion Model
3. Methodology
3.1. Overall Framework
3.2. JDR Network
3.3. Matching Model
3.3.1. Motion Module
- Initialization. Set the initial prior range .
- Prediction. For , the prior range is
- Update. For , given , the posterior range is
3.3.2. Appearance Matching Cascade Module
Algorithm 1 AMC algorithm |
|
3.3.3. Motion-Mutation Filter Module
4. Experiments
4.1. Datasets and Metrics
4.2. Implementation Details
4.3. Comparison with State-of-the-Arts
4.4. Ablation Study
4.4.1. Analysis of SMF Module
4.4.2. Analysis of AMC Module
4.4.3. Analysis of MMF Module
4.5. Qualitative Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar] [CrossRef]
- Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087. [Google Scholar] [CrossRef]
- Yu, F.; Li, W.; Li, Q.; Liu, Y.; Shi, X.; Yan, J. POI: Multiple Object Tracking with High Performance Detection and Appearance Feature. In Proceedings of the Computer Vision—ECCV 2016 Workshops, Amsterdam, The Netherlands, 8–10 October 2016; Hua, G., Jégou, H., Eds.; Springer: Cham, Switzerland, 2016; pp. 36–42. [Google Scholar]
- Liu, S.; Li, X.; Lu, H.; He, Y. Multi-Object Tracking Meets Moving UAV. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8876–8885. [Google Scholar]
- Wang, C.; Wang, Y.; Yuille, A.L. An Approach to Pose-Based Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013. [Google Scholar]
- Braso, G.; Leal-Taixe, L. Learning a Neural Solver for Multiple Object Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Huang, C.; Wu, B.; Nevatia, R. Robust Object Tracking by Hierarchical Association of Detection Responses. In Proceedings of the Computer Vision—ECCV 2008, Marseille, France, 12–18 October 2008; Forsyth, D., Torr, P., Zisserman, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 788–801. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Cong, Y.; Wang, X.; Zhou, X. Rethinking the Mathematical Framework and Optimality of Set-Membership Filtering. IEEE Trans. Autom. Control 2022, 67, 2544–2551. [Google Scholar] [CrossRef]
- Wen, L.; Zhu, P.; Du, D.; Bian, X.; Ling, H.; Hu, Q.; Zheng, J.; Peng, T.; Wang, X.; Zhang, Y.; et al. VisDrone-MOT2019: The Vision Meets Drone Multiple Object Tracking Challenge Results. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
- Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z.; Luo, P.; Liu, W.; Wang, X. ByteTrack: Multi-object Tracking by Associating Every Detection Box. In Proceedings of the Computer Vision—ECCV 2022, Tel Aviv, Israel, 23–27 October 2022; Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T., Eds.; Springer: Cham, Switzerland, 2022; pp. 1–21. [Google Scholar]
- Wang, Z.; Zheng, L.; Liu, Y.; Li, Y.; Wang, S. Towards Real-Time Multi-Object Tracking. In Proceedings of the Computer Vision—ECCV 2020, Glasgow, UK, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer: Cham, Switzerland, 2020; pp. 107–122. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Law, H.; Deng, J. CornerNet: Detecting Objects as Paired Keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October 2019. [Google Scholar]
- Voigtlaender, P.; Krause, M.; Osep, A.; Luiten, J.; Sekar, B.B.G.; Geiger, A.; Leibe, B. MOTS: Multi-Object Tracking and Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Sadeghian, A.; Alahi, A.; Savarese, S. Tracking the Untrackable: Learning to Track Multiple Cues With Long-Term Dependencies. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Tang, S.; Andriluka, M.; Andres, B.; Schiele, B. Multiple People Tracking by Lifted Multicut and Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
- Bochinski, E.; Eiselein, V.; Sikora, T. High-Speed tracking-by-detection without using image information. In Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 29 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Chen, L.; Ai, H.; Zhuang, Z.; Shang, C. Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification. In Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA, 23–27 July 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Yang, F.; Wang, Z.; Wu, Y.; Sakti, S.; Nakamura, S. Tackling multiple object tracking with complicated motions—Re-designing the integration of motion and appearance. Image Vis. Comput. 2022, 124, 104514. [Google Scholar] [CrossRef]
- Scott, J.K.; Raimondo, D.M.; Marseglia, G.R.; Braatz, R.D. Constrained zonotopes: A new tool for set-based estimation and fault detection. Automatica 2016, 69, 126–136. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision—ECCV 2014, Zurich, Switzerland, 6–12 September 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Robbins, H.; Monro, S. A stochastic approximation method. Ann. Math. Stat. 1951, 22, 400–407. [Google Scholar] [CrossRef]
- Pirsiavash, H.; Ramanan, D.; Fowlkes, C.C. Globally-optimal greedy algorithms for tracking a variable number of objects. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 1201–1208. [Google Scholar] [CrossRef]
- Zeng, F.; Dong, B.; Zhang, Y.; Wang, T.; Zhang, X.; Wei, Y. Motr: End-to-end multiple-object tracking with transformer. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 659–675. [Google Scholar]
- Meinhardt, T.; Kirillov, A.; Leal-Taixé, L.; Feichtenhofer, C. TrackFormer: Multi-Object Tracking with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8844–8854. [Google Scholar]
Method | MOTA ↑(%) | MOTP ↑(%) | IDF1 ↑(%) | MT↑ | ML↓ | FP↓ | FN↓ | IDs↓ | FM↓ |
---|---|---|---|---|---|---|---|---|---|
MOTDT [23] | −0.8 | 68.5 | 21.6 | 87 | 1196 | 44,548 | 185,453 | 1437 | 3609 |
SORT [1] | 14.0 | 73.2 | 38.0 | 506 | 545 | 80,845 | 112,954 | 3629 | 4838 |
IOUT [22] | 28.1 | 74.7 | 38.9 | 467 | 670 | 36,158 | 126,549 | 2393 | 3829 |
GOG [28] | 28.7 | 76.1 | 36.4 | 346 | 836 | 17,706 | 144,657 | 1387 | 2237 |
MOTR [29] | 22.8 | 72.8 | 41.4 | 272 | 825 | 28,407 | 147,937 | 959 | 3980 |
TrackFormer [30] | 25 | 73.9 | 30.5 | 385 | 770 | 25,856 | 141,526 | 4840 | 4855 |
UAVMOT [5] | 36.1 | 74.2 | 51.0 | 520 | 574 | 27,983 | 115,925 | 2775 | 7396 |
Ours | 43.6 | 76.4 | 54.9 | 656 | 469 | 32,599 | 86,654 | 2113 | 4042 |
SMF | AMC | MMF | MOTA ↑(%) | MOTP ↑(%) | IDF1 ↑(%) | IDs↓ | IDP↑(%) | IDR↑(%) |
---|---|---|---|---|---|---|---|---|
✕ | ✕ | ✕ | 42.3 | 75.3 | 53.9 | 2314 | 61.7 | 47.9 |
√ | ✕ | ✕ | 42.5 | 76.6 | 49.7 | 3200 | 60.0 | 42.4 |
√ | √ | ✕ | 43.3 | 76.4 | 53.9 | 2155 | 63.1 | 47.0 |
√ | √ | √ | 43.6 | 76.4 | 54.9 | 2113 | 64.1 | 48.0 |
Prcn ↑ (%) | FP ↓ | |
---|---|---|
Baseline | 77.9 | 36,868 |
Baseline + SMF | 81.2 | 28,614 |
IDF1 ↑ (%) | IDs ↓ | IDP ↑ (%) | IDR ↑ (%) | |
---|---|---|---|---|
Baseline + SMF | 49.7 | 3200 | 60.0 | 42.4 |
Baseline + SMF + AMC | 53.9 | 2155 | 63.1 | 47.0 |
IDF1 ↑ (%) | IDs ↓ | IDP ↑ (%) | IDR ↑ (%) | |
---|---|---|---|---|
Baseline + SMF + AMC | 53.9 | 2155 | 63.1 | 47.0 |
Baseline + SMF + AMC + MMF | 54.9 | 2113 | 64.1 | 48.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, X.; Zhu, R.; Yu, X.; Wang, X. High-Performance Detection-Based Tracker for Multiple Object Tracking in UAVs. Drones 2023, 7, 681. https://doi.org/10.3390/drones7110681
Li X, Zhu R, Yu X, Wang X. High-Performance Detection-Based Tracker for Multiple Object Tracking in UAVs. Drones. 2023; 7(11):681. https://doi.org/10.3390/drones7110681
Chicago/Turabian StyleLi, Xi, Ruixiang Zhu, Xianguo Yu, and Xiangke Wang. 2023. "High-Performance Detection-Based Tracker for Multiple Object Tracking in UAVs" Drones 7, no. 11: 681. https://doi.org/10.3390/drones7110681
APA StyleLi, X., Zhu, R., Yu, X., & Wang, X. (2023). High-Performance Detection-Based Tracker for Multiple Object Tracking in UAVs. Drones, 7(11), 681. https://doi.org/10.3390/drones7110681