Weighted Kernel Filter Based Anti-Air Object Tracking for Thermal Infrared Systems
Abstract
:1. Introduction
- (1)
- We create a new dataset for anti-air TIR images and propose an image-enhancement method for anti-air targets.
- (2)
- We propose a Siamese architecture with a modified backbone and fused RPNs. This network is fully trained end-to-end without using pre-trained parameters for the RGB domain.
- (3)
- In the inference phase, we utilize the weighted kernel filters that are updated for every frame.
2. Related Work
2.1. Datasets
2.2. Trackers
2.3. Deep Architectures
3. Proposed Method
3.1. Anti-Air TIR Dataset
3.1.1. Data Collection
3.1.2. Preprocessing
3.2. Training: Siamese-Based Deep Network with RPNs
3.2.1. Siamese-Based Feature Extraction Network
3.2.2. Weighted Sum of the Region Proposal Networks
3.2.3. Loss and Optimization Strategy
3.3. Inference Process
3.3.1. Kernel Filter
3.3.2. Box Decoding and Selection
4. Experimental Results
4.1. Implementation Details
4.2. Ablation Analysis
4.2.1. Ablation Analysis on Selecting the Feature Extraction Layers
4.2.2. Ablation Analysis on Adopting the Weighted Kernel Filters
4.2.3. Ablation Analysis on Adopting the Preprocessing Method
4.3. Evaluation Methodology
4.4. Evaluation Using the Anti-Air TIR Dataset
4.5. Qualitative Evaluation
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Asha, C.S.; Narasimhadhan, A.V. Robust infrared target tracking using discriminative and generative approaches. Infrared Phys. Technol. 2017, 85, 114–127. [Google Scholar] [CrossRef]
- Kim, B.H.; Kim, M.Y.; Chae, Y.S. Background registration-based adaptive noise filtering of LWIR/MWIR imaging sensors for UAV applications. Sensors 2018, 18, 60. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yu, X.; Yu, Q.; Shang, Y.; Zhang, H. Dense structural learning for infrared object tracking at 200+ Frames per second. Pattern Recognit. Lett. 2017, 100, 152–159. [Google Scholar] [CrossRef]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 583–596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Danelljan, M.; Hager, G.; Shahbaz Khan, F.; Felsberg, M. Accurate scale estimation for robust visual tracking. In Proceedings of the British Machine Vision Conference, Bristol, UK, 9–13 September 2014; pp. 1–11. [Google Scholar]
- Danelljan, M.; Häger, G.; Khan, F.S.; Felsberg, M. Learning spatially regularized correlation filters for visual tracking. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 4310–4318. [Google Scholar]
- Zhang, M.; Xing, J.; Gao, J.; Shi, X.; Wang, Q.; Hu, W. Joint scale-spatial correlation tracking with adaptive rotation estimation. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 11–12, 17–18 December 2015; pp. 32–40. [Google Scholar]
- Danelljan, M.; Robinson, A.; Khan, F.K.S.; Felsberg, M. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 472–488. [Google Scholar]
- DCFNet: Discriminant Correlation Filters Network for Visual Tracking. Available online: https://arxiv.org/abs/1704.04057 (accessed on 13 April 2017).
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. ECO: Efficient convolution operators for tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6931–6939. [Google Scholar]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P. Fully-convolutional siamese networks for object tracking. In Proceedings of the IEEE European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 8–10 October 2016; pp. 850–865. [Google Scholar]
- David, H.; Sebastian, T.; Silvio, S. learning to track at 100 fps with deep regression networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016. [Google Scholar]
- Wang, Q.; Teng, Z.; Xing, J.; Gao, J.; Hu, W.; Maybank, S. learning attentions: Residual attentional siamese network for high performance online visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4854–4863. [Google Scholar]
- Zhu, Z.; Wang, Q.; Li, B.; Wu, W.; Yan, J.; Hu, W. Distractor-aware siamese networks for visual object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Li, B.; Yan, J.; Wu, W.; Zhu, Z.; Hu, X. High performance visual tracking with siamese region proposal network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8971–8980. [Google Scholar]
- Kristan, M.; Eldesokey, A.; Xing, Y.; Fan, Y.; Zhu, Z.; Zhang, Z.; Leonardis, A.; Matas, H.; Felsberg, M.; Pflugfelder, R.; et al. The visual object tracking VOT2017 challenge results. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), Venice, Italy, 22–29 October 2017; Volume 8926, pp. 98–111. [Google Scholar]
- Mueller, M.; Smith, N.; Ghanem, B. A benchmark and simulator for UAV tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 445–461. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Ren, S.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Real, E.; Shlens, J.; Mazzocchi, S.; Pan, X.; Vanhoucke, V. Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5296–5305. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In European Conference on Computer Vision; Springer: New York, NY, USA, 2014; pp. 740–755. [Google Scholar]
- Kristan, M.; Leonardis, A.; Matas, J.; Felsberg, M.; Pflugfelder, R.; Čehovin, L.; Vojír, T.; Häger, G.; Lukežič, A.; Fernández, G.; et al. The visual object tracking VOT2016 challenge results. In Proceedings of the European Conference on Computer Vision Workshps (ECCV 2016), Amsterdam, The Netherlands, 8–16 October 2016; pp. 777–823. [Google Scholar]
- PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark. Available online: https://arxiv.org/abs/1801.05944 (accessed on 6 November 2019).
- Li, C.; Liang, X.; Lu, Y.; Zhao, N.; Tang, J. RGB-T object tracking: Benchmark and baseline. Pattern Recognit. 2019, 96, 106977. [Google Scholar] [CrossRef] [Green Version]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2544–2550. [Google Scholar]
- Kim, B.H.; Lukezic, A.; Lee, J.H.; Jung, H.M.; Kim, M.Y. Global motion-aware robust visual Object tracking for electro optical targeting systems. Sensors 2020, 20, 566. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, L.; Gonzalez-Garcia, A.; Weijer, J.V.d.; Danelljan, M.; Khan, F.S. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Trans. Image Process. 2019, 28, 1837–1850. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Multi-Task Driven Feature Models for Thermal Infrared Tracking. Available online: https://arxiv.org/abs/1911.11384 (accessed on 26 November 2019).
- Valmadre, J.; Bertinetto, L.; Henriques, J.; Vedaldi, A.; Torr, P.H. End-to-end representation learning for correlation filter based tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2805–2813. [Google Scholar]
- Liu, Q.; Lu, X.; He, Z.; Zhang, C.; Chen, W.S. Deep convolutional neural networks for thermal infrared object tracking. Knowl. Based Syst. 2017, 134, 189–198. [Google Scholar] [CrossRef]
- Li, X.; Liu, Q.; Fan, N.; He, Z.; Wang, H. Hierarchical spatial-aware Siamese network for thermal infrared object tracking. Knowl. Based Syst. 2019, 166, 71–81. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th Conference on Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer, Vision, Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. Available online: https://arxiv.org/abs/1704.04861 (accessed on 17 April 2017).
- Zhang, S.; Wen, L.; Bian, X.; Lei, Z.; Li, S.Z. Single-shot refinement neural network for object detection. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 19–21 July 2018; pp. 4203–4212. [Google Scholar]
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Available online: https://arxiv.org/abs/1802.02611 (accessed on 22 August 2018).
- Wang, Q.; Zhang, L.; Bertinetto, L.; Hu, W.; Torr, P.H. Fast online object tracking and segmentation: A unifying approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1328–1338. [Google Scholar]
- Feichtenhofer, C.; Pinz, A.; Zisserman, A. Detect to track and track to detect. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3057–3065. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- He, A.; Luo, C.; Tian, X.; Zeng, W. A twofold siamese network for real-time object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4834–4843. [Google Scholar]
- Wu, Y.; Jongwoo, L.; Yang, M.-H. Online object tracking: A benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2411–2418. [Google Scholar]
Template Input | Search Input | Operator | t | c | n | s | Remark |
---|---|---|---|---|---|---|---|
conv2D | - | 32 | 1 | 2 | Layer 0 | ||
bottleneck | 1 | 16 | 1 | 1 | Layer 1 | ||
bottleneck | 6 | 24 | 2 | 2 | Layer 2 | ||
bottleneck | 6 | 32 | 3 | 1 | Layer 3 | ||
bottleneck | 6 | 64 | 3 | 1 | Layer 4 | ||
bottleneck | 6 | 160 | 1 | 1 | Layer 5 | ||
- | - | - | - | - | - |
No. | Layer 2 | Layer 3 | Layer 4 | Layer 5 | Weighted Kernel Filter | Overlap Ratio |
---|---|---|---|---|---|---|
1 | O | X | O | X | X | 0.663 |
2 | O | X | X | O | X | 0.687 |
3 | X | O | X | O | X | 0.668 |
4 | X | O | O | O | X | 0.729 |
5 | O | X | O | O | X | 0.757 |
No. | Layer 2 | Layer 3 | Layer 4 | Layer 5 | Weighted Kernel Filter | Overlap Ratio |
---|---|---|---|---|---|---|
1 | O | X | O | X | O | 0.724 |
2 | O | X | X | O | O | 0.762 |
3 | X | O | X | O | O | 0.701 |
4 | X | O | O | O | O | 0.765 |
5 | O | X | O | O | O | 0.772 |
No. | Layer 2 | Layer 3 | Layer 4 | Layer 5 | Weighted Kernel Filter | Overlap Ratio |
---|---|---|---|---|---|---|
1 | O | X | O | O | X | 0.757 |
2 | O | X | O | O | O | 0.772 |
No. | Preprocessing Method | Overlap Ratio |
---|---|---|
1 | Proposed | 0.772 |
2 | Min/Max Normalized | 0.691 |
3 | Bit-shift | 0.604 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, C.; Ko, H. Weighted Kernel Filter Based Anti-Air Object Tracking for Thermal Infrared Systems. Sensors 2020, 20, 4081. https://doi.org/10.3390/s20154081
Kim C, Ko H. Weighted Kernel Filter Based Anti-Air Object Tracking for Thermal Infrared Systems. Sensors. 2020; 20(15):4081. https://doi.org/10.3390/s20154081
Chicago/Turabian StyleKim, Chuljoong, and Hanseok Ko. 2020. "Weighted Kernel Filter Based Anti-Air Object Tracking for Thermal Infrared Systems" Sensors 20, no. 15: 4081. https://doi.org/10.3390/s20154081
APA StyleKim, C., & Ko, H. (2020). Weighted Kernel Filter Based Anti-Air Object Tracking for Thermal Infrared Systems. Sensors, 20(15), 4081. https://doi.org/10.3390/s20154081