FSVM: A Few-Shot Threat Detection Method for X-ray Security Images
Abstract
:1. Introduction
- We proposed a few-shot threat detection model called FSVM for X-ray security images, enabling the model to detect novel contraband items with only a small number of samples.
- To generate an informative embedding space, an SVM embedding module is proposed, which can be trained end-to-end, and embedded it freely into an object detection model.
- The SVM loss is proposed as a part of the joint loss function. This additional constraint, used in the fine-tuning stage, will back-propagate the supervised information to former layers. Trained with this joint loss, the proposed model is more suitable for few-shot application scenarios.
2. Related Work
2.1. Automatic Threat Detection for X-ray Security Images
2.2. Few-Shot Learning
2.3. End-to-End Trainable Embedding Layer
3. Method
3.1. Problem Formulation
3.2. The SVM-Constrained FSOD Architecture
3.3. Svm Embedding Module
3.4. Joint Loss Funtion
4. Experiment
4.1. Dataset
4.2. Experimental Setups
4.2.1. Category Division
4.2.2. Implementation Details
4.2.3. Evaluation Metrics
4.3. Comparison Experiment
4.4. Ablation Experiment
4.4.1. FSVM
4.4.2. Dimension Control
4.4.3. Input Feature Control
4.4.4. The Embedded SVM Layer Control
4.5. Visual Results and Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Michel, S.; Koller, S.M.; de Ruiter, J.C.; Moerland, R.; Hogervorst, M.; Schwaninger, A. Computer-based training increases efficiency in X-ray image interpretation by aviation security screeners. In Proceedings of the 2007 41st Annual IEEE International Carnahan Conference on Security Technology, Ottawa, ON, Canada, 8–11 October 2007; pp. 201–206. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Akcay, S.; Breckon, T. Towards automatic threat detection: A survey of advances of deep learning within X-ray security imaging. Pattern Recognit. 2022, 122, 108245. [Google Scholar] [CrossRef]
- Elsayed, G.; Krishnan, D.; Mobahi, H.; Regan, K.; Bengio, S. Large margin deep networks for classification. Adv. Neural Inf. Process. Syst. 2018, 31, 850–860. [Google Scholar]
- Wang, H.; Wang, Y.; Zhou, Z.; Ji, X.; Gong, D.; Zhou, J.; Li, Z.; Liu, W. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 5265–5274. [Google Scholar]
- Yang, J.; Guo, X.; Li, Y.; Marinello, F.; Ercisli, S.; Zhang, Z. A survey of few-shot learning in smart agriculture: Developments, applications, and challenges. Plant Methods 2022, 18, 28. [Google Scholar] [CrossRef] [PubMed]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1126–1135. [Google Scholar]
- Ravi, S.; Larochelle, H. Optimization as a model for few-shot learning. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Zhang, P.; Bai, Y.; Wang, D.; Bai, B.; Li, Y. Few-shot classification of aerial scene images via meta-learning. Remote Sens. 2020, 13, 108. [Google Scholar] [CrossRef]
- Yao, X.; Cao, Q.; Feng, X.; Cheng, G.; Han, J. Scale-aware detailed matching for few-shot aerial image semantic segmentation. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5611711. [Google Scholar] [CrossRef]
- Cai, J.; Zhang, Y.; Guo, J.; Zhao, X.; Lv, J.; Hu, Y. St-pn: A spatial transformed prototypical network for few-shot sar image classification. Remote Sens. 2022, 14, 2019. [Google Scholar] [CrossRef]
- Rostami, M.; Kolouri, S.; Eaton, E.; Kim, K. Deep transfer learning for few-shot SAR image classification. Remote Sens. 2019, 11, 1374. [Google Scholar] [CrossRef]
- Paul, A.; Tang, Y.X.; Summers, R.M. Fast few-shot transfer learning for disease identification from chest X-ray images using autoencoder ensemble. In Proceedings of the Medical Imaging 2020: Computer-Aided Diagnosis, Orlando, FL, USA, 11–16 February 2020; Volume 11314, pp. 33–38. [Google Scholar]
- Cherti, M.; Jitsev, J. Effect of Pre-Training Scale on Intra-and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-ray Chest Images. arXiv 2021, arXiv:2106.00116. [Google Scholar]
- Singh, M.; Singh, S. Optimizing image enhancement for screening luggage at airports. In Proceedings of the CIHSPS 2005. Proceedings of the 2005 IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety, Orlando, FL, USA, 31 March–1 April 2005; pp. 131–136. [Google Scholar]
- Akcay, S.; Kundegorski, M.E.; Willcocks, C.G.; Breckon, T.P. Using deep convolutional neural network architectures for object classification and detection within X-ray baggage security imagery. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2203–2215. [Google Scholar] [CrossRef]
- Hassan, T.; Khan, S.H.; Akcay, S.; Bennamoun, M.; Werghi, N. Deep CMST framework for the autonomous recognition of heavily occluded and cluttered baggage items from multivendor security radiographs. CoRR 2019, 14, 17. [Google Scholar]
- Xu, M.; Zhang, H.; Yang, J. Prohibited item detection in airport X-ray security images via attention mechanism based CNN. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Guangzhou, China, 23–26 November 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 429–439. [Google Scholar]
- Antonelli, S.; Avola, D.; Cinque, L.; Crisostomi, D.; Foresti, G.L.; Galasso, F.; Marini, M.R.; Mecca, A.; Pannone, D. Few-shot object detection: A survey. ACM Comput. Surv. (CSUR) 2021, 54, 1–37. [Google Scholar] [CrossRef]
- Wang, X.; Huang, T.E.; Darrell, T.; Gonzalez, J.E.; Yu, F. Frustratingly simple few-shot object detection. arXiv 2020, arXiv:2003.06957. [Google Scholar]
- Sun, B.; Li, B.; Cai, S.; Yuan, Y.; Zhang, C. Fsce: Few-shot object detection via contrastive proposal encoding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 7352–7362. [Google Scholar]
- Tian, Y.; Wang, Y.; Krishnan, D.; Tenenbaum, J.B.; Isola, P. Rethinking few-shot image classification: A good embedding is all you need? In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 266–282. [Google Scholar]
- Long, M.; Cao, Y.; Wang, J.; Jordan, M. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 97–105. [Google Scholar]
- Hoffer, E.; Ailon, N. Deep metric learning using triplet network. In Proceedings of the International Workshop on Similarity-Based Pattern Recognition, Copenhagen, Denmark, 12–14 October 2015; pp. 84–92. [Google Scholar]
- Li, P.; Xie, J.; Wang, Q.; Zuo, W. Is second-order information helpful for large-scale visual recognition? In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2070–2078. [Google Scholar]
- Wang, Q.; Li, P.; Zhang, L. G2DeNet: Global Gaussian distribution embedding network and its application to visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2730–2739. [Google Scholar]
- Lee, K.; Maji, S.; Ravichandran, A.; Soatto, S. Meta-learning with differentiable convex optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10657–10665. [Google Scholar]
- Sun, B.; Feng, J.; Saenko, K. Return of frustratingly easy domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, Arizona, 12–17 February 2016; Volume 30. [Google Scholar]
- Cheng, Z.; Chen, C.; Chen, Z.; Fang, K.; Jin, X. Robust and high-order correlation alignment for unsupervised domain adaptation. Neural Comput. Appl. 2021, 33, 6891–6903. [Google Scholar] [CrossRef]
- Cervantes, J.; Garcia-Lamont, F.; Rodríguez-Mazahua, L.; Lopez, A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020, 408, 189–215. [Google Scholar] [CrossRef]
- Crammer, K.; Singer, Y. On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2001, 2, 265–292. [Google Scholar]
- Barratt, S. On the differentiability of the solution to convex optimization problems. arXiv 2018, arXiv:1804.05098. [Google Scholar]
- Amos, B.; Kolter, J.Z. Optnet: Differentiable optimization as a layer in neural networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 7–9 August 2017; pp. 136–145. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Miao, C.; Xie, L.; Wan, F.; Su, C.; Liu, H.; Jiao, J.; Ye, Q. Sixray: A large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2119–2128. [Google Scholar]
- Yan, X.; Chen, Z.; Xu, A.; Wang, X.; Liang, X.; Lin, L. Meta r-cnn: Towards general solver for instance-level low-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9577–9586. [Google Scholar]
- Xiao, Y.; Marlet, R. Few-shot object detection and viewpoint estimation for objects in the wild. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 192–210. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
(a) 10-Shot Result | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Novel Split1 | Novel Split2 | Novel Split3 | ||||||||||
Model | bAP50 | Gun | Knife | nAP50 | bAP50 | Pliers | Wrench | nAP50 | bAP50 | Scissors | Gun | nAP50 |
Meta R-CNN [36] | 76.8 | 0.8 | 10.7 | 5.8 | 80.3 | 12.4 | 4.5 | 8.4 | 73.9 | 0.9 | 9.1 | 5.0 |
FsdetView [37] | 74.7 | 1.1 | 11.5 | 6.3 | 77.7 | 11.3 | 12.8 | 12.1 | 76.9 | 1.3 | 5.5 | 3.3 |
TFA [20] | 78.5 | 31.3 | 15.0 | 23.2 | 78.1 | 11.5 | 9.4 | 10.5 | 77.1 | 15.5 | 37.3 | 21.4 |
FSCE [21] | 79.8 | 50.9 | 24.4 | 37.7 | 80.3 | 24.9 | 10.2 | 17.5 | 82.2 | 24.3 | 50.3 | 37.3 |
FSVM-L * (Ours) | 83.4 | 52.4 | 28.9 | 40.6 | 83.4 | 26.6 | 10.1 | 18.3 | 85.8 | 31.8 | 52.9 | 42.2 |
FSVM-G * (Ours) | 85.7 | 54.8 | 26.6 | 40.7 | 82.8 | 26.8 | 12.5 | 19.6 | 84.5 | 33.1 | 55.4 | 44.2 |
(b) 30-Shot Result | ||||||||||||
Novel Split1 | Novel Split2 | Novel Split3 | ||||||||||
Model | bAP50 | Gun | Knife | nAP50 | bAP50 | Pliers | Wrench | nAP50 | bAP50 | Scissors | Gun | nAP50 |
Meta R-CNN [36] | 75.9 | 0.4 | 10.4 | 5.4 | 79.9 | 9.3 | 4.6 | 6.9 | 76.2 | 3.5 | 9.1 | 6.30 |
FsdetView [37] | 78.6 | 9.1 | 11.6 | 10.3 | 78.3 | 14.5 | 12.2 | 13.4 | 76.4 | 5.8 | 5.2 | 5.50 |
TFA [20] | 80.9 | 39.8 | 14.1 | 26.9 | 83.0 | 15.4 | 16.5 | 15.9 | 78.8 | 14.7 | 39.5 | 27.1 |
FSCE [21] | 85.4 | 62.8 | 29.8 | 46.2 | 85.7 | 25.6 | 16.5 | 21.1 | 82.5 | 36.4 | 63.0 | 49.7 |
FSVM-L * (Ours) | 88.6 | 70.1 | 34.6 | 52.4 | 87.2 | 31.1 | 19.4 | 25.2 | 86.7 | 40.9 | 65.7 | 53.3 |
FSVM-G * (Ours) | 88.4 | 69.3 | 38.9 | 54.1 | 88.4 | 35.2 | 23.1 | 29.1 | 85.1 | 41.7 | 66.8 | 54.2 |
Base-Training | Fine-Tuning | ||
---|---|---|---|
Model | mAP50 | bAP50 | nAP50 |
TFA [20] | 82.9 | 80.9 | 26.9 |
FSCE [21] | 90.8 | 85.4 | 46.2 |
FSVM-L (Ours) | 90.8 | 88.6 | 52.4 |
FSVM-G (Ours) | - - | 88.4 | 54.1 |
SVM Module | Remap Layer | Freeze FCs. | Novel mAP50 | |
---|---|---|---|---|
10-Shot | 30-Shot | |||
39.6 | 49.4 | |||
√ | 42.0 | 52.1 | ||
√ | √ | 42.5 | 53.6 | |
√ | √ | 43.4 | 53.9 | |
√ | √ | √ | 44.9 | 54.5 |
Dim = | 16 | 64 | 128 | 256 | 1024 | 4096 |
nAP50 | 44.7 | 45.5 | 45.6 | 45.7 | 45.4 | 43.9 |
nAP75 | 6.9 | 7.2 | 7.9 | 7.5 | 7.3 | 7.4 |
IoU | 10 Shot | 30 Shot | ||
---|---|---|---|---|
nAP50 | nAP75 | nAP50 | nAP75 | |
0.5 | 39.7 | 3.8 | 51.6 | 7.4 |
0.7 | 41.7 | 5.4 | 51.6 | 8.7 |
Linear Kernel | Gaussian Kernel | |||
---|---|---|---|---|
0.1 | 0.5 | 0.1 | 0.5 | |
52.7 | 53.5 | 53.8 | 54.9 | |
52.2 | 54.3 | 53.4 | 54.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, C.; Liu, J.; Han, P.; Chen, M.; Liao, D. FSVM: A Few-Shot Threat Detection Method for X-ray Security Images. Sensors 2023, 23, 4069. https://doi.org/10.3390/s23084069
Fang C, Liu J, Han P, Chen M, Liao D. FSVM: A Few-Shot Threat Detection Method for X-ray Security Images. Sensors. 2023; 23(8):4069. https://doi.org/10.3390/s23084069
Chicago/Turabian StyleFang, Cheng, Jiayue Liu, Ping Han, Mingrui Chen, and Dayu Liao. 2023. "FSVM: A Few-Shot Threat Detection Method for X-ray Security Images" Sensors 23, no. 8: 4069. https://doi.org/10.3390/s23084069
APA StyleFang, C., Liu, J., Han, P., Chen, M., & Liao, D. (2023). FSVM: A Few-Shot Threat Detection Method for X-ray Security Images. Sensors, 23(8), 4069. https://doi.org/10.3390/s23084069