Deep Cascade AdaBoost with Unsupervised Clustering in Autonomous Vehicles
Abstract
:1. Introduction
- We creatively propose a framework that combines deep learning with traditional machine learning. As we describe throughout this article, our model can implement arbitrary scene object detection tasks, greatly enriching the application of traditional detection algorithms. By establishing the relationship between clustering and the AdaBoost algorithm, we obtain the Deep Cascade AdaBoost model.
- We design a training method based on Cascade-AdaBoost with multi-category samples. Instead of global samples, ordered categorical samples are used, and then the hierarchical training models are cascaded. In this paper, we also demonstrate the effectiveness of our method with a theoretical formulation.
- We compare with the traditional AdaBoost model on multiple classes of vision tasks, and the model we proposed achieves the best results in both accuracy and time. In our test dataset, the detection time was shortened by , and the false detection rate was reduced by , even though our model was accomplished in a shorter training time. In addition, we add the additional interference to the experiment and find that the new model has a remarkable ability to screen out external noise.
2. Materials and Methods
2.1. Deep Clustering
2.2. Traditional Learning Method
2.3. Dataset for AdaBoost Training
2.4. Data Processing
3. Proposed Methodology
3.1. Deep Cascade AdaBoost
3.1.1. Deep Unsupervised Clustering
3.1.2. Classifier-Based Cascade AdaBoost
3.1.3. Cascade AdaBoost
Algorithm 1:The Deep Cascade AdaBoost |
Input: Initialize the following parameters. (1) Feature extraction function: ; (2) The parameter vector: ; (3) Linear classifier W; (4) the number of classifications k; Output: the results of Clustering: The center of mass matrix C and k cluster samples fordo Initialize the weights , for do (a) Fit a classifier to the training set (b) Compute (c) Compute (d) Set (e) Set Output endfor Output: endfor |
3.1.4. Clustering-Based Cascade AdaBoost
3.2. Experiment
3.2.1. Clustering
3.2.2. Multi-Cascade-AdaBoost
- Number of stages. Usually, increasing the number of stages results in a more accurate detector, but it needs more time to train the model. Higher stages may also require more training images, and the demand for such images tends to grow exponentially, while the improvement for detectors is minimal. In our training, the number of stages was set to 30.
- Object training size. By default, the training function sets the size of the samples in the instance to [24 24]. To make the training results more accurate, we refer to the expected size of the target in the actual image. Finally, the training sample size is set to [30 30].
- False alarm rate. Higher values of alarm rate tend to require more cascading stages to achieve reasonable detection accuracy and increase the memory consumption of the model. Lower values increase the model complexity. The alarm rate generally defaults to 0.5, and we set the alarm rate to 0.3 in order to use a smaller number of stages.
- Feature Selection. Using AdaBoost in the evaluation method of vehicle detection, the classical three features are HAAR features, LBP features and HOG features. HAAR shows the information of light and dark transformation of image pixel values. LBP describes the texture information corresponding to the local extent of the image and HOG reacts to the edge gradient information of the image. In this work, we use these three features above to learn the vehicle detector and observe the performance of vehicle detection with different features.
4. Results and Discussion
4.1. Effect on the Training Process
4.2. Comparison of Detector Performance
4.3. Analysis of the Size of Dataset and Noise Resistance Performance
4.4. Sensitivity Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
IoT | Internet of things |
NLP | Natural language processing |
CNN | Convolutional neural network |
HAAR | Haar-like features |
HOG | Histogram of oriented gradients |
LBP | Local Binary Pattern |
ROI | Region of Interest |
TPR | Ture positive rate |
FPR | False positive rate |
References
- Wang, L.; Ouyang, W.; Wang, X.; Lu, H. Visual Tracking with Fully Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Milan, A.; Rezatofighi, S.H.; Dick, A.; Schindler, K.; Reid, I. Online Multi-Target Tracking Using Recurrent Neural Networks. arXiv 2013, arXiv:1604.03635. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.B.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. CoRR 2015. Available online: https://doi.org/10.48550/arXiv.1506.01497 (accessed on 30 October 2022). [CrossRef] [PubMed] [Green Version]
- Girshick, R.B.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR 2013. Available online: https://doi.org/10.48550/arXiv.1311.2524 (accessed on 30 October 2022).
- Lin, T.-Y.; Dollár, P.; Girshick, R.B.; He, K.; Hariharan, B.; Serge, J. Feature Pyramid Networks for Object Detection. CoRR 2016. Available online: https://doi.org/10.48550/arXiv.1612.03144 (accessed on 30 October 2022).
- Zhou, Z.; Zhao, X.; Wang, Y.; Wang, P.; Foroosh, H. CenterFormer: Center-based Transformer for 3D Object Detection. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2022. [Google Scholar]
- Pavlitskaya, S.; Polley, N.; Weber, M.; Zöllner, J.M. Adversarial Vulnerability of Temporal Feature Networks for Object Detection. arXiv 2022, arXiv:2208.10773. [Google Scholar] [CrossRef]
- Ham, J.; Chen, Y.; Crawford, M.M.; Ghosh, J. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote. Sens. 2005, 43, 492–501. [Google Scholar] [CrossRef] [Green Version]
- Antonio, B.; Davide, M.; Massimo, M. Efficient Adaptive Ensembling for Image Classification. arXiv 2022, arXiv:2206.07394. [Google Scholar] [CrossRef]
- Schuldt, C.; Laptev, I.; Caputo, B. Recognizing human actions: A local SVM approach. In Proceedings of the International Conference on Pattern Recognition, Cambridge, UK, 26 August 2004. [Google Scholar]
- Dasom, A.; Sangwon, K.; Hyunsu, H.; Byoung, C.K. STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition. arXiv 2022, arXiv:2210.07503. [Google Scholar] [CrossRef]
- Viola, P.A.; Jones, M.J. Rapid Object Detection using a Boosted Cascade of Simple Features. In Proceedings of the Computer Vision and Pattern Recognition, CVPR, Kauai, HI, USA, 8–14 December 2001. [Google Scholar]
- Islam, M.T.; Ahmed, T.; Raihanur Rashid, A.B.M.; Islam, T.; Rahman, S.; Habib, T. Convolutional Neural Network Based Partial Face Detection. In Proceedings of the 2022 IEEE 7th International conference for Convergence in Technology (I2CT), Mumbai, India, 7–9 April 2022. [Google Scholar]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Arreola, L.; Gudiño, G.; Flores, G. Object Recognition and Tracking Using Haar-Like Features Cascade Classifiers: Application to a Quad-Rotor UAV. arXiv 2019, arXiv:1903.03947. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision Pattern Recognition, San Diego, CA, USA, 20–25 June 2005. [Google Scholar]
- Kitayama, M.; Kiya, H. Generation of Gradient-Preserving Images allowing HOG Feature Extraction. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Penghu, Taiwan, 15–17 September 2021. [Google Scholar]
- Alhindi, T.J.; Kalra, S.; Ng, K.H.; Afrin, A.; Tizhoosh, H.R. Comparing LBP, HOG and Deep Features for Classification of Histopathology Images. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018. [Google Scholar]
- Abdelhamid, A.A.; El-Kenawy, E.-S.M.; Khodadadi, N.; Mirjalili, S.; Khafaga, D.S.; Alharbi, A.H.; Ibrahim, A.; Eid, M.M.; Saber, M. Classification of Monkeypox Images Based on Transfer Learning and the Al-Biruni Earth Radius Optimization Algorithm. Mathematics 2022, 10, 3614. [Google Scholar] [CrossRef]
- Hui, Y.; Cheng, N.; Su, Z.; Huang, Y.; Zhao, P.; Luan, T.H.; Li, C. Secure and Personalized Edge Computing Services in 6G Heterogeneous Vehicular Networks. IEEE Internet Things J. 2022, 9, 5920–5931. [Google Scholar] [CrossRef]
- Hui, Y.; Su, Z.; Tom, H. Luan: Unmanned Era. A Service Response Framework in Smart City. IEEE Trans. Intell. Transp. Syst. 2022, 23, 5791–5805. [Google Scholar] [CrossRef]
- Schmidhuber, J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xie, J.; Girshick, R.; Farhadi, A. Unsupervised Deep Embedding for Clustering Analysis. Comput. Sci. 2015. Available online: https://doi.org/10.48550/arXiv.1511.06335 (accessed on 30 October 2022). [CrossRef]
- Mong, Y.-L.; Ackley, K.; Killestein, T.L.; Galloway, D.K.; Vassallo, C.; Dyer, M.; Cutter, R.; Brown, M.J.I.; Lyman, J.; Ulaczyk, K.; et al. Self-Supervised Clustering on Image-Subtracted Data with Deep-Embedded Self-Organizing Map. Mon. Not. R. Astron. Soc. 2022, 518, 152–762. [Google Scholar] [CrossRef]
- Yang, J.; Parikh, D.; Batra, D. Joint Unsupervised Learning of Deep Representations and Image Clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5147–5156. [Google Scholar]
- Chang, J.; Wang, L.; Meng, G.; Xiang, S.; Pan, C. Deep Adaptive Image Clustering. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
- Quinlan, J. Program for Machine Learning; C4.5 Morgan Kaufmann Publisher: San Mateo, CA, USA, 1993. [Google Scholar]
- Nasraoui, O. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Acm Sigkdd Explor. Newsl. 2008, 10, 23–25. [Google Scholar] [CrossRef]
- Caglar, A. Neural Networks Are Decision Trees. arXiv 2022, arXiv:2210.05189. [Google Scholar] [CrossRef]
- Louppe, G. Understanding Random Forests: From Theory to Practice. arXiv 2014, arXiv:1407.7502. [Google Scholar]
- Breiman, L. Random Forests–Random Features. Mach. Learn. 2001, 45, 5–32. Available online: https://link.springer.com/article/10.1023/a:1010933404324 (accessed on 11 October 2022).
- Dempster, A. Maximum-likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 1977, 39, 1–22. [Google Scholar]
- Mclachlan, G.J.; Krishnan, T. The EM Algorithm and Extensions: Second Edition; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Freund, Y.; Schapire, R.E. Schapire (translation by Naoki Abe). A short introduction to boosting. Artif. Intell. 1999, 14, 771–780. (In Japanese) [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The elements of statistical learning, 2001. J. R. Stat. Soc. 2004, 167, 192. [Google Scholar]
- Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep Clustering for Unsupervised Learning of Visual Features. CoRR 2018. Available online: https://doi.org/10.48550/arXiv.1807.05520 (accessed on 30 October 2022).
- El-kenawy, E.-S.M.; Albalawi, F.; Ward, S.A.; Ghoneim, S.S.M.; Eid, M.M.; Abdelhamid, A.A.; Bailek, N.; Ibrahim, A. Feature Selection and Classification of Transformer Faults Based on Novel Meta-Heuristic Algorithm. Mathematics 2022, 10, 3144. [Google Scholar] [CrossRef]
- Confalonieri, R.; Bellocchi, G.; Bregaglio, S.; Donatelli, M.; Acutis, M. Comparison of sensitivity analysis techniques: A case study with the rice model WARM. Ecol. Model. 2010, 221, 1897–1906. [Google Scholar] [CrossRef]
Method | Time Cost (h) | Number of Features | Number of Training Stages |
---|---|---|---|
HAAR + AdaBoost | 57 ± 0.5 | 635 | 30 |
LBP + AdaBoost | 12 ± 0.5 | 302 | 20 |
HOG + AdaBoost | 33 ± 0.5 | 477 | 25 |
DeepCluster + HAAR + AdaBoost | 23 ± 0.5 | 538 | 26 |
DeepCluster + LBP + AdaBoost | 9 ± 0.5 | 187 | 16 |
DeepCluster + HOG + AdaBoost | 20 ± 0.5 | 412 | 22 |
Method | TPR | FPR |
---|---|---|
HAAR + AdaBoost | 91.88% | 6.83% |
LBP + AdaBoost | 86.75% | 12.73% |
HOG + AdaBoost | 90.76% | 9.66% |
DeepCluster + HAAR + AdaBoost | 94.40% | 3.59% |
DeepCluster + LBP + AdaBoost | 88.22% | 10.49% |
DeepCluster + HOG + AdaBoost | 92.55% | 7.31% |
Method | Detection Time (ms) |
---|---|
HAAR + AdaBoost | 42 |
LBP + AdaBoost | 26 |
HOG + AdaBoost | 33 |
DeepCluster + HAAR + AdaBoost | 36 |
DeepCluster + LBP + AdaBoost | 24 |
DeepCluster + HOG + AdaBoost | 28 |
Number of Training Stages | Size of Training Images | False Alarm Rate | ||||||
---|---|---|---|---|---|---|---|---|
Values | Run Time | TPR | Values | Run Time | TPR | Values | Run Time | TPR |
20 | 40.96 | 80.12% | 20 | 41.72 | 90.27% | 0.05 | 41.92 | 91.88% |
21 | 41.83 | 82.71% | 21 | 42.33 | 90.93% | 0.10 | 42.34 | 91.77% |
22 | 42.91 | 83.90% | 22 | 41.96 | 91.24% | 0.15 | 41.04 | 91.54% |
23 | 43.01 | 85.24% | 23 | 42.22 | 91.36% | 0.20 | 42.85 | 90.93% |
24 | 45.20 | 87.93% | 24 | 42.17 | 91.88% | 0.25 | 42.15 | 90.77% |
25 | 46.36 | 89.90% | 25 | 42.12 | 91.96% | 0.30 | 42.18 | 89.64% |
26 | 48.22 | 90.24% | 26 | 42.13 | 91.99% | 0.35 | 42.44 | 87.53% |
27 | 45.13 | 90.89% | 27 | 42.09 | 92.03% | 0.40 | 42.07 | 85.15% |
28 | 44.78 | 91.20% | 28 | 42.11 | 91.89% | 0.45 | 43.71 | 82.38% |
29 | 43.17 | 91.77% | 29 | 42.10 | 91.77% | 0.50 | 44.21 | 79.94% |
30 | 42.10 | 91.88% | 30 | 42.10 | 91.86% | 0.55 | 44.79 | 77.42% |
Number of Training Stages | Size of Training Images | False Alarm Rate | ||||||
---|---|---|---|---|---|---|---|---|
Values | Run Time | TPR | Values | Run Time | TPR | Values | Run Time | TPR |
20 | 34.81 | 85.68% | 20 | 35.55 | 94.16% | 0.05 | 35.93 | 94.90% |
21 | 35.68 | 87.59% | 21 | 36.16 | 94.82% | 0.10 | 36.54 | 94.67% |
22 | 36.76 | 88.95% | 22 | 35.79 | 95.13% | 0.15 | 35.92 | 94.44% |
23 | 37.86 | 89.80% | 23 | 36.05 | 95.62% | 0.20 | 36.25 | 93.81% |
24 | 39.05 | 90.07% | 24 | 35.97 | 95.85% | 0.25 | 36.16 | 93.48% |
25 | 40.21 | 91.33% | 25 | 35.95 | 95.87% | 0.30 | 36.44 | 92.61% |
26 | 40.07 | 92.75% | 26 | 35.96 | 95.89% | 0.35 | 36.85 | 90.55% |
27 | 38.18 | 93.32% | 27 | 35.92 | 95.93% | 0.40 | 37.11 | 87.22% |
28 | 37.63 | 93.79% | 28 | 35.94 | 95.88% | 0.45 | 37.64 | 85.56% |
29 | 36.32 | 94.43% | 29 | 35.93 | 95.95% | 0.50 | 38.13 | 83.49% |
30 | 35.95 | 94.90% | 30 | 35.93 | 95.89% | 0.55 | 38.49 | 82.90% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, J.; Ye, H.; Zhao, H.; Li , Z. Deep Cascade AdaBoost with Unsupervised Clustering in Autonomous Vehicles. Electronics 2023, 12, 44. https://doi.org/10.3390/electronics12010044
Duan J, Ye H, Zhao H, Li Z. Deep Cascade AdaBoost with Unsupervised Clustering in Autonomous Vehicles. Electronics. 2023; 12(1):44. https://doi.org/10.3390/electronics12010044
Chicago/Turabian StyleDuan, Jianghua, Hongfei Ye, Hongyu Zhao, and Zhiqiang Li . 2023. "Deep Cascade AdaBoost with Unsupervised Clustering in Autonomous Vehicles" Electronics 12, no. 1: 44. https://doi.org/10.3390/electronics12010044
APA StyleDuan, J., Ye, H., Zhao, H., & Li , Z. (2023). Deep Cascade AdaBoost with Unsupervised Clustering in Autonomous Vehicles. Electronics, 12(1), 44. https://doi.org/10.3390/electronics12010044