A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks
Abstract
:1. Introduction
2. Related Work
3. The Proposed Hierarchical CNNs (HCNNs)
3.1. The Model Framework of HCNNs
3.2. Multi-Class Joint Loss in HCNNs
3.3. Model Testing
- Step 1: The test image is input into the m-th sub-network for visual feature learning ( for the first sub-network). The model then outputs the probability distribution of the image classification results , where is the number of image categories.
- Step 2: A comparison is drawn between the maximum classification probability and .
- Step 3: If or , then the model outputs the classification results corresponding to ; otherwise, , and return to Step 1.
4. Experimental Results and Analysis
4.1. Image Classification Datasets
- (1)
- Ultrasonic image dataset of prostate
- (2)
- The chimpanzee facial image dataset
- (3)
- CIFAR-10 and CIFAR-100
4.2. Experimental Setup
4.3. Experimental Results
- (1)
- Experimental results on the ultrasonic image dataset
- (2)
- Experimental results on the chimpanzee facial image dataset
- (3)
- Experimental results on CIFAR-10
- (4)
- Experimental results on CIFAR-100
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Nie, L.; Zhang, L.; Meng, L.; Song, X.; Chang, X.; Li, X. Modeling disease progression via multisource multitask learners: A case study with Alzheimer’s disease. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 1508–1519. [Google Scholar] [CrossRef]
- Luo, M.; Chang, X.; Nie, L.; Yang, Y.; Hauptmann, A.G.; Zheng, Q. An adaptive semisupervised feature analysis for video semantic recognition. IEEE Trans. Cybern. 2017, 48, 648–660. [Google Scholar] [CrossRef]
- Wang, S.; Chang, X.; Li, X.; Long, G.; Yao, L.; Sheng, Q.Z. Diagnosis code assignment using sparsity-based disease correlation embedding. IEEE Trans. Knowl. Data Eng. 2016, 28, 3191–3202. [Google Scholar] [CrossRef] [Green Version]
- Qi, L.; Tang, W.; Zhou, L.; Huang, Y.; Zhao, S.; Liu, L.; Li, M.; Zhang, L.; Feng, S.; Hou, D.; et al. Long-term follow-up of persistent pulmonary pure ground-glass nodules with deep learning–assisted nodule segmentation. Eur. Radiol. 2020, 30, 744–755. [Google Scholar] [CrossRef] [PubMed]
- Munir, K.; Elahi, H.; Ayub, A.; Frezza, F.; Rizzi, A. Cancer Diagnosis Using Deep Learning: A Bibliographic Review. Cancers 2019, 11, 1235. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, B.; Chi, W.; Li, X.; Li, P.; Liang, W.; Liu, H.; Wang, W.; He, J. Evolving the pulmonary nodules diagnosis from classical approaches to deep learning-aided decision support: Three decades’ development course and future prospect. J. Cancer Res. Clin. Oncol. Vol. 2020, 146, 153–185. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cheng, Z.; Chang, X.; Zhu, L.; Kanjirathinkal, R.C.; Kankanhalli, M. MMALFM: Explainable recommendation by leveraging reviews and images. ACM Trans. Inf. Syst. (TOIS) 2019, 37, 1–28. [Google Scholar] [CrossRef]
- Li, Z.; Yao, L.; Chang, X.; Zhan, K.; Sun, J.; Zhang, H. Zero-shot event detection via event-adaptive concept relevance mining. Pattern Recognit. 2019, 88, 595–603. [Google Scholar] [CrossRef]
- Yu, E.; Sun, J.; Li, J.; Chang, X.; Han, X.H.; Hauptmann, A.G. Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Trans. Multimed. 2018, 21, 1276–1288. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, L.; Hinton, G.E. Imagenet classification withdeep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recog-nition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Forres, I.; Matt, M.; Serge, K.; Ross, G.; Trevor, K.; Kurt, K. Densenet: Im-plementing efficient convnet descriptor pyramids. arXiv 2014, arXiv:1404.1869. [Google Scholar]
- Qiang, L.; Xuyu, X.; Jiaohua, Q.; Yun, T.; Yuanjing, T.J.L. Cover-less steganography based on image retrieval of densenet features and dwtsequence mapping. Knowl.-Based Syst. 2020, 192, 105375. [Google Scholar]
- Zhang, D.; Yao, L.; Chen, K.; Wang, S.; Chang, X.; Liu, Y. Making sense of spatio-temporal preserving representations for EEG-based human intention recognition. IEEE Trans. Cybern. 2019, 50, 3033–3044. [Google Scholar] [CrossRef] [PubMed]
- Nie, L.; Zhang, L.; Yan, Y.; Chang, X.; Liu, M.; Shaoling, L. Multiview physician-specific attributes fusion for health seeking. IEEE Trans. Cybern. 2016, 47, 3680–3691. [Google Scholar] [CrossRef] [Green Version]
- Ren, P.; Xiao, Y.; Chang, X.; Huang, P.Y.; Li, Z.; Chen, X.; Wang, X. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv. (CSUR) 2021, 54, 1–34. [Google Scholar] [CrossRef]
- Li, Z.; Nie, F.; Chang, X.; Yang, Y. Beyond trace ratio: Weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans. Knowl. Data Eng. 2017, 29, 2100–2110. [Google Scholar] [CrossRef]
- Yuan, D.; Chang, X.; Huang, P.Y.; Liu, Q.; He, Z. Self-supervised deep correlation tracking. IEEE Trans. Image Process. 2020, 30, 976–985. [Google Scholar] [CrossRef]
- Ma, Z.; Chang, X.; Yang, Y.; Sebe, N.; Hauptmann, A.G. The many shades of negativity. IEEE Trans. Multimed. 2017, 19, 1558–1568. [Google Scholar] [CrossRef]
- Li, Z.; Nie, F.; Chang, X.; Yang, Y.; Zhang, C.; Sebe, N. Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 6323–6332. [Google Scholar] [CrossRef] [PubMed]
- Yan, C.; Zheng, Q.; Chang, X.; Luo, M.; Yeh, C.H.; Hauptman, A.G. Semantics-preserving graph propagation for zero-shot object detection. IEEE Trans. Image Process. 2020, 29, 8163–8176. [Google Scholar] [CrossRef] [PubMed]
- Ren, P.; Xiao, Y.; Chang, X.; Huang, P.Y.; Li, Z.; Gupta, B.; Wang, X. A Survey of Deep Active Learning. ACM Comput. Surv. (CSUR) 2021, 54, 1–40. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Hinton, G. Learning multiple layers of features fromtiny images. Tech Rep. 2009, 7, 1–60. [Google Scholar]
- Loos, A.; Ernst, A. An automated chimpanzee identification system using face detection and recognition. EURASIP J. Image Video Process. 2013, 2013, 49. [Google Scholar] [CrossRef]
- Khan, S.; Nazir, S.; Garcia-Magarino, I.; Hussain, A. Deep learning-base urban big data fusion in smart cities: Towards traffic monitoring and flow-preserving fusion. Comput. Electr. Eng. 2021, 89, 106906. [Google Scholar] [CrossRef]
- Orozco, M.C.E.; Rebong, C.B. Vehicular detection and classification forintelligent transportation system: A deep learning approach using faster490r-cnn model. Int. J. Simul. Syst. 2019, 180, 36551. [Google Scholar]
- Zhenghao, X.; Niu, Y.; Chen, J.; Kan, X.; Liu, H. Facial expression recognition of industrial internet of things by parallel neural networks combining texture features. IEEE Trans. Ind. Inform. 2020, 17, 2784–2793. [Google Scholar]
- Hossain, M.S.; Muhammad, G.; Amin, S.U. Improving consumer satisfac-tion in smart cities using edge computing and caching: A case study ofdate fruits classification. Future Gener. Comput. Syst. 2018, 88, 333–341. [Google Scholar] [CrossRef]
- Gören, S.; Óncevarlk, D.F.; Yldz, K.D.; Hakyemez, T.Z. On-street parking500spot detection for smart cities. In Proceedings of the IEEE International Smart CitiesConference (ISC2), Casablanca, Morocco, 14–17 October 2019; pp. 292–295. [Google Scholar]
- Yao, H.; Gao, P.; Wang, J.; Zhang, P.; Jiang, C.; Han, Z. Capsule networkassisted iot traffic classification mechanism for smart cities. IEEE Internet Things J. 2019, 6, 7515–7525. [Google Scholar] [CrossRef]
- Hassan, A.; Liu, F.; Wang, F.; Wang, Y. Secure image classification withdeep neural networks for iot applications. J. Ambient Intell. Humaniz. Comput. 2020, 12, 8319–8337. [Google Scholar] [CrossRef]
- Vasan, D.; Alazab, M.; Wassan, S.; Naeem, H.; Safaei, B.; Zheng, Q. Imcfn:Image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 2020, 171, 107138. [Google Scholar] [CrossRef]
- Ciregan, D.; Meier, U.; Schmidhuber, J. Multi-column deep neural networksfor image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3642–3649. [Google Scholar]
- Frazao, X.; Alexandre, L.A. Weighted convolutional neural network ensemble. In Iberoamerican Congress on Pattern Recognition; Springer: Cham, Switzerland, 2014; pp. 674–681. [Google Scholar]
- Tajbakhsh, N.; Gurudu, S.R.; Liang, J. Automatic polyp detection incolonoscopy videos using an ensemble of convolutional neural networks. In Proceedings of the 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), Brooklyn, NY, USA, 16–19 April 2015; pp. 79–83. [Google Scholar]
- Ijjina, E.P.; Mohan, C.K. Hybrid deep neural network model for humanaction recognition. Appl. Soft Comput. 2016, 46, 936–952. [Google Scholar] [CrossRef]
- Taherkhani, A.; Cosma, G.; McGinnity, T.M. Adaboost-cnn: An adaptiveboosting algorithm for convolutional neural networks to classify multi-classimbalanced datasets using transfer learning. Neurocomputing 2020, 404, 351–366. [Google Scholar] [CrossRef]
- Movshovitz-Attias, Y.; Toshev, A.; Leung, T.K.; Ioffe, S.; Singh, S. No fussdistance metric learning using proxies. In Proceedings of the IEEE International Conferenceon Computer Vision, Venice, Italy, 22–29 October 2017; pp. 360–368. [Google Scholar]
- Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. Sphereface: Deep hy-persphere embedding for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 212–220. [Google Scholar]
- Chen, W.; Chen, X.; Zhang, J.; Huang, K. Beyond triplet loss: A deepquadruplet network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 403–412. [Google Scholar]
- Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding forface recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
- Wang, X.; Han, X.; Huang, W.; Dong, D.; Scott, M.R. Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5022–5030. [Google Scholar]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 499–515. [Google Scholar]
- Wang, J.; Zhou, F.; Wen, S.; Liu, X.; Lin, Y. Deep metric learning withangular loss. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2593–2601. [Google Scholar]
- Jiang, L.; Meng, D.; Yu, S.-I.; Lan, Z.; Shan, S.; Hauptmann, A. Self-paced learning with diversity. Adv. Neural Inf. Process. Syst. 2014, 27, 2078–2086. [Google Scholar]
- Chang, X.; Yu, Y.L.; Yang, Y.; Xing, E.P. Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1617–1632. [Google Scholar] [CrossRef]
- Yan, C.; Chang, X.; Li, Z.; Guan, W.; Ge, Z.; Zhu, L.; Zheng, Q. ZeroNAS: Differentiable Generative Adversarial Networks Search for Zero-Shot Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef] [PubMed]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking theinception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2:Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Munir, K.; Frezza, F.; Rizzi, A. Deep Learning for Brain Tumor Segmentation. In Deep Learning for Cancer Diagnosis; Springer: Singapore, 2020; pp. 189–201. [Google Scholar]
- Munir, K.; Elahi, H.; Farooq, M.U.; Ahmed, S.; Frezza, F.; Rizzi, A. Detection and screening of COVID-19 through chest computed tomography radiographs using deep neural networks. In Data Science for COVID-19; Academic Press: Cambridge, MA, USA, 2021; pp. 63–73. [Google Scholar]
- Munir, K.; Frezza, F.; Rizzi, A. Brain Tumor Segmentation Using 2D-UNET Convolutional Neural Network. In Deep Learning for Cancer Diagnosis; Springer: Singapore, 2020; pp. 239–248. [Google Scholar]
- Fakoor, R.; Ladhak, F.; Nazi, A.; Huber, M. Using deep learning to enhance cancer diagnosis and classification. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013), Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Wide residual networks. arXiv 2016, arXiv:1605.07146. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regulariza-tion strategy to train strong classifiers with localizable features. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 6023–6032. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely con-nected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Datasets | Category | Train | Validation | Test |
---|---|---|---|---|
Ultrasonic image dataset | 2 | 746 | 93 | 93 |
Chimpanzee facial image dataset | 52 | 1689 | 292 | 540 |
CIFAR-10 | 10 | 40,000 | 10,000 | 10,000 |
CIFAR-100 | 100 | 45,000 | 5000 | 10,000 |
Models | ACC | F1 Score | Recall | Precision |
---|---|---|---|---|
Alexnet [10] | 0.6989 | 0.7200 | 0.8571 | 0.6207 |
VGG16 [11] | 0.7634 | 0.7381 | 0.7381 | 0.7381 |
Inception V3 [49] | 0.7500 | 0.7164 | 0.7273 | 0.7059 |
Mobilenet V2 [50] | 0.6989 | 0.6499 | 0.6190 | 0.6842 |
Resnet50 [12] | 0.8065 | 0.7805 | 0.7619 | 0.8000 |
Alexnet+VGG16 | 0.7361 | 0.7077 | 0.7188 | 0.6970 |
Alexnet+VGG16+Inception V3 | 0.7639 | 0.7385 | 0.7500 | 0.7273 |
Alexnet+VGG16+ | 0.7917 | 0.7693 | 0.7813 | 0.7576 |
Inception V3+Mobilenet V2 | ||||
HCNN | 0.8333 | 0.8125 | 0.8125 | 0.8125 |
Models | ACC | F1 Score | Recall | Precision |
---|---|---|---|---|
Alexnet [10] | 0.5532 | 0.5470 | 0.5428 | 0.5512 |
VGG16 [11] | 0.6885 | 0.6836 | 0.6818 | 0.6854 |
Inception V3 [49] | 0.7008 | 0.6976 | 0.6956 | 0.6996 |
Mobilenet V2 [50] | 0.5737 | 0.5699 | 0.5701 | 0.5698 |
Resnet50 [12] | 0.7336 | 0.7327 | 0.7321 | 0.7334 |
Alexnet+VGG16 | 0.7023 | 0.7010 | 0.6998 | 0.7023 |
Alexnet+VGG16+Inception V3 | 0.7234 | 0.7200 | 0.7199 | 0.7201 |
Alexnet+VGG16+Inception V3+Mobilenet V2 | 0.7349 | 0.7316 | 0.7288 | 0.7344 |
HCNN | 0.7455 | 0.7435 | 0.7451 | 0.7419 |
Models | ACC | F1 Score | Recall | Precision |
---|---|---|---|---|
Alexnet [10] | 0.7881 | 0.7881 | 0.7894 | 0.7868 |
VGG16 [11] | 0.8860 | 0.8810 | 0.8818 | 0.8802 |
Inception V3 [49] | 0.8922 | 0.8921 | 0.8928 | 0.8914 |
Mobilenet V2 [50] | 0.8704 | 0.8639 | 0.8647 | 0.8631 |
Resnet50 [12] | 0.9080 | 0.9008 | 0.9011 | 0.9005 |
Alexnet+VGG16 | 0.8877 | 0.8878 | 0.8880 | 0.8877 |
Alexnet+VGG16+Inception V3 | 0.9023 | 0.9086 | 0.9087 | 0.9085 |
Alexnet+VGG16+Inception V3+Mobilenet V2 | 0.9104 | 0.9106 | 0.9107 | 0.9105 |
HCNN | 0.9226 | 0.9221 | 0.9222 | 0.9221 |
Models | Acc |
---|---|
Adaboost-CNN [38] | 0.8140 |
HCNN | 0.9226 |
Models | ACC | F1 Score | Recall | Precision |
---|---|---|---|---|
Alexnet [10] | 0.5347 | 0.5326 | 0.5329 | 0.5323 |
VGG16 [11] | 0.6556 | 0.6548 | 0.6556 | 0.6540 |
Inception V3 [49] | 0.7675 | 0.7686 | 0.7690 | 0.7682 |
Mobilenet V2 [50] | 0.6601 | 0.6634 | 0.6645 | 0.6623 |
Resnet50 [12] | 0.6031 | 0.6033 | 0.6034 | 0.6032 |
Alexnet+VGG16 | 0.6623 | 0.6640 | 0.6646 | 0.6635 |
Alexnet+VGG16+Inception V3 | 0.7742 | 0.7767 | 0.7778 | 0.7757 |
Alexnet+VGG16+Inception V3+Mobilenet V2 | 0.7798 | 0.7740 | 0.7746 | 0.7735 |
HCNN | 0.7847 | 0.7846 | 0.7844 | 0.7848 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, C.; Miao, F.; Gao, G. A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks. Electronics 2021, 10, 3183. https://doi.org/10.3390/electronics10243183
Li C, Miao F, Gao G. A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks. Electronics. 2021; 10(24):3183. https://doi.org/10.3390/electronics10243183
Chicago/Turabian StyleLi, Cheng, Fei Miao, and Gang Gao. 2021. "A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks" Electronics 10, no. 24: 3183. https://doi.org/10.3390/electronics10243183
APA StyleLi, C., Miao, F., & Gao, G. (2021). A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks. Electronics, 10(24), 3183. https://doi.org/10.3390/electronics10243183