A 3D Shape Recognition Method Using Hybrid Deep Learning Network CNN–SVM
Abstract
:1. Introduction
2. Related Studies
2.1. Descriptors of Hand-Crafted Shape
2.2. CNN-Based Method
3. Methodology
3.1. The Creation of 3D Point Clouds Data
3.2. D Point Clouds Data Augmentation
3.3. The Architecture of CNN–SVM
4. Experimental Results
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Zaki, H.; Shafait, F.; Mian, A. Modeling 2D appearance evolution for 3D object categorization. In Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, Australia, 30 November–2 December 2016; pp. 1–8. [Google Scholar] [CrossRef]
- Bu, S.; Wang, L.; Han, P.; Liu, Z.; Li, K. 3D shape recognition and retrieval based on multi-modality deep learning. Neurocomputing 2017, 259, 183–193. [Google Scholar] [CrossRef]
- Zheng, Q.; Sun, J.; Zhang, L.; Chen, W.; Fan, H. An improved 3D shape recognition method based on panoramic view. Math. Probl. Eng. 2018, 2018, 11. [Google Scholar] [CrossRef]
- Xia, Y.; Wang, C.; Xu, Y.; Zang, Y.; Liu, W.; Li, J.; Stilla, U. RealPoint3D: Generating 3D point clouds from a single image of complex scenarios. Remote Sens. 2019, 11, 2644. [Google Scholar] [CrossRef] [Green Version]
- Zhi, S.; Liu, Y.; Li, X.; Guo, Y. LightNet: A lightweight 3D convolutional neural network for real-time 3D object recognition. In Proceedings of the 2017 Workshop on 3D Object Retrieval (3Dor17), Lyon, France, 23–24 April 2017; pp. 9–16. [Google Scholar] [CrossRef]
- Rusu, R.; Bradski, G.; Thibaux, R.; HsuRetrievalb, J. Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 2155–2162. [Google Scholar] [CrossRef]
- Osada, R.; Funkhouser, T.; Chazelle, B.; Dobkin, D. Shape distributions. ACM Trans. Graph. 2002, 21, 807–832. [Google Scholar] [CrossRef]
- Johnson, A.; Herbert, M. Surface matching for object recognition in complex 3-dimensional scenes. Image Vision Comput. 1998, 16, 635–651. [Google Scholar] [CrossRef]
- Guo, Y.; Sohel, F.; Bennamoun, M.; Lu, M.; Wan, J. Rotational projection statistics for 3D local surface description and object recognition. Int. J. Comput. Vis. 2013, 105, 63–68. [Google Scholar] [CrossRef] [Green Version]
- Sun, J.; Ovsjanikov, M.; Guibas, L. A concise and provably informative multi-scale signature based on heat diffusion. Comput. Graph. Forum 2009, 28, 1383–1392. [Google Scholar] [CrossRef]
- Rusu, R.; Blodow, N.; Beetz, M. Fast point feature histograms (FPFH) for 3D registration. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 1848–1853. [Google Scholar] [CrossRef]
- Knopp, J.; Prasad, M.; Willems, G.; Timofte, R.; Gool, L. Hough transform and 3D SURF for robust three dimensional classification. In Proceedings of the 11th European Conference on Computer Vision (ECCV 2010), Heraklion, Crete, Greece, 5–11 September 2010; pp. 589–602. [Google Scholar] [CrossRef]
- Lo, T.; Siebert, J. Local feature extraction and matching on range images: 2.5D SIFT. Comput. Vis. Image Und. 2009, 113, 1235–1250. [Google Scholar] [CrossRef]
- Gomez-Donoso, F.; Garcia-Garcia, A.; Garcia-Rodriguez, J.; Orts-Escolano, S.; Cazorla, M. LonchaNet: A sliced-based CNN architecture for real-time 3D object recognition. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 412–418. [Google Scholar] [CrossRef]
- Garcia, A.; Donoso, F.; Rodriguez, J.; Escolano, S.; Cazorla, M.; Lopez, J. PointNet: A 3D convolutional neural network for real-time object class recognition. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN 2016), Vancouver, BC, Canada, 24–29 July 2016; pp. 1578–1584. [Google Scholar] [CrossRef]
- Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’15), Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar] [CrossRef] [Green Version]
- Shi, B.; Bai, S.; Zhou, Z.; Xiang, B. DeepPano: Deep panoramic representation for 3-D shape recognition. IEEE Signal Process. Lett. 2015, 22, 2339–2343. [Google Scholar] [CrossRef]
- Yin, J.; Huang, N.; Tang, J.; Fang, M. Recognition of 3D shapes based on 3V-DepthPano CNN. Math. Probl. Eng. 2020, 2020, 11. [Google Scholar] [CrossRef]
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar] [CrossRef] [Green Version]
- Zhao, J.; Xie, X.; Xu, X.; Sun, S. Multi-view learning overview: Recent progress and new challenges. Inf. Fusion. 2017, 38, 43–54. [Google Scholar] [CrossRef]
- Valiollahzadeh, S.; Sayadiyan, A.; Nazari, M. Feature selection by KDDA for SVM-based multiview face recognition. arXiv 2008, arXiv:0812.2574. [Google Scholar]
- Nanda, M.; Seminar, K.; Nandika, D.; Maddu, A. A Comparison Study of Kernel Functions in the Support Vector Machine and Its Application for Termite Detection. Information 2018, 9, 5. [Google Scholar] [CrossRef] [Green Version]
- Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary PhD Workshop (IIPhDW), Swinoujście, Poland, 9–12 May 2018; pp. 117–122. [Google Scholar] [CrossRef]
- Orientation, Rotation, Velocity and Acceleration. Available online: https://www.sedris.org/wg8home/Documents/WG80485.pdf (accessed on 28 February 2020).
- Schilling, F. The Effect of Batch Normalization on Deep Convolutional Neural Networks; DiVA Publisher: Uppsala, Sweden, 2016. [Google Scholar]
- Tang, Y. Deep Learning using Linear Support Vector Machines. arXiv 2013, arXiv:1306.0239. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 2012 Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1106–1114. [Google Scholar] [CrossRef]
Methods | DeepPano | 3V-DepthPano | Proposed Method |
---|---|---|---|
Approach | View-based | View-based | View-based |
Transform 3D shape | No | No | Interpolate vertices |
Type of 2D views | Panoramic views | Panoramic views | Single 2D projection |
Multi-CNN structure | Single CNN | Three-branch CNN | Single CNN |
Class Name | Total | Class Name | Total | Class Name | Total | Class Name | Total |
---|---|---|---|---|---|---|---|
airplane | 726 | cup | 99 | laptop | 169 | sofa | 780 |
bathtub | 156 | curtain | 158 | mantel | 384 | stairs | 144 |
bed | 615 | desk | 286 | monitor | 565 | stool | 110 |
bench | 193 | door | 129 | night_stand | 286 | table | 492 |
bookshelf | 672 | dresser | 286 | person | 108 | tent | 183 |
bottle | 435 | Plower_pot | 169 | piano | 331 | toilet | 444 |
bowl | 84 | Glass_box | 271 | plant | 340 | tv_stand | 367 |
car | 297 | guitar | 255 | radio | 124 | vase | 575 |
chair | 989 | keyboard | 165 | range_hood | 215 | wardrobe | 107 |
cone | 187 | lamp | 144 | sink | 148 | xbox | 123 |
Layers | DeepPano | Parameters | PanoView | Parameters |
Input | 160 × 64 | 0 | 108 × 36 | 0 |
Conv1 | (5, 96) | 2496 | (1, 64) | 128 |
Conv2 | (5, 256) | 6656 | (2, 80) | 400 |
Conv3 | (3, 384) | 3840 | (4, 160) | 2720 |
Conv4 | (3, 512) | 5120 | (6, 320) | 11,840 |
FC1 | N.Available | N.Available | 512 | N.Available |
FC2 | N.Available | N.Available | 1024 | N.Available |
Total | - | 18,112 | - | 15,088 |
Layers | 3V-DepthPano | Parameters | Our Method | Parameters |
Input | 227 × 227 | 0 | 32 × 32 × 12 | 0 |
Conv1 | (11, 96) | 11,712 | (2, 8) | 40 |
Conv2 | (5, 256) | 6656 | (2, 32) | 160 |
Conv3 | (3, 384) | 3840 | (2, 128) | 640 |
Conv4 | (3, 384) | 3840 | (2, 512) | 2560 |
Conv5 | (3, 256) | 2560 | N.Available | N.Available |
FC1 | 4096 | N.Available | 128 | N.Available |
FC2 | 4096 | N.Available | N.Available | N.Available |
Total | - | 28,608 | - | 3400 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hoang, L.; Lee, S.-H.; Kwon, K.-R. A 3D Shape Recognition Method Using Hybrid Deep Learning Network CNN–SVM. Electronics 2020, 9, 649. https://doi.org/10.3390/electronics9040649
Hoang L, Lee S-H, Kwon K-R. A 3D Shape Recognition Method Using Hybrid Deep Learning Network CNN–SVM. Electronics. 2020; 9(4):649. https://doi.org/10.3390/electronics9040649
Chicago/Turabian StyleHoang, Long, Suk-Hwan Lee, and Ki-Ryong Kwon. 2020. "A 3D Shape Recognition Method Using Hybrid Deep Learning Network CNN–SVM" Electronics 9, no. 4: 649. https://doi.org/10.3390/electronics9040649
APA StyleHoang, L., Lee, S. -H., & Kwon, K. -R. (2020). A 3D Shape Recognition Method Using Hybrid Deep Learning Network CNN–SVM. Electronics, 9(4), 649. https://doi.org/10.3390/electronics9040649