PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling
Abstract
:1. Introduction
- An end-to-end point cloud processing network, namely PointSCNet, is proposed to learn structure and correlation information between local regions of a point cloud for shape classification and part segmentation tasks.
- The idea of a space-filling curve is adopted for points sampling and local sub-cloud generation. Specifically, points are encoded and sorted by Z-order curve coding, which makes the points contain meaningful geometric ordering.
- An information fusion module is designed to represent the local region correlation and shape structure information. The information fusion is achieved by correlating the local and structure feature via a correlation tensor and by skipping connection operations.
- A channel-spatial attention module is adopted to learn the significant points and crucial feature channels. The channel-spatial attention weights are learned for the refinement of the point cloud feature.
2. Related Work
2.1. Traditional Point Cloud Processing Methods
2.2. Point-Wise Embedding
2.3. Point Cloud Structure Reasoning
2.4. Attention in Point Cloud Processing
3. Method
3.1. Initial Sampling and Grouping
3.2. Z-Order Curve-Guided Sampling Module
3.3. Information Fusion of Local Feature and Structure Feature
3.4. Points Channel-Spatial Attention Module
4. Experiments
4.1. Implementation Details
4.2. Shape Classification on ModelNet40
4.3. Part Segmentation on ShapeNet
4.4. Additional Quantitative Analyses
4.5. Additional Visualization Experiments
4.6. Ablation Study
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Guo, Y.; Sohel, F.; Bennamoun, M.; Lu, M.; Wan, J. Rotational projection statistics for 3D local surface description and object recognition. Int. J. Comput. Vis. 2013, 105, 63–86. [Google Scholar] [CrossRef] [Green Version]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3d point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Nießner, M.; Dai, A.; Yan, M.; Guibas, L.J. Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5648–5656. [Google Scholar]
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar]
- Chen, X.; Ma, H.; Wan, J.; Li, B.; Xia, T. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6526–6534. [Google Scholar]
- Yu, T.; Meng, J.; Yuan, J. Multi-view harmonized bilinear network for 3d object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 186–194. [Google Scholar]
- Yang, Z.; Wang, L. Learning relationships for multi-view 3D object recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 7504–7513. [Google Scholar]
- Maturana, D.; Scherer, S. Voxnet: A 3d convolutional neural network for real-time object recognition. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Boston, MA, USA, 7–12 June 2015; pp. 3367–3375. [Google Scholar]
- Riegler, G.; Osman Ulusoy, A.; Geiger, A. Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6620–6629. [Google Scholar]
- Wang, P.S.; Liu, Y.; Guo, Y.X.; Sun, C.Y.; Tong, X. O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Trans. Graph. (TOG) 2017, 36, 1–11. [Google Scholar] [CrossRef]
- Le, T.; Duan, Y. Pointgrid: A deep network for 3d shape understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9204–9214. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv 2017, arXiv:1706.02413. [Google Scholar]
- Duan, Y.; Zheng, Y.; Lu, J.; Zhou, J.; Tian, Q. Structural relational reasoning of point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 949–958. [Google Scholar]
- Yin, K.; Huang, H.; Cohen-Or, D.; Zhang, H. P2p-net: Bidirectional point displacement net for shape transform. ACM Trans. Graph. (TOG) 2018, 37, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Yang, J.; Zhang, Q.; Ni, B.; Li, L.; Liu, J.; Zhou, M.; Tian, Q. Modeling point clouds with self-attention and gumbel subset sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3318–3327. [Google Scholar]
- Sarode, V.; Li, X.; Goforth, H.; Aoki, Y.; Srivatsan, R.A.; Lucey, S.; Choset, H. PCRNet: Point cloud registration network using PointNet encoding. arXiv 2019, arXiv:1908.07906. [Google Scholar]
- Lin, Z.; Feng, M.; Santos, C.N.d.; Yu, M.; Xiang, B.; Zhou, B.; Bengio, Y. A structured self-attentive sentence embedding. arXiv 2017, arXiv:1703.03130. [Google Scholar]
- Thabet, A.; Alwassel, H.; Ghanem, B. Mortonnet: Self-supervised learning of local features in 3D point clouds. arXiv 2019, arXiv:1904.00230. [Google Scholar]
- Wu, Y.; He, F.; Yang, Y. A grid-based secure product data exchange for cloud-based collaborative design. Int. J. Coop. Inf. Syst. 2020, 29, 2040006. [Google Scholar] [CrossRef]
- Klokov, R.; Lempitsky, V. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 863–872. [Google Scholar]
- Dai, A.; Chang, A.X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2432–2443. [Google Scholar]
- Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar]
- Johnson, J.; Hariharan, B.; Van Der Maaten, L.; Hoffman, J.; Li, F.-F.; Lawrence Zitnick, C.; Girshick, R. Inferring and executing programs for visual reasoning. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3008–3017. [Google Scholar]
- Li, Y.; Pirk, S.; Su, H.; Qi, C.R.; Guibas, L.J. Fpnn: Field probing neural networks for 3d data. Adv. Neural Inf. Process. Syst. 2016, 29, 307–315. [Google Scholar]
- Wang, D.Z.; Posner, I. Voting for voting in online point cloud object detection. In Robotics: Science and Systems; Sapienza University of Rome: Rome, Italy, 2015; Volume 1, pp. 10–15. [Google Scholar]
- Sun, X.; Lian, Z.; Xiao, J. Srinet: Learning strictly rotation-invariant representations for point cloud classification and segmentation. In Proceedings of the 27th ACM International Conference on Multimedia, Nice France, 21–25 October 2019; pp. 980–988. [Google Scholar]
- Joseph-Rivlin, M.; Zvirin, A.; Kimmel, R. Momen (e) t: Flavor the moments in learning to classify shapes. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Korea, 27–28 October 2019; pp. 4085–4094. [Google Scholar]
- Achlioptas, P.; Diamanti, O.; Mitliagkas, I.; Guibas, L. Learning representations and generative models for 3d point clouds. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 40–49. [Google Scholar]
- Lin, H.; Xiao, Z.; Tan, Y.; Chao, H.; Ding, S. Justlookup: One millisecond deep feature extraction for point clouds by lookup tables. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, 8–12 July 2019; pp. 326–331. [Google Scholar]
- Zhang, D.; He, F.; Tu, Z.; Zou, L.; Chen, Y. Pointwise geometric and semantic learning network on 3D point clouds. Integr. Comput.-Aided Eng. 2020, 27, 57–75. [Google Scholar] [CrossRef]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. Acm Trans. Graph. (TOG) 2019, 38, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Guo, M.H.; Cai, J.X.; Liu, Z.N.; Mu, T.J.; Martin, R.R.; Hu, S.M. PCT: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Li, X.; Yu, L.; Fu, C.W.; Cohen-Or, D.; Heng, P.A. Unsupervised detection of distinctive regions on 3D shapes. ACM Trans. Graph. (TOG) 2020, 39, 1–14. [Google Scholar] [CrossRef]
- Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 16259–16268. [Google Scholar]
- Shaw, P.; Uszkoreit, J.; Vaswani, A. Self-attention with relative position representations. arXiv 2018, arXiv:1803.02155. [Google Scholar]
- Phan, A.V.; Le Nguyen, M.; Nguyen, Y.L.H.; Bui, L.T. Dgcnn: A convolutional neural network over large-scale labeled graphs. Neural Netw. 2018, 108, 533–543. [Google Scholar] [CrossRef]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. Pointcnn: Convolution on x-transformed points. Adv. Neural Inf. Process. Syst. 2018, 31, 820–830. [Google Scholar]
- Liu, Y.; Fan, B.; Xiang, S.; Pan, C. Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8887–8896. [Google Scholar]
- Xu, M.; Ding, R.; Zhao, H.; Qi, X. PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3172–3181. [Google Scholar]
- Muzahid, A.; Wan, W.; Sohel, F.; Wu, L.; Hou, L. Curvenet: Curvature-based multitask learning deep networks for 3D object recognition. IEEE/CAA J. Autom. Sin. 2020, 8, 1177–1187. [Google Scholar] [CrossRef]
- Ran, H.; Zhuo, W.; Liu, J.; Lu, L. Learning Inner-Group Relations on Point Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 15477–15487. [Google Scholar]
- Xu, Y.; Fan, T.; Xu, M.; Zeng, L.; Qiao, Y. Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 87–102. [Google Scholar]
- Komarichev, A.; Zhong, Z.; Hua, J. A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7413–7422. [Google Scholar]
- Yan, X.; Zheng, C.; Li, Z.; Wang, S.; Cui, S. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5588–5597. [Google Scholar]
- Li, J.; Chen, B.M.; Lee, G.H. So-net: Self-organizing network for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9397–9406. [Google Scholar]
- Chang, A.X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. Shapenet: An information-rich 3d model repository. arXiv 2015, arXiv:1512.03012. [Google Scholar]
- Atzmon, M.; Maron, H.; Lipman, Y. Point convolutional neural networks by extension operators. arXiv 2018, arXiv:1803.10091. [Google Scholar] [CrossRef] [Green Version]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Zhang, D.; Zhang, Z.; Zou, L.; Xie, Z.; He, F.; Wu, Y.; Tu, Z. Part-based visual tracking with spatially regularized correlation filters. Vis. Comput. 2020, 36, 509–527. [Google Scholar] [CrossRef]
- Zhang, D.; Wu, Y.; Guo, M.; Chen, Y. Deep Learning Methods for 3D Human Pose Estimation under Different Supervision Paradigms: A Survey. Electronics 2021, 10, 2267. [Google Scholar] [CrossRef]
- Wu, Y.; Ma, S.; Zhang, D.; Sun, J. 3D Capsule Hand Pose Estimation Network Based on Structural Relationship Information. Symmetry 2020, 12, 1636. [Google Scholar] [CrossRef]
Method | Input | Points | Acc |
---|---|---|---|
Pointnet [12] | xyz | 1024 | 89.2 |
Pointnet++ [13] | xyz | 1024 | 90.7 |
Kd-Net [21] | xyz | 32k | 91.8 |
DGCNN [40] | xyz | 1024 | 92.9 |
SRN [14] | xyz | 1024 | 91.5 |
PointGrid [11] | xyz | 1024 | 92.0 |
PointCNN [41] | xyz | 1024 | 92.2 |
RS-CNN [42] | xyz | 1024 | 93.6 |
PCT [33] | xyz | 1024 | 93.6 |
PAConv [43] | xyz | 1024 | 93.9 |
CurveNet [44] | xyz | 1024 | 93.8 |
RPNet-W9 [45] | xyz | 1024 | 93.9 |
Pointnet++ [13] | xyz,nr | 1024 | 91.7 |
PAT [16] | xyz,nr | 1024 | 91.7 |
SpiderCNN [46] | xyz,nr | 5k | 92.4 |
A-CNN [47] | xyz,nr | 1024 | 92.6 |
PointASNL [48] | xyz,nr | 1024 | 93.2 |
SO-Net [49] | xyz,nr | 1024 | 93.4 |
PointSCNet | xyz,nr | 1024 | 93.7 |
Class | Pointnet [12] | Pointnet++ [13] | SRN [14] | PCNN [51] | PointCNN [41] | PointSCNet |
---|---|---|---|---|---|---|
Airplane | 83.4 | 82.3 | 82.4 | 82.4 | 84.1 | 83.3 |
Bag | 78.7 | 79.7 | 79.8 | 80.1 | 86.4 | 84.3 |
Cap | 82.5 | 86.1 | 88.1 | 85.5 | 86.0 | 88.1 |
Car | 74.9 | 78.2 | 77.9 | 79.5 | 80.8 | 79.2 |
Chair | 89.6 | 90.5 | 90.7 | 90.8 | 90.6 | 91.0 |
Earphone | 73.0 | 73.7 | 69.6 | 73.2 | 79.7 | 74.3 |
Guitar | 91.5 | 91.5 | 90.9 | 91.3 | 92.3 | 91.2 |
Knife | 85.9 | 86.2 | 86.3 | 86.0 | 88.4 | 87.4 |
Lamp | 80.8 | 83.6 | 84.0 | 85.0 | 85.3 | 84.5 |
Laptop | 95.3 | 95.2 | 95.4 | 95.7 | 96.1 | 95.7 |
Motorbike | 65.2 | 71.0 | 72.2 | 73.2 | 77.2 | 73.4 |
Mug | 93.0 | 94.5 | 94.9 | 94.8 | 95.3 | 95.3 |
Pistol | 91.2 | 80.8 | 81.3 | 83.3 | 84.2 | 81.7 |
Rocket | 57.9 | 57.7 | 62.1 | 51.0 | 64.2 | 60.7 |
Skateboard | 72.8 | 74.8 | 75.9 | 75.0 | 80.0 | 75.9 |
Mean | 83.7 | 85.1 | 85.3 | 85.1 | 86.1 | 85.6 |
Method | Params | Acc |
---|---|---|
Pointnet [12] | 3.472 M | 89.2 |
Pointnet++ [13] | 1.748 M | 91.9 |
SRN [14] | 3.743 M | 91.5 |
DGCNN [40] | 1.811 M | 92.9 |
NPCT [33] | 1.36 M | 91.0 |
SPCT [33] | 1.36 M | 92.0 |
PCT [33] | 2.88 M | 93.2 |
PointSCNet | 1.827 M | 93.7 |
Methods | ZS | C&S | AM | Acc | ToBestAcc/Epochs |
---|---|---|---|---|---|
A | ✓ | × | × | 93.0 | 87 |
B | ✓ | ✓ | × | 93.4 | 95 |
C | ✓ | × | ✓ | 93.2 | 85 |
D | × | ✓ | ✓ | 93.3 | 120 |
E | × | ✓ | × | 93.2 | 148 |
F | × | × | ✓ | 93.2 | 73 |
PointSCNet | ✓ | ✓ | ✓ | 93.7 | 67 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, X.; Wu, Y.; Xu, W.; Li, J.; Dong, H.; Chen, Y. PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling. Symmetry 2022, 14, 8. https://doi.org/10.3390/sym14010008
Chen X, Wu Y, Xu W, Li J, Dong H, Chen Y. PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling. Symmetry. 2022; 14(1):8. https://doi.org/10.3390/sym14010008
Chicago/Turabian StyleChen, Xingye, Yiqi Wu, Wenjie Xu, Jin Li, Huaiyi Dong, and Yilin Chen. 2022. "PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling" Symmetry 14, no. 1: 8. https://doi.org/10.3390/sym14010008
APA StyleChen, X., Wu, Y., Xu, W., Li, J., Dong, H., & Chen, Y. (2022). PointSCNet: Point Cloud Structure and Correlation Learning Based on Space-Filling Curve-Guided Sampling. Symmetry, 14(1), 8. https://doi.org/10.3390/sym14010008