MSFA-Net: A Multiscale Feature Aggregation Network for Semantic Segmentation of Historical Building Point Clouds
Abstract
:1. Introduction
- (1)
- This paper proposes a unique semantic segmentation network named MSFA-Net. It designs a double attention aggregation (DAA) module, which consists of a bidirectional adaptive pooling (BAP) block and a multiscale attention aggregation (MSAA) block. Through the combination of two different attention mechanisms, it can obtain multiscale information features of the target in the sampling process and reduce redundant information.
- (2)
- This paper proposes a contextual feature enhancement (CFE) module, which enhances the connection between the model context by fusing the local global features across the encoding and decoding layers and fully considers the semantic gap between neighboring features.
- (3)
- This paper proposes an edge interactive classifier (EIC), which introduces the features of each point into the edge interactive classifier to obtain the edge features of each point. Through the information transfer between nodes, it better performs label prediction, making it possible to smoothly segment the edges of objects.
2. Materials and Methods
2.1. MSFA-Net Model Construction
2.2. The Structure of the Double Attention Aggregation (DAA) Module
2.2.1. Multiscale Attention Aggregation Module
2.2.2. Bidirectional Adaptive Pooling Module
2.3. The Structure of the Contextual Feature Enhancement (CFE) Module
2.4. The Structure of the Edge Interactive Classifier (EIC) Module
3. Experiment
3.1. Experimental Platform
3.2. Dataset
3.3. Evaluation Indicators
4. Results and Discussion
4.1. Comparison with Other Methods
4.2. Ablation Study
4.2.1. Ablation Experiment of DAA Module
- (1)
- Ablation experiment of BAP module
- (2)
- Ablation experiment of the MSAA module
- (3)
- DAA comparative experimental results and analysis
4.2.2. Ablation Experiment of the CFE Module
4.2.3. Ablation experiment of the EIC module
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, R.; Luo, T.; Zha, H. 3D Digitization and Its Applications in Cultural Heritage. In Proceedings of the Euro-Mediterranean Conference, Lemesos, Cyprus, 8–13 November 2010; pp. 381–388. [Google Scholar]
- Ji, A.; Chew, A.W.Z.; Xue, X.; Zhang, L. An encoder-decoder deep learning method for multi-class object segmentation from 3D tunnel point clouds. Autom. Constr. 2022, 137, 104187. [Google Scholar] [CrossRef]
- Xie, Y.; Tian, J.; Zhu, X.X. Linking points with labels in 3D: A review of point cloud semantic segmentation. IEEE Geosci. Remote Sens. Mag. 2020, 8, 38–59. [Google Scholar] [CrossRef]
- Cheng, S.; Chen, X.; He, X.; Liu, Z.; Bai, X. Pra-net: Point relation-aware network for 3d point cloud analysis. IEEE Trans. ImageProcess. 2021, 30, 4436–4448. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Liu, X.; Xiao, Y.; Zhao, Q.; Wan, S. Three-Dimensional Urban Land Cover Classification by Prior-Level Fusion of LiDAR Point Cloud and Optical Imagery. Remote Sens. 2021, 13, 4928. [Google Scholar] [CrossRef]
- Pérez-Sinticala, C.; Janvier, R.; Brunetaud, X.; Treuillet, S.; Aguilar, R.; Castañeda, B. Evaluation of Primitive Extraction Methods from Point Clouds of Cultural Heritage Buildings. In Structural Analysis of Historical Constructions; RILEM Bookseries; Springer: Cham, Switzerland, 2019; pp. 2332–2341. [Google Scholar]
- Kivilcim, C.Ö.; Duran, Z. Parametric Architectural Elements from Point Clouds for HBIM Applications. Int. J. Environ. Geoinformatics 2021, 8, 144–149. [Google Scholar] [CrossRef]
- Cheng, M.; Hui, L.; Xie, J.; Yang, J. Sspc-net: Semi-supervised semantic 3d point cloud segmentation network. Proc. AAAI Conf. Artif. Intell. 2021, 35, 1140–1147. [Google Scholar] [CrossRef]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3d point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Le, T.; Duan, Y. Pointgrid: A Deep Network for 3D shape understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 8–23 June 2018; pp. 9204–9214. [Google Scholar]
- Meng, H.Y.; Gao, L.; Lai, Y.K.; Manocha, D. Vv-net: Voxel vae net with group convolutions for point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8500–8508. [Google Scholar]
- Milioto, A.; Vizzo, I.; Behley, J.; Stachniss, C. RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 4213–4220. [Google Scholar]
- Lyu, Y.; Huang, X.; Zhang, Z. Learning to segment 3d point clouds in 2d image space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 12255–12264. [Google Scholar]
- Triess, L.T.; Peter, D.; Rist, C.B.; Zöllner, J.M. Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental Study. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; pp. 1116–1121. [Google Scholar]
- Chen, Y.; Liu, G.; Xu, Y.; Pan, P.; Xing, Y. PointNet++ Network Architecture with Individual Point Level and Global Features on Centroid for ALS Point Cloud Classification. Remote Sens. 2021, 13, 472. [Google Scholar] [CrossRef]
- Qian, G.; Hammoud, H.; Li, G.; Thabet, A.; Ghanem, B. ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning. Neural Inf. Process. Syst. 2021, 34, 28119–28130. [Google Scholar]
- Qian, G.; Li, Y.; Peng, H.; Mai, J.; Hammoud, H.; Elhoseiny, M.; Ghanem, B. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. Adv. Neural Inf. Process. Syst. 2022, 35, 23192–23204. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11108–11117. [Google Scholar]
- Fan, S.; Dong, Q.; Zhu, F.; Lv, Y.; Ye, P.; Wang, F.Y. SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 14499–14508. [Google Scholar] [CrossRef]
- Zeng, Z.; Xu, Y.; Xie, Z.; Tang, W.; Wan, J.; Wu, W. LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via Local Adaptive and Comprehensive VLAD. arXiv 2022, arXiv:2210.05870. [Google Scholar]
- Mao, Y.; Sun, X.; Chen, K.; Diao, W.; Guo, Z.; Lu, X.; Fu, K. Semantic segmentation for point cloud scenes via dilated graph feature aggregation and pyramid decoders. arXiv 2022, arXiv:2204.04944. [Google Scholar]
- Xue, Y.; Zhang, R.; Wang, J.; Zhao, J.; Pang, L. EEI-NET: EDGE-ENHANCED INTERPOLATION NETWORK FOR SEMANTIC SEGMENTATION OF HISTORICAL BUILDING POINT CLOUDS. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 239–245. [Google Scholar] [CrossRef]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Armeni, I.; Sax, S.; Zamir, A.R.; Savarese, S. Joint 2D-3D-semantic data for indoor scene understanding. arXiv 2017, arXiv:1702.01105. [Google Scholar]
- Shuai, H.; Xu, X.; Liu, Q. Backward Attentive Fusing Network With Local Aggregation Classifier for 3D Point Cloud Semantic Segmentation. IEEE Trans. Image Process. 2021, 30, 4973–4984. [Google Scholar] [CrossRef] [PubMed]
- Su, Y.; Liu, W.; Yuan, Z.; Cheng, M.; Zhang, Z.; Shen, X.; Wang, C. DLA-Net: Learning dual local attention features for semantic segmentation of large-scale building facade point clouds. Pattern Recognit. 2022, 123, 108372. [Google Scholar] [CrossRef]
- Landrieu, L.; Simonovsky, M. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4558–4567. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. Pointcnn: Convolution on x-transformed points. arXiv 2018, arXiv:1801.07791. [Google Scholar]
- Zhao, H.; Jiang, L.; Fu, C.W.; Jia, J. Pointweb: Enhancing local neighborhood features for point cloud processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5565–5573. [Google Scholar]
- Zhang, Z.; Hua, B.S.; Yeung, S.K. Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1607–1616. [Google Scholar]
Symbol | Explanation |
---|---|
K nearest neighbor | |
Sum | |
Subtract | |
Concatenation | |
Softmax | |
Hadamard product | |
Batch normalization with ReLU activation | |
Transpose |
OA (%) | mAcc (%) | mIoU (%) | Column | Door | Forehead | Wall | Roof | Window | |
---|---|---|---|---|---|---|---|---|---|
RandLA-Net | 94.5 | 91.1 | 84.6 | 80.0 | 88.1 | 83.7 | 85.8 | 98.1 | 71.9 |
BAF-LAC | 93.4 | 90.2 | 82.4 | 79.1 | 85.2 | 83.4 | 80.3 | 97.1 | 69.3 |
DLA-Net | 93.8 | 91.7 | 84.2 | 85.8 | 86.0 | 82.8 | 80.0 | 96.6 | 76.9 |
SCF-Net | 95.0 | 92.0 | 85.0 | 81.9 | 86.9 | 84.5 | 84.8 | 98.3 | 73.8 |
Our | 95.2 | 92.5 | 86.2 | 84.9 | 88.9 | 84.7 | 84.8 | 98.3 | 75.8 |
OA (%) | mAcc (%) | mIoU (%) | Ceiling | Floor | Wall | Beam | Column | Window | Door | Table | Chair | Sofa | Bookcase | Board | Cluster | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PointNet | 78.6 | 66.2 | 47.6 | 88.0 | 88.7 | 69.3 | 42.4 | 23.1 | 47.5 | 51.6 | 51.4 | 42.0 | 9.6 | 38.2 | 29.4 | 35.2 |
SPG | 86.4 | 73.0 | 62.1 | 89.9 | 95.1 | 76.4 | 62.8 | 47.1 | 55.3 | 68.4 | 73.5 | 69.2 | 63.2 | 45.9 | 8.7 | 52.9 |
PointCNN | 88.1 | 75.6 | 65.4 | 94.8 | 97.3 | 75.8 | 63.3 | 51.7 | 58.4 | 57.2 | 71.6 | 69.1 | 39.1 | 61.2 | 52.2 | 58.6 |
PointWeb | 87.3 | 76.2 | 66.7 | 93.5 | 94.2 | 80.8 | 52.4 | 41.3 | 64.9 | 68.1 | 71.4 | 67.1 | 50.3 | 62.7 | 62.2 | 58.5 |
ShellNet | 87.1 | - | 66.8 | 90.2 | 93.6 | 79.9 | 60.4 | 44.1 | 64.9 | 52.9 | 71.6 | 84.7 | 53.8 | 64.6 | 48.6 | 59.4 |
KPConv | - | 79.1 | 70.6 | 93.6 | 92.4 | 83.1 | 63.9 | 54.3 | 66.1 | 76.6 | 57.8 | 64.0 | 69.3 | 74.9 | 61.3 | 60.3 |
RandLA-Net | 88.0 | 82.0 | 70.0 | 93.1 | 96.1 | 80.6 | 62.4 | 48.0 | 64.4 | 69.4 | 69.4 | 76.4 | 60.0 | 64.2 | 65.9 | 60.1 |
Our | 88.7 | 82.6 | 71.6 | 92.8 | 97.0 | 81.7 | 64.0 | 53.8 | 64.1 | 70.8 | 72.5 | 81.9 | 61.5 | 64.5 | 66.2 | 60.6 |
Model Name | Modules | Evaluation Index | |||||
---|---|---|---|---|---|---|---|
BAP | MSAA | CFE | EIC | OA (%) | mAcc (%) | mIoU (%) | |
None | 87.8 | 81.3 | 71.1 | ||||
BAP | √ | 94.1 | 89.5 | 83.0 | |||
MSAA | √ | 94.8 | 92.1 | 85.3 | |||
CFE | √ | 89.0 | 79.9 | 71.5 | |||
EIC | √ | 88.8 | 80.8 | 71.7 | |||
BAP + MSAA(DAA) | √ | √ | 94.9 | 92.5 | 85.8 | ||
BAP + MSAA + CFE | √ | √ | √ | 95.0 | 92.4 | 86.0 | |
Our | √ | √ | √ | √ | 95.2 | 92.5 | 86.2 |
mIoU (%) | |
---|---|
(1) None | 73.0 |
(2) CFE | 75.3 |
(3) Only global feature | 63.2 |
(4) Only partial feature | 73.5 |
mIoU (%) | |
---|---|
(1) None | 73.0 |
(2) Only EIC | 75.9 |
(3) Replace edge features with neighbor features | 69.3 |
(4) Replace with average pooling | 74.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, R.; Xue, Y.; Wang, J.; Song, D.; Zhao, J.; Pang, L. MSFA-Net: A Multiscale Feature Aggregation Network for Semantic Segmentation of Historical Building Point Clouds. Buildings 2024, 14, 1285. https://doi.org/10.3390/buildings14051285
Zhang R, Xue Y, Wang J, Song D, Zhao J, Pang L. MSFA-Net: A Multiscale Feature Aggregation Network for Semantic Segmentation of Historical Building Point Clouds. Buildings. 2024; 14(5):1285. https://doi.org/10.3390/buildings14051285
Chicago/Turabian StyleZhang, Ruiju, Yaqian Xue, Jian Wang, Daixue Song, Jianghong Zhao, and Lei Pang. 2024. "MSFA-Net: A Multiscale Feature Aggregation Network for Semantic Segmentation of Historical Building Point Clouds" Buildings 14, no. 5: 1285. https://doi.org/10.3390/buildings14051285
APA StyleZhang, R., Xue, Y., Wang, J., Song, D., Zhao, J., & Pang, L. (2024). MSFA-Net: A Multiscale Feature Aggregation Network for Semantic Segmentation of Historical Building Point Clouds. Buildings, 14(5), 1285. https://doi.org/10.3390/buildings14051285