Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany
Abstract
:1. Introduction
2. Related Work
3. Materials and Methods
3.1. Study Area and Data Acquisition
3.2. Reference Data Annotating
- (0)
- Unclassified: scanning reflections and classes including too few points, i.e. persons and bikes;
- (1)
- Natural ground: natural ground, terrain and grass;
- (2)
- Low vegetation: flowers, shrubs and small bushes;
- (3)
- High vegetation: trees and large bushes higher than 2 m;
- (4)
- Buildings: commercial and residential buildings (our dataset contained no industrial buildings);
- (5)
- Traffic roads: main roads, minor streets and highways;
- (6)
- Wire-structure connectors: power lines and utility lines;
- (7)
- Vehicles: cars, lorries and trucks;
- (8)
- Poles;
- (9)
- Hardscape: a cluttered class, including sculptures, stone statues and fountains;
- (10)
- Barriers: walls, fences and barriers; and
- (11)
- Pavements: footpaths, alleys and cycle paths.
3.3. Data of Train/Validation/Test Splitting
3.4. End-to-End Deep Learning Applied to Urban Furniture Segmentation
3.5. Evaluation
3.6. Implementation
4. Results
5. Discussion
5.1. Data Preparation
5.2. Point Clouds Sampling
5.3. Class-Balanced Loss Function
5.4. Appearance Information
5.5. Reference Data
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ALS | Airborne LiDAR scanning |
ARandLA-Net | Adapted RandLA-Net |
CAD | Computer-Aided design |
CE | Cross-Entropy |
CNNs | Convolutional neural networks |
DL | Deep learning |
DP | Dropout |
DNN | Deep neural network |
FC | Fully connected |
FP | False positive |
FN | False negative |
GPS | Global positioning system |
GT | Ground truths |
IoU | Intersection-over-Union |
LFA | Local feature aggregation |
LiDAR | Light detection and ranging |
Lovász | Lovász-Softmax |
mean Acc | Mean accuracy |
mean IoU | Mean Intersection-over-Union |
MLS | Mobile laser scanning |
MLPs | Multi-Layer Perceptrons |
NIR | Near infrared |
OA | Overall accuracy |
RF | Random forest |
RGB | Red–green–blue |
RNN | Recurrent Neural Network |
RS | Random sampling |
SVM | Support vector machine |
TLS | Terrestrial LiDAR scanning |
TN | True negative |
TP | True positive |
ULS | Unmanned LiDAR scanning |
UP | Up-Sampling |
UTM | Universal transverse mercator |
WCE | Weighted cross-entropy with inverse frequency |
WCES | Weighted cross-entropy with inverse square root frequency |
WCESL | A combination of the WCES and the Lovász |
2D | Two-Dimensional |
3D | Three-Dimensional |
References
- Xu, Y.; Boerner, R.; Yao, W.; Hoegner, L.; Stilla, U. Pairwise coarse registration of point clouds in urban scenes using voxel-based 4-planes congruent sets. ISPRS J. Photogramm. Remote Sens. 2019, 151, 106–123. [Google Scholar] [CrossRef]
- Luo, H.; Khoshelham, K.; Fang, L.; Chen, C. Unsupervised scene adaptation for semantic segmentation of urban mobile laser scanning point clouds. ISPRS J. Photogramm. Remote Sens. 2020, 169, 253–267. [Google Scholar] [CrossRef]
- Hackel, T.; Savinov, N.; Ladicky, L.; Wegner, J.D.; Schindler, K.; Pollefeys, M. SEMANTIC3D.NET: A new large-scale point cloud classification benchmark. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, IV-1–W1, 91–98. [Google Scholar] [CrossRef] [Green Version]
- Zai, D.; Li, J.; Guo, Y.; Cheng, M.; Huang, P.; Cao, X.; Wang, C. Pairwise registration of TLS point clouds using covariance descriptors and a non-cooperative game. ISPRS J. Photogramm. Remote Sens. 2017, 134, 15–29. [Google Scholar] [CrossRef]
- Theiler, P.; Wegner, J.; Schindler, K. Keypoint-based 4-Points Congruent Sets—Automated marker-less registration of laser scans. ISPRS J. Photogramm. Remote Sens. 2014, 96, 149–163. [Google Scholar] [CrossRef]
- Theiler, P.; Wegner, J.; Schindler, K. Globally consistent registration of terrestrial laser scans via graph optimization. ISPRS J. Photogramm. Remote Sens. 2015, 109, 126–138. [Google Scholar] [CrossRef]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Munoz, D.; Bagnell, J.A.; Vandapel, N.; Hebert, M. Contextual Classification with Functional Max-Margin Markov Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009. [Google Scholar]
- Roynard, X.; Deschaud, J.E.; Goulette, F. Paris-Lille-3D: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification. Int. J. Robot. Res. 2017, 37, 545–557. [Google Scholar] [CrossRef] [Green Version]
- Tan, W.; Qin, N.; Ma, L.; Li, Y.; Du, J.; Cai, G.; Yang, K.; Li, J. Toronto-3D: A large-scale mobile lidar dataset for semantic segmentation of urban roadways. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 202–203. [Google Scholar]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, in press. [Google Scholar] [CrossRef] [PubMed]
- Griffiths, D.; Boehm, J. A Review on Deep Learning Techniques for 3D Sensed Data Classification. Remote Sens. 2019, 11, 1499. [Google Scholar] [CrossRef] [Green Version]
- Graham, B.; Engelcke, M.; van der Maaten, L. 3D Semantic Segmentation With Submanifold Sparse Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Choy, C.; Gwak, J.; Savarese, S. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 3075–3084. [Google Scholar]
- Le, T.; Duan, Y. PointGrid: A Deep Network for 3D Shape Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018; pp. 9204–9214. [Google Scholar] [CrossRef]
- Liu, Z.; Tang, H.; Lin, Y.; Han, S. Point-Voxel CNN for Efficient 3D Deep Learning. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Meng, H.Y.; Gao, L.; Lai, Y.K.; Manocha, D. VV-Net: Voxel VAE Net With Group Convolutions for Point Cloud Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 8499–8507. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Zhou, Z.; David, P.; Yue, X.; Xi, Z.; Gong, B.; Foroosh, H. PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Xie, Y.; Tian, J.; Zhu, X.X. Linking Points With Labels in 3D: A Review of Point Cloud Semantic Segmentation. IEEE Geosci. Remote Sens. Mag. 2020, 8, 38–59. [Google Scholar] [CrossRef] [Green Version]
- Lyu, Y.; Huang, X.; Zhang, Z. Learning to Segment 3D Point Clouds in 2D Image Space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 12255–12264. [Google Scholar]
- Cortinhal, T.; Tzelepis, G.; Aksoy, E.E. SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving. arXiv 2020, arXiv:cs.CV/2003.03653. [Google Scholar]
- Xu, C.; Wu, B.; Wang, Z.; Zhan, W.; Vajda, P.; Keutzer, K.; Tomizuka, M. Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 1–19. [Google Scholar]
- Austin, M.; Delgoshaei, P.; Coelho, M.; Heidarinejad, M. Architecting Smart City Digital Twins: Combined Semantic Model and Machine Learning Approach. J. Manag. Eng. 2020, 36, 04020026. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. arXiv 2016, arXiv:1612.00593. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv 2017, arXiv:1706.02413. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution on Xtransformed points. Adv. Neural Inf. Process. Syst. 2018, 31, 820–830. [Google Scholar]
- Tatarchenko, M.; Park, J.; Koltun, V.; Zhou, Q.Y. Tangent Convolutions for Dense Prediction in 3D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Graph. (TOG) 2019, 38, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. KPConv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Loic, L.; Martin, S. Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Huang, Q.; Wang, W.; Neumann, U. Recurrent Slice Networks for 3D Segmentation on Point Clouds. arXiv 2018, arXiv:1802.04402. [Google Scholar]
- Zhang, Z.; Hua, B.S.; Yeung, S.K. ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Weinacker, H.; Koch, B.; Weinacker, R. TREESVIS: A software system for simultaneous ED-real-time visualisation of DTM, DSM, laser raw data, multispectral data, simple tree and building models. ISPRS J. Photogramm. Remote Sens. 2004, 36, 90–95. [Google Scholar]
- Girardeau-Montaut, D. CloudCompare. Available online: https://www.danielgm.net/cc/ (accessed on 22 June 2021).
- Rosu, R.A.; Schütt, P.; Quenzel, J.; Behnke, S. LatticeNet: Fast point cloud segmentation using permutohedral lattices. In Proceedings of the Robotics: Science and Systems (RSS), Online, 12–16 July 2020. [Google Scholar]
- Berman, M.; Rannen Triki, A.; Blaschko, M.B. The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018; pp. 4413–4421. [Google Scholar]
- Varney, N.; Asari, V.; Graehling, Q. DALES: A Large-scale Aerial LiDAR Data Set for Semantic Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 717–726. [Google Scholar] [CrossRef]
- Hu, Q.; Yang, B.; Khalid, S.; Xiao, W.; Trigoni, N.; Markham, A. Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LO, USA, 19–24 June 2021. [Google Scholar]
- Awrangjeb, M.; Fraser, C. Automatic Segmentation of Raw LIDAR Data for Extraction of Building Roofs. Remote Sens. 2014, 6, 3716–3751. [Google Scholar] [CrossRef] [Green Version]
- Vo, A.V.; Truong-Hong, L.; Laefer, D.; Bertolotto, M. Octree-based region growing for point cloud segmentation. ISPRS J. Photogramm. Remote Sens. 2015, 104, 88–100. [Google Scholar] [CrossRef]
- Nurunnabi, A.; Geoff, W.; Belton, D. Outlier Detection and Robust Normal-Curvature Estimation in Mobile Laser Scanning 3D Point Cloud Data. Pattern Recognit. 2015, 48, 1404–1419. [Google Scholar] [CrossRef] [Green Version]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Chetlur, S.; Woolley, C.; Vandermersch, P.; Cohen, J.; Tran, J.; Catanzaro, B.; Shelhamer, E. cuDNN: Efficient Primitives for Deep Learning. arXiv 2014, arXiv:1410.0759. [Google Scholar]
- Lang, I.; Manor, A.; Avidan, S. SampleNet: Differentiable Point Cloud Sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 7575–7585. [Google Scholar] [CrossRef]
- Dovrat, O.; Lang, I.; Avidan, S. Learning to Sample. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Xu, Q.; Sun, X.; Wu, C.Y.; Wang, P.; Neumann, U. Grid-GCN for Fast and Scalable Point Cloud Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 5660–5669. [Google Scholar] [CrossRef]
- Yan, X.; Zheng, C.; Li, Z.; Wang, S.; Cui, S. PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
Surroundings | Average Speed (km/h) | Points/m2 on the Ground | Points/m2 on the Wall | Points/m2 on the Ceiling |
---|---|---|---|---|
Closed locality | 40 | 2500 | 1900 | 1500 |
Country road | 80 | 1250 | 950 | 750 |
Highway | 120 | 833 | 633 | 500 |
mIoU (%) | OA (%) | |
---|---|---|
Random sampling with constant density | 52.2 | 83.4 |
Grid samplings with constant density | 53.7 | 83.8 |
Grid (m) | OA (%) | mIoU (%) | mAcc (%) | Natural Ground | Low Vegetation | High Vegetation | Buildings | Traffic Roads | Wire-Structure Connectors | Vehicles | Poles | Hardscape | Barriers | Pavements |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.04 | 72.7 | 47.6 | 65.0 | 42.1 | 43.5 | 75.6 | 74.5 | 78.2 | 23.5 | 71.5 | 52 | 4.6 | 13.7 | 43.9 |
0.06 | 77.7 | 50.6 | 67.0 | 48.0 | 55.5 | 78.7 | 73.8 | 74.9 | 27.9 | 74.3 | 51.9 | 4.7 | 23.3 | 43.3 |
0.08 | 79.9 | 53.6 | 71.1 | 53.1 | 53.7 | 82.5 | 76.5 | 73.7 | 31.1 | 80.0 | 58.5 | 6.4 | 25.2 | 48.4 |
0.10 | 81.3 | 52.4 | 76.5 | 55.6 | 53.1 | 84.8 | 75.9 | 69.5 | 30.6 | 79.4 | 54.0 | 7.8 | 19.6 | 52.4 |
0.12 | 81.3 | 50.9 | 76.9 | 56.6 | 51.2 | 85.3 | 75.7 | 69.4 | 27.1 | 74.3 | 53.6 | 5.3 | 15.6 | 45.4 |
0.14 | 82.8 | 52.1 | 80.1 | 60.5 | 53.1 | 86.0 | 76.2 | 67.3 | 35.5 | 74.0 | 56.1 | 6.7 | 15.1 | 42.7 |
0.20 | 83.9 | 54.4 | 80.5 | 63.3 | 53.7 | 86.7 | 73.9 | 63.2 | 57.8 | 75.7 | 64.9 | 8.7 | 11.2 | 39.6 |
0.30 | 85.0 | 50.0 | 81.6 | 65.8 | 54.6 | 87.8 | 71.2 | 64.0 | 53.9 | 69.9 | 24.9 | 9.2 | 10.2 | 38.4 |
0.40 | 85.4 | 47.8 | 81.9 | 65.1 | 56.1 | 88.7 | 70.1 | 64.4 | 41.8 | 59.6 | 19.1 | 8.7 | 11.0 | 40.7 |
Input Feature | mIoU (%) | OA (%) |
---|---|---|
x-y-z coordinates | 50.7 | 83.8 |
x-y-z coordinates and RGB values | 54.0 | 84.1 |
Input Feature | OA (%) | mIoU (%) | Natural Ground | Low Vegetation | High Vegetation | Buildings | Traffic Roads | Wire-Structure Connectors | Vehicles | Poles | Hardscape | Barriers | Pavements |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x-y-z plus RGB | 83.0 | 51.4 | 61.4 | 50.7 | 86.9 | 71.4 | 62.2 | 47.4 | 76.1 | 57.1 | 7.7 | 10.9 | 34.3 |
x-y-z | 82.4 | 46.8 | 69.4 | 58.0 | 84.7 | 62.9 | 80.1 | 68.5 | 65.6 | 18.1 | 3.8 | 1.9 | 2.5 |
Loss | OA (%) | mIoU (%) | Natural Ground | Low Vegetation | High Vegetation | Buildings | Traffic Roads | Wire-Structure Connectors | Vehicles | Poles | Hardscape | Barriers | Pavements |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CE | 83.9 | 50.7 | 63.2 | 55.0 | 86.2 | 77.6 | 65.1 | 36.5 | 70.9 | 49.4 | 6.2 | 9.8 | 37.7 |
WCE | 84.2 | 53.2 | 63.3 | 56.9 | 87.0 | 73.9 | 66.1 | 46.6 | 73.3 | 58.1 | 7.5 | 12.6 | 40.4 |
WCES | 83.9 | 54.4 | 63.3 | 53.7 | 86.7 | 73.9 | 63.2 | 57.8 | 75.7 | 64.9 | 8.7 | 11.2 | 39.6 |
WCESL | 84.0 | 53.2 | 62.3 | 55.0 | 87.2 | 73.0 | 65.0 | 47.4 | 71.9 | 59.7 | 5.8 | 11.6 | 37.6 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zou, Y.; Weinacker, H.; Koch, B. Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany. Remote Sens. 2021, 13, 3220. https://doi.org/10.3390/rs13163220
Zou Y, Weinacker H, Koch B. Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany. Remote Sensing. 2021; 13(16):3220. https://doi.org/10.3390/rs13163220
Chicago/Turabian StyleZou, Yanling, Holger Weinacker, and Barbara Koch. 2021. "Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany" Remote Sensing 13, no. 16: 3220. https://doi.org/10.3390/rs13163220
APA StyleZou, Y., Weinacker, H., & Koch, B. (2021). Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany. Remote Sensing, 13(16), 3220. https://doi.org/10.3390/rs13163220