Visual Place Recognition of Robots via Global Features of Scan-Context Descriptors with Dictionary-Based Coding
Abstract
:1. Introduction
- The accuracy of Lidar place recognition has been improved according to the global features of the descriptor. This can weaken the influence of different regions but similar descriptors.
- The retrieval dataset has been transformed into text information by dictionary-based coding. We simplify the final retrieval dataset, which can significantly improve the retrieval speed.
2. Related Works
3. Approach
3.1. Generation of Descriptors
3.2. Global Feature Extraction by CNN and Reclassification
3.3. Deep-Learning Network
3.4. Place Recognition
4. Experiments
4.1. Benchmark Datasets
4.2. Baseline Method
4.2.1. Scan Context
4.2.2. Pole Extraction
4.3. Experimental Results
4.3.1. Comparison of Accuracy
4.3.2. Global Feature Analysis
4.3.3. Runtime Evaluation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lin, H.Y.; He, C.H. Mobile Robot Self-Localization Using Omnidirectional Vision with Feature Matching from Real and Virtual Spaces. Appl. Sci. 2021, 11, 3360. [Google Scholar] [CrossRef]
- Jiao, H.; Chen, G. Global self-localization of redundant robots based on visual tracking. Int. J. Syst. Assur. Eng. Manag. 2021, 14, 529–537. [Google Scholar] [CrossRef]
- Liwei, H.; De, X.; Yi, Z. Natural ceiling features based self-localization for indoor mobile robots. Int. J. Model. Identif. Control 2010, 10, 272–280. [Google Scholar]
- Jabnoun, H.; Benzarti, F.; Morain-Nicolier, F.; Amiri, H. Visual substitution system for room labels identification based on text detection and recognition. Int. J. Intell. Syst. Technol. Appl. 2018, 17, 210–228. [Google Scholar]
- Kim, S.; Kim, S.; Lee, D.E. Sustainable application of hybrid point cloud and BIM method for tracking construction progress. Sustainability 2020, 12, 4106. [Google Scholar] [CrossRef]
- Li, N.; Ho, C.P.; Xue, J.; Lim, L.W.; Chen, G.; Fu, Y.H.; Lee, L.Y.T. A Progress Review on Solid—State LiDAR and Nanophotonics—Based LiDAR Sensors. Laser Photonics Rev. 2022, 16, 2100511. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, L.; Yang, J.; Cao, C.; Wang, W.; Ran, Y.; Tan, Z.; Luo, M. A Review of Multi-Sensor Fusion SLAM Systems Based on 3D LIDAR. Remote Sens. 2022, 14, 2835. [Google Scholar] [CrossRef]
- Yan, Z.; Duckett, T.; Bellotto, N. Online learning for human classification in 3D LiDAR-based tracking. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 864–871. [Google Scholar]
- Bosse, M.; Zlot, R. Place recognition using keypoint voting in large 3D lidar datasets. In Proceedings of the IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 2677–2684. [Google Scholar]
- Barros, T.; Garrote, L.; Pereira, R.; Premebida, C.; Nunes, U.J. Attdlnet: Attention-based dl network for 3d lidar place recognition. arXiv 2021, arXiv:2106.09637. [Google Scholar]
- Zhou, B.; He, Y.; Huang, W.; Yu, X.; Fang, F.; Li, X. Place recognition and navigation of outdoor mobile robots based on random Forest learning with a 3D LiDAR. J. Intell. Robot. Syst. 2022, 104, 72. [Google Scholar] [CrossRef]
- Vidanapathirana, K.; Moghadam, P.; Harwood, B.; Zhao, M.; Sridharan, S.; Fookes, C. Locus: Lidar-based place recognition using spatiotemporal higher-order pooling. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 5075–5081. [Google Scholar]
- Kim, G.; Kim, A. Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4802–4809. [Google Scholar]
- Kim, G.; Park, B.; Kim, A. 1-Day Learning, 1-Year Localization: Long-Term Lidar Localization Using Scan Context Image. IEEE Robot. Autom. Lett. 2019, 4, 1948–1955. [Google Scholar] [CrossRef]
- Tian, X.; Yi, P.; Zhang, F.; Lei, J.; Hong, Y. STV-SC: Segmentation and Temporal Verification Enhanced Scan Context for Place Recognition in Unstructured Environment. Sensors 2022, 22, 8604. [Google Scholar] [CrossRef]
- Yuan, H.; Zhang, Y.; Fan, S.; Li, X.; Wang, J. Object Scan Context: Object-centric Spatial Descriptor for Place Recognition within 3D Point Cloud Map. arXiv 2022, arXiv:2206.03062. [Google Scholar]
- Nam, S.J.; Park, I.C.; Kyung, C.M. Improving dictionary-based code compression in VLIW architectures. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 1999, 82, 2318–2324. [Google Scholar]
- Sajjad, M.; Mehmood, I.; Baik, S.W. Image super-resolution using sparse coding over redundant dictionary based on effective image representations. J. Vis. Commun. Image Represent. 2015, 26, 50–65. [Google Scholar] [CrossRef]
- Zhao, R.; Mao, K. Fuzzy Bag-of-Words Model for Document Representation. IEEE Trans. Fuzzy Syst. 2018, 26, 794–804. [Google Scholar] [CrossRef]
- Wu, L.; Hoi, S.C.H.; Yu, N. Semantics-Preserving Bag-of-Words Models and Applications. IEEE Trans. Image Process. 2010, 19, 1908–1920. [Google Scholar]
- NCLT Dataset. Available online: http://robots.engin.umich.edu/nclt/ (accessed on 4 April 2022).
- Carlevaris-Bianco, N.; Ushani, A.K.; Eustice, R.M. University of Michigan North Campus long-term vision and lidar dataset. Int. J. Robot. Res. 2015, 35, 1023–1035. [Google Scholar] [CrossRef]
- Bürki, M.; Schaupp, L.; Dymczyk, M.; Dubé, R.; Cadena, C.; Siegwart, R.; Nieto, J. Vizard: Reliable visual localization for autonomous vehicles in urban outdoor environments. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 1124–1130. [Google Scholar]
- He, L.; Wang, X.; Zhang, H. M2DP: A novel 3D point cloud descriptor and its application in loop closure detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 231–237. [Google Scholar]
- Alnaggar, Y.A.; Afifi, M.; Amer, K.; ElHelw, M. Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1800–1809. [Google Scholar]
- Komorowski, J. Minkloc3d: Point cloud based large-scale place recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1790–1799. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K. Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Luo, L.; Cao, S.Y.; Han, B.; Shen, H.L.; Li, J. BVMatch: Lidar-Based Place Recognition Using Bird’s-Eye View Images. IEEE Robot. Autom. Lett. 2021, 6, 6076–6083. [Google Scholar] [CrossRef]
- Rosas-Cervantes, V.; Lee, S.-G. 3D Localization of a Mobile Robot by Using Monte Carlo Algorithm and 2D Features of 3D Point Cloud. Int. J. Control Autom. Syst. 2020, 18, 2955–2965. [Google Scholar] [CrossRef]
- Li, X.; Du, S.; Li, G.; Li, H. Integrate point-cloud segmentation with 3D Lidar scan-matching for mobile robot localization and mapping. Sensors 2020, 20, 237. [Google Scholar] [CrossRef] [Green Version]
- Barea, R.; Pérez, C.; Bergasa, L.M.; López-Guillén, E.; Romera, E.; Molinos, E.; Ocana, M.; López, J. Vehicle detection and localization using 3d Lidar point cloud and image semantic segmentation. In Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 3481–3486. [Google Scholar]
- Xu, X.; Yin, H.; Chen, Z.; Li, Y.; Wang, Y.; Xiong, R. DiSCO: Differentiable Scan Context with Orientation. IEEE Robot. Autom. Lett. 2021, 6, 2791–2798. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Gao, Z.; Zhang, Y.; Li, Y. Extracting features from infrared images using convolutional neural networks and transfer learning. Infrared Phys. Technol. 2020, 105, 103237. [Google Scholar] [CrossRef]
- Mateen, M.; Wen, J.; Nasrullah; Song, S.; Huang, Z. Fundus Image Classification Using VGG-19 Architecture with PCA and SVD. Symmetry 2018, 11, 1. [Google Scholar] [CrossRef] [Green Version]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Andrej, K.; Aditya, K.; Michael, B.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Beyer, L.; Hénaff, O.J.; Kolesnikov, A.; Zhai, X.; Oord, A.V.D. Are we done with imagenet? arXiv 2020, arXiv:2006.07159. [Google Scholar]
- Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef] [Green Version]
- Karegowda, A.G.; Jayaram, M.A.; Manjunath, A.S. Cascading k-means clustering and k-nearest neighbor classifier for categorization of diabetic patients. Int. J. Eng. Adv. Technol. 2012, 1, 147–151. [Google Scholar]
- Wang, F.; Wang, Q.; Nie, F.; Li, Z.; Yu, W.; Ren, F. A linear multivariate binary decision tree classifier based on K-means splitting. Pattern Recognit. 2020, 107, 107521. [Google Scholar] [CrossRef]
- Chowdary, G.J. Class dependency based learning using Bi-LSTM coupled with the transfer learning of VGG16 for the diagnosis of Tuberculosis from chest x-rays. arXiv 2021, arXiv:2108.04329. [Google Scholar]
- Mascarenhas, S.; Agarwal, M. A comparison between VGG16, VGG19 and ResNet50 architecture frameworks for Image Classification. In Proceedings of the International Conference on Disruptive Technologies for Multi-Disciplinary Research and Applications (CENTCON), Bengaluru, India, 22–24 December 2021; Volume 1, pp. 96–99. [Google Scholar]
- Desai, P.; Pujari, J.; Sujatha, C.; Kamble, A.; Kambli, A. Hybrid Approach for Content-Based Image Retrieval using VGG16 Layered Architecture and SVM: An Application of Deep Learning. SN Comput. Sci. 2021, 2, 170. [Google Scholar] [CrossRef]
- Dong, H.; Chen, X.; Stachniss, C. Online range image-based pole extractor for long-term lidar localization in urban environments. In Proceedings of the European Conference on Mobile Robots (ECMR), Bonn, Germany, 31 August–3 September 2021; pp. 1–6. [Google Scholar]
- Lowry, S.; Sünderhauf, N.; Newman, P.; Leonard, J.J.; Cox, D.; Corke, P.; Milford, M.J. Visual Place Recognition: A Survey. IEEE Trans. Robot. 2015, 32, 1–19. [Google Scholar] [CrossRef] [Green Version]
- Stallings, W.M.; Gillmore, G.M. A note on “accuracy” and “precision”. J. Educ. Meas. 1971, 8, 127–129. [Google Scholar] [CrossRef]
In | (batch_size, 128, 128, 3) |
Conv | VGG-16 network |
Conv1 | Flatten (input_shape = vgg16.output_shape) |
Fc1 | ReLU (FC (256, Conv1)) |
Fc2 | Softmax (FC (N, Dropout (Fc1))) |
Out | (batch_size, N) |
Datasets | Training | Testing | ||||
---|---|---|---|---|---|---|
15 January 2012 | 8 January 2012 | 17 March 2012 | 4 August 2012 | 28 September 2012 | ||
NCLT | 579 places | visible | 6171 | 5449 | 5464 | 4626 |
invisible | 292 | 428 | 490 | 919 |
NCLT Dataset | 8 January 2012 | 17 March 2012 | 4 August 2012 | 28 September 2012 | |
---|---|---|---|---|---|
Accuracy | BGF-system (Top_5) | 0.9033 | 0.7962 | 0.6817 | 0.7187 |
Scan-context (Top_1) | 0.6924 | 0.6206 | 0.5644 | 0.5443 | |
Scan-context (Top_25) | 0.8004 | 0.7507 | 0.7313 | 0.7122 | |
Pole_extraction | 0.3725 | 0.3462 | 0.2888 | 0.2477 |
Method | Descriptor Generation (s) | Retrieving (s) | All (s) |
---|---|---|---|
BGF-system | 0.0434 | 0.000016 | 0.04341 |
Scan-context | 0.0434 | 0.0047 | 0.0481 |
Pole_extraction | 0.09 | 0.1 | 0.19 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ye, M.; Tanaka, K. Visual Place Recognition of Robots via Global Features of Scan-Context Descriptors with Dictionary-Based Coding. Appl. Sci. 2023, 13, 9040. https://doi.org/10.3390/app13159040
Ye M, Tanaka K. Visual Place Recognition of Robots via Global Features of Scan-Context Descriptors with Dictionary-Based Coding. Applied Sciences. 2023; 13(15):9040. https://doi.org/10.3390/app13159040
Chicago/Turabian StyleYe, Minying, and Kanji Tanaka. 2023. "Visual Place Recognition of Robots via Global Features of Scan-Context Descriptors with Dictionary-Based Coding" Applied Sciences 13, no. 15: 9040. https://doi.org/10.3390/app13159040
APA StyleYe, M., & Tanaka, K. (2023). Visual Place Recognition of Robots via Global Features of Scan-Context Descriptors with Dictionary-Based Coding. Applied Sciences, 13(15), 9040. https://doi.org/10.3390/app13159040