Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes
Abstract
:1. Introduction
- This paper proposes a density-based RANSAC method (DSSAC), which pre-processes the dataset through clustering and dividing it into dynamic and static clusters to accurately eliminate dynamic feature points. Since the data are pre-clustered, the required number of iterations is significantly reduced compared to traditional RANSAC. This method combines geometric analysis and clustering techniques, offering an efficient and accurate strategy for elimination mismatches.
- RANSAC, as a highly robust fitting method, excels in fitting high-quality data. By combining the DSSAC method with RANSAC, the former effectively removes noise and clustered dynamic points through clustering, while the latter is employed for secondary screening of mismatched points, achieving higher accuracy and computational efficiency.
- This study evaluates the performance of the DSSAC-RANSAC method and verifies its effectiveness in eliminating mismatches in indoor and outdoor dynamic scenes. Using reprojection error mean, reprojection error variance, and processing time as evaluation metrics, experimental results demonstrate that DSSAC-RANSAC significantly outperforms traditional methods in eliminating dynamic feature points. The algorithm is applied to the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 to validate its feasibility within visual SLAM systems.
2. Related Works
3. Methodology
3.1. DSSAC-RANSAC Framework
- 1.
- Capture and output the corresponding 2D images and through the camera.
- 2.
- Detect key points in images and , calculate the descriptors of the key points, and output the ORB feature points.
- 3.
- Perform brute-force matching of ORB features based on Hamming distance [19], obtaining the initial matching set .
- 4.
- Perform down-sampling on the sample set to obtain the matching set .
- 5.
- Apply the DSSAC method proposed in this paper for local clustering, obtaining the clustered set , and then remove dynamic features to obtain the static point set .
- 6.
- Use the RANSAC method to perform local optimization on the point set .
3.2. Density-Based Segmentation RANSAC
- 1.
- Compute the distance matrix and cluster the point set into .
- 2.
- Initialize parameters: set the reprojection error threshold , the minimum inlier ratio , the matching set , and initialize the number of inliers .
- 3.
- Select four matching pairs from and compute the homography matrix .
- 4.
- Using the homography matrix , calculate the number of inliers in which the reprojection error , between image projected to image is less than .
- 5.
- Calculate the inlier ratio of the cluster by using the total number of matches and the number of inliers that meet the conditions.
- 6.
- If , classify the cluster as a static cluster.
- 7.
- Iterate through all clusters in the dataset based on their labels, repeat steps (3) to (6), and add all static points to the set .
3.3. Image Feature Mismatch Elimination Method Combining DSSAC and RANSAC
4. Experiments
4.1. Datasets and Evaluation Metrics
4.2. Mismatch Elimination in Different Scenarios
4.3. Improvements to the ORB-SLAM3 Feature Matching Module Using the DSSAC-RANSAC Algorithm
4.3.1. Monocular Initialization of ORB-SLAM2
4.3.2. Tracking Thread of ORB-SLAM3
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Abaspur Kazerouni, I.; Fitzgerald, L.; Dooly, G.; Toal, D. A Survey of State-of-the-Art on Visual SLAM. Expert Syst. Appl. 2022, 205, 117734. [Google Scholar] [CrossRef]
- Fuentes-Pacheco, J.; Ruiz-Ascencio, J.; Rendón-Mancha, J.M. Visual Simultaneous Localization and Mapping: A Survey. Artif. Intell. Rev. 2015, 43, 55–81. [Google Scholar] [CrossRef]
- Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J.J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
- Raguram, R.; Chum, O.; Pollefeys, M.; Matas, J.; Frahm, J.-M. USAC: A Universal Framework for Random Sample Consensus. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2022–2038. [Google Scholar] [CrossRef]
- Peng, L.; Zhang, Y.; Zhou, H.; Lu, T. A Robust Method for Estimating Image Geometry With Local Structure Constraint. IEEE Access 2018, 6, 20734–20747. [Google Scholar] [CrossRef]
- Chum, O.; Matas, J. Matching with PROSAC—Progressive Sample Consensus. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: San Diego, CA, USA, 2005; Volume 1, pp. 220–226. [Google Scholar]
- Chum, O.; Matas, J.; Kittler, J. Locally Optimized RANSAC. In Pattern Recognition; Michaelis, B., Krell, G., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2781, pp. 236–243. ISBN 978-3-540-40861-1. [Google Scholar]
- Yang, Z.; He, Y.; Zhao, K.; Lang, Q.; Duan, H.; Xiong, Y.; Zhang, D. Research on Inter-Frame Feature Mismatch Removal Method of VSLAM in Dynamic Scenes. Sensors 2024, 24, 1007. [Google Scholar] [CrossRef] [PubMed]
- Myatt, D.R. Robust Estimation in High Noise and Highly Dimensional Data Sets with Applications to Machine Vision. arXiv 2024, arXiv:2409.12805. [Google Scholar]
- Dubrofsky, E. Homography Estimation. Master’s Thesis, University of British Columbia, Vancouver, BC, Canada, 2009. [Google Scholar]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.-P.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
- Mumtaz, K. An Analysis on Density Based Clustering of Multi Dimensional Spatial Data. Indian J. Comput. Sci. Eng. 2010, 1, 8–12. [Google Scholar]
- Jiang, X.; Ma, J.; Jiang, J.; Guo, X. Robust Feature Matching Using Spatial Clustering with Heavy Outliers. IEEE Trans. Image Process. 2020, 29, 736–746. [Google Scholar] [CrossRef] [PubMed]
- Liu, P.; Zhou, D.; Wu, N. VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise. In Proceedings of the 2007 International Conference on Service Systems and Service Management, Chengdu, China, 9–11 June 2007. [Google Scholar]
- Liu, B. A Fast Density-Based Clustering Algorithm for Large Databases. In Proceedings of the 2006 International Conference on Machine Learning and Cybernetics, Dalian, China, 13–16 August 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 996–1000. [Google Scholar]
- Hu, L.; Zuo, W.; Zhang, J. A Mismatch Elimination Method Based on Reverse Nearest Neighborhood and Influence Space. J. Comput. Aided Des. Comput. Graph. 2022, 34, 449–458. [Google Scholar] [CrossRef]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An Efficient Alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2564–2571. [Google Scholar]
- Hadrovic, E.; Osmankovic, D.; Velagic, J. Aerial Image Mosaicing Approach Based on Feature Matching. In Proceedings of the 2017 International Symposium ELMAR, Zadar, Croatia, 18–20 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 177–180. [Google Scholar]
- Tenorth, M.; Bandouch, J.; Beetz, M. The TUM Kitchen Data Set of Everyday Manipulation Activities for Motion Tracking and Action Recognition. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan, 27 September–4 October 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1089–1096. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision Meets Robotics: The KITTI Dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
Sequences | Metrics | RAN | G-R | G-ATR | D-R |
---|---|---|---|---|---|
TUM (Indoor dynamic) | 2.437 | 1.913 | 1.237 | 1.084 | |
1.237 | 0.813 | 0.684 | 0.284 | ||
time (ms) | 185 | 73 | 67 | 67 | |
KITTI (Outdoor dynamic) | 2.372 | 1.897 | 1.351 | 0.957 | |
1.113 | 0.901 | 0.705 | 0.253 | ||
time (ms) | 187 | 82 | 70 | 66 | |
Self-collected (Indoor dynamic) | 2.675 | 2.125 | 1.411 | 1.021 | |
1.352 | 1.021 | 0.825 | 0.275 | ||
time (ms) | 178 | 78 | 68 | 58 | |
Self-collected (Outdoor dynamic) | 2.771 | 2.095 | 1.382 | 1.102 | |
1.531 | 1.103 | 0.833 | 0.301 | ||
time (ms) | 189 | 79 | 63 | 54 |
Sequences | Metrics | RAN | G-R | G-ATR | D-R |
---|---|---|---|---|---|
TUM (Indoor dynamic) | 2.455 | 1.701 | 1.114 | 1.048 | |
1.155 | 0.801 | 0.501 | 0.248 | ||
time (ms) | 240 | 111 | 83 | 73 | |
KITTI (Outdoor dynamic) | 2.351 | 1.738 | 1.125 | 1.106 | |
1.038 | 0.786 | 0.522 | 0.283 | ||
time (ms) | 237 | 109 | 83 | 71 | |
Self-collected (Indoor dynamic) | 2.536 | 1.835 | 1.221 | 0.973 | |
1.257 | 0.897 | 0.613 | 0.375 | ||
time (ms) | 221 | 107 | 79 | 69 | |
Self-collected (Outdoor dynamic) | 2.702 | 1.844 | 1.432 | 1.124 | |
1.215 | 0.911 | 0.671 | 0.331 | ||
time (ms) | 217 | 109 | 80 | 71 |
Sequences | Metrics | RAN | G-R | G-ATR | D-R |
---|---|---|---|---|---|
TUM (Indoor dynamic) | 2.625 | 1.515 | 1.187 | 1.138 | |
1.325 | 0.715 | 0.578 | 0.314 | ||
time (ms) | 299 | 131 | 102 | 82 | |
KITTI (Outdoor dynamic) | 2.513 | 1.419 | 1.232 | 1.056 | |
1.287 | 0.749 | 0.613 | 0.289 | ||
time (ms) | 287 | 134 | 102 | 81 | |
Self-collected (Indoor dynamic) | 2.633 | 1.533 | 1.319 | 1.121 | |
1.397 | 0.813 | 0.647 | 0.348 | ||
time (ms) | 275 | 129 | 100 | 83 | |
Self-collected (Outdoor dynamic) | 2.671 | 1.601 | 1.281 | 1.005 | |
1.433 | 0.781 | 0.634 | 0.482 | ||
time (ms) | 273 | 129 | 99 | 77 |
Sequences | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) |
---|---|---|---|---|---|---|---|---|---|
TUM (Indoor dynamic) | 55.5% | 77.0% | 12.4% | 43.3% | 65.0% | 58.5% | 63.8% | 8.2% | 0% |
KITTI (Outdoor dynamic) | 59.7% | 77.2% | 29.2% | 49.6% | 77.3% | 64.1% | 64.7% | 19.5% | 5.7% |
Self- collected (Indoor dynamic) | 61.8% | 79.7% | 27.6% | 52.0% | 73.1% | 66.7% | 71.4% | 25.6% | 14.7% |
Self- collected (Outdoor dynamic) | 60.2% | 80.3% | 20.2% | 47.4% | 72.7% | 63.7% | 71.4% | 31.6% | 14.3% |
AVG. | 59.3% | 78.6% | 22.4% | 48.1% | 72.0% | 63.3% | 67.8% | 21.2% | 8.7% |
Sequences | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) |
---|---|---|---|---|---|---|---|---|---|
TUM (Indoordynamic) | 57.3% | 38.4% | 6% | 78.5% | 69.0% | 50.0% | 69.6% | 34.2% | 12.0% |
KITTI (Outdoor dynamic) | 53.0% | 36.4% | 2% | 72.7% | 64.0% | 45.8% | 70.0% | 34.9% | 14.5% |
Self- collected (Indoor dynamic) | 61.6% | 47.0% | 20.3% | 70.2% | 58.2% | 38.8% | 68.8% | 35.5% | 12.3% |
Self- collected (Outdoor dynamic) | 58.4% | 39.0% | 21.5% | 72.8% | 63.7% | 50.7% | 67.3% | 34.9% | 11.2% |
AVG. | 57.6% | 40.2% | 12.5% | 73.6% | 63.7% | 46.3% | 68.9% | 34.9% | 12.5% |
Sequences | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) |
---|---|---|---|---|---|---|---|---|---|
TUM (Indoor dynamic) | 56.6% | 24.9% | 4.1% | 76.3% | 56.1% | 45.7% | 72.6% | 37.4% | 19.6% |
KITTI (Outdoor dynamic) | 58.0% | 25.6% | 14.3% | 77.5% | 61.4% | 52.9% | 71.8% | 39.6% | 20.6% |
Self- collected (Indoor dynamic) | 57.4% | 26.9% | 15.0% | 75.1% | 57.2% | 46.2% | 69.8% | 35.7% | 17% |
Self- collected (Outdoor dynamic) | 62.4% | 37.2% | 21.5% | 66.4% | 38.3% | 24.0% | 71.8% | 40.3% | 22.2% |
AVG. | 58.6% | 28.7% | 13.7% | 73.8 | 53.3% | 42.2% | 71.5% | 38.3 | 19.9% |
ORB Feature Points | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) | For RAN (%) | For G-R (%) | For G-ATR (%) |
---|---|---|---|---|---|---|---|---|---|
1500 | 59.3% | 78.6% | 22.4% | 48.1% | 72.0% | 63.3% | 67.8% | 21.2% | 8.7% |
2000 | 57.6% | 40.2% | 12.5% | 73.6% | 63.7% | 46.3% | 68.9% | 34.9% | 12.5% |
2500 | 58.6% | 28.7% | 13.7% | 73.8 | 53.3% | 42.2% | 71.5% | 38.3% | 19.9% |
AVG. | 58.5% | 49.2% | 16.2% | 65.2% | 63.0% | 50.6% | 69.4% | 31.5% | 13.7% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Zhao, K.; Yang, S.; Xiong, Y.; Zhang, C.; Deng, L.; Zhang, D. Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes. Sensors 2025, 25, 622. https://doi.org/10.3390/s25030622
Yang Z, Zhao K, Yang S, Xiong Y, Zhang C, Deng L, Zhang D. Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes. Sensors. 2025; 25(3):622. https://doi.org/10.3390/s25030622
Chicago/Turabian StyleYang, Zhiyong, Kun Zhao, Shengze Yang, Yuhong Xiong, Changjin Zhang, Lielei Deng, and Daode Zhang. 2025. "Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes" Sensors 25, no. 3: 622. https://doi.org/10.3390/s25030622
APA StyleYang, Z., Zhao, K., Yang, S., Xiong, Y., Zhang, C., Deng, L., & Zhang, D. (2025). Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes. Sensors, 25(3), 622. https://doi.org/10.3390/s25030622