Range-Intensity-Profile-Guided Gated Light Ranging and Imaging Based on a Convolutional Neural Network
Abstract
:1. Introduction
- (1)
- The method RIP-Gated3D is proposed to obtain depth maps of high spatial resolution and high accuracy using the gated LiRAI system;
- (2)
- A network which utilize both “range-intensity” depth cues and semantic depth cues in two gated images is proposed to generate depth maps;
- (3)
- We generate synthetic training data using the real RIP of our gated LiRAI system and data from GTAV, and our network is mainly trained on synthetic training data and finetuned with small number of real range-intensity profile data;
- (4)
- We validate our method on a synthetic dataset and real-scene dataset, the network generates depth maps of high accuracy and solves the problems of distorted and blurry edges shown in other deep-learning based methods.
2. RIP-Gated3D Method
2.1. General Technical Solution
2.2. Dataset
2.2.1. Real Range-Intensity Profile
2.2.2. Real Data
2.2.3. Synthetic Data
2.3. Network Architecture
2.3.1. GIR Module
2.3.2. Multi-Scale Semantic Module
2.4. Implementation Details
3. Experiment and Results
3.1. Experiment
3.2. Results on Synthetic Dataset
3.3. Results on Real-Scene Dataset
3.4. Ablation Study
4. Conclusions and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Laga, H.; Jospin, L.V.; Boussaid, F.; Bennamoun, M. A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 1738–1764. [Google Scholar] [CrossRef] [PubMed]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Guo, M.-H.; Xu, T.-X.; Liu, J.-J.; Liu, Z.-N.; Jiang, P.-T.; Mu, T.-J.; Zhang, S.-H.; Martin, R.R.; Cheng, M.-M.; Hu, S.-M. Attention Mechanisms in Computer Vision: A Survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Gruber, T.; Julca-Aguilar, F.; Bijelic, M.; Heide, F. Gated2depth: Real-Time Dense Lidar from Gated Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1506–1516. [Google Scholar]
- Godard, C.; Mac Aodha, O.; Firman, M.; Brostow, G.J. Digging into Self-Supervised Monocular Depth Estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3828–3838. [Google Scholar]
- Saxena, A.; Chung, S.; Ng, A. Learning Depth from Single Monocular Images. In Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems 2005], Vancouver, BC, Canada, 5–8 December 2005. [Google Scholar]
- Liu, C.; Yuen, J.; Torralba, A.; Sivic, J.; Freeman, W.T. Sift Flow: Dense Correspondence across Different Scenes. In Proceedings of the Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, 12–18 October 2008; Proceedings, Part III 10. Springer: Berlin/Heidelberg, Germany, 2008; pp. 28–42. [Google Scholar]
- Eigen, D.; Puhrsch, C.; Fergus, R. Depth Map Prediction from a Single Image Using a Multi-Scale Deep Network. Adv. Neural Inf. Process Syst. 2014, arXiv:1406.228327. [Google Scholar]
- Zbontar, J.; LeCun, Y. Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches. J. Mach. Learn. Res. 2016, 17, 2287–2318. [Google Scholar]
- Lange, R. 3D Time-of-Flight Distance Measurement with Custom Solid-State Image Sensors in CMOS/CCD-Technology. Ph.D. Thesis, University of Siegen, Siegen, Germany, 2000. [Google Scholar]
- Schwarz, B. Mapping the World in 3D. Nat. Photonics 2010, 4, 429–430. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation, Inc.: Red Hook, NY, USA, 2014; pp. 2672–2680. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Ren, W.; Jin, Z. Phase space visibility graph. Chaos Solitons Fractals 2023, 176, 114170. [Google Scholar] [CrossRef]
- Ren, W.; Jin, N.; Ouyang, L. Phase Space Graph Convolutional Network for Chaotic Time Series Learning. IEEE Trans. Ind. Inform. 2024, 1–9. [Google Scholar] [CrossRef]
- Yin, W.; Liu, Y.; Shen, C.; Yan, Y. Enforcing Geometric Constraints of Virtual Normal for Depth Prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 5684–5693. [Google Scholar]
- Jie, Z.; Wang, P.; Ling, Y.; Zhao, B.; Wei, Y.; Feng, J.; Liu, W. Left-Right Comparative Recurrent Model for Stereo Matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3838–3846. [Google Scholar]
- Keel, M.-S.; Jin, Y.-G.; Kim, Y.; Kim, D.; Kim, Y.; Bae, M.; Chung, B.; Son, S.; Kim, H.; An, T. A VGA Indirect Time-of-Flight CMOS Image Sensor With 4-Tap 7μm Global-Shutter Pixel and Fixed-Pattern Phase Noise Self-Compensation. IEEE J. Solid-State Circuits 2019, 55, 889–897. [Google Scholar] [CrossRef]
- Walia, A.; Walz, S.; Bijelic, M.; Mannan, F.; Julca-Aguilar, F.; Langer, M.; Ritter, W.; Heide, F. Gated2gated: Self-Supervised Depth Estimation from Gated Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2811–2821. [Google Scholar]
- Wang, X.; Li, Y.; Yan, Z. Triangular-Range-Intensity Profile Spatial-Correlation Method for 3D Super-Resolution Range-Gated Imaging. Appl. Opt. 2013, 52, 7399–7406. [Google Scholar]
- Laurenzis, M.; Christnacher, F.; Monnin, D. Long-Range Three-Dimensional Active Imaging with Superresolution Depth Mapping. Opt. Lett. 2007, 32, 3146–3148. [Google Scholar] [CrossRef] [PubMed]
- Gruber, T.; Kokhova, M.; Ritter, W.; Haala, N.; Dictmayer, K. Learning Super-Resolved Depth from Active Gated Imaging. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 3051–3058. [Google Scholar]
- Rockstar Games. Policy on Posting Copyrighted Rockstar Games Material. Available online: http://Tinyurl.Com/Pjfoqo5 (accessed on 1 March 2024).
- Karlsson, B. RenderDoc. Available online: https://renderdoc.org (accessed on 1 March 2024).
- Richter, S.R.; Vineet, V.; Roth, S.; Koltun, V. Playing for Data: Ground Truth from Computer Games. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part II 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 102–118. [Google Scholar]
- Nair, V.; Hinton, G.E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. Pytorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process Syst. 2019, 32, 1–12. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Laina, I.; Rupprecht, C.; Belagiannis, V.; Tombari, F.; Navab, N. Deeper Depth Prediction with Fully Convolutional Residual Networks. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 239–248. [Google Scholar]
- Fu, H.; Gong, M.; Wang, C.; Batmanghelich, K.; Tao, D. Deep Ordinal Regression Network for Monocular Depth Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; 2018; pp. 2002–2011. [Google Scholar]
- Gonzalez Bello, J.L.; Kim, M. Forget about the Lidar: Self-Supervised Depth Estimators with Med Probability Volumes. Adv. Neural Inf. Process Syst. 2020, 33, 12626–12637. [Google Scholar]
Method | MAE [m] | RMSE [m] | AbsRel [%] | |||
---|---|---|---|---|---|---|
Numerical method | 0.624 | 0.686 | 71.14 | 0.367 | 0.523 | 0.678 |
Multilayer perceptron | 0.249 | 0.288 | 30.20 | 0.589 | 0.780 | 0.936 |
Gated2Depth network | 0.040 | 0.088 | 2.00 | 0.996 | 0.999 | 1.000 |
FCRN | 0.081 | 0.179 | 4.48 | 0.988 | 0.997 | 0.999 |
DORN | 0.795 | 0.825 | 63.70 | 0.155 | 0.280 | 0.363 |
RIRS-net (Ours) | 0.014 | 0.022 | 0.91 | 1.000 | 1.000 | 1.000 |
Method | Target Reflectance | MAE [m] | RMSE [m] | AbsRel [%] |
---|---|---|---|---|
Numerical method | 10% | 0.554 | 0.651 | 3.57 |
90% | 0.563 | 0.656 | 3.62 | |
Multilayer perceptron | 10% | 0.060 | 0.169 | 0.93 |
90% | 0.044 | 0.159 | 0.87 | |
Gated2Depth network | 10% | 0.052 | 0.073 | 0.39 |
90% | 0.037 | 0.059 | 0.31 | |
FCRN | 10% | 0.053 | 0.075 | 0.40 |
90% | 0.034 | 0.063 | 0.33 | |
DORN | 10% | 0.808 | 0.814 | 5.24 |
90% | 0.812 | 0.817 | 5.26 | |
RIRS-net (Ours) | 10% | 0.045 | 0.067 | 0.36 |
90% | 0.027 | 0.054 | 0.28 |
Method | MAE [m] | RMSE [m] | AsRel [%] | |||
---|---|---|---|---|---|---|
GIR module | 0.063 | 0.087 | 4.24 | 0.996 | 1.000 | 1.000 |
Multi-scale semantic module without spatial attention module | 0.037 | 0.073 | 2.10 | 0.997 | 1.000 | 1.000 |
Multi-scale semantic module | 0.028 | 0.050 | 1.88 | 0.999 | 1.000 | 1.000 |
RIRS-net without spatial attention module | 0.014 | 0.023 | 1.24 | 1.000 | 1.000 | 1.000 |
RIRS-net (Ours) | 0.014 | 0.022 | 0.91 | 1.000 | 1.000 | 1.000 |
Method | Target Reflectance | MAE [m] | RMSE [m] | AbsRel [%] |
---|---|---|---|---|
GIR module | 10% | 0.064 | 0.108 | 0.57 |
90% | 0.039 | 0.073 | 0.39 | |
Multi-scale semantic module without spatial attention module | 10% | 0.050 | 0.072 | 0.39 |
90% | 0.036 | 0.060 | 0.32 | |
Multi-scale semantic module | 10% | 0.049 | 0.071 | 0.38 |
90% | 0.033 | 0.060 | 0.32 | |
RIRS-net without spatial attention module | 10% | 0.046 | 0.067 | 0.36 |
90% | 0.028 | 0.054 | 0.28 | |
RIRS-net (Ours) | 10% | 0.045 | 0.067 | 0.36 |
90% | 0.027 | 0.054 | 0.28 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xia, C.; Wang, X.; Sun, L.; Zhang, Y.; Song, B.; Zhou, Y. Range-Intensity-Profile-Guided Gated Light Ranging and Imaging Based on a Convolutional Neural Network. Sensors 2024, 24, 2151. https://doi.org/10.3390/s24072151
Xia C, Wang X, Sun L, Zhang Y, Song B, Zhou Y. Range-Intensity-Profile-Guided Gated Light Ranging and Imaging Based on a Convolutional Neural Network. Sensors. 2024; 24(7):2151. https://doi.org/10.3390/s24072151
Chicago/Turabian StyleXia, Chenhao, Xinwei Wang, Liang Sun, Yue Zhang, Bo Song, and Yan Zhou. 2024. "Range-Intensity-Profile-Guided Gated Light Ranging and Imaging Based on a Convolutional Neural Network" Sensors 24, no. 7: 2151. https://doi.org/10.3390/s24072151
APA StyleXia, C., Wang, X., Sun, L., Zhang, Y., Song, B., & Zhou, Y. (2024). Range-Intensity-Profile-Guided Gated Light Ranging and Imaging Based on a Convolutional Neural Network. Sensors, 24(7), 2151. https://doi.org/10.3390/s24072151