Efficient Stereo Depth Estimation for Pseudo-LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder
Abstract
:1. Introduction
2. Related Works
3. Method
3.1. Stereo Training Using Depth Network
3.2. Dataset Splitting
3.3. Point Cloud Back-Projection
3.4. Post-Processing Step
3.5. Evaluation Metric
4. Experiment and Results
Algorithm 1. The overall steps: image to point cloud generation. | |
Input: Image pair input | |
Output: The 3D point cloud of the environment | |
1: | Initialize the encoder and decoder model |
2: | Initialize the proper model and input size |
3: | Initialize the calibration parameter, such as intrinsic and projection matrix. |
4: | while image frames are available, do |
5: | Read image pairs |
6: | Convert to torch tensor |
7: | Concatenate the image pairs |
8: | Extract the features using the encoder network |
9: | Depth output using decoder network |
10: | Functional interpolation of result if the size is different |
11: | Squeeze the output to the array |
12: | Project the disparity to points |
13: | Convert to point field for visualization |
14: | end |
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jennings, D.; Figliozzi, M. Study of Road Autonomous Delivery Robots and Their Potential Effects on Freight Efficiency and Travel. Transp. Res. Rec. 2020, 2674, 1019–1029. [Google Scholar] [CrossRef]
- Chang, J.R.; Chen, Y.S. Pyramid Stereo Matching Network. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5410–5418. [Google Scholar]
- Godard, C.; Aodha, O.M.; Firman, M.; Brostow, G. Digging into Self-Supervised Monocular Depth Estimation. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3827–3837. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision Meets Robotics: The KITTI Dataset. Int. J. Rob. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Springer: Berlin/Heidelberg, Germany, 2015; Volume 9351, pp. 234–241. [Google Scholar]
- Godard, C.; Mac Aodha, O.; Brostow, G.J. Unsupervised Monocular Depth Estimation with Left-Right Consistency. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 6602–6611. [Google Scholar]
- Yang, Z.; Wang, P.; Xu, W.; Zhao, L.; Nevatia, R. Unsupervised Learning of Geometry from Videos with Edge-Aware Depth-Normal Consistency. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence AAAI 2018, New Orleans, LA, USA, 2–7 February 2018; pp. 7493–7500. [Google Scholar]
- Saxena, A.; Chung, S.H.; Ng, A.Y. Learning Depth from Single Monocular Images. Adv. Neural Inf. Process. Syst. 2005, 18, 1161–1168. [Google Scholar]
- Eigen, D.; Puhrsch, C.; Fergus, R. Depth Map Prediction from a Single Image Using a Multi-Scale Deep Network. Adv. Neural Inf. Process. Syst. 2014, 27, 2366–2374. [Google Scholar] [CrossRef]
- Kundu, J.N.; Uppala, P.K.; Pahuja, A.; Babu, R.V. AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2656–2665. [Google Scholar]
- Lee, J.H.; Han, M.-K.; Ko, D.W.; Suh, I.H. From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation. arXiv 2019, arXiv:1907.10326. [Google Scholar]
- Miangoleh, S.H.M.; Dille, S.; Mai, L.; Paris, S.; Aksoy, Y. Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9680–9689. [Google Scholar]
- Xie, J.; Girshick, R.; Farhadi, A. Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks. In Computer Vision—ECCV 2016; Springer: Berlin/Heidelberg, Germany, 2016; Volume 9908 LNCS, pp. 842–857. [Google Scholar]
- Pilzer, A.; Xu, D.; Puscas, M.; Ricci, E.; Sebe, N. Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks. In Proceedings of the 2018 International Conference on 3D Vision, 3DV 2018, Verona, Italy, 5–8 September 2018; pp. 587–595. [Google Scholar]
- Luo, Y.; Ren, J.; Lin, M.; Pang, J.; Sun, W.; Li, H.; Lin, L. Single View Stereo Matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 155–163. [Google Scholar]
- Guo, X.; Yang, K.; Yang, W.; Wang, X.; Li, H. Group-Wise Correlation Stereo Network. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3268–3277. [Google Scholar]
- Lipson, L.; Teed, Z.; Deng, J. RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching. In Proceedings of the 2021 International Conference on 3D Vision, 3DV 2021, London, UK, 1–3 December 2021; pp. 218–227. [Google Scholar]
- Gu, X.; Fan, Z.; Zhu, S.; Dai, Z.; Tan, F.; Tan, P. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2492–2501. [Google Scholar]
- Jensen, R.; Dahl, A.; Vogiatzis, G.; Tola, E.; Aanaes, H. Large Scale Multi-View Stereopsis Evaluation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 406–413. [Google Scholar]
- Shen, Z.; Dai, Y.; Rao, Z. CFNET: Cascade and Fused Cost Volume for Robust Stereo Matching. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13901–13910. [Google Scholar]
- Chabra, R.; Straub, J.; Sweeney, C.; Newcombe, R.; Fuchs, H. Stereodrnet: Dilated Residual Stereonet. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 11778–11787. [Google Scholar]
- Cheng, X.; Zhong, Y.; Harandi, M.; Dai, Y.; Chang, X.; Drummond, T.; Li, H.; Ge, Z. Hierarchical Neural Architecture Search for Deep Stereo Matching. Adv. Neural Inf. Process. Syst. 2020, 2020, 22158–22169. [Google Scholar]
- Menze, M.; Heipke, C.; Geiger, A. Joint 3d Estimation of Vehicles and Scene Flow. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 2, 427. [Google Scholar] [CrossRef]
- Menze, M.; Heipke, C.; Geiger, A. Object Scene Flow. ISPRS J. Photogramm. Remote Sens. 2018, 140, 60–76. [Google Scholar] [CrossRef]
- Xu, G.; Cheng, J.; Guo, P.; Yang, X. Attention Concatenation Volume for Accurate and Efficient Stereo Matching. arXiv 2022, arXiv:2203.02146. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Zhou, T.; Brown, M.; Snavely, N.; Lowe, D.G. Unsupervised Learning of Depth and Ego-Motion from Video. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 6612–6621. [Google Scholar]
- Wang, C.; Buenaposada, J.M.; Zhu, R.; Lucey, S. Learning Depth from Monocular Videos Using Direct Methods. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2022–2030. [Google Scholar]
- Wang, Y.; Chao, W.L.; Garg, D.; Hariharan, B.; Campbell, M.; Weinberger, K.Q. Pseudo-Lidar from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8437–8445. [Google Scholar]
- Uhrig, J.; Schneider, N.; Schneider, L.; Franke, U.; Brox, T.; Geiger, A. Sparsity Invariant CNNs. In Proceedings of the 2017 International Conference on 3D Vision, 3DV 2017, Qingdao, China, 10–12 October 2017; pp. 11–20. [Google Scholar]
- Guo, X.; Li, H.; Yi, S.; Ren, J.; Wang, X. Learning Monocular Depth by Distilling Cross-Domain Stereo Networks. In Computer Vision—ECCV 2018; Springer: Berlin/Heidelberg, Germany, 2018; Volume 11215 LNCS, pp. 506–523. [Google Scholar]
- Kuznietsov, Y.; Stückler, J.; Leibe, B. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 2215–2223. [Google Scholar]
- Yang, N.; Wang, R.; Stückler, J.; Cremers, D. Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry. In Computer Vision—ECCV 2018; Springer: Berlin/Heidelberg, Germany, 2018; Volume 11212 LNCS, pp. 835–852. [Google Scholar]
- Fu, H.; Gong, M.; Wang, C.; Batmanghelich, K.; Tao, D. Deep Ordinal Regression Network for Monocular Depth Estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2002–2011. [Google Scholar]
- Mahjourian, R.; Wicke, M.; Angelova, A. Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5667–5675. [Google Scholar]
- Yin, Z.; Shi, J. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1983–1992. [Google Scholar]
- Ranjan, A.; Jampani, V.; Balles, L.; Kim, K.; Sun, D.; Wulff, J.; Black, M.J. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 12232–12241. [Google Scholar]
- Luo, C.; Yang, Z.; Wang, P.; Wang, Y.; Xu, W.; Nevatia, R.; Yuille, A. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2624–2641. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Casser, V.; Pirk, S.; Mahjourian, R.; Angelova, A. Depth Prediction without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8001–8008. [Google Scholar]
- Garg, R.; Vijay Kumar, B.G.; Carneiro, G.; Reid, I. Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue. In Computer Vision— ECCV 2016; Springer: Berlin/Heidelberg, Germany, 2016; Volume 9912 LNCS, pp. 740–756. [Google Scholar]
- Mehta, I.; Sakurikar, P.; Narayanan, P.J. Structured Adversarial Training for Unsupervised Monocular Depth Estimation. In Proceedings of the 2018 International Conference on 3D Vision, 3DV 2018, Verona, Italy, 5–8 September 2018; pp. 314–323. [Google Scholar]
- Poggi, M.; Tosi, F.; Mattoccia, S. Learning Monocular Depth Estimation with Unsupervised Trinocular Assumptions. In Proceedings of the 2018 International Conference on 3D Vision, 3DV 2018, Verona, Italy, 5–8 September 2018; pp. 324–333. [Google Scholar]
- Pillai, S.; Ambruş, R.; Gaidon, A. Superdepth: Self-Supervised, Super-Resolved Monocular Depth Estimation. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 9250–9256. [Google Scholar]
- Peña, D.; Sutherland, A. Disparity Estimation by Simultaneous Edge Drawing. In Computer Vision—ACCV 2016 Workshops; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10117 LNCS, pp. 124–135. [Google Scholar]
- Teed, Z.; Deng, J. RAFT-3D: Scene Flow Using Rigid-Motion Embeddings. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8371–8380. [Google Scholar]
- Brickwedde, F.; Abraham, S.; Mester, R. Mono-SF: Multi-View Geometry Meets Single-View Depth for Monocular Scene Flow Estimation of Dynamic Traffic Scenes. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2780–2790. [Google Scholar]
- Tosi, F.; Aleotti, F.; Poggi, M.; Mattoccia, S. Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9791–9801. [Google Scholar]
- Seki, A.; Pollefeys, M. Patch Based Confidence Prediction for Dense Disparity Map. In Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, 19–22 September 2016; pp. 23.1–23.13. [Google Scholar]
Layer (Type: Depth-idx) | Output Shape | Param |
---|---|---|
Conv2D:1–1 | [1, 64, 96, 320] | 18,816 |
BatchNorm2d: 1–2 | [1, 64, 96, 320] | 128 |
ReLU: 1–3 | [1, 64, 96, 320] | -- |
MaxPool2d: 1–4 | [1, 64, 48, 160] | -- |
Sequential: 1–5 | [1, 64, 48, 160] | -- |
BasicBlock: 2–1 | [1, 64, 48, 160] | 73,984 |
BasicBlock: 2–2 | [1, 64, 48, 160] | 73,984 |
Sequential: 1–6 | [1, 128, 24, 80] | -- |
BasicBlock: 2–3 | [1, 128, 24, 80] | 230,144 |
BasicBlock: 2–4 | [1, 128, 24, 80] | 295,424 |
Sequential: 1–7 | [1, 256, 12, 40] | -- |
BasicBlock: 2–5 | [1, 256, 12, 40] | 919,040 |
BasicBlock: 2–6 | [1, 256, 12, 40] | 1,180,672 |
Sequential: 1–8 | [1, 512, 6, 20] | -- |
BasicBlock: 2–7 | [1, 512, 6, 20] | 3,673,088 |
BasicBlock: 2–8 | [1, 512, 6, 20] | 4,720,640 |
Total params: 11,185,920 Trainable params: 11,185,920 Non-trainable params: 0 |
Method | Train Input | Abs Rel | Sq Rel | RMSE | RMSE log | δ < 1.25 | δ < 1.252 | δ < 1.253 |
---|---|---|---|---|---|---|---|---|
AdaDepth 2018 [10] | D* | 0.167 | 1.257 | 5.578 | 0.237 | 0.771 | 0.922 | 0.971 |
Kuznietsov 2017 [33] | DS | 0.113 | 0.741 | 4.621 | 0.189 | 0.862 | 0.96 | 0.986 |
DVSO 2018 [34] | D*S | 0.097 | 0.734 | 4.442 | 0.187 | 0.888 | 0.958 | 0.98 |
SVSM FT 2018 [15] | DS | 0.094 | 0.626 | 4.252 | 0.177 | 0.891 | 0.965 | 0.984 |
Guo 2018 [32] | DS | 0.096 | 0.641 | 4.095 | 0.168 | 0.892 | 0.967 | 0.986 |
DORN 2018 [35] | D | 0.072 | 0.307 | 2.727 | 0.12 | 0.932 | 0.984 | 0.994 |
Zhou 2017 [28] | M | 0.183 | 1.595 | 6.709 | 0.27 | 0.734 | 0.902 | 0.959 |
Yang 2018 [7] | M | 0.182 | 1.481 | 6.501 | 0.267 | 0.725 | 0.906 | 0.963 |
Mahjourian 2018 [36] | M | 0.163 | 1.24 | 6.22 | 0.25 | 0.762 | 0.916 | 0.968 |
GeoNet 2018 [37] | M | 0.149 | 1.06 | 5.567 | 0.226 | 0.796 | 0.935 | 0.975 |
DDVO 2018 [29] | M | 0.151 | 1.257 | 5.583 | 0.228 | 0.81 | 0.936 | 0.974 |
Ranjan 2019 [38] | M | 0.148 | 1.149 | 5.464 | 0.226 | 0.815 | 0.935 | 0.973 |
EPC++ 2020 [39] | M | 0.141 | 1.029 | 5.35 | 0.216 | 0.816 | 0.941 | 0.976 |
Struct2depth 2019 [40] | M | 0.141 | 1.026 | 5.291 | 0.215 | 0.816 | 0.945 | 0.979 |
Monodepth2 w/o pretraining 2019 [3] | M | 0.132 | 1.044 | 5.142 | 0.21 | 0.845 | 0.948 | 0.977 |
Monodepth2 (640 × 192), 2019 [3] | M | 0.115 | 0.903 | 4.863 | 0.193 | 0.877 | 0.959 | 0.981 |
Monodepth2 (1024 × 320), 2019 [3] | M | 0.115 | 0.882 | 4.701 | 0.19 | 0.879 | 0.961 | 0.982 |
BTS ResNet50, 2019 [11] | M | 0.061 | 0.261 | 2.834 | 0.099 | 0.954 | 0.992 | 0.998 |
Garg 2016 [41] | S | 0.152 | 1.226 | 5.849 | 0.246 | 0.784 | 0.921 | 0.967 |
Monodepth R50 2017 [6] | S | 0.133 | 1.142 | 5.533 | 0.23 | 0.83 | 0.936 | 0.97 |
StrAT 2018 [42] | S | 0.128 | 1.019 | 5.403 | 0.227 | 0.827 | 0.935 | 0.971 |
3Net (VGG), 2018 [43] | S | 0.119 | 1.201 | 5.888 | 0.208 | 0.844 | 0.941 | 0.978 |
SuperDepth & PP, 2019 [44] (1024 × 382) | S | 0.112 | 0.875 | 4.958 | 0.207 | 0.852 | 0.947 | 0.977 |
Monodepth2 w/o pretraining 2019 [3] | S | 0.13 | 1.144 | 5.485 | 0.232 | 0.831 | 0.932 | 0.968 |
Monodepth2 (640 × 192), 2019 [3] | S | 0.109 | 0.873 | 4.96 | 0.209 | 0.864 | 0.948 | 0.975 |
Monodepth2 (1024 × 320) 2019 [3] | S | 0.107 | 0.849 | 4.764 | 0.201 | 0.874 | 0.953 | 0.977 |
Monodepth2 w/o pretraining, 2019 [3] | MS | 0.127 | 1.031 | 5.266 | 0.221 | 0.836 | 0.943 | 0.974 |
Monodepth2 (640 × 192), 2019 [3] | MS | 0.106 | 0.818 | 4.75 | 0.196 | 0.874 | 0.957 | 0.979 |
Monodepth2 (1024 × 320), 2019 [3] | MS | 0.106 | 0.806 | 4.63 | 0.193 | 0.876 | 0.958 | 0.98 |
Ours w/o pretraining (640 × 192) | S | 0.083 | 0.768 | 4.467 | 0.185 | 0.911 | 0.959 | 0.977 |
Ours (640 × 192) | S | 0.080 | 0.747 | 4.346 | 0.181 | 0.918 | 0.961 | 0.978 |
Ours (1024 × 320) | S | 0.077 | 0.723 | 4.233 | 0.179 | 0.922 | 0.961 | 0.978 |
Ours (1024 × 320) + PP | S | 0.075 | 0.700 | 4.196 | 0.176 | 0.924 | 0.963 | 0.979 |
Methods | NOC | Runtime (s) | ||
---|---|---|---|---|
D1-bg (%) | D1-fg (%) | D1-All (%) | ||
SED [45] | 24.67 | 39.95 | 27.19 | 0.68 |
Raft-3D [46] | 1.34 | 3.11 | 1.63 | 2 |
Mono-SF [47] | 13.72 | 26.36 | 15.81 | 41 |
LEAStereo [22] | 1.29 | 2.65 | 1.51 | 0.3 |
ACVNet [25] | 1.26 | 2.84 | 1.52 | 0.2 |
CFNet [20] | 1.43 | 3.25 | 1.73 | 0.18 |
monoResMatch [48] | 21.65 | 19.08 | 21.23 | 0.16 |
PBCP [49] | 2.27 | 7.71 | 3.17 | 68 |
PSMNet [2] | 1.38 | 3.45 | 1.72 | 0.41 |
Ours (1024 × 320) | 7.00 | 12.53 | 7.84 | 0.03 |
Method | Image Resolution | FPS (Avg.) |
---|---|---|
PSMNet [2,30] | 720 × 480 | 2.857 |
1080 × 720 | 1.298 | |
640 × 256 | 6.25 | |
1024 × 320 | 3.125 | |
Stereo Depth Estimation, Monodepth2 (640 × 192) [3] | 720 × 480 | 40.6 |
1080 × 720 | 21.7 | |
640 × 192 | 60.9 ** | |
1024 × 320 | 42.9 | |
Ours (640 × 192) | 720 × 480 | 32.1 |
1080 × 720 | 19.1 | |
640 × 192 | 57.2 ** | |
1024 × 320 | 34.2 | |
Ours (1024 × 320) | 720 × 480 | 23.1 |
1080 × 720 | 15.64 | |
640 × 192 | 26.6 | |
1024 × 320 | 31.9 ** |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hossain, S.; Lin, X. Efficient Stereo Depth Estimation for Pseudo-LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder. Sensors 2023, 23, 1650. https://doi.org/10.3390/s23031650
Hossain S, Lin X. Efficient Stereo Depth Estimation for Pseudo-LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder. Sensors. 2023; 23(3):1650. https://doi.org/10.3390/s23031650
Chicago/Turabian StyleHossain, Sabir, and Xianke Lin. 2023. "Efficient Stereo Depth Estimation for Pseudo-LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder" Sensors 23, no. 3: 1650. https://doi.org/10.3390/s23031650
APA StyleHossain, S., & Lin, X. (2023). Efficient Stereo Depth Estimation for Pseudo-LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder. Sensors, 23(3), 1650. https://doi.org/10.3390/s23031650