A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision
Abstract
:1. Introduction
2. Related Work
2.1. Traditional Methods
2.2. Deep Learning–Based Methods
3. The Proposed Method
3.1. Framework Overview
3.2. Pseudo-Depth Net
3.3. Depth Net
3.4. Loss Function
Algorithm 1: The procedure for edge-guided sampling |
4. Experiments
4.1. Experimental Setup
4.2. Evaluation Metrics
4.3. Evaluation Results
4.4. Ablation Study
4.5. Running Time
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Siddiqui, Y.; Porzi, L.; Bulò, S.; Muller, N.; Nießner, M.; Dai, A.; Kontschieder, P. Panoptic lifting for 3d scene understanding with neural fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 9043–9052. [Google Scholar]
- Ali, S.; Pandey, A. ArthroNet: A monocular depth estimation technique with 3D segmented maps for knee arthroscopy. Intell. Med. 2023, 3, 129–138. [Google Scholar] [CrossRef]
- Yang, B.; Xu, X.; Ren, J.; Cheng, L.; Guo, L.; Zhang, Z. SAM-Net: Semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications. Pattern Recognit. Lett. 2022, 153, 126–135. [Google Scholar] [CrossRef]
- Zhou, C.; Yan, Q.; Shi, Y.; Sun, L. DoubleStar: Long-Range Attack Towards Depth Estimation based Obstacle Avoidance in Autonomous Systems. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 1885–1902. [Google Scholar]
- Tesla Use pEr-Pixel Depth Estimation with Self-Supervised Learning. Available online: https://youtu.be/hx7BXih7zx8?t=1334 (accessed on 21 April 2020).
- Tesla AI Day. Available online: https://youtu.be/j0z4FweCy4M?t=5295 (accessed on 20 August 2021).
- Zheng, X.; Sun, H.; Lu, X.; Xie, W. Rotation-Invariant Attention Network for Hyperspectral Image Classification. IEEE Trans. Image Process. 2022, 31, 4251–4265. [Google Scholar] [CrossRef]
- Zheng, X.; Gong, T.; Li, X.; Lu, X. Generalized Scene Classification from Small-Scale Datasets with Multi-Task Learning. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–11. [Google Scholar]
- Saxena, A.; Sun, M.; Ng, A.Y. Make3D: Learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 824–840. [Google Scholar] [CrossRef]
- Eigen, D.; Puhrsch, C.; Fergus, R. Depth map prediction from a single image using a multiscale deep network. In Proceedings of the NeurIPS, Montreal, QC, Canada, 8–13 December 2014; pp. 2366–2374. [Google Scholar]
- Laina, I.; Rupprecht, C.; Belagiannis, V.; Tombari, F.; Navab, N. Deeper Depth Prediction with Fully Convolutional Residual Networks. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 239–248. [Google Scholar]
- Hu, J.; Fan, C.; Jiang, H.; Guo, X.; Gao, Y.; Lu, X.; Lam, T. Boosting lightweight depth estimation via knowledge distillation. In Proceedings of the International Conference on Knowledge Science, Engineering and Management, Guangzhou, China, 15–18 August 2023; pp. 27–39. [Google Scholar]
- Lopez-Rodriguez, A.; Mikolajczyk, K. Desc: Domain adaptation for depth estimation via semantic consistency. Int. J. Comput. Vis. 2023, 131, 752–771. [Google Scholar] [CrossRef]
- Agarwal, A.; Arora, C. Attention attention everywhere: Monocular depth prediction with skip attention. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 5861–5870. [Google Scholar]
- Yin, Z.; Shi, J. GeoNet: Unsupervised learning of dense depth, optical flow and camera pose. In Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition Conference, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1983–1992. [Google Scholar]
- Bian, J.; Zhan, H.; Wang, N.; Chin, T.; Shen, C.; Ian, R. Auto-Rectify Network for Unsupervised Indoor Depth Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 9802–9813. [Google Scholar] [CrossRef] [PubMed]
- Sun, L.; Bian, J.; Zhan, H.; Yin, W.; Reid, I.; Shen, C. SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes. arXiv 2022, arXiv:2211.03660. [Google Scholar] [CrossRef] [PubMed]
- Masoumian, A.; Rashwan, H.; Abdulwahab, S.; Cristiano, J.; Puig, D. Gcndepth: Self-supervised monocular depth estimation based on graph convolutional network. Neurocomputing 2023, 517, 81–92. [Google Scholar] [CrossRef]
- Hoyer, L.; Dai, D.; Wang, Q.; Chen, Y.; Gool, L. Improving semi-supervised and domain-adaptive semantic segmentation with self-supervised depth estimation. Int. J. Comput. Vis. 2023, 131, 2070–2096. [Google Scholar] [CrossRef]
- Fu, H.; Gong, M.; Wang, C.; Batmanghelich, K.; Tao, D. Deep Ordinal Regression Network for Monocular Depth Estimation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2002–2011. [Google Scholar]
- Lee, J.H.; Han, M.-K.; Ko, D.W.; Suh, I.H. From big to small: Multi-scale local planar guidance for monocular depth estimation. arXiv 2019, arXiv:1907.10326. [Google Scholar]
- Bhat, S.F.; Alhashim, I.; Wonka, P. AdaBins: Depth Estimation Using Adaptive Bins. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 4008–4017. [Google Scholar]
- Ranftl, R.; Bochkovskiy, A.; Koltun, V. Vision Transformers for Dense Prediction. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 12159–12168. [Google Scholar]
- Saxena, A.; Chung, S.H.; Ng, A.Y. Learning Depth from Single Monocular Images. In Proceedings of the NeurIPS, Vancouver, BC, Canada, 5–8 December 2005; pp. 1161–1168. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.H.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In Proceedings of the ICLR, Vienna, Austria, 4–9 May 2021; pp. 1–22. [Google Scholar]
- Yang, G.; Tang, H.; Ding, M.; Sebe, N.; Ricci, E. Transformers solve the limited receptive field for monocular depth prediction. arXiv 2021, arXiv:2103.12091. [Google Scholar]
- Liu, B.; Gould, S.; Koller, D. Single image depth estimation from predicted semantic labels. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2010; pp. 1253–1260. [Google Scholar]
- Karsch, K.; Liu, C.; Kang, S.B. Depth transfer: Depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 2144–2158. [Google Scholar] [CrossRef] [PubMed]
- Liu, M.; Salzmann, M.; He, X. Discrete-continuous depth estimation from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 716–723. [Google Scholar]
- Spencer, J.; Qian, C.; Russell, C.; Hadfield, S.; Graf, E.; Adams, W.; Schofield, A.; Elder, J.; Bowden, R.; Cong, H.; et al. The monocular depth estimation challenge. In Proceedings of the IEEE/CVF Winter Conference Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 623–632. [Google Scholar]
- Eigen, D.; Fergus, R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In Proceedings of the ICCV, Santiago, Chile, 11–18 December 2015; pp. 2650–2658. [Google Scholar]
- Liu, F.; Shen, C.; Lin, G.; Reid, I. Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 2024–2039. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the CVPR, Las Vegas, NV, USA, 26–30 June 2016; pp. 770–778. [Google Scholar]
- Cao, Y.; Wu, Z.; Shen, C. Estimating depth from monocular images as classfication using deep fully convolutional residual networks. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 3174–3182. [Google Scholar] [CrossRef]
- Li, B.; Dai, Y.; He, M. Monocular depth estimation with hierarchical fusion of dilated CNNs and soft-weighted-sum inference. Pattern Recognit. 2018, 83, 328–339. [Google Scholar] [CrossRef]
- Hu, J.; Ozay, M.; Zhang, Y.; Okatani, T. Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries. In Proceedings of the WACV, Waikoloa Village, HI, USA, 7–11 January 2019; pp. 1043–1051. [Google Scholar]
- Chen, X.; Chen, X.; Zha, Z.-J. Structure-aware residual pyramid network for monocular depth estimation. arXiv 2019, arXiv:1907.06023. [Google Scholar]
- Ye, X.; Chen, S.; Xu, R. DPNet: Detail-preserving network for high quality monocular depth estimation. Pattern Recognit. 2021, 109, 107578. [Google Scholar] [CrossRef]
- Godard, C.; Aodha, O.M.; Brostow, G.J. Unsupervised monocular depth estimation with left-right consistency. In Proceedings of the IEEE Conference Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 270–279. [Google Scholar]
- Bian, J.; Zhan, H.; Wang, N.; Li, Z.; Zhang, L.; Shen, C.; Cheng, M.; Reid, I. Unsupervised Scale-consistent Depth Learning from Video. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 8–14 December 2019; pp. 1–16. [Google Scholar]
- Bian, J.; Li, Z.; Wang, N.; Zhan, H.; Shen, C.; Cheng, M.M.; Reid, I. Unsupervised scale-consistent depth and ego-motion learning from monocular video. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32, pp. 35–45. [Google Scholar]
- Klingner, M.; Termöhlen, J.-A.; Mikolajczyk, J.; Fingscheidt, T. Self-supervised monocular depth estimation: Solving the dynamic object problem by semantic guidance. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 582–600. [Google Scholar]
- Heise, P.; Klose, S.; Jensen, B.; Knoll, A. PM-Huber: PatchMatch with Huber Regularization for Stereo Matching. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 2360–2367. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Xian, K.; Zhang, J.; Wang, O.; Mai, L.; Lin, Z.; Cao, Z. Structure-guided ranking loss for single image depth prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 13–19 June 2020; pp. 611–620. [Google Scholar]
- Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision Workshops (ECCVW), Florence, Italy, 7–13 October 2012; pp. 746–760. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets Robotics: The kitti dataset. Int. J. Robot. Res. (IJRR) 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
- Guizilini, V.; Ambrus, R.; Pillai, S.; Raventos, A.; Gaidon, A. 3d packing for self-supervised monocular depth estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 13–19 June 2020; pp. 2485–2494. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.F. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami Beach, FL, USA, 20–26 June 2009; pp. 248–255. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
Layer | Input | Output | Details |
---|---|---|---|
Input_image | - | In | Input size: 304 × 228 × 3 |
Convolution_1 | In | Conv_1 | Kernel number: 64, Kernel size: 7 × 7, stride: 2 |
Batch Norm_1 | Conv_1 | BN_1 | |
ReLU_1 | BN_1 | ReLU_1 | |
Projection_1 | ReLU_1 | Pro_1 | Kernel number: 256 |
Skip_1 | Pro_1 | S_1 | Kernel number: 256 |
Skip_2 | S_1 | S_2 | Kernel number: 256 |
Projection_2 | S_2 | Pro_2 | Kernel number: 512 |
Skip_3 | Pro_2 | S_3 | Kernel number: 512 |
Skip_4 | S_3 | S_4 | Kernel number: 512 |
Skip_5 | S_4 | S_5 | Kernel number: 512 |
Projection_3 | S_5 | Pro_3 | Kernel number: 1024 |
Skip_6 | Pro_3 | S_6 | Kernel number: 1024 |
Skip_7 | S_6 | S_7 | Kernel number: 1024 |
Skip_8 | S_7 | S_8 | Kernel number: 1024 |
Skip_9 | S_8 | S_9 | Kernel number: 1024 |
Skip_10 | S_9 | S_10 | Kernel number: 1024 |
Projection_4 | S_10 | Pro_4 | Kernel number: 2048 |
Skip_11 | Pro_4 | S_11 | Kernel number: 2048 |
Skip_12 | S_11 | S_12 | Kernel number: 2048 |
Convolution_2 | S_12 | Conv_2 | Kernel number: 1024, Kernel size: 7 × 7, stride: 1 |
Batch Norm_2 | Conv_2 | BN_2 | |
Up-projection_1 | BN_2 | U_1 | Kernel number: 512 |
Up-projection_2 | U_1 | U_2 | Kernel number: 256 |
Up-projection_3 | U_2 | U_3 | Kernel number: 128 |
Up-projection_4 | U_3 | U_4 | Kernel number: 64 |
Convolution_3 | U_4 | Conv_3 | Kernel number: 1, Kernel size: 3 × 3, stride: 1 |
ReLU_3 | Conv_3 | Out |
Layer | Input | Output | Details |
---|---|---|---|
Input_feature | - | In | Input size: M × N × C |
Convolution_1 | In | Conv_1 | Kernel size: 1 × 1, stride: 1 |
Batch Norm_1 | Conv_1 | BN_1 | |
ReLU_1 | BN_1 | ReLU_1 | |
Convolution_2 | ReLU_1 | Conv_2 | Kernel size: 3 × 3, stride: 1 |
Batch Norm_2 | Conv_2 | BN_2 | |
ReLU_2 | BN_2 | ReLU_2 | |
Convolution_3 | ReLU_2 | Conv_3 | Kernel size: 1 × 1, stride: 1 |
Batch Norm_3 | Conv_3 | BN_3 | |
Skip Connection | In, BN_3 | SC | |
ReLU_3 | SC | Out |
Layer | Input | Output | Details |
---|---|---|---|
Input_feature | - | In | Input size: M × N × C |
Convolution_1 | In | Conv_1 | Kernel size: 1 × 1, stride: 1 |
Batch Norm_1 | Conv_1 | BN_1 | |
ReLU_1 | BN_1 | ReLU_1 | |
Convolution_2 | ReLU_1 | Conv_2 | Kernel size: 3 × 3, stride: 1 |
Batch Norm_2 | Conv_2 | BN_2 | |
ReLU_2 | BN_2 | ReLU_2 | |
Convolution_3 | ReLU_2 | Conv_3 | Kernel size: 1 × 1, stride: 1 |
Batch Norma_3 | Conv_3 | BN_3 | |
Convolution_4 | In | Conv_4 | Kernel size: 1 × 1, stride: 1 |
Batch Norma_4 | Conv_4 | BN_4 | |
Skip Connection | BN_3, BN_4 | SC | |
ReLU_3 | SC | Out |
Layer | Input | Output | Details |
---|---|---|---|
Input_feature | - | In | Input size: M × N × C |
Up-pooling | In | Up | 2 × 2 upsampling |
Convolution_1 | Up | Conv_1 | Kernel size: 5 × 5, stride: 1 |
ReLU_1 | Conv_1 | ReLU_1 | |
Convolution_2 | ReLU_1 | Conv_2 | Kernel size: 3 × 3, stride: 1 |
Convolution_3 | Up | Conv_3 | Kernel size: 5 × 5, stride: 1 |
Skip Connection | Conv_2, Conv_3 | SC | |
ReLU_2 | SC | Out |
Layer | Input | Output | Details |
---|---|---|---|
conv1 | - | conv1 | Kernel size: 7 × 7, stride: 2 |
conv2 | conv1 | conv2 | Kernel size: 5 × 5, stride: 2 |
conv3a | conv2 | conv3a | Kernel size: 5 × 5, stride: 2 |
conv3b | conv3a | conv3b | Kernel size: 3 × 3, stride: 1 |
conv4a | conv3b | conv4a | Kernel size: 3 × 3, stride: 2 |
conv4b | conv4a | conv4b | Kernel size: 3 × 3, stride: 1 |
conv5a | conv4b | conv5a | Kernel size: 3 × 3, stride: 2 |
conv5b | conv5a | conv5b | Kernel size: 3 × 3, stride: 1 |
conv6a | conv5b | conv6a | Kernel size: 3 × 3, stride: 2 |
conv6b | conv6a | conv6b | Kernel size: 3 × 3, stride: 1 |
pr6+loss6 | conv6b | conv6b | Kernel size: 3 × 3, stride: 1 |
upconv5 | conv6b | upconv5+pr6+conv5b | Kernel size: 4 × 4, stride: 2 |
iconv5 | upconv5+pr6+conv5b | iconv5 | Kernel size: 3 × 3, stride: 1 |
pr5+loss5 | iconv5 | iconv5 | Kernel size: 3 × 3, stride: 1 |
upconv4 | iconv5 | upconv4+pr5+conv4b | Kernel size: 4 × 4, stride: 2 |
iconv4 | upconv4+pr5+conv4b | iconv4 | Kernel size: 3 × 3, stride: 1 |
pr4+loss4 | iconv4 | iconv4 | Kernel size: 3 × 3, stride: 1 |
upconv3 | iconv4 | upconv3+pr4+conv3b | Kernel size: 4 × 4, stride: 2 |
iconv3 | upconv3+pr4+conv3b | iconv3 | Kernel size: 3 × 3, stride: 1 |
pr3+loss3 | iconv3 | iconv3 | Kernel size: 3 × 3, stride: 1 |
upconv2 | iconv3 | upconv2+pr3+conv2 | Kernel size: 4 × 4, stride: 2 |
iconv2 | upconv2+pr3+conv2 | iconv2 | Kernel size: 3 × 3, stride: 1 |
pr2+loss2 | iconv2 | iconv2 | Kernel size: 3 × 3, stride: 1 |
upconv1 | iconv2 | upconv1+pr2+conv1 | Kernel size: 4 × 4, stride: 2 |
iconv1 | upconv1+pr2+conv1 | iconv1 | Kernel size: 3 × 3, stride: 1 |
pr1+loss1 | iconv1 | output | Kernel size: 3 × 3, stride: 1 |
Method | Param (M) | RMSE (linear) | RMSE (log) | ARD | SRD | GPU 3090(s) |
---|---|---|---|---|---|---|
FCRN | 63.6 | 0.573 | 0.195 | 0.152 | 0.121 | 83.52 |
SC-DepthV2 | 40.9 | 0.554 | 0.186 | 0.142 | 0.112 | 37.48 |
DORN | 110.3 | 0.509 | 0.172 | 0.115 | 0.082 | 925.22 |
SC-DepthV3 | 28.7 | 0.486 | 0.165 | 0.123 | 0.090 | 44.37 |
AdaBins | 78 | 0.364 | 0.122 | 0.103 | 0.070 | 1377.6 |
DPT | 123.1 | 0.357 | 0.121 | 0.110 | 0.077 | 81.85 |
Ours | 19.8 | 0.342 | 0.110 | 0.095 | 0.064 | 36.24 |
Method | Param (M) | RMSE (linear) | RMSE (log) | ARD | SRD | GPU 3090(s) |
---|---|---|---|---|---|---|
SC-DepthV1 | 27.9 | 4.997 | 0.196 | 0.118 | 0.870 | 38.39 |
SC-DepthV3 | 28.7 | 4.699 | 0.188 | 0.119 | 0.756 | 38.51 |
DORN | 110.3 | 2.727 | 0.120 | 0.072 | 0.307 | 985.59 |
DPT | 123.1 | 2.573 | 0.092 | 0.062 | 0.221 | 96.91 |
AdaBins | 78 | 2.360 | 0.088 | 0.058 | 0.190 | 1504.35 |
Ours | 19.8 | 2.351 | 0.085 | 0.058 | 0.187 | 36.27 |
Method | Param (M) | RMSE (linear) | RMSE (log) | ARD | SRD | GPU 3090(s) |
---|---|---|---|---|---|---|
SC-DepthV1 | 27.9 | 16.118 | 0.279 | 0.168 | 3.825 | 88.04 |
SC-DepthV3 | 28.7 | 15.702 | 0.248 | 0.143 | 3.008 | 88.27 |
Ours | 19.8 | 13.427 | 0.218 | 0.118 | 2.861 | 82.18 |
Method | RMSE (linear) | RMSE (log) | ARD | SRD |
---|---|---|---|---|
Ours | 0.342 | 0.110 | 0.095 | 0.064 |
Ours w/o Pseudo-Depth Net | 0.378 | 0.129 | 0.194 | 0.095 |
Ours w/o Depth Net | 0.461 | 0.142 | 0.253 | 0.142 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tan, J.; Gao, M.; Duan, T.; Gao, X. A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision. Mathematics 2023, 11, 4645. https://doi.org/10.3390/math11224645
Tan J, Gao M, Duan T, Gao X. A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision. Mathematics. 2023; 11(22):4645. https://doi.org/10.3390/math11224645
Chicago/Turabian StyleTan, Jiahai, Ming Gao, Tao Duan, and Xiaomei Gao. 2023. "A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision" Mathematics 11, no. 22: 4645. https://doi.org/10.3390/math11224645
APA StyleTan, J., Gao, M., Duan, T., & Gao, X. (2023). A Deep Joint Network for Monocular Depth Estimation Based on Pseudo-Depth Supervision. Mathematics, 11(22), 4645. https://doi.org/10.3390/math11224645