High-Precision Depth Map Estimation from Missing Viewpoints for 360-Degree Digital Holography
Abstract
:1. Introduction
2. Previous Research
3. Proposed Method
3.1. Data Preparations
3.2. Proposed Model
4. Experiment Results and Discussion
4.1. Quantitative Results
4.2. Qualitative Results of the Image Quality
4.2.1. Comparison of the Proposed Model (HDD) with the Ground Truth Using Holographic 3D Images Reconstructed from CGH
4.2.2. Comparison of the Proposed Model (HDD) with Previous Models (Using Four Kinds of Objects Used in the Course of the Training Phase)
4.2.3. Comparison of the Proposed Model (HDD) with Previous Models (Using Two Kinds of Complicated Objects Not Used in the Course of the Training Phase)
4.2.4. Further Comparison of the Proposed Model (HDD) with Previous Models (Using Complicated Object)
5. Discussion and Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CGH | Computer-generated holograms |
CNN | Convolutional neural network |
CDD | Conventional dense depth |
AR/VR | Augmented reality/Virtual Reality |
HDD | Holographic dense depth |
MSE | Mean squared error |
SSIM | Structural Similarity |
PSNR | Peak signal-to-noise |
ACC | Accuracy |
SLM | Spatial light modulator |
LCoS | Liquid crystal on silicon |
References
- Brown, B.R.; Lohmann, A.W. Complex spatial filtering with binary masks. Appl. Opt. 1966, 5, 967–969. [Google Scholar] [CrossRef] [PubMed]
- Horisaki, R.; Takagi, R.; Tanida, J. Deep-learning-generated holography. Appl. Opt. 2018, 57, 3859–3863. [Google Scholar] [CrossRef] [PubMed]
- Battiato, S.; Curti, S.; La Cascia, M.; Tortora, M.; Scordato, E. Depth map generation by image classification. In Proceedings of the Three-Dimensional Image Capture and Applications VI, International Society for Optics and Photonics, San Jose, CA, USA, 18 January 2004; Volume 5302, pp. 95–104. [Google Scholar]
- Eigen, D.; Puhrsch, C.; Fergus, R. Depth map prediction from a single image using a multi-scale deep network. arXiv 2014, arXiv:1406.2283. [Google Scholar]
- Koch, T.; Liebel, L.; Fraundorfer, F.; Korner, M. Evaluation of cnn-based single-image depth estimation methods. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Li, B.; Shen, C.; Dai, Y.; Van Den Hengel, A.; He, M. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical crfs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1119–1127. [Google Scholar]
- Liu, F.; Shen, C.; Lin, G.; Reid, I. Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 2024–2039. [Google Scholar] [CrossRef] [PubMed]
- Wang, P.; Shen, X.; Lin, Z.; Cohen, S.; Price, B.; Yuille, A.L. Towards unified depth and semantic prediction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2800–2809. [Google Scholar]
- Lore, K.G.; Reddy, K.; Giering, M.; Bernal, E.A. Generative adversarial networks for depth map estimation from RGB video. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1258–12588. [Google Scholar]
- Aleotti, F.; Tosi, F.; Poggi, M.; Mattoccia, S. Generative adversarial networks for unsupervised monocular depth prediction. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Alhashim, I.; Wonka, P. High quality monocular depth estimation via transfer learning. arXiv 2018, arXiv:1812.11941. [Google Scholar]
- Alagoz, B.B. Obtaining depth maps from color images by region based stereo matching algorithms. arXiv 2008, arXiv:0812.1340. [Google Scholar]
- Martins, D.; Van Hecke, K.; De Croon, G. Fusion of stereo and still monocular depth estimates in a self-supervised learning context. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 849–856. [Google Scholar]
- Wang, H.; Sang, X.; Chen, D.; Wang, P.; Ye, X.; Qi, S.; Yan, B. Self-supervised stereo depth estimation based on bi-directional pixel-movement learning. Appl. Opt. 2022, 61, D7–D14. [Google Scholar] [CrossRef] [PubMed]
- Nievergelt, J.; Preparata, F.P. Plane-sweep algorithms for intersecting geometric figures. Commun. ACM 1982, 25, 739–747. [Google Scholar] [CrossRef]
- Choi, S.; Kim, S.; Park, K.; Sohn, K. Learning descriptor, confidence, and depth estimation in multi-view stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 276–282. [Google Scholar]
- Im, S.; Jeon, H.G.; Lin, S.; Kweon, I.S. Dpsnet: End-to-end deep plane sweep stereo. arXiv 2019, arXiv:1905.00538. [Google Scholar]
- Pei, Z.; Wen, D.; Zhang, Y.; Ma, M.; Guo, M.; Zhang, X.; Yang, Y.H. MDEAN: Multi-view disparity estimation with an asymmetric network. Electronics 2020, 9, 924. [Google Scholar] [CrossRef]
- Wang, K.; Shen, S. MVDepthNet: Real-time multiview depth estimation neural network. In Proceedings of the 2018 International Conference on 3d Vision (3DV), Verona, Italy, 5–8 September 2018; pp. 248–257. [Google Scholar]
- Shi, L.; Li, B.; Kim, C.; Kellnhofer, P.; Matusik, W. Towards real-time photorealistic 3D holography with deep neural networks. Nature 2021, 591, 234–239. [Google Scholar] [CrossRef] [PubMed]
- Nishitsuji, T.; Kakue, T.; Blinder, D.; Shimobaba, T.; Ito, T. An interactive holographic projection system that uses a hand-drawn interface with a consumer CPU. Sci. Rep. 2021, 11, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Park, B.J.; Hunt, S.J.; Nadolski, G.J.; Gade, T.P. Augmented reality improves procedural efficiency and reduces radiation dose for CT-guided lesion targeting: A phantom study using HoloLens 2. Sci. Rep. 2020, 10, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Miller, M.R.; Herrera, F.; Jun, H.; Landay, J.A.; Bailenson, J.N. Personal identifiability of user tracking data during observation of 360-degree VR video. Sci. Rep. 2020, 10, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Maya, Autodesk. 2018. Available online: https://www.autodesk.com/products/maya/overview (accessed on 17 September 2022).
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Eybposh, M.H.; Caira, N.W.; Atisa, M.; Chakravarthula, P.; Pégard, N.C. DeepCGH: 3D computer-generated holography using deep learning. Opt. Express 2020, 28, 26636–26650. [Google Scholar] [CrossRef] [PubMed]
- Lee, W.H. Sampled Fourier transform hologram generated by computer. Appl. Opt. 1970, 9, 639–643. [Google Scholar] [CrossRef] [PubMed]
- Yoon, M.S.; Oh, K.J.; Choo, H.G.; Kim, J. A spatial light modulating LC device applicable to amplitude-modulated holographic mobile devices. In Proceedings of the 2015 IEEE 13th International Conference on Industrial Informatics (INDIN), Cambridge, UK, 22–24 July 2015; pp. 677–681. [Google Scholar]
- Bhat, S.F.; Alhashim, I.; Wonka, P. Adabins: Depth estimation using adaptive bins. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 4009–4018. [Google Scholar]
Distance from virtual camera to 255 depth | 11 cm |
Margin from depth boundary to object | 2 cm |
Distance from virtual camera to 0 depth | 28.7 cm |
Distance between two objects (Center to center) | 8.3 cm |
Distance from 0 depth to 255 depth | 14.2 cm |
Radius of camera rotation path (R) | 20 cm |
Data Set 1 (Depth Map Estimation for Objects Used in Train) | |||||||
---|---|---|---|---|---|---|---|
Generated sample | Shape | Training set/Test set | |||||
RGB images | 4096 | Torus | 1024 | Set for training | 614 | Set for test | 410 |
Cube | 1024 | 614 | 410 | ||||
Cone | 1024 | 614 | 410 | ||||
Sphere | 1024 | 614 | 410 | ||||
Depth map images | 4096 | Torus | 1024 | Set for training | 614 | Set for test | 410 |
Cube | 1024 | 614 | 410 | ||||
Cone | 1024 | 614 | 410 | ||||
Sphere | 1024 | 614 | 410 | ||||
Data set 2 (Depth map estimation for objects not used in train) | |||||||
Generated sample | Shape | Training set/Test set | |||||
RGB images | 820 | Dodecahedron | 410 | Set for training | - | Set for test | 410 |
Icosahedron | 410 | - | 410 | ||||
Depth map images | 820 | Dodecahedron | 410 | Set for training | - | Set for test | 410 |
Icosahedron | 410 | - | 410 |
Architecture of Proposed Model | ||
---|---|---|
Input | Input (RGB images) | [ch = 3, shape = 640 × 360] |
Convolution | 7 × 7 convolution, stride 2 | [ch = 96, shape = 320 × 180] |
Encoder (Pre-trained DenseNet-161) | Batch normalization ReLu 3 × 3 max pooling | [ch = 96, shape = 160 × 90] |
Dense block (6 dense layers) transition layer | [ch = 192, shape = 80 × 45] | |
Dense block (12 dense layers) transition layer | [ch = 384, shape = 40 × 22] | |
Dense block (36 dense layers) transition layer | [ch = 1056, shape = 20 × 11] | |
Dense block (24 dense layers) | [ch = 2208, shape = 20 × 11] | |
Batch normalization | [ch = 2208, shape = 20 × 11] | |
Bottleneck | 1 × 1 convolution | [ch = 1104, shape = 20 × 11] |
Decoder (Dense Depth) | Up-sampling layer | [ch = 552, shape = 40 × 22] |
Up-sampling layer | [ch = 276, shape = 80 × 45] | |
Up-sampling layer | [ch = 138, shape = 160 × 90] | |
Up-sampling layer | [ch = 69, shape = 320 × 180] | |
Convolution | 3 × 3 convolution | [ch = 1, shape = 320 × 180] |
Output | Bilinear interpolation | [ch = 1, shape = 640 × 360] |
Dense layer | Transition layer | Up sampling layer |
Batch normalization | Batch normalization | Bilinear interpolation |
ReLu | ReLu | Skip connection |
1 × 1 convolution | 1 × 1 convolution | 3 × 3 convolution |
Batch normalization | 2 × 2 max pooling | Batch normalization |
ReLu | - | ReLu |
3 × 3 convolution | - | 3 × 3 convolution |
- | - | Batch normalization |
- | - | ReLu |
Models | (a) SSIM | (b) PSNR (dB) | ||||||
Torus | Cube | Cone | Sphere | Torus | Cube | Cone | Sphere | |
HDD | 0.9999 | 0.9999 | 0.9999 | 0.9999 | 84.95 | 84.42 | 84.90 | 85.03 |
CDD | 0.9999 | 0.9999 | 0.9999 | 0.9999 | 84.64 | 83.94 | 84.68 | 84.62 |
Models | (c) ACC | (d) Abs rel | ||||||
Torus | Cube | Cone | Sphere | Torus | Cube | Cone | Sphere | |
HDD | 0.9933 ± 0.0040 | 0.9933 ± 0.0030 | 0.9928 ± 0.0037 | 0.9965 ± 0.0012 | 0.022 | 0.018 | 0.022 | 0.017 |
CDD | 0.9925 ± 0.0039 | 0.9934 ± 0.0027 | 0.9926 ± 0.0036 | 0.9959 ± 0.0013 | 0.019 | 0.017 | 0.016 | 0.017 |
Models | (e) Sq rel | (f) RMSE | ||||||
Torus | Cube | Cone | Sphere | Torus | Cube | Cone | Sphere | |
HDD | 0.0058 | 0.0046 | 0.0058 | 0.0043 | 0.0009 | 0.0007 | 0.0006 | 0.0005 |
CDD | 0.0062 | 0.0052 | 0.0061 | 0.0047 | 0.0012 | 0.0013 | 0.0008 | 0.0013 |
Models | (g) LRMSE | |||||||
Torus | Cube | Cone | Sphere | |||||
HDD | 0.0114 | 0.0117 | 0.0111 | 0.0110 | ||||
CDD | 0.0116 | 0.0122 | 0.0113 | 0.0112 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, H.; Lim, H.; Jee, M.; Lee, Y.; Yoon, M.; Kim, C. High-Precision Depth Map Estimation from Missing Viewpoints for 360-Degree Digital Holography. Appl. Sci. 2022, 12, 9432. https://doi.org/10.3390/app12199432
Kim H, Lim H, Jee M, Lee Y, Yoon M, Kim C. High-Precision Depth Map Estimation from Missing Viewpoints for 360-Degree Digital Holography. Applied Sciences. 2022; 12(19):9432. https://doi.org/10.3390/app12199432
Chicago/Turabian StyleKim, Hakdong, Heonyeong Lim, Minkyu Jee, Yurim Lee, MinSung Yoon, and Cheongwon Kim. 2022. "High-Precision Depth Map Estimation from Missing Viewpoints for 360-Degree Digital Holography" Applied Sciences 12, no. 19: 9432. https://doi.org/10.3390/app12199432
APA StyleKim, H., Lim, H., Jee, M., Lee, Y., Yoon, M., & Kim, C. (2022). High-Precision Depth Map Estimation from Missing Viewpoints for 360-Degree Digital Holography. Applied Sciences, 12(19), 9432. https://doi.org/10.3390/app12199432