Figure 1.
Examples of depth maps containing large areas of missing data. (a) Reference color image. (b) Depth map, holes (in black) due to occlusions. (c) Color reference image; a white square indicates reflections. (d) Depth map with holes in areas of high reflection due to uncertainty in the stereo-matching process.
Figure 1.
Examples of depth maps containing large areas of missing data. (a) Reference color image. (b) Depth map, holes (in black) due to occlusions. (c) Color reference image; a white square indicates reflections. (d) Depth map with holes in areas of high reflection due to uncertainty in the stereo-matching process.
Figure 2.
Proposed pipeline structure. It consists of three steps. CS1 extracts color features from the reference image using Gabor filters. Then, the algorithm propagates the acquired available depth data, with the guidance of the color image, to all regions of the sparse depth map by solving the infinity Laplacian equation (interpolator). Finally, we use a convolution stage (CS2) to eliminate outliers and noise from the completed depth map.
Figure 2.
Proposed pipeline structure. It consists of three steps. CS1 extracts color features from the reference image using Gabor filters. Then, the algorithm propagates the acquired available depth data, with the guidance of the color image, to all regions of the sparse depth map by solving the infinity Laplacian equation (interpolator). Finally, we use a convolution stage (CS2) to eliminate outliers and noise from the completed depth map.
Figure 3.
Examples of color reference images processed by the first convolutional stage. Figures (a–c) show the color reference image extracted from the KITTI data set. In (d–f), we present their respective output after being processed by the first convolutional stage (color feature images). We observe in (d–f) that the processing mainly emphasizes horizontal edges in order to improve the diffusion process.
Figure 3.
Examples of color reference images processed by the first convolutional stage. Figures (a–c) show the color reference image extracted from the KITTI data set. In (d–f), we present their respective output after being processed by the first convolutional stage (color feature images). We observe in (d–f) that the processing mainly emphasizes horizontal edges in order to improve the diffusion process.
Figure 4.
Example of Texture+Structure decomposition. (a) Original color reference image. (b) Low spatial variation part of the image. (c) High spatial variation part of the image (texture component).
Figure 4.
Example of Texture+Structure decomposition. (a) Original color reference image. (b) Low spatial variation part of the image. (c) High spatial variation part of the image (texture component).
Figure 5.
Example of depth completion by the infinite Laplacian. We show the reference image in (a). The image consists of a white rectangle, and in its center, there is a black circle. We show the depth data in (b). The orange region represents a constant depth of 0.5; in the center, the circle in blue represents the lack of data. In the center of the blue circle, there is only a single value equal to 0.68. We show the interpolated depth map in (c). The infinity Laplacian generates a cone, connecting the circular contour with the point in the circle’s center.
Figure 5.
Example of depth completion by the infinite Laplacian. We show the reference image in (a). The image consists of a white rectangle, and in its center, there is a black circle. We show the depth data in (b). The orange region represents a constant depth of 0.5; in the center, the circle in blue represents the lack of data. In the center of the blue circle, there is only a single value equal to 0.68. We show the interpolated depth map in (c). The infinity Laplacian generates a cone, connecting the circular contour with the point in the circle’s center.
Figure 6.
Example of depth map completion. (a) shows a two-color reference image. (b) shows random samples of a two-level depth map. White values represent the value 1, and orange values represent the depth value 0.5. Figure (c) shows a 3D representation of the completed depth map. We observe the two-level surface.
Figure 6.
Example of depth map completion. (a) shows a two-color reference image. (b) shows random samples of a two-level depth map. White values represent the value 1, and orange values represent the depth value 0.5. Figure (c) shows a 3D representation of the completed depth map. We observe the two-level surface.
Figure 7.
Example of map in three example images. Figures (a–c) show considered reference color images. Figures (d–f) show color-coded values of balance map. In magenta, we show largest values of (). In blue and cyan, we show intermediate values of . In green, we show lowest values of ().
Figure 7.
Example of map in three example images. Figures (a–c) show considered reference color images. Figures (d–f) show color-coded values of balance map. In magenta, we show largest values of (). In blue and cyan, we show intermediate values of . In green, we show lowest values of ().
Figure 8.
Examples of reference color images of the KITTI data set, sparse depth maps, and ground truth. First row: color reference images. Second row: available sparse depth maps. In the third row, we show the corresponding ground truth. We color-coded the available depth value using a MATLAB jet color map. In the second and third rows, black means a lack of data, and red and yellow mean small and large depth values, respectively.
Figure 8.
Examples of reference color images of the KITTI data set, sparse depth maps, and ground truth. First row: color reference images. Second row: available sparse depth maps. In the third row, we show the corresponding ground truth. We color-coded the available depth value using a MATLAB jet color map. In the second and third rows, black means a lack of data, and red and yellow mean small and large depth values, respectively.
Figure 9.
Examples of NYU_V2 data set. The first row shows three indoor color images of the NYU_V2 data set. We show images Img_1199, Img_1372, and Img_1424, respectively. We color-coded the corresponding depth map in the second row using MATLAB jet colormap: dark red means small depth values, and bright yellow means large depth values.
Figure 9.
Examples of NYU_V2 data set. The first row shows three indoor color images of the NYU_V2 data set. We show images Img_1199, Img_1372, and Img_1424, respectively. We color-coded the corresponding depth map in the second row using MATLAB jet colormap: dark red means small depth values, and bright yellow means large depth values.
Figure 10.
Evolution of the PSO algorithm. Using this optimization algorithm, we selected the best set of parameters for the model. We set the number of individuals, that is, independent realizations, to 50. (a) Evolution of the depth error (MSE + MAE) of each of the 50 model parameters candidates and the error evolution over successive iterations. (b) For clarity, the evolution of the best individual from the previous 50 shown in (a) is shown here.
Figure 10.
Evolution of the PSO algorithm. Using this optimization algorithm, we selected the best set of parameters for the model. We set the number of individuals, that is, independent realizations, to 50. (a) Evolution of the depth error (MSE + MAE) of each of the 50 model parameters candidates and the error evolution over successive iterations. (b) For clarity, the evolution of the best individual from the previous 50 shown in (a) is shown here.
Figure 11.
Example of the subsampled image. We sampled the original depth map in (a) every other (or ) square pixel in (b). In (c), we zoom in on a region (on the bed) of the depth map in (a). In (d), we show a zoom of the downsampled depth map in (c).
Figure 11.
Example of the subsampled image. We sampled the original depth map in (a) every other (or ) square pixel in (b). In (c), we zoom in on a region (on the bed) of the depth map in (a). In (d), we show a zoom of the downsampled depth map in (c).
Figure 12.
Error in the training set as a function of the number of images in the training set.
Figure 12.
Error in the training set as a function of the number of images in the training set.
Figure 13.
Examples of lack of information in depth maps. Figures (a–c) show reference color images. Figures (d–f) show depth map ground truth.
Figure 13.
Examples of lack of information in depth maps. Figures (a–c) show reference color images. Figures (d–f) show depth map ground truth.
Figure 14.
Examples of qualitative and quantitative results obtained by our best model in the KITTI data set. We show color reference images in (a,b). In (c,d), we show the sparse depth map used as input to our model. In (e,f), we show the completed depth map and the obtained in each image.
Figure 14.
Examples of qualitative and quantitative results obtained by our best model in the KITTI data set. We show color reference images in (a,b). In (c,d), we show the sparse depth map used as input to our model. In (e,f), we show the completed depth map and the obtained in each image.
Figure 15.
Examples of qualitative and quantitative errors obtained by our model in the KITTI data set. We show our worst results in the KITTI data set. In (a–c), we show color reference images. We present the sparse depth map used as input to our model in (d–f). In (g–i), we show the completed depth map.
Figure 15.
Examples of qualitative and quantitative errors obtained by our model in the KITTI data set. We show our worst results in the KITTI data set. In (a–c), we show color reference images. We present the sparse depth map used as input to our model in (d–f). In (g–i), we show the completed depth map.
Figure 16.
Error in the training and validation set as a function of the number of images in the training set.
Figure 16.
Error in the training and validation set as a function of the number of images in the training set.
Figure 17.
Comparison of obtained
between models: Our proposal, Bicubic, TGV [
10], Ham [
28], JBU, and Park [
2].
Figure 17.
Comparison of obtained
between models: Our proposal, Bicubic, TGV [
10], Ham [
28], JBU, and Park [
2].
Figure 19.
A 3D reconstruction of an urban scene from a depth map and a color frame. The first row shows the 3D interpolated depth of an urban scene. The second row shows the reconstruction of the 3D scene created using the color reference and the completed depth map.
Figure 19.
A 3D reconstruction of an urban scene from a depth map and a color frame. The first row shows the 3D interpolated depth of an urban scene. The second row shows the reconstruction of the 3D scene created using the color reference and the completed depth map.
Table 1.
Model training comparison table.
Table 1.
Model training comparison table.
Order Number | Model | Number of Parameters | Number of GPUs | Number of Images in the Training Set |
---|
1 | Eldesokey et al. [22] | 330.000 | - | - |
2 | Lu et al. [4] | - | 2 | 85,898 |
3 | Bai et al. [17] | 134.000 | 1 | 85,898 |
4 | Lin et al. [15] | - | 4 | 50,000 |
5 | Morpho Networks [23] | - | 1 | 4300 |
6 | Chen et al. [24] | - | 8 | - |
Table 2.
Number of points per neighborhood size.
Table 2.
Number of points per neighborhood size.
Order Number | Neighborhood Size | Parameter P |
---|
1 | | 9 |
2 | | 25 |
3 | | 49 |
4 | | 81 |
5 | | 121 |
6 | | 169 |
Table 3.
Running time of different models.
Table 3.
Running time of different models.
Model | Processing Time |
---|
Our model | 1 ms |
Eldesokey et al. [22] | 20 ms |
Lu et al. [4] | 50 ms |
Bai et al. [17] | 90 ms |
Imran et al. [8] | 150 ms |
Lin et al. [15] | 160 ms |
Morpho. Networks [23] | 170 ms |
Krauss et al. [20] | 840 ms |
Conv. Spatial. Prop. Networks [24] | 1000 ms |
Table 4.
Parameters of the proposed model.
Table 4.
Parameters of the proposed model.
Parameter | Description | Number of Parameters |
---|
, , p, q, s, , | Parameters of the metric | 7 |
, | Parameters of the anisotropic metric | 2 |
radius | Neighborhood size of | 1 |
| Number of iterations | 1 |
with j = 1,…,8 | Parameters of the Gabor Filters CS1 | 16 |
with | Weights of the convolutional filters stage CS2 | 36 |
and , | Coefficients of the matrices A and C | 4 |
Reversible | Sequence order | 1 |
NFilt, NFilt2 | Number of filters considered in stage CS1 and CS2, respectively | 2 |
, | Parameters of the balanced infinity Laplacian | 2 |
, | Parameters of the double balanced infinity Laplacian | 2 |
Total | | 74 |
Table 5.
Experiments performed with our model in order to obtain MSE + MAE.
Table 5.
Experiments performed with our model in order to obtain MSE + MAE.
Experiment Number | Used Metric | Training Set | Experiment |
---|
1 | positive definite Metric | KITTI training set | Interpolation of the complete KITTI data set |
2 | anisotropic metric | KITTI training set | Interpolation of the complete KITTI data set |
3 | anisotropic metric + biased infinity Laplacian | KITTI training set | Interpolation of the complete KITTI data set |
4 | anisotropic metric + double biased infinity Laplacian | KITTI training set | Interpolation of the complete KITTI data set |
5 | positive definite Metric | NYU_V2 4× training set | Upsampling of the complete NYU_V2 4× data set |
6 | Anisotropic Metric | NYU_V2 4× training set | Upsampling of the complete NYU_V2 4× data set |
7 | positive definite Metric | NYU_V2 8× training set | Upsampling of the complete NYU_V2 8× data set |
8 | Anisotropic Metric | NYU_V2 8× training set | Upsampling of the complete NYU_V2 8× data set |
9 | positive definite Metric | NYU_V2 16× training set | Upsampling of the complete NYU_V2 16× data set |
10 | Anisotropic Metric | NYU_V2 16× training set | Upsampling of the complete NYU_V2 16× data set |
11 | Anisotropic Metric | KIITI training set | Interpolation of the complete KITTI data set |
Table 6.
Depth Completion Error in KITTI data set.
Table 6.
Depth Completion Error in KITTI data set.
Model | MSE | MAE | MSE + MAE |
---|
Proposed model—Texture+Structure | 1.1395 | 0.3127 | 1.4522 |
Proposed model—anisotropic metric | 1.1269 | 0.3107 | 1.4376 |
Balanced biased infinity Laplacian | 1.1254 | 0.3205 | 1.4459 |
double balanced biased infinity Laplacian | 1.1256 | 0.3223 | 1.4479 |
Presented results in PDAA [1] | 1.1397 | 0.3132 | 1.4529 |
Conv. Spatial. Prop. Networks [24] | 1.0196 | 0.2795 | 1.2991 |
Morpho. Networks [23] | 1.0455 | 0.3105 | 1.3560 |
Deep Fuse Networks [13] | 1.2067 | 0.4299 | 1.6366 |
Table 7.
Depth Completion Error obtained varying the number of images in the training set.
Table 7.
Depth Completion Error obtained varying the number of images in the training set.
Number of Images | MSE | MAE | Error |
---|
in the Training Set | cm | cm | cm |
3 | 1.1254 | 0.3205 | 1.4459 |
4 | 1.1272 | 0.3139 | 1.4411 |
6 | 1.2617 | 0.4237 | 1.6854 |
8 | 1.1355 | 0.3226 | 1.4581 |
10 | 1.2826 | 0.4247 | 1.7073 |
12 | 1.1353 | 0.3234 | 1.4586 |
14 | 1.4396 | 0.7735 | 2.2132 |
16 | 1.4412 | 0.7675 | 2.2087 |
18 | 1.4365 | 0.7603 | 2.1968 |
Table 8.
Depth Completion Error obtained by different models in NYU_V2.
Table 8.
Depth Completion Error obtained by different models in NYU_V2.
Model | Error | Error | Error |
---|
| cm | cm | cm |
Proposed model | 8.31 | 9.38 | 17.56 |
Proposed model—anisotropic metric | 8.31 | 9.35 | 17.16 |
Bicubic | 8.46 | 14.22 | 22.32 |
TGV [10] | 6.98 | 11.23 | 28.13 |
Ham [28] | 5.27 | 12.31 | 19.24 |
JBU | 4.07 | 8.39 | 13.25 |
Park [2] | 5.21 | 9.56 | 18.10 |
DJF [29] | 3.54 | 6.20 | 10.21 |
Table 9.
Results of the ablation test.
Table 9.
Results of the ablation test.
Stage | MSE |
---|
Complete | 8.3136 |
Convolutional Stage CS1 | 8.3272 |
Anisotropic Metric | 8.3140 |
Convolutional Stage CS2 | 8.6529 |
Table 10.
Best results obtained in KITTI data set.
Table 10.
Best results obtained in KITTI data set.
Model | MSE | MAE |
---|
Anisotropic Metric | 1.1269 | 0.3107 |
Table 11.
Best results obtained in NYU_V2 data.
Table 11.
Best results obtained in NYU_V2 data.
Model | MSE | MSE | MSE |
---|
| | | |
Anisotropic Metric | 8.31 | 9.38 | 17.56 |