High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry

Cheng, Xu; Liu, Xingjian; Li, Zhongwei; Zhong, Kai; Han, Liya; He, Wantao; Gan, Wanbing; Xi, Guoqing; Wang, Congjun; Shi, Yusheng

doi:10.3390/s19030668

Open AccessArticle

High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry

by

Xu Cheng

¹,

Xingjian Liu

¹

,

Zhongwei Li

^1,*,

Kai Zhong

^1,*,

Liya Han

¹

,

Wantao He

²,

Wanbing Gan

³,

Guoqing Xi

³,

Congjun Wang

¹ and

Yusheng Shi

¹

State Key Laboratory of Material Processing and Die & Mould Technology, Huazhong University of Science and Technology, Wuhan 430074, China

²

School of Mechanical Engineering, Heilongjiang University of Science and Technology, Harbin 150022, China

³

Hubei Tri-Ring Forging Co., Ltd, Gucheng 441700, China

^*

Authors to whom correspondence should be addressed.

Sensors 2019, 19(3), 668; https://doi.org/10.3390/s19030668

Submission received: 16 January 2019 / Revised: 3 February 2019 / Accepted: 4 February 2019 / Published: 6 February 2019

(This article belongs to the Special Issue Visual Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry (FPP) sensor. To solve the accumulated sensor pose estimation error problem encountered in a long scanning trajectory, we first present a novel 3D registration method which fuses both dense geometric and curvature consistency constraints to improve the accuracy of relative sensor pose estimation. Then we perform global sensor pose optimization by modeling the surface consistency information as a pre-computed covariance matrix and formulating the multi-view point cloud registration problem in a pose graph optimization framework. Experiments on reconstructing a 1300 mm × 400 mm workpiece with a FPP sensor is performed, verifying that our method can substantially reduce the accumulated error and achieve industrial-level surface model reconstruction without any external positional assistance but only using a single FPP sensor.

Keywords:

quality control; fringe projection profilometry; depth image registration; 3D reconstruction

1. Introduction

Fringe projection profilometry provides a convenient way to measure dense and accurate three dimensional (3D) surface point cloud of target objects. It plays an increasingly important role in various fields such as industrial quality inspection, prototyping, culture heritage preservation and movie industry [1,2,3,4,5]. Owing to the limited field of view (FOV) and object self-occlusion, 3D point cloud obtained from a single viewpoint only contains partial surface shape data. To reconstruct complete surface models, 3D measurements from multiple viewpoints are deserved to cover the whole object, and their sensor poses need to be precisely tracked to further transform these partial surface point clouds into a global coordinate system [6,7,8,9].

Existing sensor pose tracking solutions are mostly based on external assistance methods, such as attaching artificial markers or using external positional equipment such as a laser tracker or optical coordinate measuring machines (CMMs) [10], their usage flexibility is inherently limited. Alternatively, sensor poses can also be directly estimated by using 3D registration techniques [11,12,13] to compute the relative pose between sequential two measurements. However, sensor pose estimation drifts inevitably exist due to 3D registration inaccuracy. Small sensor pose estimation error which may seem negligible on a local scale, can drastically accumulate along a long scanning trajectory [12,14]. The accumulated error directly leads to surface point clouds inconsistency between the first and last scans and finally breaks the reconstruction result.

Different optimization methods have been adopted to solve the accumulated error problem. Among them, bundle adjustment (BA) is one of the most well-known approaches that performs global optimization by minimizing the reprojection error across different frames. Specifically, BA is conducted by firstly identifying the same visual feature points appearing in multiple frames, and then adjusting the estimated 3D locations of feature points together with the camera poses [7,9]. Nevertheless, BA only optimizes sparse 3D feature points and camera poses, thus it does not guarantee local shape consistency of the reconstructed 3D models [14]. Besides, visual feature detection is the prerequisite for BA optimization, it cannot be fulfilled when the color image is not valid or the target object surface is textureless (e.g., industrial parts).

Instead of optimizing the accumulated error to solve surface inconsistency, Zhou et al. [15] and Whelan el al. [16] chose to deform inconsistency local point clouds together using non-rigid 3D registration techniques, consumer RGB-D sensors are taken as the depth input in their works. Shape deformation provides a simple yet useful approach to obtain globally consistent models, especially in some applications such as indoor reconstruction [12] where surface consistency instead of the accuracy is of the most importance. However, shape deformation is not desired in our problem, because it directly ruins the surface measurement accuracy. Furthermore, since FPP sensor provides high-accuracy surface point cloud measurements, theoretically when sufficient accurate sensor poses are recovered, the individual local 3D point clouds should be able to integrate into a globally consistent model using only rigid transformations.

Differently, Cao et al. [17] and Yue et al. [18] optimized the accumulated error by first identifying the loop closures formed through successful 3D registration between each current frame and other earlier frames, and then performing a pose graph optimization [19] to reduce the sensor poses drifts. However, in their works the loop closures are identifying either by manually checking the 3D point cloud overlapping ratio [17], or by using the measurement system setup information [18], which prevents their further usage in a practical 3D scanning system. Moreover, the pose graph optimization in [17,18] only optimized the inconsistency between two associated sensor poses and their relative pose constraint; it ignores important surface consistency information in the 3D registration process [6].

According to the above analysis, the key to accurate surface reconstruction lies in the reduction of accumulated sensor pose estimation error. In this paper, we present a flexible and accurate method for high-accuracy globally consistent surface reconstruction using a single FPP sensor. The accumulated error problem is addressed from two aspects: (1) observing the underlying principle that surface curvature remains invariant against measurement viewpoint changes, a novel 3D registration method is proposed which fuses both dense geometric and curvature consistency constraints to joint optimize the relative sensor pose estimation. The introduction of curvature consistency constraint implicitly pays attention to high-curvature surfaces, which helps to generate more accurate 3D registration results [20]. (2) We utilize 6-DOF pose distances for adaptive keyframe determination, and use a two-step checking scheme for automatic loop closure detection. By modelling the surface inconsistency information as a pre-computed covariance matrix and formulating the multi-view point cloud registration problem in a pose graph optimization framework, the accumulated error can be effectively reduced to obtain the final accurate sensor pose estimations.

The effectiveness of our proposed method is demonstrated by reconstructing a 1300 mm × 400 mm workpiece with a FPP sensor. Results show that the proposed method substantially reduced the accumulated error, making the sensor pose estimation accuracy match the measurement accuracy well. Our method shows the ability to accomplish industrial-level surface model reconstruction without any external positional assistance but only using a single FPP sensor.

2. Measurement Principle

In our FPP sensor, a series of sinusoidal fringes along the horizontal axes of projector image frame with constant phase shifting are projected onto a target object, and two cameras capture the distorted fringe images synchronously. The captured images can be expressed as:

I_{i} (x, y) = A_{i} (x, y) + B_{i} (x, y) cos (ϕ (x, y) + δ_{i}), i = 1, 2, 3, \dots, n

(1)

where

(x, y)

is the pixel coordinates and is omitted in the following expression,

I_{i}

denotes the recorded intensity,

A_{i}

indicates the average intensity,

B_{i}

represents the modulation intensity,

δ_{i}

is the constant phase-shift, n is the phase shift number, and

ϕ

is the desired phase information. By solving Equation (1), the phase value

ϕ

can be obtained according to:

ϕ = - arctan (\sum_{i = 1}^{n} I_{i} sin (δ_{i}) / \sum_{i = 1}^{n} I_{i} cos (δ_{i})) .

(2)

The arctangent function in Equation (2) will result in a phase value within the range of

[- π, π]

with

2 π

discontinuities. In our sensor, multi-frequency heterodyne technology is adopted to construct the continuous phase map [21], so that the correspondence between two camera views can be established unambiguously. Finally, the 3D result can be obtained according to the pre-calibrated camera intrinsic and external parameters. The measurement principle of the FPP sensor is shown in the Figure 1 below.

3. Relative Sensor Pose Estimation

The relative sensor pose estimation between sequential two measurements (also called as frames in the following) is the basis to obtain the initial global sensor pose estimation of each measurement. In this section, we will introduce the proposed method which estimates the relative sensor pose (a rigid transformation) by 3D registering two depth maps to jointly optimize the dense geometric and curvature inconsistency errors. The whole process is conducted by first computing the curvature map of each depth map, and then iteratively performing data association and error minimization steps.

3.1. Curvature Map Estimation

Similarly to depth map (also called as depth image), curvature map is a 2D image in which the value of each pixel is the surface curvature value instead of the depth value. Specifically, for each pixel

x = {(u, v)}^{⊺}

in the depth map with valid depth

z (x)

, its corresponding 3D point coordinate

p (x)

can be computed using the inverse of projection function

Π (\cdot)

as:

\begin{matrix} p (x) & = Π^{- 1} (x, z (x)) \\ = z (x) {(\frac{u - c_{x}}{f_{x}}, \frac{v - c_{y}}{f_{y}}, 1)}^{⊺}, \end{matrix}

(3)

where

f_{x}

,

f_{y}

are the focal lengths and

c_{x}

,

c_{y}

are the principle point, respectively. The mean curvature of each point on the surface is represented using a surface variation notion in [22]. Hence, the surface curvature value

κ (x)

at pixel

x

is estimated by taking the eigen-analysis of the covariance matrix of the local neighbor points of point

p (x)

. The covariance matrix is defined as:

C (x) = \sum_{i}^{k} (p_{i} - \bar{p}) {(p_{i} - \bar{p})}^{⊺}, \bar{p} = \frac{1}{k} \sum_{i}^{k} p_{i},

(4)

where

p_{i}

is one of the nearest neighbor points of

p (x)

. Then

κ (x)

can be computed as:

κ (x) = \frac{λ_{0}}{λ_{0} + λ_{1} + λ_{2}},

(5)

where

λ_{0} \leq λ_{1} \leq λ_{2}

are the eigenvalues of the covariance matrix

C (x)

.

To speed up the nearest neighbor search, we take advantage of the organized point cloud structure embedded in the depth map, only taking adjacent pixels as candidate neighbors. Meanwhile, the geometric continuity constraints are also considered to filter the potential depth gaps by specifying a maximum allowed distance. Pixel

x_{i}

is the nearest neighbor of pixel

x

, only when it satisfies

∥ x - x_{i} ∥ \leq σ_{1}

, and

∥ p (x) - p (x_{i}) ∥ \leq σ_{2}

, where

σ_{1}

and

σ_{2}

represent the pixel and point nearest neighbor distance threshold, respectively. In this paper, we set

σ_{1} = 3

and

σ_{2} = 1.1

mm (with average point cloud density as 0.275 mm) to allow approximate 30 nearest neighbor points for curvature value estimation.

Figure 2a shows a depth map measured with the FPP sensor, Figure 2b shows the estimated curvature map using our method. Figure 2c is the corresponding 3D point cloud whose color is mapped from the curvature map, and the local detail is displayed in Figure 2d. It can be seen that the estimated curvature map exhibits high consistency with the point cloud surface variation. Furthermore, by carefully handling the discontinuous boundary case, the curvature values at boundary points can also be robustly estimated, as shown in Figure 2d.

3.2. Data Association

Data association is to identify the corresponding points between two sequential frames, the correspondence set is then fed to the optimization process to find the optimal relative sensor pose estimation. Assuming small camera motion between sequential frames, the projective data association algorithm [12] is conducted to produce the point correspondences set. Given the relative sensor pose estimation

T_{i - 1, i}

between current frame

f_{i}

and its previous frame

f_{i - 1}

, then for each pixel

x

with valid depth in

f_{i}

, we first transform its corresponding 3D point

p (x)

into the local coordinate system of previous frame

f_{i - 1}

as

T_{i - 1, i} p (x) = {(x, y, z)}^{⊺}

. Then the corresponding pixel of

x

in frame

f_{i - 1}

can be computed with perspective projection:

\begin{matrix} \bar{x} & = Π (T_{i - 1, i} p (x)) \\ = \frac{1}{z} K T_{i - 1, i} p (x) \\ = {(f_{x} \frac{x}{z} + c_{x}, f_{y} \frac{y}{z} + c_{y})}^{⊺}, \end{matrix}

(6)

where

K

is the camera intrinsic matrix. Note that for simplicity of notation, we omit the conversions between vectors and its homogeneous vectors throughout this paper.

With the projective data association, multiple pixels in source depth image

f_{i}

may correspond to a common pixel in target depth image

f_{i - 1}

. To solve the many-to-one problem, the z-buffer technique is adopted, for each pixel in target depth map

f_{i - 1}

we only keep the corresponding pixel in source depth map

f_{i}

with minimum depth. All corresponding points pairs together construct the corresponding set

K_{i - 1, i} = {(x, \bar{x})}

between frame

f_{i}

and

f_{i - 1}

.

3.3. Minimization

The relative sensor pose optimization function

E_{r e g}

is defined as:

E_{r e g} = E_{g e o} + λ E_{c u r},

(7)

where

E_{g e o}

denotes the geometric inconsistency error,

E_{c u r}

denotes the curvature inconsistency error,

λ

is the weight of the curvature inconsistency error.

The geometric error is defined as the point-to-plane error [11] between current and previous frames:

E_{g e o} = \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ (exp (\hat{ξ}) T_{i - 1, i} p_{i} (x) - p_{i - 1} (\bar{x})) \cdot n_{i - 1} (\bar{x}) ∥}^{2},

(8)

in which

(x, \bar{x})

is one corresponding pixels pair in the corresponding set

K_{i - 1, i}

,

p_{i} (x)

is the local 3D point in the current frame

f_{i}

,

p_{i - 1} (\bar{x})

and

n_{i - 1} (\bar{x})

are the corresponding 3D point and normal, respectively.

T_{i - 1, i}

is the current estimation of the relative sensor pose between the two frames.

exp (\hat{ξ}) \in SE (3)

is the incremental transformation to be estimated in each iteration, in which

ξ = {(ω, t)}^{⊺} = {(α, β, γ, t_{x}, t_{y}, t_{z})}^{⊺} \in R^{6}

.

The curvature inconsistency error

E_{c u r}

is defined as the curvature value inconsistency between the warped curvature map of current frame

f_{i}

and the curvature map of previous frame

f_{i - 1}

:

\begin{matrix} E_{c u r} & = \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ κ_{i} (x) - κ_{i - 1} (\bar{x}) ∥}^{2} \\ = \sum_{(x, \bar{x}) \in K_{i - 1, i}} ∥ κ_{i} (x) - κ_{i - 1} {(Π (exp (\hat{ξ}) T_{i - 1, i} p_{i} (x)) ∥}^{2}, \end{matrix}

(9)

where

κ_{i} (x)

is the curvature value at pixel

x

of the current frame,

κ_{i - 1} (\bar{x})

is the curvature value at pixel

\bar{x}

of the previous frame.

Assuming the incremental pose transformation

exp (\hat{ξ})

to optimize at each iteration is small, it can be linearized as

exp (\hat{ξ}) \approx I + \hat{ξ}

, where

\hat{ξ} \in se (3)

is the corresponding Lie algebra element:

\hat{ξ} = [\begin{matrix} {[ω]}_{\times} & t \\ 0^{⊺} & 0 \end{matrix}] = [\begin{matrix} 0 & - γ & β & t_{x} \\ γ & 0 & - α & t_{y} \\ - β & α & 0 & t_{z} \\ 0 & 0 & 0 & 0 \end{matrix}],

(10)

the

{[\cdot]}_{\times} : R^{3} \to so (3)

is a linear skew-symmetric operator (see [23] for details).

With this linearization and simple notation

{\dot{p}}_{i - 1} (x) = T_{i - 1, i} p_{i} (x)

, the error term

E_{g e o}

becomes:

\begin{matrix} E_{g e o} & \approx \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ ((I + \hat{ξ}) {\dot{p}}_{i - 1} (x) - p_{i - 1} (\bar{x})) \cdot n_{i - 1} (\bar{x}) ∥}^{2} \\ = \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ {[\begin{matrix} p_{i - 1} (x) \times n_{i - 1} (\bar{x}) \\ n_{i - 1} (\bar{x}) \end{matrix}]}^{⊺} ξ + ({\dot{p}}_{i - 1} (x) - p_{i - 1} (\bar{x})) \cdot n_{i - 1} (\bar{x}) ∥}^{2} \\ = ∥ J_{g e o} ξ + r_{g e o} ∥^{2}, \end{matrix}

(11)

where

J_{g e o}

is the Jacobian matrix and

r_{g e o}

is the residual vector. Similarly, the error term

E_{c u r}

becomes:

\begin{matrix} E_{c u r} & \approx \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ κ_{i} (x) - κ_{i - 1} (Π ((I + \hat{ξ}) {\dot{p}}_{i - 1} (x))) ∥}^{2} \\ = \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ κ_{i} (x) - κ_{i - 1} (\frac{1}{z} K (I + \hat{ξ}) {\dot{p}}_{i - 1} (x)) ∥}^{2} \\ \approx \sum_{(x, \bar{x}) \in K_{i - 1, i}} {∥ - \frac{\partial κ_{i - 1} (\bar{x})}{\partial \bar{x}} \frac{\partial \bar{x}}{\partial \hat{ξ} {\dot{p}}_{i - 1} (x)} \frac{\partial \hat{ξ} {\dot{p}}_{i - 1} (x)}{\partial ξ} ξ + κ_{i} (x) - κ_{i - 1} (\frac{1}{z} K {\dot{p}}_{i - 1} (x)) ∥}^{2} \\ = ∥ J_{c u r} ξ + r_{c u r} ∥^{2} . \end{matrix}

(12)

With the above linearization, minimization of Equation (7) allows to solve the following linear system:

(J_{g e o}^{⊺} J_{g e o} + λ J_{c u r}^{⊺} J_{c u r}) ξ = - (J_{g e o}^{⊺} r_{g e o} + λ J_{c u r}^{⊺} r_{c u r}) .

(13)

In each iteration, we compute Jacobian

J_{g e o}

,

J_{c u r}

and residual

r_{g e o}

,

r_{c u r}

at current relative sensor pose estimation

T_{i - 1, i}

, and solve the linear system in Equation (13) to find the

ξ

that best satisfies the geometric and curvature consistency constraint. Then the relative pose

T_{i - 1, i}

is updated to

exp (\hat{ξ}) T_{i - 1, i}

, and taken as the initialization for the next iteration.

When the optimization converges, the

T_{i - 1, i}

is taken as the final relative sensor pose estimation between two frames. We fix the sensor pose of the first frame

f_{1}

as

T_{1} = I

and regard it as the world coordinate system. Then the initial global sensor pose of frame

f_{i}

is computed as

T_{i} = T_{i - 1} T_{i - 1, i}

.

Figure 3 shows the 3D registration results comparison between the proposed method and two other methods. The sensor pose estimation accuracy is directly reflected in the surface shape consistency of two registered point clouds. When independently visual inspecting each registration result, each method seems to converge to a correct result. However, when comparing the registration results between Figure 3b–d, it is not hard to see that the relative sensor pose estimation accuracy of our method outperforms the other two methods.

Figure 4a,b represents the curvature value difference map between source and target point cloud before and after the 3D registration, respectively. The curvature difference map is built on the target frame

f_{i - 1}

, correspondences are built using the above data association method. Gray pixels indicate that no correspondence is built for these pixels. It can be seen that the curvature value difference from Figure 4a,b decreases dramatically over the whole map, which demonstrates the significance of introducing curvature map consistency into the 3D registration constraints.

4. Global Sensor Pose Optimization

Though fusing curvature consistency information improves the accuracy of the estimated relative sensor poses, the global sensor pose drift will inevitably accumulate during a long scanning process. To reduce the accumulated error and obtain globally consistent 3D models, successful relative pose estimation to much earlier frames (also called as building loop closure) is deserved. In this section, we will first introduce how to automatically build a series of loop closures with the proposed adaptive keyframe selection and the two-step checking method. We will then introduce our method which performs multi-view point cloud registration in a pose graph optimization framework [19].

4.1. Keyframe Selection

Detecting loop closure for every new-income measurement is not optimal; it will greatly increase the computation cost after a long time scanning. Therefore, we only detect loop closure for selected keyframes. We utilize 6-DOF (degree of freedom) pose distance metrics to determine when to add a new keyframe for further loop closure detection. For each new input frame

f_{j}

, we evaluate the relative pose distances between it and the last added keyframe

f_{i - 1}^{k}

. In which, the rotation distance is measured as the rotation angle using the Rodrigues’ formula:

d (R_{j}, R_{i - 1}^{k}) = | arccos (\frac{t r a c e (R_{j}^{⊺} R_{i - 1}^{k}) - 1}{2}) | .

(14)

The translation distance is computed as:

d (t_{j}, t_{i - 1}^{k}) = ∥ t_{j} - t_{i - 1}^{k} ∥,

(15)

If either the rotation or translation distance exceeds its corresponding threshold

σ_{R}

or

σ_{t}

, the current frame

f_{j}

is marked as a new keyframe

f_{i}^{k}

. We set

σ_{R} = 20^{\circ}

,

σ_{t} = 130

mm in our paper. Figure 5 shows the keyframe selection results using the total 146 depth maps acquired with our FPP sensor (see Section 5). Gray points identify the 34 selected keyframes out of a total 146 depth maps.

4.2. Loop Closure Detection

For each new added keyframe

f_{i}^{k}

, we use a two-step checking scheme to detect whether it forms correct loop closures with previous keyframes. If two keyframes construct a loop closure, then they must fulfil: (1) the overlapping area between two point clouds is enough, (2) the mean absolute error (MAE) between them is small.

The overlapping area ratio is crucial for arbitrary two frames with loop closures, as small overlapping area ratios are prone to correspond to non-loop-closure connection. In this paper, we propose to use the projective association algorithm to efficiently compute the overlapping area ratio between two keyframes. When a new keyframe

f_{i}^{k}

arrives, we compute its depth valid map

V_{i}^{k}

for each pixel.

V_{i}^{k} (x) = 1

for each pixel where its depth is valid, and

V_{i} (x) = 0

when depth is not valid. Then for a pair of keyframes

f_{i}^{k}

and

f_{j}^{k}

, we obtain the correspondence set

K_{i, j}^{k} = {(x, \bar{x})}

using the data association method in Section 3.2. Note that, the relative sensor pose between

f_{i}^{k}

and

f_{j}^{k}

is computed as

T_{i, j}^{k} = {T_{i}^{k}}^{- 1} T_{j}^{k}

here. A correspondence pair

(x, \bar{x})

is identified as overlapped when

V_{j}^{k} (\bar{x}) = 1

. We collect all overlapped point pairs, the overlapping ratio is computed as

τ_{o} = N / M

, where N is the overlapped points number, M is the total number of points with valid depth. If the overlapping ratio

τ_{o}

is larger than the threshold

σ_{o}

, we mark keyframe

f_{i}^{k}

and

f_{j}^{k}

as a candidate loop closure. Figure 6a shows the overlapping ratios between the 34th keyframe (frame 145) with all its previous keyframes, we set the overlapping ratio threshold

σ_{o} = 0.65

in this paper. We select frame 36, 96, 120 and 140 to visualize the correctness of our proposed method as shown in Figure 6b, dotted line sketches the scanning path.

We then check the dense geometric consistency to further validate the correctness of these candidate loop closures. A candidate loop closure

(f_{i}^{k}, f_{j}^{k})

is considered as reliable only if the MAE of the correspondence points between two frames is below a threshold

σ_{r}

:

\frac{1}{| K_{i, j} |} \sum_{(x, \bar{x}) \in K_{i, j}} ∥ T_{i, j}^{k} p (x) - p (\bar{x}) ∥ < σ_{r} .

(16)

If the two-step checks all passed, the two frames are further registered together to construct a loop closure.

4.3. Graph Based Sensor Pose Optimization

Removing the accumulated error to get globally consistent model needs to eliminate the surface inconsistencies across all associated point clouds. We define the surface inconsistency as a error term

F_{i, j}

in terms of the dense geometric registration error between frame

f_{i}

and

f_{j}

, as:

\begin{matrix} F_{i, j} & = \sum_{p_{i}, p_{j}} {∥ (T_{j} p_{j} - T_{i} p_{i}) ∥}^{2} \\ = \sum_{p_{i}, p_{j}} {∥ T_{i}^{- 1} T_{j} p_{j} - p_{i} ∥}^{2} \\ \approx \sum_{p_{i}, p_{j}} {∥ T_{i}^{- 1} T_{j} T_{j, i} p_{i} - p_{i} ∥}^{2} . \end{matrix}

(17)

Note that

T_{i}

,

T_{j}

is obtained through the relative sensor pose estimation in Section 3.3,

T_{j, i}

is obtained through the loop closure detection in Section 4.2. Inconsistency exists between

T_{j, i}

and

T_{i}

,

T_{j}

due to the accumulated error. Line (17) holds by restricting the corresponding points

(p_{i}, p_{j})

must fulfil

∥ T_{j, i} p_{i} - p_{j} ∥ < ϵ

, we set

ϵ = 1.0

mm in this paper.

Then by approximating

T_{i}^{- 1} T_{j} T_{j, i} = I + {\hat{ξ}}_{i, j}

, Equation (17) can be written as:

\begin{matrix} F_{i, j} & \approx \sum_{p_{i}, p_{j}} {∥ {\hat{ξ}}_{i, j} p_{i} ∥}^{2} \\ = \sum_{p_{i}, p_{j}} {∥ [\begin{matrix} - {[p_{i}]}_{\times} & I \end{matrix}] ξ_{i, j} ∥}^{2}, \end{matrix}

(18)

in which

ξ_{i, j}

actually measures the inconsistency between sensor pose

T_{i}

,

T_{j}

and their relative pose constraint

T_{i, j}

. Define

G_{i} = [\begin{matrix} - {[p_{i}]}_{\times} & I \end{matrix}]

, we obtain:

\begin{matrix} F_{i, j} & \approx \sum_{p_{i}, p_{j}} {∥ G_{i} ξ_{i, j} ∥}^{2} \\ = ξ_{i, j}^{⊺} \sum_{p_{i}, p_{j}} G_{i}^{⊺} G_{i} ξ_{i, j} \\ = ξ_{i, j}^{⊺} Ω_{i, j} ξ_{i, j} . \end{matrix}

(19)

Equation (19) shows the surface inconsistency term

F_{i, j}

can be represented with the sensor pose inconsistency term

ξ_{i, j}

and a covariance matrix

Ω_{i, j} = \sum_{p_{i}, p_{j}} G_{i}^{⊺} G_{i}

, it is constant and can be pre-computed for each term during the 3D registration process.

Let

C

be the set of indices for which a connection between two sensor poses exists, then the multi-view point cloud registration problem can be formulated as:

\begin{matrix} F & = \sum_{(i, j) \in C} F_{i, j} \\ = \sum_{(i, j) \in C} ξ_{i, j}^{⊺} Ω_{i, j} ξ_{i, j} . \end{matrix}

(20)

This exactly defines a pose graph optimization, which can be directly solved using the g2o library [19]. Figure 7 shows the pose graph constructed with our method. Vertices represent the 6-DoF sensor poses, edges represent the constraints between sensor poses. The pose graph is visualized with the g2o viewer software.

5. Experiment

In the experiment, a FPP sensor is constructed using (1) a Texas Instruments LighterCrafter4500 board (Texas Instruments, Dallas, TX, USA) for fringe patterns projection, (2) two Basler acA1300-30gm cameras (Basler AG, Ahrensburg, Germany) simultaneously capturing the modulated images with pixel resolution of 1296 × 966. The proposed method is validated by scanning a 1300 mm × 400 mm sheet metal using the FPP sensor as shown in Figure 8, the 3D measurement and model reconstruction are conducted on a desktop PC with a 3.3 GHz Intel Xeon CPU and 16 GB RAM. By moving the FPP sensor around, complete scan of the sheet metal with totally 146 frames (depth maps) acquired is accomplished.

To test and verify the accuracy and effectiveness of the proposed relative sensor pose estimation method and the global optimization method, a ceramic ball bar is placed beside the measured sheet metal. The reconstruction accuracy can then be well examined by qualitatively observing the surface consistency and quantitatively analyzing the size fitting results of the reconstructed ceramic ball bar.

5.1. Relative Sensor Pose Estimation Accuracy

The accuracy of our proposed relative sensor pose estimation method is tested first. The sensor pose of each frame relative to the world coordinate system (frame 1) is separately estimated by (1) jointly optimizing the geometric and curvature consistency constraints (our method), (2) only optimizing the geometric consistency constraint for comparison. With the estimated sensor poses, 3D point cloud of each frame is transformed to the world coordinate system and further voxel downsampled to a unified 3D point cloud. Figure 9a shows the reconstructed surface of sheet metal with our method, it shows that the overall shape of our reconstruction result matches the actual sheet metal shape well. The point clouds are rendered with Open3D library [24].

On the other side, sensor pose estimation error inevitably accumulated in the reconstruction process, which leads to obvious surface shape artifacts, as shown in Figure 9b,c. In which, Figure 9b shows the local surface inconsistency at 3 difference places using our method, Figure 9c shows the corresponding results using only geometric consistency constraints. With this comparison, it is not hard to see that introducing the curvature consistency constraint effectively improves the sensor pose estimation accuracy, which provides a good foundation for further global optimization.

5.2. Global Sensor Pose Optimization Accuracy

Based on the sensor pose estimation results above, the global optimization is performed by (1) keyframe selection, (2) loop closure detection and (3) pose graph optimization. Then the globally optimized reconstruction result is obtained with the optimized sensor poses. Figure 10a,b show the optimized surface model and its local details, respectively. With the global model optimization, we obtained globally consistent surface model, surface inconsistencies due to the accumulated error are well optimized as shown in Figure 10b.

To further quantitatively analyze the accuracy improvement with the global optimization, we computed the relative translation and rotation changes of each keyframe pose before and after global optimization, as shown in Figure 11, the optimized poses are taken as the reference values here. It demonstrates that even very small translation estimation inaccuracy (less than 2.0 mm) and rotation estimation inaccuracy (less than

{0.10}^{\circ}

) in the reconstruction range of 1300 mm × 400 mm, are enough to cause obvious surface inconsistency (as shown in Figure 9b), and lead to reconstruction results that are unusable for high-accuracy dimensional inspection.

Meanwhile, the absolute accuracy of the reconstructed surface model can be directly and precisely tested by comparing (1) diameter fitting values of two spheres, (2) standard deviation values of Euclidean distances between sphere surface 3D points and the fitted sphere surface, (3) Euclidean distance between two sphere centers. The comparison is made between the not-optimized model, globally-optimized model and the ground truth. The ground-truth is obtained with the fitting values of frame 130, because two spheres are both measured in this frame, the fitting values are only related to the measurement accuracy of our FPP sensor, and are not affected by any sensor pose estimation error. Specifically, for each kind of data source, we manually cropped the corresponding points that belong to the two sphere surfaces, and fitted the diameter and standard deviation values using the Geomagic software.

Table 1 shows the comparison results of diameter and standard deviation fitting values of two spheres. The standard deviation values directly reflect the surface consistency of our reconstruction model. After the global optimization, it decreases from 0.1971 mm to 0.0282 mm for sphere 1, and decreases from 0.2534 mm to 0.0301 mm for sphere 2. Furthermore, the standard deviation value of globally-optimized model is very close to the value of a single measurement (frame 130), which demonstrates that our reconstructed surface exhibits very good shape consistency.

We also compared the difference of the sphere center distances between not-optimized and globally-optimized models, as shown in Table 2. The absolute error of sphere center distance relative to the ground truth decreases from 0.2080 mm to 0.0205 mm, the relative error relative to the ground truth decreases from 0.1387% to 0.0137%.

Both of the above two comparison results explain the surface shape inconsistency refinement from Figure 9a,b to Figure 10a,b, and illustrate that with the global optimization (1) the accumulated error is substantially reduced to less than

1 / 10

of the not-optimized reconstruction result, (2) the final sensor pose estimation accuracy can well match the measurement accuracy of our FPP sensor.

6. Conclusions

In this paper, we present a high-accuracy globally consistent surface reconstruction method using fringe projection profilometry. The accumulated sensor pose estimation error problem is solved with a first relative sensor pose estimation step and a following global sensor pose optimization step. The former step tries to reduce the accumulated error by maximizing the relative sensor pose estimation accuracy; it helps to ensure the initial sensor poses lie in the convergence basin of the following global optimization method. The latter step globally optimizes the sensor poses through a multi-view point cloud registration formulated in the pose graph optimization framework. Besides, adaptive keyframe selection and loop closure detection method are proposed to efficiently and automatically build point cloud connections and their relative pose constraints, which are the prerequisites of global sensor pose optimization. By qualitatively observing and quantitatively analyzing the reconstruction results of a 1300 mm × 400 mm workpiece, we validated the effectiveness and accuracy of our method. Our method demonstrates the ability to accomplish industrial-level surface model reconstruction without any external positional assistance but only using a single FPP sensor.

Since our reconstruction method is based on 3D registration, it also shares some limitations similar to most 3D registration based surface reconstruction methods [7,12,16]. For example, when the target object is near a plane, 3D registration may not converge to a correct result due to insufficient geometric constraint [11], which will stop the sensor poses from being robustly tracked. A possible solution is to further exploit the usage of surface textures constraint to help the robust tracking of sensor poses.

Author Contributions

X.C. conceived the main idea, designed the main algorithm and wrote the original draft. X.C. and L.H. wrote the algorithm, X.C. designed the main experiments under the supervision of Z.L., K.Z., C.W. and Y.S. The experimental results were analyzed by X.C., X.L., L.H. and K.Z. And W.G. and G.X gave suggestions on the experiments and provided the measured workpiece. X.L., K.Z. and W.H. reviewed and edited the original draft.

Funding

This research was funded by National Key Research and Development Program of China (No. 2018YFB1105800, 2017YFB1103200, 2018YFB110170), National Natural Science Foundation of China (No. 51505169, 51675165).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, S. Recent progresses on real-time 3D shape measurement using digital fringe projection techniques. Opt. Lasers Eng. 2010, 48, 149–158. [Google Scholar] [CrossRef]
Zhong, K.; Li, Z.; Li, R.; Shi, Y.; Wang, C. Pre-calibration-free 3D shape measurement method based on fringe projection. Opt. Express 2016, 24, 14196–14207. [Google Scholar] [CrossRef] [PubMed]
Li, B.; Zhang, S. Superfast high-resolution absolute 3D recovery of a stabilized flapping flight process. Opt. Express 2017, 25, 27270–27282. [Google Scholar] [CrossRef] [PubMed]
Zuo, C.; Feng, S.; Huang, L.; Tao, T.; Yin, W.; Chen, Q. Phase shifting algorithms for fringe projection profilometry: A review. Opt. Lasers Eng. 2018, 109, 23–59. [Google Scholar] [CrossRef]
Han, L.; Cheng, X.; Li, Z.; Zhong, K.; Shi, Y.; Jiang, H. A Robot-Driven 3D Shape Measurement System for Automatic Quality Inspection of Thermal Objects on a Forging Production Line. Sensors 2018, 18, 4368. [Google Scholar] [CrossRef] [PubMed]
Choi, S.; Zhou, Q.; Koltun, V. Robust reconstruction of indoor scenes. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 5556–5565. [Google Scholar] [CrossRef]
Dai, A.; Nießner, M.; Zollhöfer, M.; Izadi, S.; Theobalt, C. BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration. ACM Trans. Graph. 2017, 36. [Google Scholar] [CrossRef]
Endres, F.; Hess, J.; Sturm, J.; Cremers, D.; Burgard, W. 3-D Mapping With an RGB-D Camera. IEEE Trans. Robot. 2014, 30, 177–187. [Google Scholar] [CrossRef]
Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Rob. Res. 2012, 31, 647–663. [Google Scholar] [CrossRef]
Cuypers, W.; Gestel, N.V.; Voet, A.; Kruth, J.P.; Mingneau, J.; Bleys, P. Optical measurement techniques for mobile and large-scale dimensional metrology. Opt. Lasers Eng. 2009, 47, 292–300. [Google Scholar] [CrossRef]
Rusinkiewicz, S.; Levoy, M. Efficient variants of the ICP algorithm. In Proceedings of the Third International Conference on 3-D Digital Imaging and Modeling, Quebec City, QC, Canada, 28 May–1 June 2001; pp. 145–152. [Google Scholar] [CrossRef]
Newcombe, R.A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A.J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality, Basel, Switzerland, 26–29 October 2011; pp. 127–136. [Google Scholar] [CrossRef]
Kerl, C.; Sturm, J.; Cremers, D. Dense visual SLAM for RGB-D cameras. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 2100–2106. [Google Scholar] [CrossRef]
Cao, Y.P.; Kobbelt, L.; Hu, S.M. Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras. ACM Trans. Graph. 2018, 37, 171:1–171:16. [Google Scholar] [CrossRef]
Zhou, Q.; Miller, S.; Koltun, V. Elastic Fragments for Dense Scene Reconstruction. In Proceedings of the 2013 IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 473–480. [Google Scholar] [CrossRef]
Whelan, T.; Kaess, M.; Johannsson, H.; Fallon, M.; Leonard, J.J.; McDonald, J. Real-time large-scale dense RGB-D SLAM with volumetric fusion. Int. J. Robot. Res. 2015, 34, 598–626. [Google Scholar] [CrossRef]
Cao, Y.; Xu, B.; Ye, Z.; Yang, J.; Cao, Y.; Tisse, C.L.; Li, X. Depth and thermal sensor fusion to enhance 3D thermographic reconstruction. Opt. Express 2018, 26, 8179–8193. [Google Scholar] [CrossRef] [PubMed]
Yue, H.; Yu, Y.; Chen, W.; Wu, X. Accurate three dimensional body scanning system based on structured light. Opt. Express 2018, 26, 28544–28559. [Google Scholar] [CrossRef] [PubMed]
Kümmerle, R.; Grisetti, G.; Strasdat, H.; Konolige, K.; Burgard, W. G2o: A general framework for graph optimization. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3607–3613. [Google Scholar]
Lefloch, D.; Kluge, M.; Sarbolandi, H.; Weyrich, T.; Kolb, A. Comprehensive Use of Curvature for Robust and Accurate Online Surface Reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2349–2365. [Google Scholar] [CrossRef] [PubMed]
Towers, C.; Towers, D.; Jones, J. Absolute fringe order calculation using optimised multi-frequency selection in full-field profilometry. Opt. Lasers Eng. 2005, 43, 788–800. [Google Scholar] [CrossRef]
Pauly, M.; Gross, M.; Kobbelt, L.P. Efficient simplification of point-sampled surfaces. In Proceedings of the conference on Visualization ’02, Boston, MA, USA, 27 October–1 November 2002; pp. 163–170. [Google Scholar] [CrossRef]
Barfoot, T.D. State Estimation for Robotics, 1st ed.; Cambridge University Press: New York, NY, USA, 2017. [Google Scholar]
Zhou, Q.Y.; Park, J.; Koltun, V. Open3D: A Modern Library for 3D Data Processing. arXiv, 2018; arXiv:1801.09847. [Google Scholar]

Figure 1. The 3D measurement principle of fringe projection profilometry (FPP sensor).

Figure 2. (a) A depth map acquired with the FPP sensor, (b) Its corresponding curvature map estimated using our method, (c) The rendered 3D point cloud with its color mapped from the curvature map, (d) Local details of curvature information at local point cloud surface.

Figure 3. (a) Initial relative pose between source (green) and target (yellow) point cloud, (b) Registration result by only minimizing geometric error in Equation (8), (c) Point-to-plane ICP performed on 3D point cloud with a max distance threshold to eliminate outliers, (d) Minimizing both of the geometric error and curvature error as proposed in this paper.

Figure 4. Curvature difference map (a) Before registration, (b) After registration.

Figure 5. (a,b) show the translation and rotation distance between each frame with its previous keyframe, respectively.

Figure 6. (a) The computed overlapping ratios between frame 145 with all its previous keyframes and (b) Frame 36, 96, 120, 140 and the reference frame 145.

Figure 7. It shows a pose graph consists of 146 pose vertices, 229 edges (84 loop closure edges inside).

Figure 8. (a) The measurement scene, (b) Sinusoidal fringe pattern projected onto the measured object.

Figure 9. (a) The reconstructed surface and (b) Its local details using both geometric and curvature consistency constraints, (c) The corresponding local details using only geometric consistency constraints (its complete surface model not displayed here).

Figure 10. (a) The reconstructed surface after global optimization, (b) Its local details.

Figure 11. (a) Relative translation and (b) Rotation changes of each keyframe pose.

Table 1. Comparison of the diameter and standard deviation fitting results between not-optimized and globally-optimized model.

	Data Source	Diameter (mm)	Standard Deviation (mm)
	not-optimized model	44.0074	0.1971
Sphere 1	globally-optimized model	43.9713	0.0282
	Frame 130	44.1121	0.0164
	not-optimized model	43.8685	0.2534
Sphere 2	globally-optimized model	44.0624	0.0301
	Frame 130	44.0881	0.0258

Table 2. Sphere center distance fitting results with the absolute and relative errors relative to the ground truth.

Data Source	Sphere Center Distance (mm)	Absolute Error (mm)	Relative Error (%)
not-optimized model	149.7950	0.2080	0.1387
globally-optimized model	149.9825	0.0205	0.0137
Frame 130	150.0030	/	/

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cheng, X.; Liu, X.; Li, Z.; Zhong, K.; Han, L.; He, W.; Gan, W.; Xi, G.; Wang, C.; Shi, Y. High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry. Sensors 2019, 19, 668. https://doi.org/10.3390/s19030668

AMA Style

Cheng X, Liu X, Li Z, Zhong K, Han L, He W, Gan W, Xi G, Wang C, Shi Y. High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry. Sensors. 2019; 19(3):668. https://doi.org/10.3390/s19030668

Chicago/Turabian Style

Cheng, Xu, Xingjian Liu, Zhongwei Li, Kai Zhong, Liya Han, Wantao He, Wanbing Gan, Guoqing Xi, Congjun Wang, and Yusheng Shi. 2019. "High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry" Sensors 19, no. 3: 668. https://doi.org/10.3390/s19030668

APA Style

Cheng, X., Liu, X., Li, Z., Zhong, K., Han, L., He, W., Gan, W., Xi, G., Wang, C., & Shi, Y. (2019). High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry. Sensors, 19(3), 668. https://doi.org/10.3390/s19030668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry

Abstract

1. Introduction

2. Measurement Principle

3. Relative Sensor Pose Estimation

3.1. Curvature Map Estimation

3.2. Data Association

3.3. Minimization

4. Global Sensor Pose Optimization

4.1. Keyframe Selection

4.2. Loop Closure Detection

4.3. Graph Based Sensor Pose Optimization

5. Experiment

5.1. Relative Sensor Pose Estimation Accuracy

5.2. Global Sensor Pose Optimization Accuracy

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI