Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table

Shi, Haibo; Chen, Peng; Liu, Xianglei; Hong, Zhonghua; Ye, Zhen; Gao, Yi; Liu, Ziqi; Tong, Xiaohua

doi:10.3390/rs16234345

Open AccessArticle

Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table

by

Haibo Shi

¹

,

Peng Chen

^1,*

,

Xianglei Liu

²,

Zhonghua Hong

³

,

Zhen Ye

¹

,

Yi Gao

¹,

Ziqi Liu

¹ and

Xiaohua Tong

^1,4

¹

College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China

²

Key Laboratory for Urban Geomatics of National Administration of Surveying, Mapping and Geoinformation, Beijing University of Civil Engineering and Architecture, Beijing 100044, China

³

College of Information Technology, Shanghai Ocean University, Shanghai 201306, China

⁴

The State Key Laboratory of Disaster Reduction in Civil Engineering, Tongji University, Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(23), 4345; https://doi.org/10.3390/rs16234345

Submission received: 27 September 2024 / Revised: 7 November 2024 / Accepted: 18 November 2024 / Published: 21 November 2024

Download

Browse Figures

Versions Notes

Abstract

:

The accurate and timely acquisition of high-frequency three-dimensional (3D) displacement responses of large structures is crucial for evaluating their condition during seismic excitation on shaking tables. This paper presents a distributed high-speed videogrammetric method designed to rapidly measure the 3D displacement of large shaking table structures at high sampling frequencies. The method uses non-coded circular targets affixed to key points on the structure and an automatic correspondence approach to efficiently estimate the extrinsic parameters of multiple cameras with large fields of view. This process eliminates the need for large calibration boards or manual visual adjustments. A distributed computation and reconstruction strategy, employing the alternating direction method of multipliers, enables the global reconstruction of time-sequenced 3D coordinates for all points of interest across multiple devices simultaneously. The accuracy and efficiency of this method were validated through comparisons with total stations, contact sensors, and conventional approaches in shaking table tests involving large structures with RCBs. Additionally, the proposed method demonstrated a speed increase of at least six times compared to the advanced commercial photogrammetric software. It could acquire 3D displacement responses of large structures at high sampling frequencies in real time without requiring a high-performance computing cluster.

Keywords:

high-speed camera; distributed system; global reconstruction; large FOV calibration; displacement measurement; shaking table

Graphical Abstract

1. Introduction

Innovative structures with replaceable coupling beams (RCBs) and dissipative seismic devices have recently emerged to mitigate damage and expedite repairs in buildings that need to remain operational after extreme seismic events, such as hospitals, fire stations, and government buildings [1,2]. Large shaking table tests are essential for investigating structural seismic performance and optimizing designs [3]. Contact sensors, such as displacement meters, accelerometers, and strain gauges, are frequently used in shaking table tests to record high-frequency displacement responses and other dynamic behaviors of structures in real time. These data are crucial not only for post-test seismic performance analysis but also for assessing the current state of structures to ensure the safety and success of subsequent tests [4]. However, installing numerous contact sensors is cumbersome, time-consuming, and expensive. Additionally, the reliance on wired networks for signal transmission increases the weight of the measured structures, potentially impacting their physical characteristics. Therefore, vision-based measurement methods are increasingly accepted in structural displacement measurement due to their flexibility and accuracy compared to contact sensors [5,6].

Ri et al. [7] presented a framework utilizing drone cameras for high-precision displacement measurement, achieving sub-millimeter accuracy. This method combines phase-based sampling moiré techniques with four-degrees-of-freedom geometric modeling to accurately distinguish bridge displacements from those induced by camera motion. Wen et al. [8] introduced the deep optical flow method RAFT-GOCor for extracting structural displacements and proposed Bayesian RAFT-GOCor, incorporating Monte Carlo dropout, to analyze displacement measurement uncertainty. Weng et al. [9] proposed a complementary approach that integrates phase-based optical flow with template matching, leveraging optical flow to capture high-frequency structural movement components and reducing drift error through template-matching measurements. Their results demonstrate sub-pixel accuracy for large-scale, drift-free displacement measurement and robustness against background clutter. Shi et al. [10] developed a videogrammetric method for monitoring the 3D deformation of critical internal nodes in large-scale suspended dome structures, even under camera instability. Experimental validation confirmed the method’s effectiveness, improving the positioning accuracy for internal nodes from 47.24 mm to 1.62 mm following dynamic parameter correction. Zhang et al. [11] proposed a non-overlapping dual camera measurement model with the aid of a global navigation satellite system (GNSS) to sense the 3D displacements of high-rise structures. A simulation and experiment demonstrate the feasibility and correctness of the proposed method.

However, to the best of our knowledge, there are few reports on vision techniques that can measure high-frequency 3D dynamic responses of large-scale structures in real time or near real time during seismic excitation on shaking tables. Due to the high cost and complexity of large-scale shaking tables, it is generally necessary to intermittently load diverse types and magnitudes of seismic waves onto the structure within a limited period. Long standby times for such large apparatuses pose risks, as they must operate under safe conditions [12]. Failure to promptly acquire structural displacement response data following each seismic wave and relying only on simple visual inspection renders it impossible to accurately evaluate the current state of the structure, complicating the safe and successful conduction of subsequent experiments. This issue is particularly critical for structures incorporating RCBs with dissipative seismic devices, as it is essential to study not only the seismic performance of these components under various seismic events but also their performance after a replacement following damage [13]. Evaluating the status of the structures through commonly used simple visual inspection increases the probability of experiment failure.

The efficient and accurate calibration of the camera parameters with large fields of view (FOVs) is crucial for enhancing measurement efficiency in large shaking table tests. These tests usually require stereo cameras to create multiple FOVs encompassing the entire structural multi-façade [14]. In vision-based structural health monitoring (SHM), the extrinsic calibration of cameras with large FOVs is typically achieved using artificial markers uniformly distributed on the measured structure or background. These markers can be categorized into coded and non-coded markers [15]. Coded markers are designed to ensure uniqueness in appearance, which primarily reduces the time-consuming process of target correspondence in different images [16,17]. However, using coded targets introduces a decoding process that can fail under poor image conditions, occlusions, and significant perspective variations [18]. In complex scenarios with cameras having wide baselines and limited target sizes, non-coded markers (e.g., circular targets and spheres) are preferred due to their scale invariance and robustness to image distortions, occlusions, and significant perspective variations. Zhang et al. [19] devised a stereoscopic calibration object comprising multiple high-precision sphere targets and proposed a separate parameter calibration method for cameras with large FOVs. Tong et al. [20] used non-coded circular targets to calibrate a stereo camera and measure the deformation of laminated rubber bearings on a shaking table. However, these methods typically require manual recognition and selection to establish correspondences among non-coded targets between different views and across 2D and 3D space (measured by other 3D sensors). Although non-coded circular targets are widely used in SHM [21], there remains a lack of reports about automatically establishing correspondences for numerous similar and randomly distributed non-coded circular targets. Even state-of-the-art commercial videogrammetric software, such as PhotoModeler (Version 2024), has yet to provide procedures to automate the determination of these correspondences. Excessive manual intervention can substantially reduce computational efficiency, particularly during the calibration of multiple cameras.

Besides camera calibration, target tracking and time sequence 3D coordinate reconstruction are crucial for enhancing measurement efficiency. Many high-speed tracking methods have been proposed to meet specific measurement requirements [22]. A rapid sub-pixel tracking method based on normalized cross-correlation (NCC) is used in this paper to extract time sequence 2D coordinates of dynamic structure from image sequences [23]. Separated bundle adjustment (BA) involves conducting 3D reconstruction for interest points in partial areas of the structure (the FOV of a pair of cameras in a multi-camera system) and then combining the 3D measurements of the entire structure through physical stitching [24,25]. However, these methods overlook the motion consistency of the entire structure and the potential improvement in measurement accuracy from points in the overlapping regions between each stereo FOV [26]. Global BA is more accurate and suitable, as it can integrally reconstruct the time sequence 3D coordinates of all interest points on the structure [27]. However, conventional global BA generally only supports optimization on a single computing device, posing challenges to the performance of the computing device due to the massive parameters requiring simultaneous optimization [28]. Some reports presented high-performance computing (HPC) clusters for the real-time calculation of structural dynamic parameters [29]. However, HPC clusters are extremely expensive to purchase and maintain, which can significantly increase experimental costs. Majchrowicz et al. [30] proposed a new parallel approach for 3D ECT image reconstruction, which is based on the application of multi-GPU, multi-node algorithms in a heterogeneous distributed system. The application of the framework with a new network communication layer reduced data transfer times significantly and improved the overall system efficiency. Xu et al. [31] presented an innovative image-based 3D reconstruction pipeline for the precise and efficient geometry measurement of bridge structures. By decomposing the large-scale reconstruction task into distributed sub-models using sub-image sets, the method significantly improves computational efficiency without compromising accuracy compared to conventional approaches. These studies illustrate that distributed systems and computational strategies can effectively boost efficiency without requiring HPC resources. However, the currently proposed distributed strategies or systems are unsuitable for real-time shaking table test processing.

This paper presents a distributed high-speed videogrammetric method with rapid calibration to efficiently and accurately measure the dynamic response of a large shaking table structure. A seven-story structure with RCBs equipped with hybrid devices was secured onto a shaking table, employing a videogrammetric network constructed by six high-speed cameras to monitor structural 3D displacement responses in real time. The primary contributions of this paper are as follows:

(1): An efficient dynamic measurement method for large shaking table structures based on distributed high-speed videogrammetry was developed, which can obtain the high-frequency 3D displacement responses of all the interest points of large structures in real time without necessitating an HPC cluster. The efficiency and accuracy of the proposed method were verified through a shaking table test of a large structure with RBCs.
(2): A fast calibration method for multiple cameras with large FOVs was proposed, which automatically and accurately estimates extrinsic parameters using non-coded circular targets fixed on the structural interest points. This method eliminates the need for artificial visual assistance to determine the target correspondence across different camera views and 2D-3D space.
(3): A distributed computation and reconstruction strategy based on the Alternating Direction Method of Multipliers (ADMM) was proposed to fully exploit the computing resources of the conventional high-speed videogrammetric network. This strategy circumvents the time-consuming transmission of large-volume image data and achieves global reconstruction across different computing devices without compromising measurement precision.

2. Methodology

Figure 1 illustrates the framework of the proposed method for the real-time, high-frequency 3D displacement measurement of a large-scale structure on a shaking table. The method consists of four key components: (1) the establishment of a distributed videogrammetric network that integrates a synchronized multi-camera system, high-speed camera workstations, and a distributed measurement network; (2) the rapid calibration of the multi-camera system with large fields of view (FOVs) through non-coded circular targets affixed to key structural points; (3) the high-speed tracking of these circular targets using an efficient NCC method; and (4) the implementation of a distributed computation and reconstruction strategy, including a distributed computation strategy and the ADMM-based distributed reconstruction method.

2.1. Construction of Distributed Videogrammetric Network

Multiple high-speed cameras are essential for the 3D dynamic measurement of large-scale shaking table structures with high sampling frequencies. Since high-speed cameras can capture a significant amount of image data quickly, they typically require the following equipment to ensure proper functionality [32,33]: (1) a high-speed image capture card to prevent image loss; (2) a large-volume data storage device for lossless image storage; and (3) a multicore CPU for the real-time operation and control of the camera. Therefore, a corresponding minicomputer (workstation) with sufficient computational power was necessary to integrate these functions.

In pursuit of efficient 3D dynamic measurement for large-scale shaking table structures, this study established a general distributed videogrammetric network, illustrated in Figure 2. The network comprises the following components: (1) Multiple camera system: multiple high-speed cameras are dispersed throughout the network, each equipped with a minicomputer (workstation). (2) Synchronous control module: each camera is connected to the synchronous controller via a CamLink cable, ensuring synchronized image capture. (3) Principal workstation: this workstation oversees the synchronous control module and the multi-camera network, managing camera parameter adjustments, as well as data acquisition, transmission, and storage. (4) Distributed calculation module: this connects the primary and subordinate workstations with network cables and switches, facilitating signal and data transmission. OpenMPI v4.1 was employed for inter-device communication.

2.2. Fast Calibration of Multi-Camera System

2.2.1. Stereo Correspondence of Circular Targets

Since points of interest on the shaking table structure often lack distinctive texture, artificial markers are typically affixed to key points of the structure to enhance the accuracy of dynamic measurements [34]. These markers can serve both as tracking points for capturing the dynamic behavior of the structure and as control points for determining the position and orientation of the cameras.

The non-coded targets used in this study were simple white circular markers on a black background, as shown in Figure 3. A reflective sheet was affixed to the center of each marker to assist the auxiliary measuring equipment in acquiring the 3D coordinates of the target’s center. In practical testing, various complex environments hinder the precise identification of circular targets. For example, low-light indoor environments reduce the contrast of circular marks, overexposure in outdoor scenes obscures certain target outlines, and numerous circular-like interferences increase the likelihood of false detections. This paper employed CMNet to accurately recognize and locate the centers of circular targets [35]. Specifically, circular marks were first detected using an improved YOLOv4 model to restrict the search region of the circular contour. Next, the BASNet saliency object detection model extracted the contours of these circular marks. Finally, least squares fitting was applied to calculate the central pixel coordinates of the identified contours on the saliency map. The method was trained on a large dataset of circular mark images captured in diverse environments, achieving a high recognition rate and detection accuracy within a relatively short processing time.

To automatically establish stereo correspondences for numerous similar circular targets across different camera views without manual intervention, we proposed a stereo-matching method for circular targets based on affine invariance and epipolar constraints, as illustrated in Figure 3. In the illumination,

s_{1}

and

s_{2}

denote the optical center of a camera pair;

p_{1}

and

p_{2}

signify the center points of the detected circular targets from different views;

l_{1}

and

l_{2}

represent the polar lines;

e_{1}

and

e_{2}

denote the epipoles, while

l_{1}

and

l_{2}

are the epipolar lines; and red points represent the matched feature points of different images.

An affine map offers a local approximation of the apparent deformations induced by changes in camera viewpoints. This capability proves valuable in addressing the matching challenges posed by cameras with a wide baseline.

Any affine map

A

is uniquely decomposed as Equation (1).

A = λ R_{1} (ψ) T_{t} R_{2} (ϕ) = λ [\begin{matrix} \cos ψ & - \sin ψ \\ \sin ψ & \cos ψ \end{matrix}] [\begin{matrix} t & 0 \\ 0 & 1 \end{matrix}] [\begin{matrix} \cos ϕ & - \sin ϕ \\ \sin ϕ & \cos ϕ \end{matrix}]

(1)

where

λ > 0, t > 0

,

ψ \in [0, 2 π], ϕ \in [0, π]

,

R_{1}

, and

R_{2}

represent rotations;

T_{t}

represents a translation matrix that shifts the position in one direction;

ψ

parameterizes the camera spin;

λ

corresponds to the camera zoom; and the longitude

ϕ

and latitude

θ = \arccos (1 / t)

parameterize the camera’s tilt [36].

The conventional method implements affine matching for different views by simulating images with a range of optical tilts and employing brute force matching. To reduce the processing time, a near-optimal discrete set of affine transformations strategy was used, minimizing the number of simulation angles for the camera axis.

From Equation (1), if

T_{t (A)} R_{ϕ (A)} = T_{t (B)} R_{ϕ (B)}

, two classes [A] and [B] in the space of tilts are equal. Thus, the space of tilts can be parametrized by picking representative affine maps for each class.

Ω = [I d] \cup \{\underset{(t, ϕ) \in] 1, \infty [\times [0, π [}{\cup} [T_{t} R_{ϕ}]\}

(2)

I

in

[I d]

represents the identity matrix, and

d

is a metric acting on the space of tilts that measures the affine distortion from a fixed affine viewpoint to surrounding affine viewpoints.

d : \{\begin{matrix} \begin{array}{l} Ω \times Ω \\ ([A], [B]) \end{array} & \begin{array}{l} \to \\ \mapsto \end{array} & \begin{array}{l} ℝ_{+} \\ \log (t (B A^{- 1})) \end{array} \end{matrix}\}

(3)

The root scale-invariant feature transform (Root-SIFT) was selected as the descriptor for matching image pairs due to its improved robustness to viewpoint changes compared to the SIFT [37]. This improved process involved taking the square root of a SIFT descriptor after normalization. A state-of-the-art estimator was then utilized to recover the fundamental matrix from these matched feature points [38]. Furthermore, the epipolar constraint was utilized to narrow the search area for the circular target from different views, and affine template matching was applied to establish the correspondence of circular targets along the polar line, further reducing mismatches [39].

2.2.2. Cross-Modal Correspondence of Circular Targets

Auxiliary devices, such as the total station, were employed to measure the 3D coordinates of the targets fixed on the key points of the structure. These data serve as control and check points to ensure consistency between videogrammetric and structural measurement coordinate systems. However, a cross-modal discrepancy arose between the 3D coordinates measured by the total station and the 2D coordinates of the targets. The conventional method for establishing this correspondence relies on manual visual selection, as exemplified by commercial photogrammetry software like PhotoModeler 2024. However, numerous correspondences are required for large-scale structure measurement when utilizing multiple cameras. The manual correspondence procedure may increase the likelihood of obtaining erroneous correspondences and is time-consuming.

In this study, 2D and 3D correspondences of circular targets were implemented in 3D space through multi-view geometry. The specific process was as follows: With the assumption that

X_{i} \leftrightarrow {X_{i}}^{'}

was a set of corresponding circular targets between different images and the quantity was over 5, the fundamental matrix

F

was estimated using these matched circular targets [40]. The intrinsic parameter matrix

K

was obtained by using Zhang’s calibration method in advance. The distance aspect ratio between the reconstructed space and real 3D space was consistent.

The essential matrix can be calculated as

E = K^{' T} F K

(4)

The

3 \times 3

essential matrix has two equal singular values, and a singular value of zero. Suppose the first camera normalization matrix is

P = [I | 0]

, and the singular value decomposition (SVD) of the essential matrix is

U diag (1, 1, 0) V^{T}

. The second camera normalization matrix

P^{'}

has four possible choices:

\{\begin{matrix} P^{'} = [U W V^{T} ∣ + u_{3}] \\ P^{'} = [U W V^{T} ∣ - u_{3}] \\ P^{'} = [U W^{T} V^{T} ∣ + u_{3}] \\ P^{'} = [U W^{T} V^{T} ∣ - u_{3}] \end{matrix}

(5)

where the translation vector

t = \pm u_{3} = U {(0, 0, 1)}^{T}

, the rotations matrices

R = U W^{T} V^{T} or U W V^{T}

, and

W = [\begin{matrix} 0 & - 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix}]

(6)

To determine the correct solution for

P^{'}

among the four possibilities, one can derive a single world point using these four potential extrinsic matrices and choose the reconstructed world point located in front of both cameras. Then, the center points of circular targets can be metrically reconstructed using the selected camera parameters

P^{'}

.

Then, we can achieve cross-modal correspondence by establishing the correspondence between the metrically reconstructed circular target and the circular target obtained by the measurement sensor (e.g., total station) in 3D space (

P_{i}^{2} \leftrightarrow P_{i}^{1}

). Since the circular targets in both point sets have a consistent quantity and scale, we can identify four pairs of potential corresponding points by selecting the maximum and minimum spatial distances between points in each set. The two 3D point sets follow a rigid transformation:

P_{i}^{2} = R P_{i}^{1} + T = T_{3 \times 4} P_{i}^{1}

(7)

where

T_{3 \times 4}

represents the combined transformation matrix (a 3 × 4 matrix), which includes both the rotation and translation, to map

P_{i}^{1}

directly to

P_{i}^{2}

.

Thus, we were able to obtain four potential aligning rigid transformations. The 4PCS algorithm could then be used to determine the optimal rigid transformation, thereby achieving registration between the two 3D point sets [41]. Once the correspondences between the circular targets in 2D and 3D were established, the PnP algorithm was able to be used to obtain the external parameters of the cameras in the world coordinate system.

2.3. Sub-Pixel Tracking of Circular Targets

Normalized cross-correlation is a robust measure for assessing the similarity between corresponding object areas in image sequences. Since the shaking table test is conducted indoors, the structure benefits from stable and consistent lighting provided by professional indoor equipment. Additionally, the non-coded targets remain unobstructed during movement, allowing for clear imaging throughout the sequence. Thus, this study employed a fast NCC method for target tracking [23]. Using pre-calculated sum tables for the numerator and denominator in the NCC definition eliminated redundancy in repeated calculations. Since NCC only provides integer pixel accuracy, a sub-pixel localization algorithm was used to enhance the tracking precision. This algorithm involves extracting a 3 × 3 region around the matched point to fit a 2D quadratic surface, whose extremum is considered the sub-pixel position

c_{1}

of the circular target [42]. Although using an area larger than 3 × 3 could result in a more accurate estimation, it also increases the computational complexity and time. The equation for the quadratic surface is shown in Equation (8), and the cross-correlation coefficient matrix

M

is described in Equation (9).

f (x, y) = a_{0} x^{2} + a_{1} y^{2} + a_{2} x y + a_{3} x + a_{4} y + a_{5}

(8)

M = [\begin{matrix} m_{1} & m_{2} & m_{3} \\ m_{4} & m_{5} & m_{6} \\ m_{7} & m_{8} & m_{9} \end{matrix}]

(9)

The coefficients

a_{i} (i = 0, 1, \dots, 5)

can be estimated by substituting Equation (9) into Equation (8), and

c_{1}

can be calculated using the following equations:

\{\begin{cases} x = \frac{2 a_{1} a_{3} - a_{2} a_{4}}{{a_{2}}^{2} - 4 a_{0} a_{1}} \\ y = \frac{2 a_{0} a_{4} - a_{3} a_{2}}{{a_{2}}^{2} - 4 a_{0} a_{1}} \end{cases}

(10)

2.4. Distributed Computation and Reconstruction

2.4.1. Distributed Computation Strategy

The videogrammetry system recorded dynamic information of structures in image sequences, which were stored at individual workstations. Global reconstruction of the 3D displacement response of the entire structure required centralizing images from different workstations. Traditionally, this involves transmitting large images to a single workstation, hindering the real-time or near-real-time 3D dynamic response measurement due to the extensive transmission time. Ngeljaratan et al. [43] explored image compression techniques, such as non-adaptive linear interpolation and wavelet transform algorithms, to facilitate image storage and data transfer. Although this method reduces the data volume while preserving structural motion information, it requires additional image pre-processing time.

The proposed distributed computation and reconstruction strategy, illustrated in Figure 4, harnesses the computational capabilities of each subordinate workstation within the videogrammetric network to parallelly track circular targets. In that case, only the transmission of text data is required, and the data transmission time is minimal. Massive parameters are globally optimized across various computing devices for simultaneous solving using the ADMM. This computation strategy eliminates the need for extensive image data transmission within the local network, facilitating the rapid measurement of the structural 3D dynamic response.

2.4.2. Distributed 3D Reconstruction Based on ADMM

After acquiring the initial intrinsic and extrinsic parameters of multiple cameras and the sequential image coordinates of all structural points of interest, the BA is commonly used to obtain accurate time sequence 3D coordinates. Bundle adjustment minimizes the reprojection error between the 2D locations of observed and predicted image points, where the predicted points are reprojected from 3D points using camera parameters [44]. The BA problem can be described by Equation (11).

f (C a m, X) = \sum_{j = 1}^{n} \sum_{i = 1}^{m} {(u_{i j} - π (C a m_{i}, X_{j}))}^{2}

(11)

where

u_{i j}

is the 2D observation point, which represents the 3D point,

X_{j}

, which is observed by the

i th

image

C a m_{i}

;

π (C a m_{i}, X_{j})

is the nonlinear operation, which represents the reprojection pixel coordinates obtained by the 3D point

X_{j}

and estimated image parameters

C a m_{i}

; and

C a m_{i}

includes extrinsic and intrinsic parameters.

As shown in Equation (11), the conventional global BA only supports computation and optimizing all parameters integrally. It poses challenges to computational efficiency within a computing device due to the simultaneous optimization of a substantial number of parameters if the high-performance computation device is lacking.

Thus, considering multiple computer devices in the multiple-camera measurement network and the consensus characteristic of the ADMM, the global BA was divided into multiple sub-blocks using the ADMM to achieve a distributed computation [45]. The general form of the consensus problem is Equation (12) if the ADMM algorithm is used to solve it.

\min \sum_{i = 1}^{k} f (x_{i}) {subject to x}_{i} = z, i = 1, \dots, k

(12)

The parameter space is divided into

k

blocks. Each sub-objective function’s parameter

f_{i} (x_{i})

dimensions are different, which are called local variables. Mapping a part of

x_{i}

to a part of the global variable

z

as

z_{i}

, the iteration expression using the ADMM can be derived as

x_{i}^{t + 1} = \underset{x_{i}}{\arg \min} (f_{i} (x_{i}^{t}) + \frac{ρ}{2} {‖x_{i}^{t} - (z^{t} - y_{i}^{t})‖}_{2}^{2})

(13)

z^{t + 1} = \frac{1}{k} \sum_{i = 1}^{k} x_{i}^{t + 1}

(14)

y_{i}^{t + 1} = y_{i}^{t} + x_{i}^{t + 1} - z^{t + 1}

(15)

The penalty parameter

ρ > 0

and Equation (14) are the average of the optimization results of different blocks.

The global BA problem is separated into multiple blocks, as shown in Figure 4 (left side). Supposing the sub-block number is

K

, the camera parameters

C a m_{i}^{K}

and the corresponding 3D points

X_{j}^{K}

are estimated in each block, and Equation (12) can be written as

\min \sum_{k = 1}^{K} f (C a m^{k}, X^{k}) subject to \begin{matrix} C a m_{i}^{k} = C a m_{i}, i = 1, \dots, n, k = 1, \dots, K \\ X_{j}^{k} = X_{j}, j = 1, \dots, m, k = 1, \dots, K \end{matrix}

(16)

Leveraging the ADMM algorithm, the iterations are given by

{(C a m^{k}, X^{k})}^{t + 1} = \underset{{C a m^{k}}, {X^{k}}}{\arg \min} (f (C a m^{k (t)}, X^{k (t)}) + \frac{ρ}{2} ‖X^{k (t)} - (X^{t} - {\tilde{X}}^{k (t)}‖, \forall k

(17)

X_{j}^{(t + 1)} = \frac{1}{K} \sum_{k = 1}^{K} {(X_{j}^{k})}^{t + 1}, \forall j

(18)

{({\tilde{X}}_{j}^{k})}^{t + 1} = {\tilde{X}}_{j}^{k (t)} + (X_{j}^{k (t + 1)} - X_{j}^{(t + 1)}), \forall j, k,

(19)

where

{\tilde{X}}_{j}^{k}

drives the iterative process of Equation (17). The Schur complement was used to increase the computation efficiency. This approach simplified the block diagonal matrix, allowing for the subsequent optimization of the 3D points. During the optimization process, the Huber loss function was applied. The initial values of the 3D points were able to be calculated using the initial camera parameters and the triangulation, allowing Equation (16) to be easily distributed across multiple processors and solved using the Gauss–Newton method. In this study, the number of sub-blocks equaled the number of CPU cores, and the termination tolerance for the reduction in the reprojection error between iterations was set to 1 × 10⁻⁵.

Based on the reconstructed high-frequency time sequence 3D coordinates of tracking targets, the dynamic response and properties of the structural interest area were able to be easily acquired through numerical analysis and calculations [46]. These can be converted into physical response indicators for safety assessments and seismic performance research.

3. Experimental Results and Analysis

A large shaking table test on a structure with RCBs was conducted to verify the accuracy and efficiency of the proposed method. First, details regarding the structural model, ground motion, and measurement setup are presented. Next, the accuracy and efficiency of the proposed method are verified and analyzed by a comparison with various approaches, including total station measurements, contact sensors, and conventional methods. Finally, the 3D dynamic response measurements under different seismic wave loads are presented and analyzed.

3.1. Structure Model and Experiment Setup

3.1.1. Structure Model and Ground Motion

The seven-story structural model with RCBs is depicted in Figure 5a and has dimensions of 1.8 m in length, 1.3 m in width, and 6.3 m in height. Details of the used shaking table are shown in Table 1.

Figure 5a also details the non-replaceable parts of the RCB, which connect to the hybrid device through end-plate bolted connections, facilitating easy removal and replacement after major earthquakes. Three pairs of ground motion records and six experiments were selected from the shaking table test, as shown in Table 2. These records include: (1) Shanghai—an artificial seismic record from the appendix of the Shanghai Code for Seismic Design of Buildings 57; (2) El Centro—the 1940 Imperial Valley earthquake recorded at the El Centro station in California; and (3) Northridge—the 1994 Northridge earthquake recorded at the Sylmar Olive View FF station in Sylmar, California. The vertical component of the ground motion records was neglected.

3.1.2. Measurement Setup

Based on the field test environment, a video measurement network was established using six high-speed cameras to ensure that the stereo FOV covered both elevations of the building while minimizing the number of cameras required. The critical parameters of the camera are shown in Table 3. Figure 5b illustrates the camera layout and the corresponding measurement areas. Specifically, Cameras A and B, Cameras B and C, Cameras D and E, and Cameras E and F establish four stereoscopic FOVs, each responsible for the measurement areas [A-B], [B-C], [D-E], and [E-F], respectively.

The videogrammetric network comprised six workstations (standard computers): five subordinate workstations, each equipped with quad-core 3.6 GHz CPUs and 16 GB RAM, and one primary workstation equipped with a six-core 3.2 GHz CPU, 16 GB RAM, and a GeForce GTX 1650. CPU and GPU algorithms were implemented in C++ and CUDA, with inter-workstation communications handled via OpenMPI v4.1.

Non-coded circular targets were affixed to the structural points of interest, as illustrated in Figure 5c. Blue

B_{i}

numerals indicate measurement points on the structural coupling beams, while red numerals

R_{i}

denote measurement points on the shear wall. A displacement meter in the X direction was installed at point

R_{3}

, and an accelerometer in the Y direction was installed at point

R_{18}

. The local spatial coordinate system was established using high-precision total stations, aligning the XY directions with the shaking table load orientation and completing the right-hand coordinate system with the Z direction. The 3D coordinates of all measurement points were determined using the total station before and after each shaking table load when the structure remained stationary.

3.2. Accuracy Verification and Analyzation

3.2.1. Accuracy of Calibration

Twenty-six circular targets were selected as checkpoints to assess the extrinsic calibration method’s accuracy. The extrinsic parameters of the six cameras were estimated using the remaining circular targets and the proposed method, with global BA employed to compute the 3D coordinates of the checkpoints. Figure 6 shows the measurement errors of checkpoints between the videogrammetry and the total station in the X, Y, and Z directions. The root mean square error (RMSE) of the 26 checkpoints in the X, Y, and Z directions was 0.58 mm, 0.60 mm, and 0.57 mm, respectively. The 3D positioning error of the checkpoint was calculated by taking the result of the RMSE values in all three directions, which equaled 1.02 mm. These results demonstrate the accuracy of the proposed calibration method.

3.2.2. Accuracy of Distributed Computation and Reconstruction

The sub-pixel tracking method was employed to track all circular targets during the six ground motions. The time sequence 3D coordinates of all points of interest were reconstructed using separated BA (BA performed separately for each measurement area), global BA (BA integrated for all measurement areas on one computing device), and the proposed method (BA integrated for all measurement areas across different computing devices). Since the total station measured the 3D coordinates of the checkpoints after each seismic wave load when the structure was stationary, the 3D positioning errors of the checkpoints of different methods after each seismic wave load were calculated and are depicted in Figure 7.

The overall positioning error of the checkpoint of separated BA, centralized global BA, and the proposed method was determined by averaging the positioning errors after each seismic wave load, resulting in values of 1.67 mm (X direction: 0.66 mm; Y direction: 0.83 mm; Z direction: 0.74 mm), 1.04 mm (X direction: 0.54 mm; Y direction: 0.63 mm; Z direction: 0.59 mm), and 1.01 mm (X direction: 0.57 mm; Y direction: 0.64 mm; Z direction: 0.52 mm), respectively. The separated BA has a relatively high overall positioning error due to the method neglecting the entire structure’s motion consistency and the potential measurement accuracy improvement from control points in the overlapping regions between each stereo FOV. The overall positioning error of the proposed method is comparable to those of conventional global BA, demonstrating its effectiveness and precision in time sequence 3D reconstruction.

3.2.3. Accuracy of Displacement and Acceleration Responses

Figure 8 compares the 3D displacement and acceleration response histories obtained from the contact sensor and videogrammetry under different seismic excitations. The RMSEs of displacement measurements in the X direction between the videogrammetry and the displacement sensor during experiments 1, 3, and 5 were 0.72 mm, 0.95 mm, and 1.22 mm, respectively. The range of values and movement trends for the acceleration response curves acquired by the accelerometer closely align with those from videogrammetric measurements. These results further demonstrate the efficacy of the proposed videogrammetric method for measuring the 3D dynamic response of the structure.

3.3. Efficiency Verification and Analyzation

3.3.1. Efficiency of Fast Calibration

The initial extrinsic parameters of a camera can be calculated using at least four 3D points in the world and their corresponding 2D projections in the image (2D-3D correspondences). Ideally, the more control points are available within the camera FOV, the higher the calibration precision will be. For the system with six cameras, at least 24 2D-3D correspondences are necessary to achieve extrinsic parameter calibration. Additionally, obtaining 3D measurements of the 52 structural interest points requires establishing image correspondences from at least two camera views for each point. Thus, we included the time consumption for this procedure in the extrinsic calibration step. In the conventional extrinsic calibration method (PhotoModoler 2024), these correspondences are manually acquired. We recruited five testers, recorded the time each took to determine these correspondences for the four camera pairs, and calculated the average time.

Table 4 presents the time consumption of the conventional and proposed extrinsic calibration methods to calibrate the multi-camera system. The runtime for circular target detection and PnP is not counted, as they only take milliseconds. The fast calibration method is approximately sixteen times faster than the conventional method, with the reduction in time becoming increasingly significant as the number of cameras increases. In actual shaking table tests, the camera needs to be recalibrated after each ground motion to avoid possible slight changes in the camera’s extrinsic parameters due to the effects of vibration from a large shaking table on the surrounding environment.

3.3.2. Efficiency of Distributed Computation and Reconstruction

In addition to camera parameter estimation, image transmission, target tracking, and trajectory reconstruction are time-consuming steps in actual videogrammetric measurement tasks. In the proposed computation strategy, the time consumption of data transmission in the local network can be ignored. The runtime of target tracking is inversely proportional to the number of workstations in the primary replica system. The shaking table dataset, comprising 1,747,200 world points and 4,166,400 observations, was obtained from the six ground motion experiments to evaluate the efficiency of time sequence 3D reconstruction. This dataset underwent reconstruction using the sparse global BA (serial solution) [47], centralized global BA (parallel solution) [28], and the proposed distributed method, respectively. Each workstation utilized only CPUs, and the communication time was considered.

As depicted in Figure 9, the time consumed for sparse global BA was 1339.8 s, for centralized global BA was 268.1 s, and for distributed BA was 76.4 s. The distributed BA was approximately three times faster than the centralized BA and about sixteen times faster than the sparse BA. Post-optimization, the mean reprojection errors of the different methods were around 0.04 pixels, indicating that the three methods achieved convergence and acquired accurate optimization results in this case. These results demonstrate the effectiveness and efficiency of the proposed distributed computation and reconstruction strategy.

Additionally, we varied the number of workstations to investigate the relationship between the number of CPU cores and the time consumed for dataset reconstruction using the proposed method. Table 5 illustrates that the time consumption for reconstructing the dataset decreases with increased CPU cores while the mean reprojection error remains stable. Although the increase in computational efficiency is not significant when a certain number of CPU cores is reached, the distributed reconstruction strategy fully utilizes the computational power of the different devices of the videogrammetric network. This significantly improves the onsite computation efficiency for the 3D dynamic measurement of large shaking table structures.

3.3.3. Efficiency of Videogrammetric Method

To further demonstrate the efficiency of the proposed videogrammetry method, we compared the time consumed for each procedure in reconstructing the same shaking table dataset with that of PhotoModeler, an advanced commercial videogrammetry software. PhotoModeler supports 3D reconstruction for time sequence interest points within only a single stereo field of view. For a fair comparison, PhotoModeler was installed on each workstation, and four professionals simultaneously reconstructed the time sequence 3D coordinates of the interest points in each measurement area. Since the target tracking procedure (least squares matching) in PhotoModeler is time-consuming, we assumed the runtime for this procedure to be the same as the method employed in this paper. Additionally, only the necessary runtime was included in the time consumed by PhotoModeler. The overall 3D positioning error for the checkpoints measured by PhotoModeler compared with the total station was approximately 1.61 mm, which is lower than the value obtained using the proposed videogrammetry method.

Figure 10 compares the time consumption between the proposed videogrammetric method and PhotoModeler in reconstructing the shaking table dataset. The intrinsic parameters of the industrial camera are pre-determined and therefore not included in the total time consumption. PhotoModeler took approximately 1128.1 s, while the proposed videogrammetric method took only 161.1 s. This result includes all time-consuming processes except for the preparatory work during camera calibration, which can be ignored as it can be performed before the shaking table work. The duration of the six seismic excitations was approximately 112 s. These results demonstrate the potential for the real-time 3D displacement response measurement of large-scale shaking table structures at 300 Hz, corresponding to the camera capture speed and system sampling frequency. Furthermore, by reducing the sequence image frame rate from 300 to 200 fps through frame extraction, the overall consumed time was reduced to 102.2 s, indicating that the method achieves high-frequency real-time measurement.

The time statistics for the proposed method were from an experiment conducted in a laboratory environment where computing resources were idle. In our laboratory setup, we used Cat8 Ethernet cables to support a maximum transmission rate of 40 Gbps, which meets the real-time processing requirements for this study. To enhance robustness in real-world applications, we recommend using optical fiber cables to establish wired connections across distributed devices whenever possible. If hardware upgrades alone cannot meet data and signal exchange demands, adopting edge computing solutions to decentralize processing tasks may reduce network dependency and improve robustness.

During field measurements, other programs (e.g., data acquisition and control) may use part of the computational resources of the distributed computation network, potentially preventing the real-time measurement frequency of the 3D displacement response from reaching 200 Hz. However, the above analysis and discussion demonstrate the effectiveness and efficiency of the proposed method. The method provides a non-contact, high-frequency, real-time measurement tool for shaking table tests. The real-time dynamic monitoring of large shaking table structures can be achieved using standard computing equipment and configuration strategies without the need for high-performance clusters.

3.4. Measurement Results

After evaluating the performance of the proposed method, we calculated the 3D dynamic response of all measurement points under different seismic wave loading using the proposed method. Figure 11 displays the 3D displacement response histories of measurement points distributed along the coupling beams in experiments 1, 3, and 5. Since the input peak ground acceleration (PGA) is greater in the X direction than in the Y direction, the displacement response of measurement points in the X direction exceeds that in the Y direction, which is consistent with structural vibration principles. The maximum displacements under different wave loads are as follows: for SHW2, the X direction: −91.2 mm at position

B_{19}

; the Y direction: −18.3 mm at position

B_{19}

; and the Z direction: 5.6 mm at position

B_{22}

; for El Centro, the X direction: 70.6 mm at position

B_{1}

; the Y direction: 65.5 mm at position

B_{19}

; and the Z direction: 4.6 mm at position

B_{21}

; and finally, for the Northridge wave load, the X direction: −49.3 mm at position

B_{19}

; the Y direction: −44.4 mm at position

B_{13}

; and the Z direction: 4.1 mm at position

B_{3}

.

The highest displacements in the X and Y directions are observed at measurement points

B_{1}

,

B_{13}

, and

B_{19}

, which are located on the top story of the structure. In addition, the structural vibration amplitude gradually decreases as the height of the measuring point decreases, aligning with the physical characteristics of high-story structures. These findings further demonstrate the accuracy of the 3D dynamic response measured using the proposed videogrammetric method.

The earthquake-damaged structure with replaceable components can partially restore its original integrity by substituting the damaged parts. Analyzing the amplitude of the deformation response and permanent deformation from previous experiments can effectively determine the current state of these components. Suppose the 3D dynamic response data of the structural area of interest from prior experiments are not promptly obtained. In that case, it becomes challenging to assess whether the hybrid device needs replacement or if the PGA values for the next seismic event need to be increased. This highlights the efficient advantage in 3D dynamic measurements facilitated by the proposed videogrammetric method.

3.5. Discussion

The proposed distributed high-speed videogrammetric method successfully achieved the real-time 3D displacement measurement of replaceable coupling beam structures on a large shaking table, validating its effectiveness and accuracy for dynamic monitoring applications. The study demonstrated significant improvements in the calibration speed without compromising accuracy, underscoring the utility of non-coded circular targets for rapid extrinsic parameter estimation in large field-of-view setups. Additionally, the adoption of a distributed computation strategy based on the ADMM enabled efficient resource utilization within a local network, reducing dependence on high-performance computing clusters.

Despite these advantages, certain limitations remain. Although this method reduces reliance on high-performance computing clusters, it still requires multiple computing devices and network communication support. The high-frequency data acquisition required for 3D dynamic response reconstruction necessitates substantial data storage capacity and transmission bandwidth, particularly for real-world applications. Future research could investigate adaptive algorithms to manage fluctuating network conditions or optimize data transmission protocols to reduce latency. Additionally, while the proposed method enables the high-frequency, dynamic 3D displacement measurement of key points on large shaking table structures, it does not currently support full-field real-time displacement measurement. Expanding the system’s scalability to accommodate more complex structural configurations or environments may be a focus of future work. This could involve integrating hybrid systems that utilize additional data sources, such as satellite imagery or ground-based sensors, to provide a more comprehensive monitoring solution.

As demand for the real-time, high-frequency 3D displacement measurement of large-scale civil structures continues to grow, the successful application of this videogrammetric method in large replaceable coupling beam shaking table tests highlights its potential as a practical and real-time displacement monitoring solution for large-scale civil structures in real-world scenarios.

4. Conclusions

To overcome the limitations of conventional multi-vision measurement methods, which fail to promptly acquire high-frequency 3D dynamic responses of large-scale structures onsite, we propose a distributed high-speed videogrammetry method with rapid calibration. This method achieved high-frequency 3D displacement response monitoring for large shaking table structures using a general system of multiple high-speed cameras to evaluate structure health under seismic excitations rapidly. The efficiency and accuracy of the proposed method were verified through large shaking table tests on a seven-story structure with RCBs. The main conclusions drawn from the experimental results are as follows:

(1): The 3D displacement responses of all points of interest on a large structure with RCBs (1.8 m in length, 1.3 m in width, and 6.3 m in height) were measured using the proposed videogrammetric method in real time. The RMSE of the proposed method in the X, Y, and Z directions was 0.57 mm, 0.64 mm, and 0.52 mm, respectively, compared to the measurements of high-precision total stations. The RMSE of displacement responses in a single direction was approximately 0.96 mm compared to the contact displacement sensor, and the acceleration response range and trend were consistent with accelerometer measurements.
(2): The proposed automatic correspondence method for the non-coded circular targets facilitates the extrinsic calibration of multiple cameras with FOVs. This method provides an alternative for calibrating multi-camera systems with large FOVs in SHM instead of using coded targets. In a scene with four large stereo FOVs comprising six cameras, the calibration time was reduced from 85.9 s to 2.8 s compared to the conventional calibration method.
(3): The distributed computation and reconstruction strategy fully exploits the computing resources of different devices within the videogrammetric network. In laboratory conditions (computing resources idle), the method achieves the real-time high-frequency (200 Hz) 3D displacement response measurement of all points of interest on the large shaking table structure using only standard computing equipment and configurations without requiring a costly HPC cluster.

This paper provides a practical and efficient vision-based 3D monitoring solution for large shaking table structures, significantly facilitating the study of the seismic performance of shaking table structures. Future research will focus on full-field deformation measurements for large structures with texture features at high sampling frequencies.

Author Contributions

Conceptualization, X.T. and P.C.; methodology, X.T., P.C. and H.S.; software, H.S., Z.L. and Y.G.; validation, H.S., P.C., Z.L. and Y.G.; formal analysis, X.L. and P.C.; investigation, H.S. and P.C.; resources, P.C. and X.T.; data curation, H.S., P.C., Z.L. and Y.G.; writing—original draft preparation, H.S.; writing—review and editing, P.C.; visualization, Z.H. and Z.Y.; supervision, X.L., Z.H. and Z.Y.; project administration, P.C.; funding acquisition, X.T. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported in part by the National Natural Science Foundation of China under Grant 42221002, in part by the National Youth Talent Support Program, Grant Number SQ2022QB01546, and in part by the Joint Project of the Beijing Municipal Commission of Education and the Beijing Natural Science Foundation, Grant Number KZ202210016022.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ji, X.; Liu, D.; Sun, Y.; Molina Hutt, C. Seismic performance assessment of a hybrid coupled wall system with replaceable steel coupling beams versus traditional RC coupling beams. Earthq. Eng. Struct. Dyn. 2017, 46, 517–535. [Google Scholar] [CrossRef]
Xia, X.; Zhang, X.; Wang, J. Shaking table test of a novel railway bridge pier with replaceable components. Eng. Struct. 2021, 232, 111808. [Google Scholar] [CrossRef]
Alam, Z.; Sun, L.; Zhang, C.; Su, Z.; Samali, B. Experimental and numerical investigation on the complex behaviour of the localised seismic response in a multi-storey plan-asymmetric structure. Struct. Infrastruct. Eng. 2021, 17, 86–102. [Google Scholar] [CrossRef]
Entezami, A.; Arslan, A.N.; De Michele, C.; Behkamal, B. Online hybrid learning methods for real-time structural health monitoring using remote sensing and small displacement data. Remote Sens. 2022, 14, 3357. [Google Scholar] [CrossRef]
Gao, X.; Ji, X.; Zhang, Y.; Zhuang, Y.; Cai, E. Structural displacement estimation by a hybrid computer vision approach. Mech. Syst. Signal Process. 2023, 204, 110754. [Google Scholar] [CrossRef]
Cataldo, A.; Roselli, I.; Fioriti, V.; Saitta, F.; Colucci, A.; Tatì, A.; Ponzo, F.C.; Ditommaso, R.; Mennuti, C.; Marzani, A. Advanced video-based processing for low-cost damage assessment of buildings under seismic loading in shaking table tests. Sensors 2023, 23, 5303. [Google Scholar] [CrossRef] [PubMed]
Ri, S.; Ye, J.; Toyama, N.; Ogura, N. Drone-based displacement measurement of infrastructures utilizing phase information. Nat. Commun. 2024, 15, 395. [Google Scholar] [CrossRef] [PubMed]
Wen, H.; Dong, R.; Dong, P. Structural displacement measurement using deep optical flow and uncertainty analysis. Opt. Lasers Eng. 2024, 181, 108364. [Google Scholar] [CrossRef]
Weng, Y.; Quek, S.T.; Yeoh, J.K.W. Robust vision-based sub-pixel level displacement measurement using a complementary strategy. Mech. Syst. Signal Process. 2025, 223, 111898. [Google Scholar] [CrossRef]
Shi, H.; Liu, X.; Tong, X.; Chen, P.; Gao, Y.; Liu, Z.; Xu, Z.; Hong, Z.; Ye, Z.; Xie, H. Three-dimensional deformation monitoring of internal nodes of large-span suspended dome structure using videogrammetry under camera instability. Measurement 2025, 242, 116009. [Google Scholar] [CrossRef]
Zhang, D.; Yu, Z.; Xu, Y.; Ding, L.; Ding, H.; Yu, Q.; Su, Z. GNSS aided long-range 3D displacement sensing for high-rise structures with two non-overlapping cameras. Remote Sens. 2022, 14, 379. [Google Scholar] [CrossRef]
Gong, N.; Freddi, F.; Li, P. Shaking table tests and numerical analysis of RC coupled shear wall structure with hybrid replaceable coupling beams. Earthq. Eng. Struct. Dyn. 2024, 53, 1742–1766. [Google Scholar] [CrossRef]
Gong, N.; Li, P.; Shan, J. Aftershock performance evaluation of shear wall structures with replaceable coupling beam including low-cycle degradation. Structures 2022, 44, 713–727. [Google Scholar] [CrossRef]
Hu, B.; Chen, W.; Zhang, Y.; Yin, Y.; Yu, Q.; Liu, X.; Ding, X. Vision-based multi-point real-time monitoring of dynamic displacement of large-span cable-stayed bridges. Mech. Syst. Signal Process. 2023, 204, 110790. [Google Scholar] [CrossRef]
Zhu, Z.; Bao, T.; Hu, Y.; Gong, J. A novel method for fast positioning of non-standardized ground control points in drone images. Remote Sens. 2021, 13, 2849. [Google Scholar] [CrossRef]
Ahn, S.J.; Rauh, W.; Kim, S.I. Circular coded target for automation of optical 3d-measurement and camera calibration. Int. J. Pattern Recognit. Artif. Intell. 2001, 15, 905–919. [Google Scholar] [CrossRef]
Wei, K.; Yuan, F.; Shao, X.; Chen, Z.; Wu, G.; He, X. High-speed multi-camera 3D DIC measurement of the deformation of cassette structure with large shaking table, Mech. Syst. Signal Process. 2022, 177, 109273. [Google Scholar] [CrossRef]
Wang, Q.; Liu, Y.; Guo, Y.; Wang, S.; Zhang, Z.; Cui, X.; Zhang, H. A robust and effective identification method for point-distributed coded targets in digital close-range photogrammetry. Remote Sens. 2022, 14, 5377. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, W.; Wang, F.; Lu, Y.; Wang, W.; Yang, F.; Jia, Z. Improved separated-parameter calibration method for binocular vision measurements with a large field of view. Opt. Express 2020, 28, 2956–2974. [Google Scholar] [CrossRef]
Tong, X.; Luan, K.; Liu, X.; Liu, S.; Chen, P.; Jin, Y.; Lu, W.; Huang, B. Tri-camera high-speed videogrammetry for three-dimensional measurement of laminated rubber bearings based on the large-scale shaking table. Remote Sens. 2018, 10, 1902. [Google Scholar] [CrossRef]
Shao, Y.; Li, L.; Li, J.; An, S.; Hao, H. Computer vision based target-free 3D vibration displacement measurement of structures. Eng. Struct. 2021, 246, 113040. [Google Scholar] [CrossRef]
Liu, F.; Gong, C.; Huang, X.; Zhou, T.; Yang, J.; Tao, D. Robust visual tracking revisited: From correlation filter to template matching. IEEE Trans. Image Process. 2018, 27, 2777–2790. [Google Scholar] [CrossRef] [PubMed]
Luo, J.; Konofagou, E.E. A fast normalized cross-correlation calculation method for motion estimation. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2010, 57, 1347–1357. [Google Scholar]
Ma, K.F.; Huang, G.P.; Xu, H.J.; Wang, W.F. Research on a precision and accuracy estimation method for close-range photogrammetry. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1955002. [Google Scholar] [CrossRef]
Ran, Q.; Zhou, K.; Yang, Y.; Kang, J.; Zhu, L.; Tang, Y.; Feng, J. High-precision human body acquisition via multi-view binocular stereopsis. Comput. Graph. 2020, 87, 43–61. [Google Scholar] [CrossRef]
Tong, X.; Gao, Y.; Ye, Z.; Xie, H.; Chen, P.; Shi, H.; Liu, Z.; Liu, X.; Xu, Y.; Huang, R.; et al. Dynamic measurement of a long-distance moving object using multi-binocular high-speed videogrammetry with adaptive-weighting bundle adjustment. Photogramm. Rec. 2024, 39, 294–319. [Google Scholar] [CrossRef]
Agarwal, S.; Snavely, N.; Seitz, S.M.; Szeliski, R. Bundle adjustment in the large. In Proceedings of the 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2010; pp. 29–42. [Google Scholar]
Wu, C.; Agarwal, S.; Curless, B.; Seitz, S.M. Multicore bundle adjustment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA, 20–25 June 2011. [Google Scholar]
Mongelli, M.; Roselli, I.; De Canio, G.; Ambrosino, F. Quasi real-time FEM calibration by 3D displacement measurements of large shaking table tests using HPC resources. Adv. Eng. Softw. 2018, 120, 14–25. [Google Scholar] [CrossRef]
Majchrowicz, M.; Kapusta, P.; Jackowska-Strumiłło, L.; Banasiak, R.; Sankowski, D. Multi-GPU, multi-node algorithms for acceleration of image reconstruction in 3D Electrical Capacitance Tomography in heterogeneous distributed system. Sensors 2020, 20, 391. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, J. UAV-based bridge geometric shape measurement using automatic bridge component detection and distributed multi-view reconstruction. Autom. Constr. 2022, 140, 104376. [Google Scholar] [CrossRef]
Hillebrand, M.; Stevanovic, N.; Hosticka, B.J.; Conde, J.S.; Teuner, A.; Schwarz, M. High speed camera system using a CMOS image sensor. In Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511), Dearborn, MI, USA, 5 October 2000. [Google Scholar]
Tong, X.; Shi, H.; Ye, Z.; Chen, P.; Liu, Z.; Gao, Y.; Li, Y.; Xu, Y.; Xie, H. Liquid-level response measurement using high-speed videogrammetry with robust multiple sphere tracking. Measurement 2024, 228, 114290. [Google Scholar] [CrossRef]
Baqersad, J.; Poozesh, P.; Niezrecki, C.; Avitabile, P. Photogrammetry and optical methods in structural dynamics—A review. Mech. Syst. Signal Process. 2017, 86, 17–34. [Google Scholar] [CrossRef]
Hong, Z.; Li, Z.; Tong, X.; Pan, H.; Zhou, R.; Zhang, Y.; Han, Y.; Wang, J.; Yang, S.; Ma, Z. A high-precision recognition method of circular marks based on CMNet within complex scenes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 7431–7443. [Google Scholar] [CrossRef]
Rodríguez, M.; Delon, J.; Morel, J.M. Fast affine invariant image matching. Image Process. Line 2018, 8, 251–281. [Google Scholar] [CrossRef]
Arandjelović, R.; Zisserman, A. Three things everyone should know to improve object retrieval. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. [Google Scholar]
Barath, D.; Noskova, J.; Ivashechkin, M.; Matas, J. MAGSAC++, a fast, reliable and accurate robust estimator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Korman, S.; Reichman, D.; Tsur, G.; Avidan, S. FasT-Match: Fast Affine Template Matching. In Proceedings of the IEEE Conference on Comput Vision and Pattern Recognit, Portland, OR, USA, 23–28 June 2013. [Google Scholar]
Nistér, D. An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern. Anal. Mach. Intell. 2004, 26, 756–770. [Google Scholar] [CrossRef]
Aiger, D.; Mitra, N.J.; Cohen-Or, D. 4-Points congruent sets for robust pairwise surface registration. ACM Siggraph 2008, 27, 1–10. [Google Scholar] [CrossRef]
Wang, Y.Q.; Sutton, M.A.; Bruck, H.A.; Schreier, H.W. Quantitative error assessment in pattern matching: Effects of intensity pattern noise, interpolation, strain and image contrast on motion measurements. Strain 2009, 45, 160–178. [Google Scholar] [CrossRef]
Ngeljaratan, L.; Moustafa, M.A. Implementation and evaluation of vision-based sensor image compression for close-range photogrammetry and structural health monitoring. Sensors 2020, 20, 6844. [Google Scholar] [CrossRef]
Westoby, M.J.; Brasington, J.; Glasser, N.F.; Hambrey, M.J.; Reynolds, J.M. ‘Structure-from-Motion’ photogrammetry: A low-cost, effective tool for geoscience applications. Geomorphology 2012, 179, 300–314. [Google Scholar] [CrossRef]
Zhang, R.; Zhu, S.; Fang, T.; Quan, L. Distributed very large scale bundle adjustment by global camera consensus. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
Havaran, A.; Mahmoudi, M. Extracting structural dynamic properties utilizing close photogrammetry method. Measurement 2020, 150, 107092. [Google Scholar] [CrossRef]
Lourakis, M.I.; Argyros, A.A. SBA: A software package for generic sparse bundle adjustment. ACM Trans. Math. Softw. 2009, 36, 1–30. [Google Scholar] [CrossRef]

Figure 1. Framework of the proposed videogrammetric method.

Figure 2. General distributed videogrammetric network.

Figure 3. Stereo-matching method of circular targets in large FOV (red dots indicate SIFT feature points of stereo images).

Figure 4. Distributed computation and reconstruction strategy.

Figure 5. (a) Real structure model. (b) Camera layout and spatial coordinate system. (c) Measurement point distribution.

Figure 6. Measurement errors between the videogrammetry and the total station at each checkpoint in the X, Y, and Z directions.

Figure 7. Three-dimensional positioning errors of the checkpoint calculated using different methods after each seismic wave load.

Figure 8. Comparison of displacement and acceleration response histories obtained by the proposed videogrammetry and contact sensors at points R₃ and R₁₈ subjected to different seismic excitations: (a) Experiment No. 1; (b) Experiment No. 3; (c) Experiment No. 5.

Figure 9. Time consumption and mean reprojection error of different methods for reconstructing the shaking table dataset.

Figure 10. Time consumption of different methods for reconstructing the shaking table dataset.

Figure 11. Three-dimensional displacement response histories of measurement points distributed across the coupling beams during (a) Experiment No. 1, (b) Experiment No. 3, and (c) Experiment No. 5.

Table 1. Simulated earthquake vibration table.

Performance	Experiment No.
Maximum Specimen Mass	25 tons
Table Size	4 m × 4 m
Vibration Direction	X, Y, Z axes
Degrees of Freedom	Six degrees of freedom
Frequency Range	0.1–100 Hz

Table 2. Selected six ground motion experiments on the shaking table.

Seismic Waves	Experiment No.	PGA (g)
Seismic Waves	Experiment No.	X Direction	Y Directions
SHW2	1	1.02	0
SHW2	2	0	1.02
El Centro	3	1.02	0.867
El Centro	4	0.867	1.02
Northridge	5	1.02	0.867
Northridge	6	0.867	1.02

Table 3. Critical parameters of the high-speed camera.

Parameter	Configuration
Sensor resolution	1280 × 1024 pixels
Capture frame rate	300 fps
Image sensor	LUPA1300 Global Shutter CMOS
Pixel size	14 μm
Lens	20 mm

Table 4. Runtime of different calibration methods to calibrate the multi-camera system.

Extrinsic Calibration	Runtime (s)
Extrinsic Calibration	2D to 3D Correspondence	Stereo Correspondence
Proposed	1.2	1.6
Conventional	30.7	55.2

Table 5. Time consumption, mean reprojection error, and number of iterations for shaking table dataset reconstruction using different numbers of workstations.

Workstation Number	CPU Cores	Time (s)	Mean Reprojection Error (Pixels)	Iterations
1	6	272.7	0.036	6
2	10	198.1	0.036	7
3	14	154.6	0.035	7
4	18	101.3	0.037	8
5	22	81.9	0.038	8
6	26	76.4	0.040	8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shi, H.; Chen, P.; Liu, X.; Hong, Z.; Ye, Z.; Gao, Y.; Liu, Z.; Tong, X. Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table. Remote Sens. 2024, 16, 4345. https://doi.org/10.3390/rs16234345

AMA Style

Shi H, Chen P, Liu X, Hong Z, Ye Z, Gao Y, Liu Z, Tong X. Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table. Remote Sensing. 2024; 16(23):4345. https://doi.org/10.3390/rs16234345

Chicago/Turabian Style

Shi, Haibo, Peng Chen, Xianglei Liu, Zhonghua Hong, Zhen Ye, Yi Gao, Ziqi Liu, and Xiaohua Tong. 2024. "Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table" Remote Sensing 16, no. 23: 4345. https://doi.org/10.3390/rs16234345

APA Style

Shi, H., Chen, P., Liu, X., Hong, Z., Ye, Z., Gao, Y., Liu, Z., & Tong, X. (2024). Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table. Remote Sensing, 16(23), 4345. https://doi.org/10.3390/rs16234345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Distributed High-Speed Videogrammetry for Real-Time 3D Displacement Monitoring of Large Structure on Shaking Table

Abstract

1. Introduction

2. Methodology

2.1. Construction of Distributed Videogrammetric Network

2.2. Fast Calibration of Multi-Camera System

2.2.1. Stereo Correspondence of Circular Targets

2.2.2. Cross-Modal Correspondence of Circular Targets

2.3. Sub-Pixel Tracking of Circular Targets

2.4. Distributed Computation and Reconstruction

2.4.1. Distributed Computation Strategy

2.4.2. Distributed 3D Reconstruction Based on ADMM

3. Experimental Results and Analysis

3.1. Structure Model and Experiment Setup

3.1.1. Structure Model and Ground Motion

3.1.2. Measurement Setup

3.2. Accuracy Verification and Analyzation

3.2.1. Accuracy of Calibration

3.2.2. Accuracy of Distributed Computation and Reconstruction

3.2.3. Accuracy of Displacement and Acceleration Responses

3.3. Efficiency Verification and Analyzation

3.3.1. Efficiency of Fast Calibration

3.3.2. Efficiency of Distributed Computation and Reconstruction

3.3.3. Efficiency of Videogrammetric Method

3.4. Measurement Results

3.5. Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI