1. Introduction
Civil engineering structures are the main bodies that resist loads. During their operational life, civil engineering structures are exposed to various external loads, such as traffic, wind gusts, and seismic loads. These external loads are the main reason for the degradation of the structures. Health monitoring on major civil engineering structures has become an important research topic. At present, structural health monitoring (SHM) is carried out through the installation of contact sensors and their corresponding data acquisition systems. Such an approach, however, has many limitations such as installation difficulty, and being time-consuming, and high cost. Specifically, installation of these contact sensors often interrupts the normal operation of the structure. Therefore, it is necessary to develop a more effective and practical SHM method. The first bus rapid transit (BRT) line was built in Curitiba (Brazil) in 1974. After that, the new transportation method spread rapidly all over the world and has now become an indispensable part of urban traffic. As one of the main modes of transportation in big cities, such as Chengdu (China), the BRT viaduct use cannot be interrupted in view of the heavy traffic and security. In this case, traditional sensing systems are not easily implemented.
At present, structural vibration response can be applied to the operational state analysis [
1,
2,
3] of existing bridges. Furthermore, damage identification [
4,
5,
6] and life prediction calculations [
7,
8] can also be carried out. Such a method has become an active research field owing to its excellent performance and the fact that it requires very few parameters. Therefore, it is of great significance to measure the vibrations precisely, rapidly, and economically. Currently available sensors for measuring structural vibrations can be classified into contact and non-contact sensors. Contact sensors, such as accelerometers [
9,
10], linear variable differential transformers (LVDT) [
11], and strain-type displacement sensors (STDS) [
12] are widely used in monitoring systems to obtain valuable structure vibration information. Non-contact sensors such as global position system (GPS) [
13], laser Doppler vibrometers [
14], and radar interferometry system [
15] are less used because they are expensive, complex, and not very accurate [
16,
17,
18].
Vision-based vibration measurement systems are burgeoning. They are gradually replacing conventional vibration measurement sensors owing to their relatively low cost as well as flexible and convenient installation, especially for target-free vision-based sensor approaches. Various techniques have been implemented for moving object tracking and displacement measurement, such as template matching [
19,
20], optical flow field [
21,
22], frame differential method [
23], and digital image correlation method (DICM) [
24,
25]. The optical flow is greatly affected by different light intensities, making it not very applicable to the field. The frame differential method is only used to determine whether an object is moving in an area or not and cannot extract the full image of the moving objects. The digital image cross-correlation is a measurement method for the analysis of the entire field displacement and strain, however, it cannot measure local vibrations.
The most frequently used method is template matching, which can be categorized into three types based on its template styles, namely, global template matching, local template matching and keypoint matching. The first two matching methods have good precision, however their efficiency is low because of their high consumption of time and random-access memory (RAM). The keypoints matching method can overcome this deficiency, and thus, this method has been widely studied. A variety of keypoints have been detected and descriptor algorithms have been proposed, such as scale-invariant feature transform (SIFT) [
26], speeded-up robust features (SURF) [
27], features from accelerated segment test (FAST) [
28,
29], adaptive and generic accelerated segment test (AGAST) [
30], binary robust invariant scalable keypoints (BRISK) [
31] and ORB [
32]. Among these algorithms, the ORB algorithm is very popular for the reason that it has the best efficiency and rotational invariance and its scale invariance is retained. It consists of two components—oFAST and rBRIEF—which have improved performance compared to the FAST keypoint detector and Binary Robust Independent Elementary Features(BRIEF) [
33] descriptors. The ORB algorithm is nearly two orders of magnitude faster than the SIFT one [
34], and one order faster than the SURF one [
33]. Thus, a number of object tracking algorithms have been proposed, such as tracking learning detection (TLD) [
35], visual tracking decomposition (VTD) [
36], incremental visual tracking (IVT) [
37], multi-task tracking (MTT) [
38], visual tracker sampler (VTS) [
39], and CMT [
40,
41]. The CMT algorithm was proposed by Nebehay et al. in 2014. It employs a novel consensus-based scheme for outlier detection in the voting behavior to eliminate erroneous keypoints. In this method, the number of keypoints has been reduced, while the process becomes more efficient.
Although computer vision measurement technology is still in its infancy, some achievements have been recorded and it has great prospects for the future. In this study, we propose a novel vision-based sensor for BRT viaduct vibration measurement employing CMT combined with ORB algorithm. In practical application, the primary concern for vision-based sensor is the measurement efficiency which mainly refers to the accuracy and operating speed. To improve the accuracy of object orientation, keypoint matching technology was employed to seek the latent object point, meanwhile voting and consensus were applied for removing the outliers. A more efficient combination pair of detector and descriptor was further tested to improve the execution speed of the algorithm based on the aforementioned technology. In general, the proposed vision sensor has the following properties: (1) easy to install and set up, without pre-installed artificial targets; (2) the measurement efficiency of algorithm is higher than that of existing algorithms, which means that the sensor is well adapted to high-speed monitoring systems; (3) precision is kept at a good level.
This study aims at solving vibration sensing and measuring problems through the vision sensor method. To address these challenges, three key steps were employed, namely, preprocessing, object tracking, and vibration analysis. Homomorphic filtering was introduced for preprocessing, tracking of objects was realized using an improved CMT object recognition and tracking algorithm, resulting in an improved method for calculation of scaling factors and a more accurate vibration analysis. A series of laboratory tests were conducted to evaluate the reliability of this method. Furthermore, the vibration measurement of a BRT viaduct in Chengdu (China) was selected as a case study to illustrate the specific process of the vision sensor method. Finally, field test results were used to validate this method.
4. Laboratory Tests
4.1. Moving Platform Tests
The moving platform tests experiment was carried out to evaluate the performance of the vision-based sensor in a laboratory environment. The mechanical testing and simulation (MTS) electronic servo testsuite was used as the vibration source, and the motion was captured by the vision-based sensor and the strain-type displacement sensors (STDS). At present, STDS are widely used in displacement measurement of civil engineering structures. These apparatus work by the strain bridge principle. Specifically, the small deformation measured by the strain bridge and thus the mechanical quantity is changed into an electrical quantity. It has many advantages compared with traditional displacement sensors, for example, higher accuracy, wider range of measurement, longer service life, faster response speed, better frequency response, no environmental restrictions, cost-effectiveness and so on. Because it has an excellent performance in terms of small displacement measurements, the STDS is an optimum option in this experiment. In addition, the selection of vision sensor equipment should take into account vibration parameters and the working environment.
Figure 9 shows the setup for the moving platform experiment. The target plate was installed on the CMT electronic servo TestSuite. The displacement sensor was installed on the target plate, with the magnetic stand fixed on it. The sensor of the measuring head maintained contact with the target plate. The camera head was installed on a tripod for steady output, and fixed at the right position to ensure that the target can be captured smoothly during the test duration.
Commissioning tests were carried out after equipment installation to determine the appropriate distance between the camera and target plate. The stability of the proposed algorithm is verified by designing two different types of target, as shown in
Figure 10. Firstly, the artificial target is designed with significant characteristics, which is conductive to the achievement of continuous target tracking, but for the free targets, the colors, sizes and positions, are assigned randomly. In this way, the effectiveness and stability of the object tracking of arbitrary targets are confirmed. Secondly, the free target plant is designed with two different targets, which can be used to verify whether the error caused by human selection could have been prevented. Lastly, targets with different colors are employed to verify the color sensitivity of the algorithm.
Since that there are many vibration modes in the real environment, various frequencies, amplitudes and operating modes were applied to simulate the natural environment in a series of experiments.
Table 5 and
Table 6 list a series of low frequency vibration test parameter values. In addition, higher frequency vibration tests were designed to validate the performance; the parameter values are listed in
Table 7.
In the laboratory experiment, the video camera was aimed at the target center, and made an angle
θ = 0. The rest of the parameters are summarized in
Table 8. Using the parameters and Equation (33), the scaling factor was obtained as
SF = 0.138858.
To further evaluate the error performance and verify the precision and accuracy of the developed vision-based sensor, the normalized root mean squared error (NRMSE) was introduced as follows:
where
n is the number of measurement data,
xi and
yi denote the
ith displacement data at time
ti measured by the vision sensor and the STDS, respectively, and
ymax = max(
y),
ymin = min(
y).
Figure 11 shows a set of experimental results obtained with the artificial target measurement test in Ι-5. The NRMSE errors were used in the analysis of the experimental data. The results are shown in
Table 9, where the average NRMSE of the vision sensor measurement was 1.822%, and the maximum value was 3.041%. The average NRMSE of the displacement sensor measurement was 1.442%, and the maximum value was 3.433%.
From
Table 9, it can be noted that two sets of tests that have relatively big errors, 3.041% and 2.757% respectively. The reasons of this phenomenon can be obtained by analyzing the corresponding test data.
Figure 12 shows the experimental results with artificial target measurement test of Ι-2 and Ι-4. As shown in the
Figure 12, exceptional data with abnormal causes are present in some periods during the test duration which explains why the average NRMSE of the vision sensor measurement is larger than that of the displacement sensor measurement. The most likely cause of this anomaly is that unavoidable movement of the camera stand occurred. Removing the abnormal results, the average NRMSE of the vision sensor measurement was 1.315%, which implies that the improved vision-based sensor is consistent with traditional displacement sensor and therefore, suitable for actual measurements.
Figure 13 shows a set of experimental results with free target measurement test in II-6.
Table 10 presents the NRMSE errors analysis results. As presented in
Table 10, the average NRMSE error of the vision-based sensor measurement was 1.805%, and the average NRMSE error of the displacement sensor measurement was 1.471%. Clearly, the vision-based sensor using a free target achieved a high accuracy comparable to traditional contact sensors.
On the other hand, the average NRMSE error of the target one measurement was 1.753%, and the average NRMSE error of the target two measurement was 1.856%. It can be concluded that the accuracy of the vision sensor measurement is independent of the selected target points. This means that the improved vision-sensor can avoid errors caused by human selection. Furthermore, motion tests with higher frequency were conducted.
Figure 14 shows the measurement results and
Table 11 lists the NRMSE error analysis results. The maximum NRMSE error of the measurement results of the vision sensor was 3.922%. Measurement accuracy is consistent with the low frequency experimental results. This indicates that the improved vision-based sensor can be applied to track higher frequency motion. It is noteworthy that the performance of the vision sensor in higher frequency measurements depends on the ability of the imaging equipment.
4.2. Shaking Table Tests
In order to describe the performance of the vision-based sensor better, a series of higher-frequency and lower-amplitude vibration experiments were carried out to verify the efficiency of this algorithm. A shaking table was used as the vibration source, and the motion was captured by the vision-based sensor and the STDS, just as the moving platform tests mentioned in
Section 4.1.
Figure 15 shows the setup for the shaking table experiment, which includes four components: video acquisition system, vibration control system, target system and strain acquisition system. The video acquisition system are used to capturing the motion states and behaviors, the main role of the vibration control system is controlling the vibration frequency and amplitude while testing, an identifiable target is provided by target system to object tracking steadily and the strain acquisition system is used to collect displacement data obtained by STDS.
The experimental parameters of shaking table tests are listed in
Table 12, and
Figure 16 shows the experimental results of ΙV-9. The NRMSE errors were used in the analysis of experimental data, the results are shown in
Table 13. According to the computing results above, we can safely come to the conclusion that: (1) The average value of vision-based sensor measurement error is 2.092%, which is better than STDS, in other words, the performance of vision sensor is better than STDS. (2) The error increases with frequency in the rough while there are some singular values.
4.3. Measuring Distance Tests
As a non-contact remote measurement technique, the performance of different measuring distances of the developed vision-based sensors should be analyzed in detail. The different measuring distance are designed to evaluate the impact using shaking table test equipment, the test parameters are listed in
Table 14.
Figure 17 shows the test results of V-5. The error analysis results are listed in
Table 15. The STDS is a kind of connecting displacement sensor and its measuring precision is only affected by frequency and amplitude. On the other hand, the measuring errors of vision sensors increase with the distance. It is well known that the further the distance between target and camera is, the smaller the target is. In other words, the pixel numbers of the target decrease with the distance from the imaging device, providing that the optical focal length is the same. That is the reason which leads to a marked drop in positioning precision, and result in big measuring error. It is worth mentioning that the performance of remote measurement depends on the parameter of imaging equipment, especially the focal length and imaging resolution.
4.4. Discussion
From the analysis of the experiments above, it can be seen that the improved object tracking approaches successfully enhance the measurement accuracy of the traditional displacement sensors. Different from the CMT algorithm, the modified CMT algorithm provides more efficient alternatives in vision-based displacement measurements.
First of all, moving platform tests were designed to verify the tracking stability of free targets. Compared with artificial target measurement data acquired in the laboratory, the precision of free target measurement system was verified. The average NRMSE errors from the free target measurement and the artificial target measurement were 1.805% and 1.822%, respectively, which proves that the improved CMT vision measurement algorithm gives a higher accuracy. Two different types of free targets were designed to check whether artificial errors exist in the assignment of initializing region. The NRMSE error between the measuring values target 1# and target 2# was 0.458%, which indicates that no artificial errors appear in this method. From the above two conclusions, we can see that the improved CMT algorithm possesses a good performance on tracking free targets.
Secondly, the moving platform experiments cannot indicate whether this system has a high precision for low amplitude and high frequency vibrations. Therefore, shaking table tests were employed to solve the problem. Compared with mechanical testing and simulation (MTS) electronic servo testsuite, the shaking table can achieve a higher frequency. A series of high frequency and low amplitude vibration tests were designed to further evaluate the performance of the vision sensor. The vibration tests frequency scopes in 8 Hz to 20 Hz and amplitude scopes in 1 mm to 3 mm. Test results prove that the NRMSE error of vision sensor was 2.092%, the results show that the error is within the acceptable level. This demonstrates the reliability of the vision sensors we proposed in high frequency and lower amplitude vibration measurements.
Finally, as a non-contact remote measurement technique, measuring distance was used as a key indicator for judging its performance. Experimental results show that the errors increased with increasing measuring distance and the theoretical analysis indicates that the decisive factor of measuring distance are the characteristics of imaging devices, which are not germane to the algorithm.
5. Field Test
Field tests were carried out to evaluate the validity of the vision sensor on the Yingmenkou flyover of Chengdu (China), which is an important BRT transport hub. The time-domain of motion images was captured by the vision-based sensor and the STDS sensor, respectively. As shown in
Figure 18, the vision sensor, limited by the camera optional lens, was installed in a location near the bridge.
Table 16 lists the parameters of the field test and the laboratory test. According to the parameters, the scaling factor was obtained as
SF = 0.186673.
Figure 19 plots the displacement measurement from the vision sensor. It can be seen that the measurement results include significant noise signals possibly caused by the movements of the camera stand [
46,
47], illumination [
48] and vapor [
48], etc. According to related data and references, the airflow speed has a significant influence on the movements of the stand. The field tests were carried out when the wind speed was lower, so it can be approximately considered that the errors caused by the camera are weak random interfering noise which can be removed by filtering. Similarly, it is proved in the [
48] that the illumination and vapor have a great effect on the measurement accuracy of the vision-based system, but scientifically arranging the test to avoid this is not that hard and the trifling impact can be further reduced by a filter.
In order to obtain the most reliable results, a Butterworth low-pass filter [
49] was implemented for noise reduction and the filtering results, plotted in
Figure 20a, show that this approach is efficient and useful. Basic displacement characteristics was preserved, while a lot of noise has been filtered. The corresponding Fourier spectrum results are plotted in
Figure 20b. The displacement measurement results from the STDS sensor are plotted in
Figure 20a, and the Fourier spectrum results are plotted in
Figure 20c. The spectral peaks of vision-based sensor measurement results were consistent with the STDS measurement results. The inconsistency of displacement measured by vision sensors and STDS are largely due to the residual noise. From the field test cases, the scaling factor is about 0.14 mm/pixel, and that is, the measurement resolution is ±0.07 mm. Thus the measuring data which between −0.07 mm and 0.07 mm is noisy. That means the residual noise will affect the performance and leads to the difference of curves. Furthermore, two obvious spectral peaks, 79 and 92 Hz, were observed in the Fourier spectrum. Therefore, it can be concluded that the same spectral information can be obtained from the vision sensor.