1. Introduction
Information about structural stiffness, which is crucial in structural health monitoring (SHM), is better reflected by dynamic displacements than by accelerations, as the latter are strongly affected by other parameters, e.g., mass [
1]. Nowadays, simple cameras mounted on stands or on unmanned aerial vehicles (UAVs) have potential in both monitoring and inspection of structural conditions [
2,
3]. The application range of computer vision in infrastructure assessment is very wide, and this paper focuses on estimation of dynamic displacements for parametric identification of structures.
Feng and Feng indicated several advantages of CV displacement measurement over techniques employing traditional sensors [
4]:
To be installed, a camera does not require physical access to the structure. Hardly accessible structures can be observed remotely using zoom lenses [
5,
6]. In comparison to contact sensors, this significantly reduces the monitoring costs of many critical parts of structures such as cables, as indicated in [
7].
Compared to GPS technology, which also does not require a physical reference point near the monitored structure, CV measurements are much more accurate [
8].
As opposed to traditional point-wise sensors, a camera is able to simultaneously capture the motion of multiple points [
9]. Moreover, it is possible to select these points in the recorded video after the measurement session.
Apart from the advantages of CV measurements, some weaknesses of this technique should also be noted. Measured higher-frequency components of motion often become highly contaminated by noise due to equipment limitations such as insufficient camera resolution or a relatively low sampling frequency [
10]. Thus, sensing dynamic displacement using a camera requires an appropriate video-processing algorithm to reduce the influence of these limitations.
Template matching methods are the most popular displacement estimation techniques due to the possibility of achieving subpixel-level accuracy. Generally, in template matching, the tracked object is represented by a preselected image region, namely, a template, of a selected frame of the video (usually the first one). The displacement of this template is determined in the subsequent video frames by searching for and matching this template with the most similar region within a search area in the current video frame, called the region of interest (ROI). The template matching process is illustrated in
Figure 1.
Based on the matching technique, template matching methods can be classified into area-based and feature-based template matching [
11]. In area-based template matching, both the template and the ROI are represented by their pixel intensities. Matching of the template within the ROI is performed by maximization of the cross-correlation function or by minimization of the error function, e.g., sum of squared differences [
12]. In feature-based template matching, both the template and the ROI are usually represented by a set of characteristic points, called the keypoints. They are matched according to the information about their vicinity encoded in descriptors, such as the fast retina keypoint (FREAK) [
13]. Area-based template matching usually provides more accurate displacement estimation if good illumination is provided, whereas feature-based template matching is more robust with respect to changes in illumination, scale, rotation, etc. [
14]. It can thus be more suitable for outdoor field measurements.
Errors caused by the limited resolution of the camera can be reduced by using subpixel techniques [
15]. Feng et al. showed that the displacement estimation error is unacceptable when a template is matched with the accuracy of the pixel size in a video [
16]. They demonstrated that with their proposed upsampling technique, the quantization error can be reduced with a simultaneous decrease in the subpixel size. In practice, it is possible to achieve a precision of displacement estimation even near to 0.01 pixels [
17] with methods involving a subpixel precision search. In [
18], the fundamental natural frequency of the monitored structure was identified even from vibrations of 0.21 mm amplitude from a distance of over 175 m, which amounts to a precision of 1/175 pixel. Interpolation of the cross-correlation function in the vicinity of its maximum provides accurate results, but it must be guaranteed that the interpolation function has the maximum [
19].
Camera calibration is required to transform the displacements in the video (expressed in pixels) into physically meaningful information. In the simplest case, if the direction of the camera view is perpendicular to the filmed surface, then only a scaling factor is required. This can be determined from the known physical dimensions of the filmed structure and its pixel size in the video frames [
20]. When the camera axis cannot be positioned perpendicularly to the filmed object, then the tilt angle needs to be included in calculations, e.g., as proposed by Pen et al., where a laser rangefinder is additionally used in the camera calibration procedure [
21]. If three dimensional (3D) displacement is to be measured, then a more sophisticated camera calibration procedure can be required. Park et al. proposed a methodology for the calculation of 3D displacements from two dimensional displacements obtained with multiple cameras [
22]. The methodology requires a “T”- or “L”-shaped wand with attached markers to be placed in the field of view near the monitored structure in order to calibrate the cameras. Narazaki et al. proposed a model-informed approach for 3D displacement estimation with the use of a single camera. Instead of the wand or a calibration panel, it employs the known dimension of the structure associated with the selected points in the video frame [
9].
Another important issue in CV measurement is the camera motion caused by ground vibration. This problem is not always significant, as shown by Feng and Feng, where the error resulting from camera motion was negligible [
23]. However, camera motion can, in general, significantly decrease the measurement accuracy. Usually, reference targets that do not belong to the measured object are employed to estimate the camera motion, e.g., as shown by Yoneyama and Ueda [
24]. More than one reference target allows estimation of both translation and rotation of the camera, and thus more effectively reduces the related errors [
25]. Another kind of camera motion is present when the camera is mounted on a UAV, as shown in [
26]. In this work, a methodology based on the Fourier spectrum of the relative displacement of two adjacent points on the cable is proposed. This allows the influence of the UAV motion to be effectively reduced.
There are also other sources of errors. In remote measurements from a significant distance, heat haze can distort the refractive index of the air and affect the measurement accuracy [
27]. The widely used complementary metal-oxide semiconductor (CMOS) sensors do not register the whole image at once, as opposed to charge-coupled device (CCD) sensors, but they usually scan the image row-by-row or column-by-column. This can result in the rolling shutter effect if relatively high-speed motion is recorded. Lee proposed a real-time algorithm for video stabilization and compensation of the rolling shutter effect to be used with low-cost consumer cameras [
28]. If the monitored structure vibrates with a frequency higher than half of the camera sampling frequency, temporal aliasing occurs. This can often be noticed with consumer cameras, which usually have a lower sampling frequency than professional high-speed cameras. Moreover, consumer cameras can have an inaccurate sampling frequency, i.e., different from the one declared in the camera specifications. These two issues appeared in a low-cost CV identification system proposed by Yoon et al., as discussed in [
29]. The small focal length common in consumer-grade smartphone cameras causes bigger displacement estimation errors for a fixed distance [
16]. However, a smaller focal length also extends the field of view of the camera; thus, the camera can be moved closer to the recorded object while still registering all the measurement points [
30]. This allows a more profitable trade-off between the resolution and the field of view. The lower pixel size of low-cost cameras makes them more sensitive to changes to illumination in field measurements.
Due to the mentioned advantages of low-cost cameras, especially smartphone cameras, they are frequently used for estimation of displacements and identification of structural dynamic properties [
31,
32,
33,
34]. Min et al. showed that CV measurements with a smartphone camera can have accuracy comparable to that of a laser displacement sensor (LDS) [
31]. Li et al. indicated that a smartphone camera allows interstory drift of buildings to be accurately measured during an earthquake [
33]. The results obtained with the tested smartphones were comparable with those of an LDS, too. Zhu et al. compared the natural frequencies identified with the aid of a smartphone to those obtained using an accelerometer and simulations [
34].
The dynamic displacement estimation techniques discussed above are often used for identification of modal parameters of structures, which can subsequently be used for model updating. Among the variety of methods for experimental modal analysis, stochastic subspace identification using time-domain data (SSI-DATA) is widely recognised as accurate and robust [
35]. It is devoted to system identification when excitations are unknown; hence, it is suitable for CV measurements.
Experimentally identified modal parameters are often used in the model updating procedure. Model updating generally has one of two aims: (1) calibration of the model, which aims to provide accurate reproduction of the output of the real system, or (2) identification of unknown structural parameters [
36].
It is known that the results of a model updating procedure are affected not only by the measurement noise and modelling errors but also by ill-conditioning of the problem, and thus numerical regularisation of the solution (or imposing additional constraints) enhances the quality of the solution. In the widely used method based on modal sensitivity, which is also called mode-matching, regularisation is ensured by an additional term in the objective function that involves prior knowledge about the structural parameters being identified [
37,
38]. Conditioning of the problem can be enhanced by proper selection of the measurement locations as well as of the number of sensors and measured vibration modes [
39,
40]. In the case of CV measurement, selection of the measured locations can be done even after the measurement session. Blachowski noticed also that the accuracy of the results and the sparsity of the solution were improved by using an additional constraint, which ensured that the parameters representing stiffness (and thus representing the unknown structural damage) could only decrease during the optimization process based on modal sensitivities [
39]. Additionally, the unknown parameters are not identifiable if the corresponding parameterized structural members are not involved in the structural response. This happens, for example, if the vibration modes that participate in structural vibration do not involve strains of the elements that are to be identified [
41].
The literature review shows that significant effort has been devoted to enhancing the performance of CV measurements. A similar trend can be observed in SHM and CV-SHM. Similarly, an increase in the number of studies that employ low-cost cameras in smartphones can be observed due to the growing potential of cost-effective measurements. However, to the knowledge of the authors, there are still only a limited number of works that discuss the influence of limitations and measurement errors of computer vision equipment on the accuracy of structural identification procedures. In particular, low-cost consumer cameras and their use in ill-conditioned identification problems with a limited number of measurement data are not fully investigated.
The present study investigates the influence of CV measurement uncertainties and equipment (a CMOS smartphone camera) limitations on the accuracy of: dynamic displacement estimation, identification of modal parameters and the model updating procedure. The paper is structured as follows. In
Section 2, the investigated frame structure and the corresponding parameterized finite element (FE) model are described.
Section 3 describes the methodology: the employed CV displacement estimation with the aid of a smartphone camera, identification of modal parameters, and modal-sensitivity-based identification of the unknown parameters that describe the stiffness of structural bolted connections. In
Section 4, benchmark accelerometer-based modal parameters are described and compared with the CV measured data and the identified parameters, including the identification uncertainties.
Section 5 discusses the results and applicability of the investigated methodology. The conclusions are summarised in
Section 6.
4. Comparison of Dynamic Displacements and Identified Parameters
In this section, the results of the methodology described in previous sections based on CV measurement with the aid of a smartphone camera are elaborated. First, the benchmark data based on accelerometers are described. Next, the accuracy of CV measurement with the aid of a smartphone camera is demonstrated. Subsequently, the identified modal data obtained with the smartphone camera and accelerometers are compared. Finally, model updating based on these two data sets is performed, and the results are compared.
During the tests, Nodes 13, 16 and 19 were rejected due to the fact that the retraction of the rubber hammer after the hit causes contamination of the measurement by the hammer’s shadow. This phenomenon is demonstrated in
Figure 7. The other tested feature-based template matching methods, e.g., orientation code matching (OCM) and KLT, exhibit a similar lack of robustness with respect to this perturbation, as in the ZNCC-based method. Thus, 26 nodes are finally available for CV measurement, and the identified state space has 52 DOFs.
In this section, accelerometer-based identified modal parameters are marked with the subscript “acc”, e.g., and , whereas the CV parameters are marked with the subscript “cv”, e.g., and .
4.1. Reference Data Obtained with Accelerometers
Accelerometer-based modal parameters were identified using the SSI method with the aid of an LMS-SCADAS system and the LMS Test.Lab software. All 26 locations of bidirectional accelerometers were selected to provide the results for comparison with the CV measurements. Accelerometers B&K 4507 B 004 were used in this study. Five experimental sessions were conducted, resulting in five data sets. The natural frequencies of these data sets are shown in
Table 2. Data Set #1 was obtained using the impact testing technique, and only the natural frequencies were obtained. Data Sets #2 and #3 were obtained using the impact testing–roving hammer technique. Data Set #4 was obtained using the impact testing–roving accelerometer technique. Data Set #5 was obtained using modal shakers with the roving accelerometer technique. The modal shakers cannot work at frequencies below 60 Hz, and hence, the first two modes in the last data set were not identified.
The modal parameters used later for model updating are calculated by averaging over all data sets; see
Table 2 and
Figure 8.
The normalised standard deviation
presented in
Table 2 is a metric of uncertainty of the
mth mode shape, and it is calculated as follows:
where
is the
ith element of the
mth mode shape, and
is the the mean standard deviation:
while the standard deviation of the
ith measured DOF of the
mth mode shape
is estimated as follows:
The physical sense of
is that a substitution of
in Equation (
12) with
results in the diagonal elements associated with the DOFs of the
mth mode being equal to
.
4.2. CV Measurement of Dynamic Displacements
In this subsection, the accuracy of CV measurement of dynamic displacement is demonstrated. The CV measurement of transversal displacement of Node 2 (
Figure 5) is compared with the data obtained with the LDS, the accelerometer and their fusion. Hence, LDS Baumer 2016160/S14F and the digital oscilloscope Tektronix TDS 2004C are also used to register the time-domain data.
The vector containing the displacement time series
resulting from the data fusion is found by solving the optimization problem
[
48]:
where the objective function is defined as follows:
In Equation (
20),
is a diagonal weighting matrix whose diagonal contains only ones and
in the first and last element;
is the second order differential operator matrix;
is the time step, equal here to 2 ms;
is the time series of acceleration measured by the accelerometer;
is a weighting coefficient, selected here to be equal to 0.4; and
is the displacement time series measured by the LDS. The problem
is solved directly:
The result is visualised in
Figure 9 along with the results obtained from the accelerometer, the LDS and the smartphone camera. It is evident that the differentiated LDS-based data are much noisier than the accelerations measured directly by the accelerometer. On the other hand, the displacement obtained from the data fusion stays in good agreement with both the LDS-based data and the accelerometer-based data; see
Figure 9a. All the measured displacements compared in
Figure 9b are in good agreement. The displacement estimated using the smartphone camera seems to be slightly more distorted than the other results, which are almost the same; however, the error level is still at a very satisfactory level. The error of the CV measurement with respect to the fused results can be expressed as:
where
means the standard deviation, and
is the time series of the CV displacement. The analogously calculated error for the LDS-based data is
The errors
and
are calculated for a time period of 1 s duration. The time series
and
are interpolated onto the time steps of
with the use of spline functions to allow appropriate calculations in Equation (
22). The error of the CV measurement is two times greater than that of LDS.
Amplitude spectra for accelerations and displacements of Node 2, as obtained with various sensing techniques, are shown in
Figure 10a and b, respectively. For calculation of amplitude spectra, signals of a duration of 4.5 s are taken into account with their original sampling frequencies, i.e., 500 Hz for the accelerometer, LDS and their fusion, and 240 Hz for the smartphone camera. It is evident that both the acceleration and the displacement amplitude spectra are in a good agreement in the vicinity of the first two natural frequencies. The LDS and CV accelerations become noisier above the frequency of 40 Hz due to the fact that they are calculated from displacements that are very small in this frequency range (higher-order modes need more energy to be excited, and they are usually characterised by significantly higher damping factors than lower-order modes). The fused acceleration data exhibit a trade-off expressed by the weighting coefficient
; see Equation (
20). In the frequency range 40–65 Hz, the fused data have values between those of the accelerometer and the LDS-based (and CV) data, whereas above this range they are in good agreement with the accelerometer-based data, which are less noisy.
The data shown in
Figure 10 confirm that only three in-plane vibration modes can be identified below the Nyquist frequency of the CV measurement (120 Hz) because no other mode more is demonstrated in the acceleration spectra. The next in-plane mode of the investigated structure is present at 226 Hz [
45].
The data fusion results are close to both input accelerations as measured by the accelerometer and as obtained from the LDS-based displacements. This suggests that the accelerometer-based measurement provides reliable benchmark data for identification of modal parameters (
Section 4.3) and, thereupon, for model updating (
Section 4.4).
4.3. Identified Natural Frequencies and Mode Shapes
In this subsection, modal parameters are described and discussed, as identified from the CV measurement data using the SSI-DATA method and the stabilization diagrams. Four videos have been recorded. A recording duration of 8.5 s is selected for each video, and it includes nearly 80 periods of vibration of the first mode.
The stabilization diagram is constructed as described in
Section 3.3 with the model orders ranging from 1 to 104, which is twice the number of the measured outputs. The coefficient values
,
,
,
,
and
are used. Among the four recorded videos, only one allows three vibration modes to be identified. The other three videos allow only the first two vibration modes to be identified.
An example of the stabilization diagram for the fourth video is shown in
Figure 11. The stabilization diagrams also include out-of-plane modes, which are rejected because only the in-plane modes are considered in this study, as shown in
Figure 11. Since the displacements in the third dimension are not measured by the smartphone camera, these modes are selected according to the MAC criterion between the CV modes and the accelerometer-based modes, which was expected to be higher than 0.7. Generally, the calculated numerical modes can be also used if accelerometer-based modal parameters are not available.
It is evident that the use of methods intended to cluster stable poles and to reject spurious modes, as described in
Section 3.3, allows for obtaining clearly demonstrated modal data, as shown in
Figure 11.
The CV natural frequencies and the calculated uncertainties are listed in
Table 3. The final values of the identified modal parameters
and
,
are calculated as the mean of the data available in all four videos. The uncertainties are calculated analogously to the accelerometer-based modal parameters; see
Table 2 and Equations (
17)–(
19).
The uncertainties in the first two CV modal parameters are smaller than those of the accelerometer-based modal parameters. Due to a possible systematic error, this does not necessarily mean that the CV measurement is more accurate, but it demonstrates that the CV data are more consistent. In fact, this error can be characterised as the bias error: the natural frequencies of the CV-identified modes seem to be underestimated, especially the third one. This is clearly demonstrated in
Table 4. The underestimation error increases nearly proportionally to the identified natural frequency. Despite the fact that for the third mode, the MAC has the lowest value of 0.93, it still remains at a satisfactory level above 0.9 for all the vibration modes.
A comparison of the mode shapes obtained with the accelerometers and the smartphone camera is shown in
Figure 12. The high MAC values are reflected in the high similarity between the mode shapes identified using both methods. The third CV mode shape seems the noisiest; however, the mode shape is still properly reflected and well-correlated with the accelerometer-based result.
4.4. Identified Stiffness Parameters
This subsection presents and compares the results of model updating for accelerometers and CV data. The weighting matrix
is selected in accordance with Equation (
12) with the parameters
and
both for CV- and accelerometer-based identified modal parameters. These values correspond with the COVs and NSDs shown in
Table 2 and
Table 3, and they are not far from typical values. Pursuing the COVs and NSDs estimated from the available data sets makes comparison of the results difficult, since the COV and NSD for the third mode obtained with the smartphone camera are not available.
The initial values of the unknown parameters in are assumed to be equal to one. The prior covariance matrix is assumed to be diagonal, with all elements on the diagonal equal to the prior variance . Consequently, assuming Gaussian distribution, each unknown parameter is within the interval with a probability of 95%.
The scaling factor is selected with the trial-and-error method. The model updating procedure is stopped when all unknown parameters differ from the corresponding values in the previous iteration by less than one percent. The model updating procedure is performed for three cases: when only the first mode (natural frequency and mode shape) is identified and available, when the first two identified modes are available, and when all three modes are available.
Comparison of the convergences for the CV- and accelerometer-based data when all three identified modes are available and two stiffness parameters are used (see
Figure 3a) is shown in
Figure 13. For a single stiffness parameter (see
Figure 3b), the corresponding convergences are shown in
Figure 14.
For both CV- and accelerometer-based data, the values of the unknown parameters converge without any numerical difficulties. The errors between the identified and numerical modal parameters decrease for the CV- and accelerometer-based data as well as for the parameterizations with two and the single unknown parameter.
The errors in the numerical modal parameters obtained for the CV measurement data are greater than the errors obtained with the accelerometer-based data, both for the single unknown parameter and for the two unknown parameters used in model updating. This is due to two reasons: The first is that the initial FE model overestimated natural frequencies, and thus, the lower natural frequencies identified with the smartphone camera increase this error. Hence, the discrepancy between the numerical and the CV-identified frequencies is greater than the corresponding discrepancy between the numerical and the accelerometer-based identified frequencies, as demonstrated in
Figure 13b and
Figure 14b. The second reason is that the CV measurement data are generally expected to have greater errors, as visible especially in
Figure 12c. Thus, even for the updated FE model, the MAC values between the numerical and CV-identified mode shapes are lower than the corresponding values for the accelerometer-based mode shapes (
Figure 13c and
Figure 14c). Nevertheless, all MAC values remain at a satisfactory level above 0.9. Additionally, in the case of the CV data, the lower identified natural frequencies result in a slightly lower level of the unknown parameters (
Figure 13a and
Figure 14a).
A comparison of the unknown parameters
and
for all three cases of the available measurement data, estimated based on the CV and accelerometer measurement data, together with the corresponding standard deviations, is shown in
Figure 15. The analogous comparison for the reduced parameterization with a single parameter
is shown in
Figure 16. The estimated standard deviation
of the unknown parameter is calculated as the square root of the corresponding diagonal element of the matrix
; see Equation (
16). To this end, the matrix
is employed, which is calculated as shown in Equation (
12) with
and
, since Equation (
16) is true only if the weighting matrices in the model updating procedure are equal to the reciprocals of the corresponding covariance matrices (see Equation (
8)). Such a covariance matrix gives information about parameter uncertainty for typical measurement variances.
It is evident that the results are dependent on the number of the available measured modes. Both for the CV- and accelerometer-based data, the unknown parameters have different values in each measurement data case. Simultaneously, the standard deviations of the unknown parameters are smaller with the increase of the available measurement data. These observations are visible for both considered parameterizations (two and a single unknown parameter). However, the single unknown parameter seems to be less sensitive to the amount of the measurement data, and it has a lower standard deviation than the corresponding results obtained for the two parameters
and
. Especially, as shown in
Figure 15a, the unknown parameter
, estimated when only the first identified mode is available, has a value significantly different than in the other data cases. This is due to the fact that a single mode provides an insufficient amount of information to precisely estimate two unknown parameters. In other words, the greater complexity of the model and a smaller amount of the available measurement data tilt the bias–variance trade-off towards increased variance. This is also visible in the corresponding significantly higher variance of this parameter (
Figure 15b).
The parameter
tends to be lower than
(
Figure 15a,e) due to the different curvature of the adjacent conical surfaces involved in the vertical and horizontal bolted connections (
Figure 2b). However, for both parameters, the estimation accuracy strongly depends on the available measurement data. The differences between particular results of the model-updating procedure are higher than is suggested by the calculated standard deviations of these parameters.
The considerations above refer both to the CV- and accelerometer-based measurement data. Moreover, the number of available identified modes has a greater influence on the values of the unknown parameters than the errors in the CV identification of the modal parameters described in
Section 4.3, including the significant bias error of the natural frequencies.
A comparison of the error metrics between the numerical and accelerometer-based identified modal parameters with the corresponding error metrics obtained for the CV-identified parameters for various available identified modes when the FE model is updated with the two unknown parameters
and
is shown in
Figure 17. An analogous comparison for the FE model parameterized with one parameter
is shown in
Figure 18.
It is evident that the errors between the numerical and identified modal parameters for the simplified parameterization of the FE model (
Figure 18) are only slightly higher than for the parameterization with two parameters (
Figure 17), whereas they provide significantly lower variances of unknown parameters and lower sensitivity to the availability of the modal parameters. For both parameterizations, all obtained relative errors of natural frequencies, except one (Mode 1 shown in
Figure 17e) are below the level of 10%. Similarly, all MAC values are well above 0.9, which is a satisfactory result. For both parameterizations, the MAC calculated for the third CV-identified mode shape has the worst value due to the significant noise affecting this mode shape (
Figure 12c).
5. Discussion of the Results
In this subsection, the obtained results are discussed in the context of their applicability both for real-world structures and laboratory experiments.
The third CV-identified mode has a natural frequency above 1/2 of the Nyquist frequency due to the limited sampling frequency of the smartphone camera. This results in considerable systematic error. The third mode shape also becomes noisy. However, real-world structures are often characterised by natural frequencies of much lower values; hence, they are less subject to estimation error when a camera with a limited sampling frequency is used. In [
26], a camera mounted on a UAV and on a tripod recorded cable vibration with a sampling frequency of 60 Hz and a resolution of 3096 × 2160 px, and a sampling frequency 25 Hz and a resolution of 2048 × 2048 px, respectively. The natural frequencies equal to 1.03, 3.05 and 3.17 Hz obtained with the camera mounted on the tripod were in satisfactory agreement with accelerometer-based data obtained with a sampling frequency of 50 Hz. The camera mounted on the UAV measured the natural frequencies of higher-order modes to be 9.41, 10.43 and 11.43 Hz. Additionally, in this case, satisfactory agreement with accelerometer-based data was obtained despite the lower-order modes being omitted since they were affected by a low-frequency UAV motion during its hovering. These two tested cameras and accelerometers provided consistent estimated cable forces. The smartphone used in the present research can be set into the normal recording mode, which allows recording in 4 K resolution and with a sampling frequency of 60 Hz, i.e., the same as the UAV in [
26]. The capability of the smartphone camera to measure low-frequency oscillations up to 2 Hz with a sampling frequency 30 Hz was also confirmed in [
33], where a smartphone was proposed as a low-cost device to measure the interstory drift of buildings subjected to earthquakes. In the present study, the first two CV-identified vibration modes of the investigated structure (that are far from the Nyquist frequency of 120 Hz) are also in good agreement with the accelerometer-based identified modal parameters. It follows that consumer-grade electronics are suitable for measurement of flexible structures whose modes of interest are much lower than the Nyquist frequency. If this requirement is satisfied, the smartphone camera can be also used for 3D displacement measurement with the method proposed by Narazaki, since the projection of 2D displacements to 3D is a postprocessing procedure [
9].
Regarding the parametric identification of the structural properties, in the investigated case, the number of the available modal parameters used in the optimization procedure affects the results more than the uncertainties resulting from CV measurement. A similar observation is given by Blachowski, who indicated that an increase in the number of measured modes is more profitable than an increase in the number of measurement locations [
39]. This is one more argument in support of consumer-grade cameras being suitable for large-scale or flexible structures, which usually have small, lower-order natural frequencies, as more modes can be identified within the limited sampling frequency. The model updating method adopted in this study is widely accepted, and it can be used for various types of structures if the FE model of the structure is available. For example, again taking into account the work [
26], the methodology proposed in the present paper could be implemented to monitor the cable condition. If only natural frequencies are to be identified, then the modal sensitivity matrix contains only the related rows, without the entries related to the mode shapes; see Equation (
14). Only one unknown parameter scaling the stiffness is required for a particular cable for it to be sufficient to determine the cable condition. Hence, the problem would be well overdetermined. The methodology of monitoring the cable condition proposed in [
26] does not require an FE model, but a methodology based on model updating would enable monitoring of not only several cables at once but also of all parts of the structure visible by the camera. As shown by Blachowski et al., modal-sensitivity-based model updating can be used with a relatively large number of unknown parameters. If still required, the number of the updated FEs (corresponding with the monitored structural members) can be reduced by an investigation of their influence on the modal parameters after calculation of the modal sensitivity matrix. Columns of this matrix that reveal a low influence on the natural frequencies (small eigenvalue derivatives) can be rejected, since this means that the corresponding elements do not transfer significant structural loads, and their monitoring is of lower importance. Including these aspects in the researched methodology may facilitate CV-SHM based on consumer-grade low-cost devices of large scale structures.
In recent times, machine and deep learning approaches in SHM have become more and more popular. Artificial neural networks can enhance the performance of SHM techniques, especially when the monitored structure is large-scale and exhibits a nonlinear relation between the damage and measured output [
49]. The methods employed in the present study are suitable and efficient for linear problems. Large-scale FE models can be reduced with, e.g., the dynamic reduction method or the system equivalent reduction expansion process (SEREP) [
37].