1. Introduction
Bolted steel joints are among the most common types of steel connections and have been widely applied in various civil structures such as buildings and bridges. Bolted steel joints are prone to structural damage over long service periods due to self-loosening of the bolts, which is mainly caused by repetitive loads and/or vibrations. Bolt loosening leads to a loss of clamping force acting on the joint, further causing stiffness degradation and potentially structural failure [
1]. Therefore, monitoring the health condition of bolts in a timely fashion is essential for structural integrity as appropriate replacements or retrofits can then be applied before the steel joints reach critical conditions.
Human visual inspection has been commonly applied for detecting bolt loosening in civil structures. For example, the Federal Highway Administration (FHWA) [
2] requires routine inspections with two-year intervals for highway bridges in the United States. Trained bridge inspectors visually detect and record various structural defects in bridges, including bolt loosening, during inspections. However, human inspection is labor intensive and less effective as bolts may become loose between the inspections. More importantly, inspection results may contain errors due to inconsistencies in inspection skills and abilities to interpret data among inspectors. For example, Graybeal et al. [
3] performed an experimental study to evaluate the inspection skills of bridge inspectors using bridges in South Central Pennsylvania and Northern Virginia in the United States. For a particular bolt loosening defect, only 19 out of 42 inspectors successfully identified the deficiency.
Advanced bolt loosening detection technologies have been developed in the fields of both structural health monitoring (SHM) and nondestructive testing (NDT). Yang and Chang [
4] proposed an attenuation-based diagnostic method through an ultrasonic technique to detect bolt loosening in space operation vehicles. Zhao et al. [
5] adopted a piezoelectric-based sensor network for identifying damage of a riveted panel in an aircraft wing. A similar piezoelectric-based approach was applied by Okugawa [
6] through a smart washer. In addition, Wu et al. [
7] developed a radio-frequency identification (RFID)-based sensing method to detect bolt loosening in coal mine structures. Nevertheless, the success of these methods relies on extensive work of human operations and/or sensor deployments, which could be costly and less flexible for rapid inspections of bolted steel joints in civil structures. Reference [
8] offers a comprehensive literature review of various technologies for bolt loosening detection.
Computer vision-based technologies have received significant attention in the SHM community due to the benefits of being low cost, easy to deploy, and contactless. Several vision-based approaches have been reported for monitoring health conditions of civil structures on both global and local scales. Some of the recent applications of vision-based SHM include structural system identification [
9,
10,
11], displacement monitoring [
12,
13,
14], post-disaster structural classification [
15], damage detection [
16], and fatigue crack identification [
17,
18]. Furthermore, when equipped with autonomous platforms such as unmanned aerial vehicles (UAVs), vision-based SHM could bring higher flexibility and cost effectiveness to structural inspections. Recently, UAVs have been applied in structural system identification [
19], structural inspection of buildings [
20], water treatment plants [
21], bridges [
22], and so forth. A state-of-the-art review of vision-based SHM in civil structures can be found in [
23].
Despite the recent successes of vision-based SHM, limited work is found in the context of vision-based bolt loosening detection. The early related work reported in the literature was performed by Park et al. [
24] and Park et al. [
25]. In these two studies, a Hough transform-based edge detection technique was adopted to extract the boundaries of the bolt nuts in multiple images. The loosened bolt can be detected by identifying the rotation of boundaries. However, the described approaches might rely on extensive operations for comparing the nut boundaries (i.e., edges) before and after the nut rotation, which would limit its flexibility for automatically processing a large volume of images. Cha et al. [
26,
27,
28] integrated machine learning with vision-based bolt dimension extraction, introducing a robust method for detecting loosened bolts. The study in [
26] reported a detection accuracy of 87.5% based on a small set of training images, including four loosened and four tightened bolts. Nevertheless, prior knowledge about the damage state of the bolt (i.e., classifications of loosened bolts and tightened bolts) are needed to train the classifier and the training procedure would have to be repeated in the case of new types of bolts with different dimensions or shapes.
This paper presents a new vision-based bolt loosening detection method that uses image registrations. Instead of finding the rotation of the nut’s boundaries [
24] or building classifiers based on the dimensions of the bolt head [
26], directly mapping images at different inspection periods into the same coordinate system and uncovering differential features caused by the loosened bolt could form a more straightforward solution. Compared with previous vision-based bolt loosening detection methods, our approach does not require extensive operations for finding the rotation of the nut’s boundaries and does not require prior knowledge about the monitored structure (such as bolt types) or damage states of the bolt. In these regards, our approach would be more flexible and cost effective for engineering applications. The detection results of this approach are also presented for easy interpretation such that direct actionable decisions can be made to conduct condition-based maintenance procedures.
The rest of this paper is organized as follows:
Section 2 demonstrates the proposed methodology along with its technical details;
Section 3 validates the approach through three experimental tests;
Section 4 further investigates the robustness of the proposed approach;
Section 5 discusses the computation cost and limitations;
Section 6 concludes this study.
2. Methodology
Figure 1 demonstrates the overall methodology of our approach with detailed discussions provided in the rest of this section. As an illustration, suppose a bolted steel joint, termed as the monitored structure in
Figure 1a, is evaluated at two inspection periods and Bolt 2 is loosened during the inspection interval. Two input images, denoted as Image 1 and 2, are collected by a digital camera at the two inspection periods. Since the camera poses of the two images would not necessarily be identical, directly identifying the rotated bolt by overlapping the two input images would be challenging. This is confirmed by the intensity comparison between Image 1 and 2, as shown in
Figure 1d, in which intensities of exactly matched pixels are illustrated as 0 (black) and intensities of unmatched pixels are in the region of 1 to 255 (grey to white), deepening the level of their discrepancies.
In order to align two input images, we first adopt a feature-based image registration method (
Figure 1b) to transform Image 2 into a new image, denoted as Image 3, so that Image 3 and 1 could share the same coordinate system. A region of interest (ROI) should be assigned prior to this procedure as shown in the red block in Image 1. The purpose of defining the ROI is to specify a region in Image 1 as the target region where Image 2 should match. Generally, the ROI should only cover a group of bolts and their adjacent structural surface and exclude unnecessary elements in the scene (e.g., the wall in the background in Image 1). After the feature-based image registration, the matching performance is improved as shown in the intensity comparison between Image 1 and 3 (
Figure 1d). Nevertheless, misalignments (i.e., registration errors) still exist, especially around the areas of Bolts 1 and 3, even though they are intact during the inspection interval. Such errors commonly exist in feature-based image registration and a detailed discussion about these errors can be found in [
29].
To reduce registration errors further, an intensity-based image registration method is applied to Image 3 (
Figure 1c), allowing Image 3 to be non-rigidly matched to Image 1. The newly transformed image is now denoted as Image 4. The intensity comparison between Image 1 and 4 is seen in
Figure 1d. Registration errors could be significantly reduced around Bolts 1 and 3, while still exist in the loosened bolt (Bolt 2) due to the bolt rotation. We treat these errors as the bolt loosening features that are introduced by multiple sources during the bolt rotation, such as hexagon boundaries of the bolt head, the mark of A325, and other surface textures on the bolt head surface. Next, we further enhance the above bolt loosening features by filtering out adjacent noisy content (
Figure 1e) in the registration errors. Finally, we map the enhanced bolt loosening features to the original input image (Image 1) so that the loosened bolt can be directly visualized (
Figure 1f). This would allow informed actionable decisions to be made regarding performing appropriate rehabilitations and/or retrofitting to the monitored steel joint.
Two image registration processes serve as the key components in the proposed approach. The feature-based image registration effectively aligns two input images into the same coordinate system based on a predefined ROI. However, small misalignments are usually associated with feature-based image registration. The intensity-based image registration, on the other hand, is able to adjust small misalignments but may have difficulties handling significant misalignments if the input images are taken from very different camera poses. By adopting these two image registration processes in a successive manner, the misalignments between two input images can be gradually reduced through each registration process. It should be noted that the algorithm for feature-based image registration is not tied to a particular intensity-based image registration method and vice versa.
Section 4.3 demonstrates several other approaches to perform feature-based image registration. For intensity-based image registration, besides the method adopted in this study, other well-established approaches can also be found in [
30].
2.1. Image Acquisition
A consumer-grade digital camera is required for image acquisition. In this study, we adopted a Nikon D7100 camera and a Sigma 17–50 mm lens with the auto-shooting mode unless stated otherwise. The distance between the camera and the monitored structure relies on the resolution of the camera; a typical distance of 20 to 50 cm was adopted in this study. The camera can be held by hands during image acquisition and the images should directly capture the detected bolt and its adjacent structural surface without any obstructions. Ambient lighting conditions are generally acceptable. The image plane can be either parallel or skew to the monitored structural surface (see Test 3 in
Section 3 for a detailed discussion on the skew case). When collecting the images at different inspection periods, the lighting conditions and camera pose should be similar between inspection periods in order to produce the optimal result. Camera calibration is not required in this study.
2.2. Feature-Based Image Registration
The purpose of feature-based image registration is to align two images into the same coordinate system using matched features (i.e., correspondences). For this approach to be viable, features (also known as feature points, corner points, or key points) are first detected in both input images. Then, a matching algorithm is adopted to find matched features between the two images, based on which a geometric transformation matrix can be estimated to transform the second image to the coordinate system of the first image.
To better demonstrate this procedure, an example is illustrated using two input images of a concrete column taken by the aforementioned digital camera with a resolution of 6000 pixels × 4000 pixels. Suppose we wanted to match the front face of a column in two input images. First, denoted the first input image as Image 1 (
Figure 2a) and an ROI (3500 pixels × 3500 pixels) in Image 1 is selected to cover the front face of the column. Next, the Shi–Tomasi algorithm [
31] is adopted to extract features and the detected features are denoted as Feature set 1. This feature extraction procedure is flexible and can be achieved by many other feature types as well. A detailed comparison of the different features for feature-based image registration can be found in
Section 4.3. As can be seen in
Figure 2d, Shi–Tomasi features (highlighted by red circles) are based on the unique intensity change at a localized region in both the horizontal and vertical directions, which is the intrinsic nature existing in most images.
Figure 3a illustrates the second input image of the concrete column using a different camera pose, denoted as Image 2. Similarly, Shi–Tomasi features are extracted for the entire region of Image 2, denoted as Feature set 2 in
Figure 3b. Next, we adopt the Kanade–Lucas–Tomasi (KLT) tracker [
32,
33] to match each point in Feature set 1 to any potential point in Feature set 2. As a result, 1370 matched features can be found in
Figure 3c where red circles are features in Image 1 and green crosses represent features in Image 2. Among all the matched features, some outliers can be found (
Figure 3d), indicating matching failures. These outliers can be further eliminated utilizing the maximum likelihood estimation sample consensus (MLESAC) algorithm proposed by Torr and Zisserman [
34] and the new matched results (i.e., inliers) are shown in
Figure 3e,f. In total, 1175 matched features can be found, based on which a projective geometric transformation matrix can be estimated so that Image 2 can be registered to the coordinate system of Image 1. The projective geometric transformation can remove the projective distortion between Image 1 and 2 taken under different camera poses. Matched feature points after image registration can be found in
Figure 3g,h where the red circles match the green crosses.
2.3. Intensity-Based Image Registration
The purpose of intensity-based image registration is to further align the two images based on their intensity distributions. Instead of applying the geometric transformation through feature-based image registration, intensity-based image registration is a non-rigid transformation process that has been widely applied in medical imaging [
35] and remote sensing [
36]. Here, an example is presented using two images of the same hand under different poses to illustrate the principle.
Figure 4a,b are two images (denoted as Image 1 and 2) taken by a smartphone camera (4th generation Moto G Play) with a resolution of 3264 pixels × 2448 pixels. Then, both images were downsized to a lower resolution of 327 pixels × 245 pixels. Due to the different hand poses in these two images, feature-based image registration would face difficulties in aligning the two images. Here, we adopt the algorithm proposed by Thirion [
37] to non-rigidly register Image 2 to Image 3 (
Figure 4c). A typical three-level pyramid with 500, 400, and 200 iterations is adopted during this procedure.
Figure 4d,e further evaluates the registration errors through intensity comparisons. Instead of misalignment of the unregistered images (Image 1 and 2), the two images now are well aligned after the registration (
Figure 4e).
Despite the great performance of intensity-based image registration, registration errors may still occur if abrupt intensity changes occur, as shown in Image 2. As an illustration, we intentionally changed the location of the ring on the ring finger during the two image acquisitions, where the ring in Image 2 is closer to the fingertip. Such an action induces abrupt intensity changes in a localized region, leading to registration errors as shown in
Figure 4f. However, from the perspective of detecting bolt loosening, such registration errors can be utilized for identifying discrepancies between two images, serving as good features for bolt loosening detection.
2.4. Feature Enhancement
Once two image registration processes are completed successively, the loosened bolt can be identified through registration errors as shown in
Figure 5a. Nevertheless, directly identifying the loosened bolt still requires human intervention as the loosened bolt is surrounded by noise content (
Figure 5a). Now, our focus is placed on removing the noise content so that the bolt loosening features around Bolt 2 can be enhanced. A number of image processing techniques have been adopted in this procedure. First, a rectangular window is applied to the registration errors (
Figure 5a) so that unrelated results can be filtered out by assigning 0 intensity to the pixels outside the window. The dimensions and location of the window are predefined as the same sizes of the ROI prior to feature-based image registration.
Next, an image segmentation method [
38] is performed to segment registration errors (
Figure 5b) into a series of localized regions, termed superpixels, as shown in
Figure 5c. For each superpixel
i, the coefficient of variation of intensities at all pixels within this superpixel is computed and is denoted as
CVi. Then, by applying a cutoff threshold
T, the noise content can be eliminated from the registration errors so that the bolt loosening features can be preserved (
Figure 5f). To explain, suppose two typical superpixels are selected in
Figure 5c where Superpixel 1 is from the loosened bolt and Superpixel 2 represents the noise content. As shown in
Figure 5d,e, the magnitudes of the intensities around the loosened bolt change dramatically, such as Superpixel 1, while transit smoothly in other regions, such as Superpixel 2. In this regard, extracting the
CVs of the superpixels can efficiently separate the loosened bolt from its background noise. Hence, a feature enhancement algorithm is proposed by assigning 0 intensity to superpixels whose
CVs are less than a predefined threshold
T. For superpixels with
CVs that are larger than the predefined threshold
T, no action is required. Utilizing this algorithm, the noise content can be removed and the final result is shown in
Figure 5f.
2.5. Result Implementation
The purpose of result implementation is to map the bolt loosening features (
Figure 6a) to the original input image so that the loosened bolt can be easily visualized. To achieve this goal, a two-dimensional Gaussian filter was applied to
Figure 6a to blur the bolt loosening features (
Figure 6b). Then, the filtered bolt loosening features are further converted to RGB channels using the following rules: (1) black color in
Figure 6b is converted into white color; and (2) white color in
Figure 6 is converted into red color. Finally, by setting up the transparency levels and overlapping RGB channels to the original input image, the loosened bolt can be successfully identified (
Figure 6d).
3. Validations
To validate the proposed approach, three experimental tests were conducted in the laboratory. The digital camera described in
Section 2.1 was adopted for image acquisition. The resolution of the collected input images was 6000 pixels × 4000 pixels. Ambient lighting conditions were applied to all the tests during image acquisition. The bolts in the tests were made of ASTM A325 steel with a diameter of 19.05 mm (3/4 in.). These are a common type of high-strength bolts applied in steel constructions in the United States. Shi–Tomasi features and the KLT tracker were adopted for feature-based image registration.
Figure 7 shows the tested steel joints. The steel joint in Test 1 was from a gusset plate in a cross frame; the steel joint in Test 2 was a steel column flange; and the steel joint in Test 3 was a web region of a steel girder.
Table 1 summarizes the different testing parameters in the three experiments in which the total number of bolts, number of loosened bolts, surface textures, and camera orientations vary in order to validate the performance of our approach. The MATLAB Computer Vision System Toolbox [
39] was adopted for applying all the algorithms mentioned in
Section 2.
Figure 8 summarizes the experimental results of Test 1 where the three bolts in the gusset plate in Image 1 are denoted as Bolt 1, 2, and 3 (
Figure 8a). During the inspection interval, Bolt 2 was rotated and then Image 2 was collected as shown in
Figure 8b.
Figure 8c shows the initial intensity comparison of the two images where significant errors can be found due to the different camera poses. To improve the matching performance, the feature-based and intensity-based image registrations were applied successively and their registration errors are shown in
Figure 8d,e. The feature-based image registration is based on the ROI defined near the group of bolts (see the red block in
Figure 8a). Then a number of image processing techniques were further applied in order to enhance the bolt loosening features and visualize the loosened bolt as discussed in
Section 2.4 and
Section 2.5. These techniques included windowing (
Figure 8f), superpixel segmentation (
Figure 8g), feature enhancement (
Figure 8h), Gaussian filtering (
Figure 8i), and result overlapping (
Figure 8j).
Figure 9 illustrates the experimental results of Test 2. Instead of three bolts, a total of eight bolts were adopted in Test 2 and two of them (i.e., Bolts 3 and 6) experienced rotations during the inspection, as shown in
Figure 9b. Nevertheless, our approach is still able to identify the loosened bolts as illustrated in
Figure 9j.
In Test 3, multiple conditions were varied in order to validate the performance of our approach. In particular, the orientation of the camera was skewed to the monitored surface instead of parallel as in Tests 1 and 2. The surface treatment of the structural surface, on the other side, was a combination of painted and unpainted, as can be seen in
Figure 7c. Prior to processing the images, the ROI (red block in
Figure 10a) should be selected to only cover the detected bolts and their adjacent structural surface, while excluding any background that is far from the monitored surface. The benefit of such a selection is twofold: (1) the ROI can facilitate a feature-based image registration process by specifying a localized region for matching potential correspondences; (2) the ROI can also exclude unnecessary registration errors during the feature enhancement procedure (see
Figure 10f). As shown in
Figure 10j, the loosened bolt (i.e., Bolt 2) can be detected.
As a summary of these experimental results, our approach can successfully detect and localize single or multiple loosened bolts from a group of bolts, regardless of the total number of bolts, structural surface textures, or camera orientation. The success of our approach, however, does rely on tuning the cutoff threshold
T, a parameter in the feature enhancement algorithm introduced in
Section 2.4. As shown in
Table 1,
T was 50 in both Test 1 and Test 3 but increased to 200 in Test 2. This is because more noise content occurred in the registration errors in Test 2, as demonstrated in
Figure 9f. A detailed discussion about the effect of
T will be presented in
Section 4.4.
6. Conclusions
In this study, we have proposed a computer vision-based bolt loosening detection method through image registrations. Our approach starts with the collection of two input images of the detected steel joint during different inspection periods, followed by identification of the loosened bolt through comparison, enhancement, and visualization of differential features caused by bolt loosening. Specifically, we first adopted feature-based and intensity-based image registration processes to eliminate misalignments around a group of bolts between the two input images. Then, we established a feature enhancement method to remove noise content in the registration errors. Finally, a damage visualization approach was applied to localize the bolt loosening features in the original input image.
To validate the performance of our approach, three experimental tests were performed in the laboratory by utilizing a gusset plate on a cross frame, a column flange, and a girder web. The test results have verified that our approach can detect single or multiple loosened bolts from a group of bolts, regardless of the total number of bolts, structural surface textures, and camera orientations. Furthermore, the robustness of our approach has been investigated in the contexts of input image resolutions, rotation angles of the bolt head, cutoff threshold T, lighting conditions, nut loosening, different bolt types, and the gap caused by nut loosening. In addition, registration performances under different types of features have been compared to demonstrate the flexibility of our approach.
Finally, computational cost and sensitivities of the proposed approach against lighting conditions, camera poses, and bolt rotation angles have been discussed. The results of the proposed method are presented in a way for easy interpretation, such that direct actionable decisions can be made for conducting condition-based maintenance procedures, such as tightening or replacing the loosened bolts, to ensure the structural integrity. Our future work will focus on enhancing the robustness of our approach against lighting conditions and camera poses through quantitative investigations and combing our approach with the UAV platform for achieving autonomous field inspections of civil structures.