Next Article in Journal
Depth Video-Based Secondary Action Recognition in Vehicles via Convolutional Neural Network and Bidirectional Long Short-Term Memory with Spatial Enhanced Attention Mechanism
Next Article in Special Issue
On Construction of Real-Time Monitoring System for Sport Cruiser Motorcycles Using NB-IoT and Multi-Sensors
Previous Article in Journal
fBrake, a Method to Simulate the Brake Efficiency of Laden Light Passenger Vehicles in PTIs While Measuring the Braking Forces of Their Unladen Configurations
Previous Article in Special Issue
Adaptive Dataset Management Scheme for Lightweight Federated Learning in Mobile Edge Computing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Construction Jobsite Image Classification Using an Edge Computing Framework

1
Department of Civil, Construction and Environmental Engineering, North Carolina State University, Raleigh, NC 27695, USA
2
Civil Engineering Department, King Saud University, Riyadh 11451, Saudi Arabia
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(20), 6603; https://doi.org/10.3390/s24206603
Submission received: 28 August 2024 / Revised: 1 October 2024 / Accepted: 11 October 2024 / Published: 13 October 2024
(This article belongs to the Special Issue Sensing and Mobile Edge Computing)

Abstract

:
Image classification is increasingly being utilized on construction sites to automate project monitoring, driven by advancements in reality-capture technologies and artificial intelligence (AI). Deploying real-time applications remains a challenge due to the limited computing resources available on-site, particularly on remote construction sites that have limited telecommunication support or access due to high signal attenuation within a structure. To address this issue, this research proposes an efficient edge-computing-enabled image classification framework for support of real-time construction AI applications. A lightweight binary image classifier was developed using MobileNet transfer learning, followed by a quantization process to reduce model size while maintaining accuracy. A complete edge computing hardware module, including components like Raspberry Pi, Edge TPU, and battery, was assembled, and a multimodal software module (incorporating visual, textual, and audio data) was integrated into the edge computing environment to enable an intelligent image classification system. Two practical case studies involving material classification and safety detection were deployed to demonstrate the effectiveness of the proposed framework. The results demonstrated the developed prototype successfully synchronized multimodal mechanisms and achieved zero latency in differentiating materials and identifying hazardous nails without any internet connectivity. Construction managers can leverage the developed prototype to facilitate centralized management efforts without compromising accuracy or extra investment in computing resources. This research paves the way for edge “intelligence” to be enabled for future construction job sites and promote real-time human-technology interactions without the need for high-speed internet.

1. Introduction

With rapid digital transformation in the construction industry, various reality-capture technologies are being implemented on construction sites for data collection [1]. Prominently, construction images are frequently captured using smartphones [2,3], drones [4,5], surveillance cameras [6,7], and unmanned ground vehicles (UGV) systems [8]. These images provide valuable information that can facilitate automated monitoring, inspection, and tracking processes. Researchers often utilize these images to identify safety hazards [9,10,11], monitor workers’ productivity [12,13], track resources [14,15,16], and facilitate progress monitoring [17].
One of the most widely used methods in image analysis involves applying artificial intelligence (AI) models for image classification. To create a customized classification system, these images are often labeled based on precise criteria tailored to specific needs. Deep learning models, especially convolutional neural networks (CNNs), are commonly used to identify patterns and extract features from these images [18]. Through iterative training, these models enhance their ability to accurately classify new, unseen images. For real-time applications, the well-trained models will be considered to be embedded into on-site devices such as surveillance cameras [9], mobile phones [19], and drones [11,20] to run inferences with the support of the internet. However, these devices are generally battery-powered and constrained by limited computational capabilities [21,22]. Construction structures with heavy reinforcement materials, such as steel or reinforced concrete, can also significantly affect internet connectivity by causing signal attenuation and interference [23]. As a result, obtaining sufficient computing resources to deploy this real-time application is particularly challenging, making real-time implementation still far from reality, especially in rural construction areas where such resources are often scarce.
Recently, edge computing has become popular as a distributed computing diagram, where data processing and analysis take place in proximity to the data source or the “edge” of the network [24,25]. This computing framework supports real-time image capture and processing, allowing for model inference to be performed locally using the computing power of edge devices. As the demand for more real-time analytics and process intelligence grows, this approach significantly reduces latency in real-time responses and optimizes bandwidth usage [24,25]. With advancements in microchip technology, the possibility of both training and running inference directly on edge devices is becoming increasingly feasible [26]. Currently, image-based edge computing solutions involve training models in the cloud and then deploying the trained models to edge devices like Raspberry Pi [27], NVIDIA’s Jetson series [28], and Arduino [29], with supports by battery modules [22], and other edge servers [30,31]. The NVIDIA Jetson series is specifically designed for machine learning (ML) and computer vision tasks [32]. However, these methods primarily depend on the computing power of these edge devices. As engineering problems grow in complexity, it is necessary to develop a generalized framework that can optimize the utilization of computing resources regardless of the sophistication of image classification models. Researchers emphasized that construction companies should focus on deployment strategies rather than the specific tools themselves when considering investment in computing resources [33].
This research follows a fundamental engineering research design methodology elaborated by Hazelrigg [34] to apply an edge computing-enabled efficient image classification framework for real-time construction job site applications. The specific objectives are to (1) create a lightweight binary image classification model tailored for practical construction applications; (2) investigate and optimize the deployment of construction image classification tasks on the edge; and (3) integrate a complete edge computing environment to facilitate real-time implementation of image classification in construction settings. To achieve these goals, the essential steps are as follows: (1) This research employed transfer learning (TF) to develop a lightweight image classification model based on MobileNet. (2) The model was then optimized through quantization, reducing its size while preserving inference accuracy using the TensorFlow Lite (TFLite) framework. This process ensured efficient deployment on edge devices. (3) Finally, a real-time implementation system was assembled, consisting of a Raspberry Pi (single-board computer), PiCamera, Edge TPU (accelerator), and other necessary hardware and software components. (4) Two case studies, including construction material classification and construction nail detection, are implemented in real time in practice to validate the research framework.

2. Literature Review

2.1. Edge Computing

Edge computing processes data closer to its sources by leveraging local computing resources, such as Internet of Things (IoT) devices, leading to faster data insights, improved response times, and more efficient bandwidth utilization [24]. This method improves computational efficiency, reduces the use of signal transmission channels, eases the storage and processing demands on cloud servers, and facilitates real-time data analysis and decision-making close to the data source [24,35]. Cloud computing encounters several significant challenges due to its centralized nature. As a solution, Edge Computing has been introduced to enhance performance and address cloud-related issues by enabling local data processing and storage at the end devices [36]. The rapid evolution of edge computing is powered by the increased computing capabilities of microchips, where the quantity of transistors on a microchip doubles every 18 months while decreasing costs by half according to Moore’s Law [37]. Edge computing began in the 1990s with the concept of content delivery network (CDN), describing a network model that placed computing nodes closer to users for faster-cached content (e.g., images, videos) delivery [38]. In 1997, Nobel et al. explored how applications (web browsers, video, and speech recognition) on mobile devices could shift some processing tasks to powerful local servers, reducing the burden on resource-constrained devices and enhancing the battery life of mobile devices [39]. This concept was generalized as “pervasive computing” later in 2001 [40]. At the same time, Rowstron and Druschel introduced distributed hash tables that utilized the proximity of physical internet connections for efficient data routing and load balancing to improve communication latency and reduce network load [41]. As more and more computing and storage resources are needed, cloud computing emerged as a major influencer for edge computing in 2006 [42]. Researchers also designed a “cloudlet” to provide nearby mobile devices with computing and storage resources [43]. By 2012, fog computing emerged as a solution to the growing demand for scalable IoT infrastructures, enabling the management of vast numbers of devices and the processing of large volumes of data [25,44].
The earliest work on edge computing concepts in construction dates back to 1985, when researchers aimed to leverage small computers for various construction planning and execution tasks [45]. In the past few years, mobile computing has been widely adopted to enable real-time applications in the construction fields [19,46,47,48]. It involves bringing mobile devices with wireless connectivity to construction sites for real-time information sharing [49]. For example, researchers have combined mobile computing with Augmented Reality (AR) and wireless communication technologies for real-time site monitoring, task management, and information dissemination [19,46,50]. With the advent of building information modeling (BIM), various applications built upon mobile computing have been developed to streamline BIM data collection and communication both on and off construction sites [51,52,53]. However, mobile computing heavily relies on centralized cloud services for network connectivity and data transmission [51,54], which may be problematic in construction scenarios located in regions with limited or no network coverage. The term “edge computing” was first coined in the construction industry in 2018, when Kochovski et al. [21] developed an edge computing infrastructure designed to deliver high Quality of Service (QoS) for video communications and construction process documentation, addressing the growing demand for computing resources on construction sites. As data volumes continue to grow exponentially nowadays, researchers have started using AI models for rapidly processing and extracting valuable insights from different data sources to drive decision-making. The increasing analytical demands of these models have prompted researchers to explore solutions that offload specific computational tasks to edge servers [30,31]. Consequently, AI models are being embedded into edge devices for real-time applications, such as safety detection [55], resource tracking [28], and structural health monitoring [56,57]. The integration of edge computing into construction fields represents significant opportunities to reduce latency in data communications for time-sensitive tasks [55], improve data privacy and security through local processing [58,59], and optimize network bandwidth usage by transmitting only relevant information [22,60]. Overall, the integration of IoT and edge computing on construction sites offers powerful solutions to enhance productivity, improve quality, and boost safety, making these technologies essential for advancing efficiency and innovation in the construction industry [61].
To gain a high-level understanding of mobile edge computing research trends in the architecture, engineering, and construction (AEC) industry, 106 relevant research articles were pulled from the Web of Science (WoS) database between 2010 and 2023 to draw two co-occurrence diagrams: one focused on mobile edge computing keywords, one focused on implemented devices/hardware for mobile edge computing, as shown in Figure 1 and Figure 2. Figure 1 indicates that mobile computing research is often related to augmented reality and BIM, including research topics of localization [62], facility management [63], context awareness computing [64,65], and visualization [66,67]. Edge computing is closely associated with IoT, ML, and deep learning, highlighting emerging trends in developing analytic capabilities in edge computing. The integration of edge computing and AI techniques significantly contributes to advancements in research areas, including smart buildings [68], smart cities [69,70], anomaly detection [56,71], and energy management [60]. Figure 2 highlights the sensors, which are the core technology for mobile/edge computing. They are connected with various mobile computing devices, such as phones and cameras. These devices are extensively connected with mobile connectivity technologies (Wi-Fi, Bluetooth) [22,72], AR, [66,73,74], and location systems (GPS, RFID) [51,75], highlighting potentials integration in practices such as safety, navigation, and user interfaces. Edge computing mainly requires a single board of a computer, such as Raspberry Pi [27,28] and Arduino [29], for edge computing capabilities. Battery is another factor that allows edge devices to operate independently and continuously with a power supply [22]. Actuators are also crucial because they enable edge devices to interact directly with and control physical systems in real time for immediate response and automation [21]. Nonetheless, this research organized 106 reviewed documents and put them in a shared spreadsheet, which can be found at the Supplementary Materials in the paper.

2.2. Image Classification

Image classification has been a fundamental task in computer vision, aiming to categorize images into predefined classes using supervised ML. The early approaches to image classification relied heavily on hand-crafted features such as Scale-Invariant Feature Transform (SIFT) and Histogram of Oriented Gradients (HOG), combined with traditional ML models like Support Vector Machines (SVM) and Random Forests [76,77]. However, the advent of deep learning, particularly CNNs, revolutionized the field [78]. CNNs are designed to automatically learn hierarchical feature representations from raw images, which significantly reduces the need for manual feature extraction. Pioneering models like AlexNet [79], VGGNet [80], and ResNet [81] demonstrated remarkable improvements in image classification accuracy on large-scale datasets such as ImageNet [82]. These models leverage multiple layers of convolutional filters to capture spatial hierarchies in images, thereby achieving state-of-the-art performance. Recent advances in image classification have focused on enhancing the efficiency and accuracy of CNNs through various approaches. Techniques such as TF and data augmentation have become standard practices to improve model generalization, particularly in scenarios with limited labeled data [83]. TF involves pre-training a CNN on a large dataset and fine-tuning it on a target task, allowing for faster convergence and better performance. Additionally, the development of attention mechanisms, such as the Transformer-based Vision Transformer (ViT), has further advanced image classification by enabling models to focus on relevant parts of an image, mimicking human visual attention [84]. These innovations continue to push the boundaries of image classification within the field of computer vision.
In recent years, there has been a shift towards optimizing model architecture for deployment in resource-constrained environments, such as edge devices. This has led to the development of lightweight models like MobileNet [85], ShuffleNet [86], and EfficientNet [87], which prioritize a balance between accuracy and efficiency. MobileNet and ShuffleNet utilize depth-wise separable convolutions and group convolutions, respectively, to reduce the model size and computational load, allowing for faster inference speeds [88]. EfficientNet introduces a compound scaling approach that systematically scales the depth, width, and resolution of the network, achieving state-of-the-art performance with fewer parameters [87]. These models have been instrumental in enabling real-time image classification on devices with limited processing power and memory. For example, Raza et al. [89] trained an EfficientNet model that utilized less computational resources and training time while maintaining high accuracy in classifying lung cancer using CT scan images with a performance score between 0.97 and 0.99 on the test set. As the demand for efficient and accurate models continues to grow, further research is focused on developing architectures that maintain high performance while minimizing complexity and resource consumption.

2.3. Construction Site Image Classification Applications

The use of image classification in construction has seen significant growth, particularly in automating the monitoring and management of construction sites [90,91]. By leveraging computer vision technologies, construction companies can analyze real-time images captured from various sources, such as drones [4,5], fixed cameras [17,92], and mobile devices [2,3]. These images are utilized to track progress [93,94], ensure safety compliance [9,95], defect detection [96,97], and monitor the usage of materials and equipment on-site [98,99]. It has been identified that CNNs are the most widely used models to perform classification tasks [18]. These image classification applications can significantly reduce the management burden on central managers, allowing them to focus more on strategic problem-solving rather than time-consuming manual inspections [94].
Despite the promising potential of image classification in construction, several limitations hinder its widespread adoption. The primary challenge is the scarcity of annotated datasets tailored specifically to construction environments, which hampers effective model development [18]. One promising solution is TF, where models are pre-trained on large datasets that can be fine-tuned on smaller, domain-specific datasets, thereby reducing the need for extensive annotations. Researchers have shown that TL can improve the ability of image classification models to learn fundamental data patterns, often resulting in better results [3]. Another major limitation is the computational cost and the demand for high-performance hardware to process images in real time. As engineering problems grow more complex, image classification models tend to scale up because of more parameters, requiring additional computing resources for efficient inference [85]. Running these models on construction sites using devices such as smartphones, AR/VR tools (e.g., HoloLens, Daydream), and drones is challenging due to the limited memory and computing power of these devices [100], making it still far from reality to implement these models in real-time.
The integration of edge computing and image classification models presents significant potential for real-time applications on construction sites. For example, Tan et al. [20] embedded a crack detection model into an unmanned aerial vehicle (UAV), enabling real-time synchronization of crack data collection and analysis on building surfaces. The model was executed directly on DJI’s Onboard SDK (OSDK), a development kit designed for edge computing. Similarly, Chen et al. [28] integrated a construction hardhat detection model into a Raspberry Pi, facilitating real-time video data processing for improved worker management. Another study evaluated a worker detection model on both a local computer and three edge computing devices (Jetson Nano, Raspberry Pi 4B, and Jetson Xavier NX). The results demonstrated that while all devices achieved comparable accuracy, the local computer exhibited the fastest processing speed, followed by the Jetson Xavier NX, with the Raspberry Pi 4B being the slowest [101]. Additionally, the Raspberry Pi 4B consumed the most CPU resources due to its lack of a GPU. Based on the information provided, these studies mainly depend on the inherent computing capabilities of edge devices without optimizing the implementation frameworks. As more advanced image classification models are deployed, issues like inference speed and memory usage become critical, making the previous approach difficult to generalize. For instance, the simplest ViT-Base model [84], which has 86 million parameters, is significantly larger than widely used CNN architectures like ResNet-50 [81], which has 25 million parameters. This disparity highlights the necessity of developing a framework that optimizes computing resource utilization for a more generalized real-time image classification application in construction sites, especially in resource-constrained environments like rural construction sites.

3. Methodology

3.1. Overview

This study employs the fundamental engineering research design methodology as delineated by Hazelrigg [34]. Rather than aiming to optimize any existing models, the focus is on implementing an integrated edge computing framework for image classification tasks within construction job site environments. Figure 3 illustrates the research framework and the developed hardware module. The primary aim of this research is to explore binary classification techniques for various real-time construction applications. After determining the specific use case and collecting the necessary data, this research was initiated by data augmentation, such as flipping, rotating, and shearing to the original images, thereby increasing the dataset size. We then employed a MobileNet TF to train a lightweight binary classifier, enabling it to distinguish between two distinct classes and evaluate the performance using various model performance metrics. This lightweight model can largely save computing resources when running inference. The resulting model was quantized using TFLite to further reduce its size for deployment on edge devices. We conducted a detailed analysis of the trade-offs between accuracy and model size across different models. Additionally, this research developed a custom Raspberry Pi hardware module equipped with a screen, PiCamera, power bank, Bluetooth speaker, wireless keyboard, touchpad mouse, and accelerator coprocessor. Finally, we integrated a visual, textual and audio module into the edge device to facilitate a synchronized real-time interactive multimodal detection system.

3.2. Classification Model Development

As a preliminary examination, this study evaluated traditional convolutional neural network (CNN) models and various fine-tuning methods (e.g., change activation function, batch normalization, and data preprocessing). However, the initial results indicated that CNN models built from scratch underperformed and exhibited suboptimal model sizes. In contrast, when testing pre-trained models, particularly lightweight MobileNet architectures, we found a significant increase in accuracy without compromising model size. This initial outcome directed the research efforts toward the development and utilization of these lightweight models.
MobileNet is a lightweight and efficient CNN architecture featured on depth-wise separable convolutions, including depth-wise convolutions and point-wise convolutions, allowing it to achieve a good balance between model size and performance [102,103]. The model architecture and development process are shown in Figure 4. In this study, we leveraged TL to deploy the MobileNet model and compared different versions of MobileNet to select the optimal version. Note that the current training process under the edge computing framework is still in the cloud environment. This study explores binary classification problems as pilots for real-time edge computing implementations. Images are resized to 128 × 128 × 3 pixels to facilitate the training process. To create a binary classifier with MobileNet, we replaced the original classification layer with a customized dense layer, using a ‘softmax’ activation function for one-hot encoding binary outputs. Additionally, we fine-tuned the model by experimenting with different parameter combinations. The data were split into training and validation sets using an 80–20% ratio. We assessed the trained model’s performance using accuracy, precision, recall, and F1 score metrics. Each model underwent training for 50 epochs.

3.3. Edge Environment Setup

In order to assemble the edge devices, the products involved and their corresponding functions in this case study are shown in Table 1. These setups mainly aim to incorporate a multimodal system, including visual, textual and audio, for on-site interactions. The edge device setup includes the following modules: Raspberry Pi 4, 8 GB Model B with 1.5 GHz 64-bit quad-core CPU (8 GB RAM); 32 GB Samsung EVO + Micro SD card (Class 10) preloaded with NOOBS; Arducam for Raspberry Pi Camera; Vilros 8-inch 1024 × 768 screen; Vilros 2.4 GHz mini wireless keyboard with touchpad mouse-USB receiver; Vilros mini Bluetooth speaker; Power Ridge portable power bank with AC outlet, 100 W 26,270 mAh; and Coral USB Accelerator. The total price for such an edge device module is around USD 430.84.
Note that the Edge TPU (Tensor Processing Unit) is a specialized, application-specific integrated circuit (ASIC) designed to enhance ML tasks, particularly deep learning inference, at the network’s edge [104]. The Edge TPU is optimized for efficiently running neural network models, with an architecture streamlined for operations common in ML, such as matrix multiplications and convolutions. In this research, deploying the quantized model on the Edge TPU involves several straightforward steps. First, the quantized TF Lite model (.tflite file) is further converted into an Edge TPU-compatible version using Google’s Edge TPU Compiler. Then, the compiler is run on the quantized model to generate a compiled version optimized for the Edge TPU (e.g., model_edgetpu.tflite). Second, the Edge TPU runtime library is installed on the Raspberry Pi’s operating system (Raspbian OS). Third, the Google Coral USB Accelerator, which contains the Edge TPU, is connected to the Raspberry Pi via a USB port. Finally, we should use this Edge TPU model to run inference and delegate these inference tasks to the Edge TPU hardware in Google Coral’s coding framework. In this configuration, the Raspberry Pi acts as a coordinator, preparing data and handling peripheral tasks, while the actual inference computation is offloaded to the Edge TPU. This approach leverages the specialized hardware acceleration capabilities of the Edge TPU, resulting in significant performance improvements for machine learning inference tasks compared to executing them solely on the Raspberry Pi’s CPU.
The required software modules in the Raspberry Pi environment include Raspbian OS (version 2.9.3.1) (operating system), TFLite (version 2.10.0), OpenCV (version 4.9.0) (vision), and pyttsx3 (version 2.90) (text-to-speech) modules. We prepared a Python script to leverage the PiCamera to capture stream images and to resize every image frame to 128 × 128 × 3 pixels. The quantized model can be hard copied to the Raspberry Pi through a USB driver. The script can run the quantized classification model to infer real-time images, and the classification results will be output as textual and audio formats. Audio broadcasts will be generated based on aggregated results from 15 consecutive frames classifications. The final product can be implemented offline with full portability and no latency.

4. Case Study

4.1. Material Classification

Automatic and real-time material classification is crucial for construction projects as it ensures accurate material usage, minimizes errors, and enables immediate adjustments, leading to cost savings and timely project completion. In this section, the research explores the deployment of a TFLite binary classification model on a Raspberry Pi to distinguish between the plywood and oriented strand board (OSB).
The data collection process was conducted within a controlled laboratory environment, where images of plywood and OSB were randomly captured from various angles. These raw video recordings were meticulously dissected into individual frames, resulting in a sizeable dataset. To enhance the diversity of the training set, data augmentation techniques such as flipping, shearing, and rotating were applied to each frame. As a result, the dataset was expanded to encompass a total of 7879 training images for plywood and 7900 for OSB, ensuring a rich and varied source of visual information for the model. For testing, a separate collection of images was obtained at different times using a different smartphone. This yielded a total of 259 test images for plywood and 322 for OSB, which were kept distinct from the training data to assess the model’s generalization capabilities effectively. Future work can also expand into more material categories to develop a multi-object classification system.
Table 2 displays the model performance statistics for the binary material classification task. Our comparison of various model versions revealed that both MobileNetV1 and MobileNetV2 achieved perfect accuracy. However, MobileNetV2 had a smaller model size. MobileNetV3Small, characterized by fewer parameters and the smallest model size among the tested models, achieved an accuracy of 0.91. Conversely, MobileNetV3Large, despite its larger model size, did not enhance overall performance. Considering the trade-offs between model size and performance accuracy, this study will use MobileNetV2 for further quantization. Figure 5 shows the confusion matrix for MobileNetV2 on the material classification task. The results show that, when the model predicts that the image is ‘Plywood’, only one image was misclassified as ‘OSB’, demonstrating promising results.
As a result of the quantization process, the original MbileNetV2 model, which had a size of 14.309 megabytes (MB), was efficiently compressed to a smaller size of 2.708 MB. This is approximately 19% of the original model size. Surprisingly, the accuracy, precision, recall, and F1 score still all approach 1. Figure 6 demonstrates the final implementation scenario. The implementation of the developed prototype is entirely portable, exhibits zero latency, operates offline, and offers real-time functionality, validating the effectiveness and robustness of the proposed framework. The identified materials would be displayed on the screen and broadcasted through the Bluetooth speaker.

4.2. Safety Detection: Identifying Boards with Nails

Poor construction housekeeping can jeopardize the safety of construction workers [10] and accounts for 15% of workplace deaths [105]. The Occupational Safety and Health Administration (OSHA) specifically states that any work areas, passageways, and stairs should be free of boards with protruding nails [106]. Therefore, this section will deploy an edge computing prototype to enable real-time safety detection to identify the construction boards with nails.
The data collected in the previous case can form a ‘boards’ dataset. In this round, we manually hammered nails into these boards in accordance with American Plywood Association [107] guidelines (regarding the locations and distance between nails, etc.) to form a ‘boards with nails’ dataset. Similarly, we video-recorded these boards from different angles. We then used another mobile device to collect test datasets. After data augmentation, 32,237 training data points emerged, with 16,779 images categorized as the ‘boards’ dataset and 15,458 images categorized as the ‘boards with nails’ dataset. Also, 1053 test datasets were formed, with 581 ‘boards’ and 472 ‘boards with nails’. These images were resized to 128 × 128 × 3 pixels for deployment efficiency. Note that during data collection, boards containing only one or two nails were classified as “boards with nails”. The researchers manually hammered each nail into the boards, capturing photographs progressively. Consequently, the “boards with nails” dataset comprises images with dynamic numbers of nails, documenting the entire process from boards with no nails to those fully equipped with all nails.
Table 3 presents the model performance statistics of different models on nail safety detection tasks. Based on a comparison of different model versions, we found that MobileNetV1 could return the highest accuracy value of around 0.90, with a 17.4 MB model size. MobileNetV2 showed the second highest accuracy value of around 0.87, with a 15.1 MB model size. MobileNetV3Small, with fewer model parameters, returned the lowest accuracy results. MobileNetV3Large, with the most parameters, also did not improve model accuracy. Moreover, the model must be sensitive to the presence of nails in the image. As a result, the MobileNetV1 was selected for further processing.
Figure 7 shows the confusion matrix for MobileNetV1. The results show that when the model predicts that the image is ‘boards’, only eight images are misclassified as ‘boards with nails’. When the model predicts that the image is ‘boards with nails’, 98 images are misclassified as ‘boards’. Therefore, because the model is sensitive to the presence of nails, some suspicious board images are misclassified as ‘boards with nails’. Overall, however, MobileNetV1 demonstrates promising results, which we intend to leverage for further development.
After quantization, the MobileNetV1 model was reduced to 3.48 MB, approximately 20% of its original size. Remarkably, the quantized model maintained an accuracy of 0.9 in detecting nails in boards, closely matching the performance of the original. Other quantized models also showed similar accuracy to their original versions, with discrepancies appearing only from the third decimal place onward. This outcome aligns with the goals of TFLite Lite quantization [108]. Overall, the 3.48 MB quantized MobileNetV1 model is notably compact for an ML model, making it especially suitable for edge devices with limited storage capacity. This model will be deployed for real-time inference, where the ‘boards’ results will not broadcast any sound, and the ‘boards with nails’ will broadcast “Pay attention! There are nails in the boards”.
Figure 8 shows a screenshot of the real-time implementation in the lab. The prototype was not connected to any cloud or internet services. Scenario 1 shows “There are nails” on the screen when the prototype is directly facing the “board with nails” object, and the prototype makes the alert of “pay attention, there are nails in the boards”. Scenario 2 shows “No nails, safe” on the screen when the prototype moves to target the “board” object; this case will not make a safety alert. The prototype demonstrated zero latency when predicting every framework of a real-time video stream. Furthermore, the prototype is fully portable and offline and can process data in a fully decentralized way without having to send data to a central server, thereby offering real-time decision-making advantages and cost-effective bandwidth usage. A video showing the implementation of the edge computing prototype can be found at the end of the paper.

5. Limitations

In order to test the scalability and adaptability of the prototype for a real construction site scenario, we took the nail safety detection prototype to a construction site in Raleigh, North Carolina, where the internet was not accessible. As illustrated in the images presented in Figure 9, we placed numerous boards and beam timbers with intruded nails in five different locations in the building under construction and tested the prototype’s adaptability. The materials used in this stage of the validation process were not included in any of the model training. We found that the prototype performed adequately at these actual construction site locations but also showed extensive room for improvement. A robust implementation of the prototype could be impeded for several reasons, such as unseen datasets, insufficient lighting, and a chaotic construction environment. Furthermore, Figure 9 also shows that multiple construction locations (Locations 3, 4, and 5) were full of scattered building materials, which added noise to the model predictions. In future research, we will consider improving the prototype’s scalability and adaptability based on these considerations.

6. Recommendations for Future Research

The case study involved training the model in cloud environments and performing model inference entirely on a local device. This complies with level 4 edge intelligence (EI), according to Zhou et al. [26]. They identified a six-level rating for EI, with Level 6 representing scenarios where both training and inference occur entirely on the device, and Level 1 is all on the cloud. Currently, training small-scale or specialized models on edge devices is feasible if optimized algorithms and model compression techniques are seamlessly integrated, enabling more adaptive real-time model parameter adjustments. However, training large or complex models directly on edge devices is impractical due to constraints in computational resources, memory, and energy. The continuous advancements of microchips provide potential opportunities to fully create data, train models, and run inference solely on the edge. However, it does not necessarily mean all devices are at the “best level”. Instead, future cloud-edge coordination should consider multicriteria, such as latency, data security, energy efficiency, and hardware cost. One possibility is to embed microchips in edge devices and integrate federated learning, allowing each edge node to collect, process, and train its own data while preserving data privacy [109]. This enables parallel data computing to adjust model parameters, resulting in more reliable decision-making.
Integrating the Edge TPU can further enhance performance by offering high efficiency, low latency, and low power consumption. This is especially advantageous for running the modern open source large language models (LLMs, e.g., LlaMa, Mistral) on edge devices, which require significant computational resources. The rise of these large-scale models indicates shifting computational loads from the training phase to inference, where inference may emerge as a new bottleneck if the deployment process remains outdated. Integrating edge computing and LLM can enable an edge “brain” to perform more complicated tasks. Future research could explore developing a framework that enables intelligent, knowledge-driven edge “assistant” powered by LLMs on edge devices. This would allow for local utilization to better support engineers in their daily tasks.
Lastly, the integration of edge computing and robotics in the construction industry can potentially revolutionize traditional practices. Edge computing’s ability to process data locally and in real time enhances the responsiveness and autonomy of robotics. On the contrary, robotics provide autonomous mobility that can enable remote project operation. They also generate valuable data that can be locally processed, providing insights for efficient project management. Robotics benefit from immediate access to processed data, simulating various “what-if” cases, enabling quicker decision-making, improved navigation, and the ability to adapt to changing conditions in real time.

7. Conclusions

As the construction industry increasingly harnesses the power of image-based AI models for real-time monitoring, inspection, and tracking, ensuring seamless deployment of these applications on construction sites has become essential, especially in on-site locations with limited access to high-speed internet. This research introduces a comprehensive edge computing framework tailored for two practical applications: material classification and safety detection. By leveraging the TL MobileNet model, a binary image classification system was developed and subsequently quantized using TFLite to optimize performance. To facilitate real-time ML tasks, the researchers equipped Raspberry Pi edge devices with a battery, camera, audio module, and Edge TPU hardware. The Raspberry Pi environment was further enhanced with vision and text-to-audio modules, streamlining the deployment process. The results showed that (1) both case studies demonstrated high accuracy in classification tasks; (2) the quantized models were approximately 20% of their original sizes while maintaining the same accuracy; (3) the prototype achieved zero latency in differentiating materials and identifying hazardous nails without any internet connectivity; and (4) the system effectively integrated vision detection results with audio and textual outputs, promoting a multimodal synchronization application.
The contributions of this research are threefold. First, this study developed an efficient and generalized edge computing framework, with a focus on TF and quantization processes, to optimize real-time image classification on construction sites. This framework allows construction companies to perform any sophisticated image classifications on the edge without compromising model accuracy or requiring additional investment in computing resources, paving the way for future AI applications on construction sites. Second, a multimodal synchronization mechanism was introduced that processes visual data inputs and outputs textual and audio data within an edge computing environment. This mechanism not only facilitates interactive communication among on-site collaborators but also enhances situational awareness, enabling more immediate decision-making. Lastly, this research provides practical examples that demonstrate the effectiveness and efficiency of the proposed framework, addressing a common gap in the industry. The study presents a comprehensive set of hardware and software modules, integrating all components seamlessly to demonstrate real-time use cases and their benefits for construction management. By adopting this prototype, project managers can reduce centralized management efforts and promote a human-technology interaction environment for future construction. Project managers can also envision a more intelligent edge “assistant” by adding additional modules based on the developed framework.

Supplementary Materials

The reviewed 106 papers can be found at: https://docs.google.com/spreadsheets/d/e/2PACX-1vTklHOvdLNyAyXTM62zXn85SByvPAoVrORclSvjNOBwmjqreDNZ4ehXvVBfCfTFrQ/pubhtml (accessed on 24 November 2023). The real-time edge computing-enabled safety detection demo can be found at: https://www.youtube.com/watch?v=jtZXUE13TmE (accessed on 24 November 2023).

Author Contributions

G.C., A.A. and E.J. came up with the ideas. G.C. led the paper in research design, model development, evaluation, prototype deployment, and paper writing. A.A. assisted in data collection and analysis, as well as paper revisions. Additionally, E.J. supervised the research and acquired the necessary resources and funding, as well as paper revisions. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Hamledari, H.; Fischer, M. Construction Payment Automation Using Blockchain-Enabled Smart Contracts and Robotic Reality Capture Technologies. Autom. Constr. 2021, 132, 103926. [Google Scholar] [CrossRef]
  2. Akhavian, R.; Behzadan, A.H. Smartphone-Based Construction Workers’ Activity Recognition and Classification. Autom. Constr. 2016, 71, 198–209. [Google Scholar] [CrossRef]
  3. Nath, N.D.; Chaspari, T.; Behzadan, A.H. Single- and Multi-Label Classification of Construction Objects Using Deep Transfer Learning Methods. J. Inf. Technol. Constr. 2019, 24, 511–526. [Google Scholar] [CrossRef]
  4. Jiang, Y.; Bai, Y.; Han, S. Determining Ground Elevations Covered by Vegetation on Construction Sites Using Drone-Based Orthoimage and Convolutional Neural Network. J. Comput. Civ. Eng. 2020, 34, 04020049. [Google Scholar] [CrossRef]
  5. Ham, Y.; Kamari, M. Automated Content-Based Filtering for Enhanced Vision-Based Documentation in Construction toward Exploiting Big Visual Data from Drones. Autom. Constr. 2019, 105, 102831. [Google Scholar] [CrossRef]
  6. Luo, H.; Liu, J.; Fang, W.; Love, P.E.D.; Yu, Q.; Lu, Z. Real-Time Smart Video Surveillance to Manage Safety: A Case Study of a Transport Mega-Project. Adv. Eng. Inform. 2020, 45, 101100. [Google Scholar] [CrossRef]
  7. Chen, C.; Zhu, Z.; Hammad, A. Automated Excavators Activity Recognition and Productivity Analysis from Construction Site Surveillance Videos. Autom. Constr. 2020, 110, 103045. [Google Scholar] [CrossRef]
  8. Asadi, K.; Kalkunte Suresh, A.; Ender, A.; Gotad, S.; Maniyar, S.; Anand, S.; Noghabaei, M.; Han, K.; Lobaton, E.; Wu, T. An Integrated UGV-UAV System for Construction Site Data Collection. Autom. Constr. 2020, 112, 103068. [Google Scholar] [CrossRef]
  9. Kim, H.; Seong, J.; Jung, H.-J. Real-Time Struck-By Hazards Detection System for Small- and Medium-Sized Construction Sites Based on Computer Vision Using Far-Field Surveillance Videos. J. Comput. Civ. Eng. 2023, 37, 04023028. [Google Scholar] [CrossRef]
  10. Lim, Y.G.; Wu, J.; Goh, Y.M.; Tian, J.; Gan, V. Automated Classification of “Cluttered” Construction Housekeeping Images through Supervised and Self-Supervised Feature Representation Learning. Autom. Constr. 2023, 156, 105095. [Google Scholar] [CrossRef]
  11. Rabbi, A.B.K.; Jeelani, I. AI Integration in Construction Safety: Current State, Challenges, and Future Opportunities in Text, Vision, and Audio Based Applications. Autom. Constr. 2024, 164, 105443. [Google Scholar] [CrossRef]
  12. Kim, J.; Hwang, J.; Jeong, I.; Chi, S.; Seo, J.O.; Kim, J. Generalized Vision-Based Framework for Construction Productivity Analysis Using a Standard Classification System. Autom. Constr. 2024, 165, 105504. [Google Scholar] [CrossRef]
  13. Sherafat, B.; Ahn, C.R.; Akhavian, R.; Behzadan, A.H.; Golparvar-Fard, M.; Kim, H.; Lee, Y.-C.; Rashidi, A.; Azar, E.R. Automated Methods for Activity Recognition of Construction Workers and Equipment: State-of-the-Art Review. J. Constr. Eng. Manag. 2020, 146, 03120002. [Google Scholar] [CrossRef]
  14. Song, L.; Mohammed, T.; Stayshich, D.; Eldin, N. A Cost Effective Material Tracking and Locating Solution for Material Laydown Yard. Procedia Eng. 2015, 123, 538–545. [Google Scholar] [CrossRef]
  15. Teizer, J. Status Quo and Open Challenges in Vision-Based Sensing and Tracking of Temporary Resources on Infrastructure Construction Sites. Adv. Eng. Inform. 2015, 29, 225–238. [Google Scholar] [CrossRef]
  16. Lee, D.; Lee, S. Digital Twin for Supply Chain Coordination in Modular Construction. Appl. Sci. 2021, 11, 5909. [Google Scholar] [CrossRef]
  17. Han, K.K.; Golparvar-Fard, M. Appearance-Based Material Classification for Monitoring of Operation-Level Construction Progress Using 4D BIM and Site Photologs. Autom. Constr. 2015, 53, 44–57. [Google Scholar] [CrossRef]
  18. Paneru, S.; Jeelani, I. Computer Vision Applications in Construction: Current State, Opportunities & Challenges. Autom. Constr. 2021, 132, 103940. [Google Scholar]
  19. Chen, Z.; Chen, J.; Shen, F.; Lee, Y. Collaborative Mobile-Cloud Computing for Civil Infrastructure Condition Inspection. J. Comput. Civ. Eng. 2015, 29, 04014066. [Google Scholar] [CrossRef]
  20. Tan, Y.; Yi, W.; Chen, P.; Zou, Y. An Adaptive Crack Inspection Method for Building Surface Based on BIM, UAV and Edge Computing. Autom. Constr. 2024, 157, 105161. [Google Scholar] [CrossRef]
  21. Kochovski, P.; Stankovski, V. Supporting Smart Construction with Dependable Edge Computing Infrastructures and Applications. Autom. Constr. 2018, 85, 182–192. [Google Scholar] [CrossRef]
  22. Abner, M.; Wong, P.K.Y.; Cheng, J.C.P. Battery Lifespan Enhancement Strategies for Edge Computing-Enabled Wireless Bluetooth Mesh Sensor Network for Structural Health Monitoring. Autom. Constr. 2022, 140, 104355. [Google Scholar] [CrossRef]
  23. Asp, A.; Sydorov, Y.; Keskikastari, M.; Valkama, M.; Niemelä, J. Impact of Modern Construction Materials on Radio Signal Propagation: Practical Measurements and Network Planning Aspects. In Proceedings of the 2014 IEEE 79th Vehicular Technology Conference (VTC Spring), Seoul, Republic of Korea, 18–21 May 2014; Institute of Electrical and Electronics Engineers: Piscataway, NJ, USA, 2014; pp. 1–7. [Google Scholar]
  24. Satyanarayanan, M. The Emergence of Edge Computing. Computer 2017, 50, 30–39. [Google Scholar] [CrossRef]
  25. Ali, O.; Ishak, M.K.; Bhatti, M.K.L.; Khan, I.; Kim, K.-I. A Comprehensive Review of Internet of Things: Technology Stack, Middlewares, and Fog/Edge Computing Interface. Sensors 2022, 22, 995. [Google Scholar] [CrossRef] [PubMed]
  26. Zhou, Z.; Chen, X.; Li, E.; Zeng, L.; Luo, K.; Zhang, J. Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing. Proc. IEEE 2019, 107, 1738–1762. [Google Scholar]
  27. Russo, T.; Ribeiro, R.R.; Araghi, A.; Lameiras, R.d.M.; Granja, J.; Azenha, M. Continuous Monitoring of Elastic Modulus of Mortars Using a Single-Board Computer and Cost-Effective Components. Buildings 2023, 13, 1117. [Google Scholar] [CrossRef]
  28. Chen, C.; Gu, H.; Lian, S.; Zhao, Y.; Xiao, B. Investigation of Edge Computing in Computer Vision-Based Construction Resource Detection. Buildings 2022, 12, 2167. [Google Scholar] [CrossRef]
  29. Salim, D.D.; Mulder, H.M.; Burry, J.R. Form Fostering: A Novel Design Approach for Interacting with Parametric Models in the Embodied Virtuality. J. Inf. Technol. Constr. (ITcon) 2011, 16, 135–150. [Google Scholar]
  30. Li, K.; Zhao, J.; Hu, J.; Chen, Y. Dynamic Energy Efficient Task Offloading and Resource Allocation for NOMA-Enabled IoT in Smart Buildings and Environment. Build. Environ. 2022, 226, 109513. [Google Scholar] [CrossRef]
  31. Li, D.; Li, B.; Qin, N.; Jing, X.; Du, C.; Wan, C. The Research of NOMA-MEC Network Based on Untrusted Relay-Assisted Transmission in Power Internet of Things. In Proceedings of the 2020 2nd International Conference on Civil Engineering, Environment Resources and Energy Materials, Changsha, China, 18–20 September 2020; IOP Conference Series: Earth and Environmental Science. IOP Publishing Ltd.: Bristol, UK, 2021; Volume 634. [Google Scholar]
  32. Nvidia NVIDIA Jetson Nano Product Design Guide. 2020. Available online: https://docs-download.oss-cn-shanghai.aliyuncs.com/%E5%8E%82%E5%95%86%E8%B5%84%E6%96%99/%E8%8B%B1%E4%BC%9F%E8%BE%BE/%E8%8B%B1%E4%BC%9F%E8%BE%BE%E7%A1%AC%E4%BB%B6/Jetson%20Nano-Z/Jetson%20Nano-Z%20Product%20Design%20Guide%20DG-09502-001v2.1.pdf (accessed on 20 September 2024).
  33. Blanco, J.L.; Mullin, A.; Pandya, K.; Sridhar, M. The New Age of Engineering and Construction Technology; McKinsey & Company: Chicago, IL, USA, 2017. [Google Scholar]
  34. Hazelrigg, G.A. Honing Your Proposal Writing Skills. Available online: https://www.soest.hawaii.edu/GG/FACULTY/smithkonter/GG_711/proposal_tips/NSF_ProposalWriting_Tips.pdf (accessed on 19 September 2024).
  35. Lu, S.; Lu, J.; An, K.; Wang, X.; He, Q. Edge Computing on IoT for Machine Signal Processing and Fault Diagnosis: A Review. IEEE Internet Things J. 2023, 10, 11093–11116. [Google Scholar] [CrossRef]
  36. Kalyani, Y.; Collier, R. A Systematic Survey on the Role of Cloud, Fog, and Edge Computing Combination in Smart Agriculture. Sensors 2021, 21, 5922. [Google Scholar] [CrossRef]
  37. Moore, G.E. Cramming More Components onto Integrated Circuits. Proc. IEEE 1998, 86, 82–85. [Google Scholar]
  38. Dilley, J.; Maggs, B.; Parikh, J.; Prokop, H.; Sitaraman, R.; Weihl, B. Globally Distributed Content Delivery. IEEE Internet Comput. 2002, 6, 50–58. [Google Scholar] [CrossRef]
  39. Noble, B.D.; Satyanarayanan, M.; Narayanan, D.; Tilton, J.E.; Flinn, J.; Walker, K.R. Agile Application-Aware Adaptation for Mobility. In Proceedings of the 16th ACM Symposium on Operating System Principles, St. Malo, France, 5–8 October 2001; 1997; pp. 276–287. [Google Scholar]
  40. Satyanarayanan, M. Pervasive Computing: Vision and Challenges. IEEE Pers. Commun. 2001, 8, 10–17. [Google Scholar] [CrossRef]
  41. Rowstron, A.; Druschel, P. Pastry: Scalable, Decentralized Object Location and Routing for Large-Scale Peer-to-Peer Systems. In Proceedings of the 18th IFIP/ACM International Conference on Distributed Systems Platforms (Middleware 2001), Heidelberg, Germany, 12–16 November 2001; pp. 329–350. [Google Scholar]
  42. Amazon Elastic Compute Cloud 2006. Available online: https://aws.amazon.com/about-aws/whats-new/2006/08/24/announcing-amazon-elastic-compute-cloud-amazon-ec2---beta/ (accessed on 20 September 2024).
  43. Satyanarayanan, M.; Bahl, P.; Cáceres, R.; Davies, N. The Case for VM-Based Cloudlets in Mobile Computing. IEEE Pervasive Comput. 2009, 8, 14–23. [Google Scholar] [CrossRef]
  44. Bonomi, F.; Milito, R.; Zhu, J.; Addepalli, S. Fog Computing and Its Role in the Internet of Things. In Proceedings of the MCC ’12: First Edition of the MCC Workshop on Mobile Cloud Computing, Helsinki, Finland, 17 August 2012; pp. 13–16. [Google Scholar]
  45. Task Committee on Application of Small Computers in Construction of the Construction Division. Application of Small Computers in Construction. J. Constr. Eng. Manag. 1985, 111, 173–189. [Google Scholar] [CrossRef]
  46. Kim, C.; Park, T.; Lim, H.; Kim, H. On-Site Construction Management Using Mobile Computing Technology. Autom. Constr. 2013, 35, 415–423. [Google Scholar] [CrossRef]
  47. Chen, Z.; Chen, J. Mobile Imaging and Computing for Intelligent Structural Damage Inspection. Adv. Civ. Eng. 2014, 2014, 483729. [Google Scholar] [CrossRef]
  48. Zhang, H.; Chi, S.; Yang, J.; Nepal, M.; Moon, S. Development of a Safety Inspection Framework on Construction Sites Using Mobile Computing. J. Manag. Eng. 2017, 33, 04016048. [Google Scholar] [CrossRef]
  49. B’Far, R.; Fielding, R.T. Mobile Computing Principles; Cambridge University Press: Cambridge, UK, 2004; ISBN 9780521817332. [Google Scholar]
  50. Chen, Y.; Kamara, J.M. A Framework for Using Mobile Computing for Information Management on Construction Sites. Autom. Constr. 2011, 20, 776–788. [Google Scholar] [CrossRef]
  51. Fang, Y.; Cho, Y.K.; Zhang, S.; Perez, E. Case Study of BIM and Cloud–Enabled Real-Time RFID Indoor Localization for Construction Management Applications. J. Constr. Eng. Manag. 2016, 142, 05016003. [Google Scholar] [CrossRef]
  52. Zhao, X. A Scientometric Review of Global BIM Research: Analysis and Visualization. Autom. Constr. 2017, 80, 37–47. [Google Scholar] [CrossRef]
  53. Chen, Y. A Conceptual Framework for a BIM-Based Collaboration Platform Supported by Mobile Computing. Appl. Mech. Mater. 2011, 94–96, 2144–2148. [Google Scholar] [CrossRef]
  54. Chi, H.L.; Kang, S.C.; Wang, X. Research Trends and Opportunities of Augmented Reality Applications in Architecture, Engineering, and Construction. Autom. Constr. 2013, 33, 116–122. [Google Scholar] [CrossRef]
  55. Chen, K. Enhancing Construction Safety Management through Edge Computing: Framework and Scenarios. J. Inf. Technol. Constr. 2020, 25, 438–451. [Google Scholar] [CrossRef]
  56. Ye, X.W.; Li, Z.X.; Jin, T. Smartphone-Based Structural Crack Detection Using Pruned Fully Convolutional Networks and Edge Computing. Smart Struct. Syst. 2022, 29, 141–151. [Google Scholar] [CrossRef]
  57. Żarski, M.; Wójcik, B.; Książek, K.; Miszczak, J.A. Finicky Transfer Learning—A Method of Pruning Convolutional Neural Networks for Cracks Classification on Edge Devices. Comput.-Aided Civ. Infrastruct. Eng. 2022, 37, 500–515. [Google Scholar] [CrossRef]
  58. Kochovski, P.; Stankovski, V. Building Applications for Smart and Safe Construction with the DECENTER Fog Computing and Brokerage Platform. Autom. Constr. 2021, 124, 103562. [Google Scholar] [CrossRef]
  59. Proietti, M.; Bianchi, F.; Marini, A.; Menculini, L.; Termite, L.F.; Garinei, A.; Biondi, L.; Marconi, M. Edge Intelligence with Deep Learning in Greenhouse Management. In Proceedings of the International Conference on Smart Cities and Green ICT Systems, SMARTGREENS—Proceedings, Online, 28–30 April 2021; Science and Technology Publications, Lda: Setúbal, Portugal, 2021; pp. 180–187. [Google Scholar]
  60. Alsalemi, A.; Himeur, Y.; Bensaali, F.; Amira, A. An Innovative Edge-Based Internet of Energy Solution for Promoting Energy Saving in Buildings. Sustain. Cities Soc. 2022, 78, 103571. [Google Scholar] [CrossRef]
  61. Costin, A.; McNair, J. IoT and Edge Computing in the Construction Site. In Buildings and Semantics; CRC Press: Boca Raton, FL, USA, 2022; pp. 223–237. [Google Scholar]
  62. Bae, H.; Golparvar-Fard, M.; White, J. Image-Based Localization and Content Authoring in Structure-from-Motion Point Cloud Models for Real-Time Field Reporting Applications. J. Comput. Civ. Eng. 2014, 29, B4014008. [Google Scholar] [CrossRef]
  63. Wei, Y.; Akinci, B. A Vision and Learning-Based Indoor Localization and Semantic Mapping Framework for Facility Operations and Management. Autom. Constr. 2019, 107, 102915. [Google Scholar] [CrossRef]
  64. Feng, C.; Kamat, V.R. Plane Registration Leveraged by Global Constraints for Context-Aware AEC Applications. Comput.-Aided Civ. Infrastruct. Eng. 2013, 28, 325–343. [Google Scholar] [CrossRef]
  65. Zakiyudin, M.Z.; Fathi, M.S.; Rambat, S.; Mohd Tobi, S.U.; Kasim, N.; Ahmad Latiffi, A. The Potential of Context-Aware Computing for Building Maintenance Management Systems. Appl. Mech. Mater. 2013, 405–408, 3505–3508. [Google Scholar] [CrossRef]
  66. Li, M.; Feng, X.; Han, Y.; Liu, X. Mobile Augmented Reality-Based Visualization Framework for Lifecycle O&M Support of Urban Underground Pipe Networks. Tunn. Undergr. Space Technol. 2023, 136, 105069. [Google Scholar] [CrossRef]
  67. Hayden, S.; Ames, D.P.; Turner, D.; Keene, T.; Andrus, D. Mobile, Low-Cost, and Large-Scale Immersive Data Visualization Environment for Civil Engineering Applications. J. Comput. Civ. Eng. 2015, 29, 05014011. [Google Scholar] [CrossRef]
  68. Li, Q.; Wang, X.; Wang, P.; Zhang, W.; Yin, J. FARDA: A Fog-Based Anonymous Reward Data Aggregation Security Scheme in Smart Buildings. Build. Environ. 2022, 225, 109578. [Google Scholar] [CrossRef]
  69. Jararweh, Y.; Otoum, S.; Ridhawi, I. Al Trustworthy and Sustainable Smart City Services at the Edge. Sustain. Cities Soc. 2020, 62, 102394. [Google Scholar] [CrossRef]
  70. Sun, Y.; Liu, J.; Bashir, A.K.; Tariq, U.; Liu, W.; Chen, K.; Alshehri, M.D. E-CIS: Edge-Based Classifier Identification Scheme in Green & Sustainable IoT Smart City. Sustain. Cities Soc. 2021, 75, 103312. [Google Scholar] [CrossRef]
  71. Xiong, F.; Xu, C.; Ren, W.; Zheng, R.; Gong, P.; Ren, Y. A Blockchain-Based Edge Collaborative Detection Scheme for Construction Internet of Things. Autom. Constr. 2022, 134, 104066. [Google Scholar] [CrossRef]
  72. Zou, H.; Zhou, Y.; Jiang, H.; Chien, S.C.; Xie, L.; Spanos, C.J. WinLight: A WiFi-Based Occupancy-Driven Lighting Control System for Smart Building. Energy Build 2018, 158, 924–938. [Google Scholar] [CrossRef]
  73. Williams, G.; Gheisari, M.; Chen, P.-J.; Irizarry, J. BIM2MAR: An Efficient BIM Translation to Mobile Augmented Reality Applications. J. Manag. Eng. 2015, 31, A4014009. [Google Scholar] [CrossRef]
  74. Fenais, A.; Ariaratnam, S.T.; Ayer, S.K.; Smilovsky, N. Integrating Geographic Information Systems and Augmented Reality for Mapping Underground Utilities. Infrastructures 2019, 4, 60. [Google Scholar] [CrossRef]
  75. Behzadan, A.H.; Kamat, V.R. Integrated Information Modeling and Visual Simulation of Engineering Operations Using Dynamic Augmented Reality Scene Graphs. J. Inf. Technol. Constr. (ITcon) 2011, 16, 259–278. [Google Scholar]
  76. Lowe, D.G. Object Recognition from Local Scale-Invariant Features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; IEEE: Piscataway, NJ, USA, 1999; pp. 1150–1157. [Google Scholar]
  77. Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
  78. Lecun, Y.; Bottou, E.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  79. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  80. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014. [Google Scholar] [CrossRef]
  81. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE Computer Society: Washington, DC, USA, 2016; pp. 770–778. [Google Scholar]
  82. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009. [Google Scholar]
  83. Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
  84. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
  85. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
  86. Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; IEEE Computer Society: Washington, DC, USA, 2018; pp. 6848–6856. [Google Scholar]
  87. Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946. [Google Scholar]
  88. Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
  89. Raza, R.; Zulfiqar, F.; Khan, M.O.; Arif, M.; Alvi, A.; Iftikhar, M.A.; Alam, T. Lung-EffNet: Lung Cancer Classification Using EfficientNet from CT-Scan Images. Eng. Appl. Artif. Intell. 2023, 126, 106902. [Google Scholar] [CrossRef]
  90. Dimitrov, A.; Golparvar-Fard, M. Vision-Based Material Recognition for Automated Monitoring of Construction Progress and Generating Building Information Modeling from Unordered Site Image Collections. Adv. Eng. Inform. 2014, 28, 37–49. [Google Scholar] [CrossRef]
  91. Gong, J.; Caldas, C.H. Computer Vision-Based Video Interpretation Model for Automated Productivity Analysis of Construction Operations. J. Comput. Civ. Eng. 2010, 24, 252–263. [Google Scholar] [CrossRef]
  92. Xiahou, X.; Li, Z.; Xia, J.; Zhou, Z.; Li, Q. A Feature-Level Fusion-Based Multimodal Analysis of Recognition and Classification of Awkward Working Postures in Construction. J. Constr. Eng. Manag. 2023, 149, 04023138. [Google Scholar] [CrossRef]
  93. Chen, G.; Liu, M.; Zhang, Y.; Wang, Z.; Hsiang, S.M.; He, C. Using Images to Detect, Plan, Analyze, and Coordinate a Smart Contract in Construction. J. Manag. Eng. 2023, 39, 04023002. [Google Scholar] [CrossRef]
  94. Luo, X.; Li, H.; Cao, D.; Dai, F.; Seo, J.; Lee, S. Recognizing Diverse Construction Activities in Site Images via Relevance Networks of Construction-Related Objects Detected by Convolutional Neural Networks. J. Comput. Civ. Eng. 2018, 32, 04018012. [Google Scholar] [CrossRef]
  95. Mneymneh, B.E.; Abbas, M.; Khoury, H. Vision-Based Framework for Intelligent Monitoring of Hardhat Wearing on Construction Sites. J. Comput. Civ. Eng. 2019, 33, 04018066. [Google Scholar] [CrossRef]
  96. Chen, J.; Lu, W.; Lou, J. Automatic Concrete Defect Detection and Reconstruction by Aligning Aerial Images onto Semantic-Rich Building Information Model. Comput.-Aided Civ. Infrastruct. Eng. 2023, 38, 1079–1098. [Google Scholar] [CrossRef]
  97. Alsakka, F.; Assaf, S.; El-Chami, I.; Al-Hussein, M. Computer Vision Applications in Offsite Construction. Autom. Constr. 2023, 154, 104980. [Google Scholar] [CrossRef]
  98. Shin, Y.; Choi, Y.; Won, J.; Hong, T.; Koo, C. A New Benchmark Model for the Automated Detection and Classification of a Wide Range of Heavy Construction Equipment. J. Manag. Eng. 2024, 40, 04023069. [Google Scholar] [CrossRef]
  99. Akhavian, R.; Behzadan, A.H. Construction Equipment Activity Recognition for Simulation Input Modeling Using Mobile Sensors and Machine Learning Classifiers. Adv. Eng. Inform. 2015, 29, 867–877. [Google Scholar] [CrossRef]
  100. Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv 2017. [Google Scholar] [CrossRef]
  101. Xiao, X.; Chen, C.; Skitmore, M.; Li, H.; Deng, Y. Exploring Edge Computing for Sustainable CV-Based Worker Detection in Construction Site Monitoring: Performance and Feasibility Analysis. Buildings 2024, 14, 2299. [Google Scholar] [CrossRef]
  102. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
  103. Chen, G.; Alsharef, A.; Pedraza, M.; Albert, A.; Jaselskis, E. A Novel Edge Computing Framework for Construction Nail Detection under Conditions of Constrained Computing Resources. In Proceedings of the 2024 ASCE International Conference on Computing in Civil Engineering, Pittsburgh, PE, USA, 28–31 July 2024; ASCE: Reston, VA, USA, 2024. [Google Scholar]
  104. Google Coral Coral-USB-Accelerator-Datasheet 2019. Available online: https://cdn-reichelt.de/documents/datenblatt/A300/CORAL-USB-ACCELERATOR-DATASHEET.pdf (accessed on 20 September 2024).
  105. U.S Department of Labor National Census of Fatal Occupational Injuries in 2022. 2022. Available online: https://www.bls.gov/news.release/pdf/cfoi.pdf (accessed on 21 March 2024).
  106. OSHA. Safety and Health Regulations for Construction—General Safety and Health Provisions. Available online: https://www.osha.gov/laws-regs/regulations/standardnumber/1926/1926.25 (accessed on 7 January 2024).
  107. American Plywood Association Building for High-Wind Resilience in Light-Frame Wood Construction (Design Guidance). 2021. Available online: https://osb.westfraser.com/wp-content/uploads/2018/08/M310E-2018-Building-in-High-Wind.pdf (accessed on 20 September 2024).
  108. Google Post-Training Quantization|TensorFlow Lite. Available online: https://www.tensorflow.org/lite/performance/post_training_quantization (accessed on 15 May 2024).
  109. Chen, G.; Alsharef, A.; Jaselskis, E. Integrating Sensor-Empowered Federated Learning and Smart Contracts for Automatic Flood Risk Management. In Proceedings of the 2024 ASCE International Conference on Computing in Civil Engineering, Pittsburgh, PE, USA, 28 July 2024; ASCE: Reston, VA, USA, 2024. [Google Scholar]
Figure 1. Co-occurrence of trending research topics.
Figure 1. Co-occurrence of trending research topics.
Sensors 24 06603 g001
Figure 2. Co-occurrence map of the implemented device/hardware.
Figure 2. Co-occurrence map of the implemented device/hardware.
Sensors 24 06603 g002
Figure 3. Edge computing implementation framework.
Figure 3. Edge computing implementation framework.
Sensors 24 06603 g003
Figure 4. MobileNet architecture and model development process.
Figure 4. MobileNet architecture and model development process.
Sensors 24 06603 g004
Figure 5. Confusion matrix of trained MobileNetV2 on the material classification task.
Figure 5. Confusion matrix of trained MobileNetV2 on the material classification task.
Sensors 24 06603 g005
Figure 6. Material classification prototype implementation in the lab.
Figure 6. Material classification prototype implementation in the lab.
Sensors 24 06603 g006
Figure 7. Confusion matrix of trained MobileNetV1 on the nail detection task.
Figure 7. Confusion matrix of trained MobileNetV1 on the nail detection task.
Sensors 24 06603 g007
Figure 8. Real-time edge computing prototype implementation in the lab environment: Scenario 1 is the detection of a “board with nails”; Scenario 2 is the detection of a “board”.
Figure 8. Real-time edge computing prototype implementation in the lab environment: Scenario 1 is the detection of a “board with nails”; Scenario 2 is the detection of a “board”.
Sensors 24 06603 g008
Figure 9. Experimental setup at a real construction site: Location 1 is an image taken from inside the building under construction showing downtown Raleigh, NC; Location 2 shows an interior room without scattered materials; Location 3 shows scattered building materials; Location 4 shows cluttered construction materials and debris; and Location 5 shows grid and buffer materials on the grounds of the building site.
Figure 9. Experimental setup at a real construction site: Location 1 is an image taken from inside the building under construction showing downtown Raleigh, NC; Location 2 shows an interior room without scattered materials; Location 3 shows scattered building materials; Location 4 shows cluttered construction materials and debris; and Location 5 shows grid and buffer materials on the grounds of the building site.
Sensors 24 06603 g009
Table 1. Involved products of the developed edge device.
Table 1. Involved products of the developed edge device.
ProductFunctionPrice (USD)
CanaKit Raspberry Pi 4 8 GB Starter Kit—8 GB RAMSingle Board of Computer159.99
Arducam for Raspberry Pi Camera ModuleCamera12.99
Vilros 8 Inch 1024 × 768 ScreenScreen79.99
Vilros 2.4 GHz Mini Wireless Keyboard with Touchpad Mouse-USB ReceiverKeyboard with Touchpad Mouse14.99
Vilros Mini Bluetooth SpeakerSpeaker8.99
Power Ridge Portable Power Bank with AC Outlet,
100 W 26,270 mAh
Battery69.99
Coral USB AcceleratorGoogle Edge TPU ML accelerator coprocessor83.90
SUM430.84
Table 2. Performance statistics of different models in binary material classification.
Table 2. Performance statistics of different models in binary material classification.
ModelAccuracyClassPrecisionRecallF1 ScoreSupportModel Size
MobileNetV11Plywood1.001.001.0025927.5 MB
OSB1.001.001.00322
MobileNetV21Plywood1.001.001.0025914.3 MB
OSB1.001.001.00322
MobileNetV3Small0.91Plywood0.890.910.902599.7 MB
OSB0.930.910.92322
MobileNetV3Large0.89Plywood0.811.000.9025926.5 MB
OSB1.000.820.90322
Table 3. Performance statistics of different models in nail safety detection.
Table 3. Performance statistics of different models in nail safety detection.
ModelAccuracyClassPrecisionRecallF1 ScoreSupportModel Size
MobileNetV10.90Boards0.980.830.9058117.4 MB
Boards with nails0.830.980.90472
MobileNetV20.87Boards0.970.790.8758115.1 MB
Boards with nails0.790.970.87472
MobileNetV3Small0.73Boards0.920.570.705819.4 MB
Boards with nails0.640.940.76472
MobileNetV3Large0.83Boards0.800.920.8558119.2 MB
Boards with nails0.880.710.79472
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, G.; Alsharef, A.; Jaselskis, E. Construction Jobsite Image Classification Using an Edge Computing Framework. Sensors 2024, 24, 6603. https://doi.org/10.3390/s24206603

AMA Style

Chen G, Alsharef A, Jaselskis E. Construction Jobsite Image Classification Using an Edge Computing Framework. Sensors. 2024; 24(20):6603. https://doi.org/10.3390/s24206603

Chicago/Turabian Style

Chen, Gongfan, Abdullah Alsharef, and Edward Jaselskis. 2024. "Construction Jobsite Image Classification Using an Edge Computing Framework" Sensors 24, no. 20: 6603. https://doi.org/10.3390/s24206603

APA Style

Chen, G., Alsharef, A., & Jaselskis, E. (2024). Construction Jobsite Image Classification Using an Edge Computing Framework. Sensors, 24(20), 6603. https://doi.org/10.3390/s24206603

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop