Big Data in Construction: Current Applications and Future Opportunities

Munawar, Hafiz Suliman; Ullah, Fahim; Qayyum, Siddra; Shahzad, Danish

doi:10.3390/bdcc6010018

Open AccessEditor’s ChoiceReview

Big Data in Construction: Current Applications and Future Opportunities

¹

School of the Built Environment, University of New South Wales, Sydney, NSW 2052, Australia

²

School of Surveying and Built Environment, University of Southern Queensland, Springfield Central, QLD 4300, Australia

³

Department of Visual Computing, University of Saarland, 66123 Saarbrücken, Germany

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2022, 6(1), 18; https://doi.org/10.3390/bdcc6010018

Submission received: 6 December 2021 / Revised: 29 January 2022 / Accepted: 3 February 2022 / Published: 6 February 2022

(This article belongs to the Special Issue Review Papers in Big Data, Cloud-Based Data Analysis and Learning Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Big data have become an integral part of various research fields due to the rapid advancements in the digital technologies available for dealing with data. The construction industry is no exception and has seen a spike in the data being generated due to the introduction of various digital disruptive technologies. However, despite the availability of data and the introduction of such technologies, the construction industry is lagging in harnessing big data. This paper critically explores literature published since 2010 to identify the data trends and how the construction industry can benefit from big data. The presence of tools such as computer-aided drawing (CAD) and building information modelling (BIM) provide a great opportunity for researchers in the construction industry to further improve how infrastructure can be developed, monitored, or improved in the future. The gaps in the existing research data have been explored and a detailed analysis was carried out to identify the different ways in which big data analysis and storage work in relevance to the construction industry. Big data engineering (BDE) and statistics are among the most crucial steps for integrating big data technology in construction. The results of this study suggest that while the existing research studies have set the stage for improving big data research, the integration of the associated digital technologies into the construction industry is not very clear. Among the future opportunities, big data research into construction safety, site management, heritage conservation, and project waste minimization and quality improvements are key areas.

Keywords:

big data; big data engineering; construction big data; digital technologies; construction industry

1. Introduction

Big data are increasingly becoming an integral part of almost all fields. The rapidity with which data is generated and piled up in the era of disruptive digital technologies is astounding [1]. Such big data have necessitated the need for efficient data management tools and techniques to deal with the bulk of data. Recently, a great deal of focus has been dedicated to using, storing, and managing big data in various fields [2]. The rise of interest in big data is associated with the easy availability of technology such as smartphones and computers across the globe [3]. The bulk of data generated daily through these technologies has made various researchers interested in using the data for innovative purposes and moving away from traditional time-consuming questionnaire-based approaches for data collection to more digital data management. Algorithm development, machine learning (ML), statistical analysis, and computational model development are among the various techniques that depend on data that can be easily gathered by day-to-day usage gadgets [4,5]. The presence of bulks of data makes it possible for researchers to make informed decisions and conduct relevant analyses for their field of study.

Construction is a data-intensive sector where the bulk of data is generated and not capitalized on adequately due to slow technology adoption [6]. Accordingly, it is not surprising to see the construction sector lagging behind the technology curve by more than five years which is rather slow considering the day-to-day innovations and disruptions brought about by the booming information technology industry [7]. Moreover, big data, a relatively new technology, are not properly adopted by construction. In fact, construction big data management is in its nascency and has a long way to go to mature. However, multiple studies [6,8] show that the potential is enormous if construction big data are fully utilized.

There are various steps involved in using big data, including data acquisition, storage, classification, and refining [8]. These steps are handled through various software programs to refine the associated big data and make it usable for research and practical purposes [9,10,11]. The biggest challenge in big data management is identifying which data is useful and vice versa through data refinement [12,13]. The immense amounts of data easily available make it hard to identify the datasets used for a particular purpose. Moreover, the available data format may not be ready for use or easily readable for the intended purpose [14,15]. These barriers to accessing, understanding, and utilizing big data make it important to develop systems for extracting key information and analyzing it [16]. In addition, the strategic sorting and analysis of big data have opened up new avenues of research by widening the need to use data appropriately [17]. In the case of construction, some barriers to big data adoption include latency, data privacy, data availability, data governance, poor broadband connectivity at construction sites, and cost implication for long-term use. For instance, big data adoption in construction may have latency issues with lower transfer rate and response time required due to software issues or network problems which may be a hurdle for some time-sensitive construction applications [18].

Furthermore, there is an increase in vulnerability in technology adoption due to the fluidity of security parameters. Storing construction design and financial information in shared resources concerns the construction industry [19]. Afolabi et al. [20] assessed the economies of big data in project delivery and included poor network connect among the threats to adoption by the construction industry.

Sorting big data requires developing database designs that would automate picking the most useful data for a given purpose [21]. Identifying a design that works best for data sorting is an entire research area on its own and has helped expand big data research by a great deal [22]. Currently, the biggest question concerning researchers in the field of big data is to find a way that creates seamless coordination between database systems such that they can hold big data, help process it, and possibly lead to an error-free statistical analysis [23]. Removing the current limitations in understanding big data will enable scientists to utilize the readily available data and make better decisions.

The construction industry is also benefiting from big data in a way that has revolutionized its traditional operational methods to a more automated process. The presence of digital tools and technologies for designing and executing construction projects has made the construction industry take enormous leaps in the last two decades. The possibility of modeling building structures and identifying the functionality of those structures before they are built has led to industrial investments in big data and related technologies [24,25]. Computer-aided design (CAD), such as building information modelling (BIM), is a term now synonymous with the construction industry [26]. The three-dimensional modeling of buildings and other construction infrastructures leads to the generation of digital files which can be stored in various formats, leading to a bulk of data generation [27]. Other digital innovations such as digital twins, 3D laser scanning, and advanced wearable gadgets incorporated in hats, shoes, gloves, and other sensor-based tools have revolutionized the construction industry and helped generate useful big data.

Big data in the construction industry can accumulate quickly and become storage heavy due to the large size of the 3D modeling files and a huge amount of daily data generated by wearable gadgets [28]. Management of such big data is a hectic but essential task as the usefulness of the models lies in ensuring that they are available for viewing and leveraging as and when needed. Apart from providing the ease of modeling infrastructure, big data also provide the opportunity to develop sustainable structures by using test models before actual constructions. These are made possible by using digital twins, geographical information systems (GIS)-based 3D point cloud structures, and other cloud-based scanning systems. Furthermore, the software that enables CAD and BIM further feeds into the databases and contributes to big data. All these variables lead to the possibility of utilizing technology for sustainable construction and associated development in line with the United Nations sustainable development goals and other local development initiatives.

The applications of big data in the construction industry are immense. Identifying how big data can be applied to the construction industry remains the real challenge. Since each construction project leads to more data generation, it is crucial to analyze and sort the data accordingly. Some of the key features within the construction industry that can benefit from big data include construction safety, efficiency, waste minimization, productivity, competitive advantage, and pollution management [29]. The strategic and operational benefits of big data in the construction industry have further been explored by Atuahene et al. [30]. The major benefits of big data were found to be project management, management of claims, and procurement. These aspects of big data application are crucial for managing construction projects. However, many other aspects and applications of big data within the construction industry still need to be explored. While these different aspects of construction projects benefit from big data, it is important to understand how big data can be analyzed and utilized for different projects. Furthermore, the algorithms and frameworks that can integrate big data in the construction industry remain largely unexplored.

Today, studies on construction and its management in relation to big data are scarce, presenting a gap in research. This provides opportunities for further research that can greatly benefit the construction industry in the long run. This gap is targeted in the current study, where the papers published in construction fields focused on big data since 2010 are studied. The key takeaways of these studies are presented here to help the construction researchers build upon these studies and advance the state of research related to big data in construction.

In terms of implications, this study will help both the construction researchers and practitioners, where the former will have the current state of research on big data and can see opportunities for further research. Similarly, the practitioners can ascertain the software and hardware requirements for incorporating big-data-based opportunities in construction and create implementation models and gadgets. This paper is divided into sections exploring big data engineering (BDE), databases, use of big data in construction, the application of big-data-based statistics in construction, and future opportunities for big data in construction.

Research Questions

This study aims to identify ways in which big data can be used for construction and its management based on the review of existing literature. The existing literature on big data does not provide detailed solutions for construction management, which creates a gap in the literature concerning the use of big data in the construction industry. The research questions set for this study are as follows:

How can we use big data for research in construction engineering and management?
How is construction big data managed and stored?
How can big data be used for planning construction projects in a futuristic way?

The rest of the paper is organized as follows. Section 2 presents the method and materials used in the study. Section 3 presents the preliminary analyses conducted in the study. This is followed by Section 4, where the BDE and its subcomponents, including big data processing, big data storage, and big data analytics (BDA), are presented and discussed. Similarly, the 10 vs. of big data and ML techniques are also presented in this section. Section 5 presents the future opportunities for big data in construction. Finally, Section 6 concludes the study and presents the key takeaways, limitations, and future expansion directions based on the current study.

2. Materials and Methods

This study follows a multi-stepped approach for reviewing the studies on big data in construction. First, a comprehensive literature retrieval mechanism is adopted from published literature and modified accordingly to retrieve pertinent literature on big data in construction. This is followed by analyses of the retrieved articles in the shape of preliminary analyses, BDE, processing, storage, analytics, and statistical and data mining approaches in relation to the construction industry. These steps are subsequently explained.

An extensive literature search was carried out to identify peer-reviewed papers related to big data and construction since 2010, following the approaches adopted in recent studies [31,32]. This was conducted in order to keep a recent focus and study current articles on big data in construction. Some preliminary analyses, as subsequently discussed, highlighted that big data in construction received more attention in 2010 and onwards; hence, the review period of 2010 and onwards makes sense. A number of scholarly research platforms, including Google Scholar, Scopus, Science Direct, Springer, Elsevier, and IEEE Explore, were consulted for literature search based on the high volume of high-quality research papers available on these platforms following recent studies [33,34,35]. Once the search engines were selected, a combination of different keywords was developed to identify the most useful publications for this study in the next step. The keyword combinations were developed in a tier-based approach, such that terms related to big data, such as “big data”, “big data analysis”, “big data volume”, and “big data analysis tools” fell into category 1 (S1).

Similarly, all keywords pertaining to construction, such as “construction”, “construction management”, and “construction industry”, were classified into category 2 (S2). Different combinations of keywords from both categories were used to retrieve the most relevant publications. Examples of keyword combinations include big data in construction, big data for construction management, construction management, and big data, etc.

Search category was further restricted by including only those papers that were published in 2010 or later years. Since big data technology was used robustly in the last decade, research publications prior to 2010 were left out. Concept papers, editorials, notes, perspectives, closures, discussions, conference papers, and others were also excluded from the search to ensure the inclusion of original research papers only. Other publications dealing with classical definitions were also excluded.

Using different combinations of the keywords to identify papers published from 2010 onwards led to a total of more than 10,000 papers being retrieved from the mentioned search engines. The list of articles was narrowed down using the detailed inclusion criteria set for this study. This included removing duplicates and other exclusions, as previously mentioned, which brought the search results down to around 4000 papers. This was further narrowed down in a stepwise manner to ensure that only those papers were included that fit the scope of the current study. In the final step, the content of the papers was analyzed to determine their suitability for this study, resulting in a total of 156 papers.

Figure 1 shows an overview of the different ways in which research studies have addressed the use of big data in construction. There has been a rise in the interest in big data usage for the construction industry since 2016. However, the interest has been limited in terms of analyses scope as the trends have remained steady. As shown in Figure 1, the publications on this topic have followed similar terms and research themes over the last few years, leading to gradual evolution. For example, in 2016, most papers related to big data and construction focused on the use of cloud computing, while 2017 saw a trend of developing models and frameworks for implementing big data in the construction industry. Similarly, in 2018 and 2019, researchers have mainly explored how different big data models could be implemented within the construction industry. Recently, the research focus has shifted to using big data in real-time construction projects and identifying how these technologies could be harnessed for developing futuristic construction projects.

In addition to big data, some other technologies and methods have been researched in the last couple of years for improving the construction industry. There is a great overlap in the types of technologies studied simultaneously for developing models that could guide future research in the construction industry. Figure 2 shows the overlapping tools and technologies identified from recent literature. It can be observed that big data is not standalone; rather, it depends on other tools and methods, including data analytics, ML, pattern recognition, statistics, deep learning, and artificial intelligence (AI). All these tools and technologies are used in different combinations for developing models that could be used in real time for construction projects. The reliance of all these tools on each other is an important factor to consider when developing construction projects as the computational aspects of the project can only be as good and true and the depth of research is performed for developing and testing the algorithms and frameworks. The construction industry greatly benefits from the overlapping fields of big data technologies. The use of big data requires data mining which generates enormous datasets. The bulk of construction-related data makes the use of statistics inevitable.

Along with data management, statistical analysis, and big data analytics, several different techniques and resources come into use. For example, machine learning tools and artificial intelligence play a crucial role in the construction industry in conjunction with big data. The overlap of all the different fields shown in Figure 2 shows how the field of construction is laden with the use of different technologies, each of which is somehow associated with big data. The use of computational models, databases, deep learning, pattern recognition, virtual reality, bots, and augmented reality contributes to the application of big data in the construction industry. An in-depth analysis of the big data applications and the use of technology in the construction industry results in a much more complex overlap than shown here. However, the core aim of using different technologies is to simplify how datasets can be used to guide future construction projects. Recognizing data patterns and understanding how each dataset fits the needs of a construction project is only possible if the dataset has been analyzed, critically appraised, and classified for its specific usage. The guiding principle here is to use modern technology to upgrade and update the ways in which information could be streamlined for the benefit of different projects. For example, identifying the materials that best suit a particular structure, developing project timelines, and streamlining the resources can become much more straightforward if the construction projects are developed with the help of big data technologies.

As shown in Figure 2, different technologies in the construction industry overlap in different ways. Integrating big data in the construction industry is possible through the combined use of other technologies such as machine learning, AI, VR, AR, pattern recognition, and other such methods.

3. Preliminary Analyses

As mentioned in the method, some preliminary analyses were conducted on the retrieved articles, including the keywords analysis and the countries of origin of the articles following recently published articles [31,35]. Before this, a basic Google Trend (r) search was conducted using trends.google.com (accessed on 20 November 2021). A comparison was made for three iterations of the keywords previously mentioned. These included construction big data (keyword 1), big data in construction (keyword 2), and big data for construction management (keyword 3). As shown in Figure 3, the earliest attention paid to big data in construction was reported in 2010. This was reported for keyword 1, followed by keyword 2 in 2013 and keyword 3 in 2014. Two clusters are clearly visible from Figure 3. The initial interest cluster showed when big data focused on construction and the spike in interest cluster. The first cluster is evident in 2010–2014, whereas the spike in interest cluster started in 2016. This shows the hotness or relevance of the topic under investigation in the current study.

After the Google Trend analyses, the retrieved articles were analyzed using Vos Viewer^® tool. The first analysis was that of keywords. The natural distribution of keywords retrieved from the articles shows five distinct clusters: education, city and region, disaster and human interactions, knowledge management, and technology management in relation to construction, as given in Figure 4. The overall top keywords in order of priority retrieved from these articles included big data, information management, AI, data mining, internet of things, ML, advanced analytics, data technologies, students, data handling, digital storage, colleges and universities, smart city, decision making, cloud computing, construction industry, and others. These are based on the appearance of the keywords in the titles, abstract, and keywords of a minimum of 30 papers. These keywords are in line with the natural clusters highlighted in Figure 4.

In another analysis, the top 10 contributing countries to big data research in construction were investigated. These are China, United States, United Kingdom, Russian Federation, Australia, India, South Korea, Germany, Spain, and Italy in terms of the number of contributions as shown in Figure 5. The colors in the country box show the countries with the strongest collaborations, whereas the size of the box refers to the number of papers. For example, most of the papers authored by Chinese authors are in collaboration with authors from Australia, New Zealand, and Indonesia.

4. Big Data Engineering (BDE)

Big data analytics (BDA) is supported by BDE that provides a framework to conduct it. BDE has tremendous applications in construction. It has been used for BIM to improve project management [36]. It has also been used to improve building design and for effective performance monitoring [37], project management, safety, energy management, decision-making design frameworks, resource management [38], quality management, waste management, and others [24].

To understand BDE, it is important to discuss big data platforms. These platforms are divided into two groups based on variations in their inherent characteristics. These include horizontal scaling platforms (HSP) and vertical scaling platforms (VSP). HSP utilizes multiple servers by distributing processing across them and bringing new machines into the cluster. VSPs are single-server-based configurations that achieve the scaling by upgrading the hardware of the related server. In construction, HSPs have been used for waste management [25], profitability performance [39], smart road construction, and others [40]. Similarly, VSPs have been reported in one-off construction projects [41], transportation [42], and others. This paper focuses on HSPs, particularly Berkeley Data Analytics Stack (BDAS) and Hadoop.

Recently, BDAS has been in the limelight since it has greater performance gains over Hadoop. However, as it is quite recent, it suffers the drawback of limitation in available supporting tools. On the other side, Hadoop has been widely utilized in big data applications. The tools offered by these platforms are useful in the storage and processing of big data. For instance, Bilal et al. [39] investigated the profitability performance of construction projects using big data and used Hadoop Distributed File System (HDFS) for managing the data within the staging area while employing Resource Description Framework (RDF)-enabled Network Data Model (NDM) for storing the persistent data. Similarly, Jun Ying et al. [43] investigated the development and implementation of BDAS by the relevant building authorities in Singapore, which has enhanced knowledge and expertise in buildability. An overview of big data classification into BDE and BDA is shown in Figure 6 and subsequently explained.

Big data are classified into two major domains: BDE and BDA. These two main domains are further divided into many classes and subclasses. A third domain that comes under the canopy of big data is ML. The use of ML is inevitable in big data as the data need to be organized, analyzed, and used through ML tools and models such as deep learning and neural networks. Some of the key ML tools and models associated with big data directly or indirectly include regression analysis, clustering, classification, information retrieved (R), and natural language processing (NLP). Some examples of ML in construction include deep-learning-based flood detection and damage assessment [44], projects delay risk prediction [45], construction site safety [46], construction site monitoring [47], neural network models to predict concrete properties [48], and others.

The various algorithms and methods shown in Figure 6 all contribute towards big data in some way. The use of supervised and unsupervised learning approaches is determined depending on the type of datasets available. The major difference between supervised learning and unsupervised learning is that the algorithm for supervised learning utilizes labeled datasets while the unsupervised data do not use labeled data. The supervised and unsupervised algorithms further have different methods and examples. For instance, regression, linear regression, neural network regression, random forest, Naïve Bayes, and lasso regression are examples of supervised learning.

Similarly, clustering, Natural Language Processing (NLP), and KNN are examples of unsupervised learning. The applications of each of these algorithms can differ and hence their integration in the construction industry can vary. The regression models are used in engineering for analyzing trends and correlations between different variables. In the construction industry, these models play a crucial role as the statistical analysis and correlation development between different variables are made easy through linear regression and other similar algorithms. Similarly, machine learning models have made it possible to ensure that construction projects are developed considering safety, time management, and quality.

As shown in Figure 6, the two major big data domains rely on statistics, data processing, and data management. All these features, in turn, are heavily dependent on ML tools and methods. For example, BDE requires data processing and storage, which in turn require regression models, NoSQL, and MapReduce, all of which are different types of computational tools that enable the different applications of big data management. Similarly, BDA heavily depends on ML tools that can use data and statistics to provide organized data solutions. The use of tree-based analysis, Bayesian analysis, and shrinkage are all examples of ML integration in the field of BDA. A wide variety of ML tools have been explored over the years and have been directly or indirectly associated with big data management and analysis. Tools such as linear regression, vector machines, KNN models, clustering, and decision tree regression are among the few examples which enable the use of big data coherently. Furthermore, the classification tree of big data is likely to be further expanded as the ML algorithms are further developed and more analysis methods are added to the list. Therefore, the constant expansion of the big data analysis tools can enable the use of these tools in the construction industry for improving construction projects in the future. Yang and Yu [49] investigated the application of heterogeneous networks oriented to NoSQL database in optimal post-evaluation indexes of construction projects. NoSQL database is scalable with a powerful and flexible data model and a large amount of data and has increasing application potential in the memory field. Sanni-Anibire et al. [50] investigated the increase in delays and abandonment of tall buildings and developed a machine learning model for delay risk assessment. Methods such as K-Nearest Neighbors (KNN), Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Ensemble methods were considered. The model developed for predicting the risk of delay was based on ANN with a classification accuracy of 93.75%. The key components of big data from Figure 6 for its management are discussed below.

4.1. Big Data Processing

Distributed and parallel computation is present in the core of BDE. In construction, big data processing has been utilized for waste management [51], prefabricated construction project management [52], profitability analyses, and other construction management applications [39]. For processing information, a considerable number of models are developed. Some of the key big data models are discussed below.

4.1.1. MapReduce (MR)

MapReduce was developed for the handling of big data. It utilizes a distributed processing model in which two functions, as indicated by the name itself, map and reduce, are employed to write analytical tasks. Mappers and reducers are the processes that collect the data from these functions for further processing. Initially, mappers collect and read the input information to process it for subsequent results generation. The output of mappers is used by reducers which give the results that are ultimately stored in the file system. MR has been used by Jiao et al. [53] to develop an augmented framework for BIM. Similarly, it has also been used in construction knowledge maps [54] and other big data applications [54].

The use of MapReduce in the construction industry is inevitable due to the big data applications within the construction industry. The usability of the MapReduce framework in the construction industry relies on the management of big data in a particular way. Accordingly, the datasets are analyzed and divided into categories to reduce clutter and present an easy-to-understand data output. The basic framework of MapReduce includes data input, data chunks, decomposition mappers, decomposed output, linear mappers, linear reducers, and combined output. The exact series and number of components in the framework can vary depending on the version used. However, the overall features and application of MapReduce remain the same, i.e., reduction of data into manageable chunks. The use of MapReduce not only distributes data into smaller chunks but also helps develop datasets that present a more analytic view of big data. Having organized datasets within the construction industry is of key importance as it can greatly increase the efficiency of data management and decision making based on data analysis.

Hadoop was the popular and first big data platform that introduced and made it easy for people to work on MR by executing its programs successfully. For tasks requiring batch processing, MR proved itself to be an effective tool as a typical cluster contains interlinked mappers and reducers that assist by running MR programs side by side at the same time. Though it has its benefits, these are not devoid of the drawbacks. These drawbacks include running some applications for graph generation and real-time and iterative processing. By dissociating the rest of the ecosystem from the processing of MR, Hadoop’s latest versions have tried to sort out the problem. Yet another resource negotiator (YARN) has also been introduced, which functions by providing resource management and scheduling related functions of MR and has made it easy to implement innovative applications by Hadoop.

Hadoop models have been used in construction for smart buildings and disaster management [55], failure prediction of construction firms [56], workers’ safe behaviors in a metro construction project [57], and other relevant applications. The overall platform design architecture of Hadoop offers high reliability; adopt cluster technology, multi-copy technology, independent backup technology, and other means to reduce the data failure rate effectively and build a reliable data application service platform. First, the processing of big data into batches and simultaneous reduction and refining of the data are carried out using MR. Next, data are batched into similar items to streamline the analyses. This step further reduces noise or datasets that do not align with a particular batch of data. Finally, a dataset is obtained, which is refined and aligned with the original search purpose.

4.1.2. Directed Acyclic Graph

Big data platforms also use Directed Acyclic Graph (DAG) which is an alternative processing model. In comparison with MR, DAG works by relaxing map-then-reduce, the style of MR, which is supported by Spark. Spark is widely accepted for reactive and iterative applications due to its supremacy over MR in high expressiveness and in-memory computation. Disk-resident and memory-resident tasks are conducted ten and one hundred times faster using Spark than MR. DAGs show relationships among variables, making them easier to understand. DAGs provide major advantages that enable experts and researchers to construct complex causal relationships in which nodes represent stochastic variables, and directed edges (arrows) indicate direct probabilistic dependencies among the relevant variables. DAGs are also able to encode deterministic as well as probabilistic relationships among the variables. The usage of Spark and associated DAGs has been reported for construction profitability analysis [39], waste management [25], energy monitoring service on smart campuses [58], and others.

Spark and Hadoop are among the ML tools with enormous potential in construction engineering and management. Figure 7 compares the two tools that can inform research in construction. The speed of both these systems is better than other algorithms and ML tools currently in use in the construction industry. Moreover, fault tolerance in both these systems is also high and has greater scalability than existing models. The data storage in these systems is slightly different in that Spark uses a memory system while Hadoop utilizes a disk for data storage. The language for both these tools is also different since Spark is written in Scala while Hadoop has been developed using JavaScript. Despite the slight differences, both these tools provide the opportunity to process data in the form of batches and at a higher speed than previously existing models, making them potential tools for futuristic model developments in construction engineering and management. JavaScript has been used in construction to anticipate building material reuse [59], automated progress control coupled with laser scanning [60], shared virtual reality for design and management [61], construction information mining [62], and others. Similarly, Scala has been used for the process information modeling concept for on-site construction management [63].

4.1.3. Big Data Processing in Construction

Big data processing has been effectively utilized in the construction industry for failure prediction data [56], construction waste analytics [25], profitability data [39], modular and prefabricated construction [52], fire incident management [64], smart campus energy monitoring [58], healthier cities management [28], smart road management [40], and others.

Though MR and Spark have their own significance, these are less frequently employed in the construction industry to process big data such as BIM-associated data. Partial BIM models’ retrieval was optimized by MR by Bilal, et al. [65] and Chang and Tsai [66]. The authors found a loop in the Hadoop MR logic of data distribution. For overcoming the query problem, a few steps of prepartitioning and processing are introduced for relevant BIM data parts that are later stored in Hadoop clusters. Node multi-threading during data analysis helped by making the CPU work its maximum. This helped in customizing Hadoop for BIM data while the YARN application implemented querying components. YARN applications are further utilized to develop a BIM system for quantity estimation and clash detection that can execute required tasks with the performance improved many-fold.

Another research group worked for naive and expert BIM users by developing a system for BIM data storage and retrieval [67]. The authors developed a system for cloud BIM to retrieve and represent big data intelligently. This system helped develop an interactive interface to maximize the usability and utility of construction big data. Complex BIM data are retrieved by processing proposed natural languages after reformulating user queries. This data are then visualized by mapping on various visualizations. Before query evaluation, two BIM collections are merged to optimize the process of query execution. Using this technology, a 40% reduction in response time has been witnessed compared to other traditional technologies. Currently, the utilization of BIM is limited across the construction and facilities management stages. The real intent of BIM could only be achieved once applied at each stage of the building lifecycle.

4.2. Big Data Storage

Big data storage is also an important aspect of BDE. In construction, big data storage has been explored for forecasting the success of construction projects [68], smart buildings data storage [69], tender price evaluation [70], and others. Despite the availability of BIM data storage, the current applications in construction still require successful implementation. Social BIM, proposed by Das et al. [71], captures building models and the social interactions among the users. The authors developed BIMCloud based on the distributed BIM framework.

Similarly, a two-tiered hybrid data infrastructure was proposed by Jeong et al. [72] for data management and monitoring of bridges. In this model, the client tier efficiently completes some analytical tasks by storing structured data momentarily using MongoDB, while the central tier stores sensor data permanently using Apache Cassandra. Lin et al. [67] also used MongoDB to store BIM data obtained through building models.

Overall big data storage is provided by either emerging NoSQL databases or distributed file systems, as explained subsequently.

4.2.1. Distributed File Systems

The distributed file systems consist of Hadoop Distributed File System (HDFS) and Tachyon. HDFS is designed to deal with large and complex databases such as those related to BIM, waste, and other construction big data sources. It operates with the commodity servers grouped together in a cluster. As it utilizes several servers, the probability of hardware failure also increases. To overcome this problem, HDFS introduces fault tolerance achieved through the distribution of data and their replication. However, in situations where low-latency data access is required, HDFS is not a suitable option as it shows inferior performance. Moreover, it is also troublesome to save many small files due to issues in managing meta-data. Moreover, it is not useful if modifications must be made concurrently at random locations in the data. Nevertheless, HDFS has been utilized by construction researchers for observing construction workers’ behavior [73], improving road performance [39], and investigating profitability performance [39]. Furthermore, based on the distributed input from HDFS, it facilitates building predictive models for conducting building simulations that give output in a predictive model markup language.

Tachyon is a distributed file system designed to extend HDFS benefits by providing access to the distributed data across the cluster at memory speed. It provides better performance through in-memory data caching and backward compatibility allows MR and Spark tasks to run without changing the codes required in those programs. Tachyon has been utilized in construction for handling unstructured documents [65] and file storage [74]. The Tachyon performs better than HDFS, is backward compatible and can handle the MapReduce jobs without any further modifications.

4.2.2. NoSQL Databases

Relational databases have been common for data management in past decades. However, new applications were designed for better performance, scalability, and flexibility as the technology emerged. Relational databases lag because of their special processing and storage needs. As a result, new systems were devised to fill this technology gap. One such system is the “Not only SQL” system that has optimized data management in several ways. For achieving flexibility, it supports schemaless storage rather than schema-oriented storage. NoSQL has been widely used in different industries, including construction, due to its fragmented nature. Some examples of NoSQL in construction include integration of lessons learned knowledge in BIM [75], web service framework for construction supply chain collaboration and management [76], and Social BIMCloud implementation [71]. NoSQL systems store schemaless data in a non-relational model. It does not set too many restrictions on value and allows easy product determination. Generally, when NoSQL databases are set to key values, they carry out only specific tasks without evaluating specific values. The key-value database is mainly tailored to the business accessed through the primary key. These systems have four data models that are briefly discussed below.

Key-value

This is the simplest data model used for unstructured data storage. However, the data lack self-description. It has been used for knowledge management in construction [77] and integration of lessons learned knowledge in BIM [75]. BIM provides positive outcomes on project success, such as cost and time reduction, communication and coordination improvement, and increased quality. Big data utilization in BIM can be beneficial to discover root causes of poor building performance, perform real-time data queries, improve the decision-making process, improve productivity, and reveal new designs and services in the construction industry, as is the case in every industry.

Document

This model can store self-describing data. However, this model can lag in terms of efficiency. It has been used for unified lifecycle data management in architecture, engineering, construction, and facilities management through BIM integration [78].

Columnar

Aggregated columns, grouped sub-columns, and sparse data can be stored by using this model. It has been used for integrating digital construction through the internet of things [79] and smart archiving of energy and petroleum construction projects [80].

Graph

This model works well for property-graph-based huge datasets in relationship traversal. It has been used for the 4D construction management information model of prefabricated buildings [81] and the development of a BIM-enabled software tool for facility management [82].

Databases concerning big data storage and management are widely used worldwide for research on various topics. The construction industry also relies on big data sources and databases, observed throughout the last five years to a decade. As shown in Figure 8, the search engine is among the most widely searched database in the last five years, followed by relational and graph DBMS. Until the time of analyzing data for this review, i.e., November 2021, other heavily used databases for extracting and using big data for the construction industry include document stores, native XML, key-value stores, and wide column stores. Object-oriented DBMS and multivalued DBMS search are considerably lower than relational DBMS and graph DBMS, whereas the search engines outperform all other DBMS. These different databases provide data sources for BIM and computational sources for developing structures that could guide larger construction projects. The rising trend in using big data sources shows the increasing interest among the construction industries in big data. For example, exchanging and reusing information is critical for engineering and construction project management. The issues pertaining to data exchange have been minimized with the Extensible Markup Language (XML) application. Such an XML-based Distributed Construction Estimating System (XDCES) has been helpful to reduce the overload of cost-estimating information exchange. Similarly, construction-based DBMS enables all construction companies to build and maintain a database easily. It allows supervisors and workers to capture information using a mobile or tablet device, and then all of that information is stored in the cloud and accessible via a desktop version.

4.3. Big Data Analytics (BDA)

BDA gathers information from a variety of disciplines. All these disciplines have one thing in common: to find out data patterns. Some of these related disciplines are data mining, statistics, business analytics, predictive analytics, data analytics, knowledge discovery from data, and the most recent one, big data. Big data use the previous techniques to broaden the field of data analytics. For BDA, some of the ML-based tools are developed. In construction projects, BDA has been used for improving building design and effective performance monitoring [37], project safety, energy, resource, overall management and decision-making frameworks [38], and quality and waste management [24]. Big data analytics has been taken a step further by developing predictive analysis techniques. Ngo et al. [83] used a factor-based big data predictive analytic tool for analyzing the capacity of construction industries to deal with big data. This tool was tested and validated on four different construction organizations to ensure that the predictive analytic method could improve how the construction industry can use big data. The integration of big data in the construction industry remains an avenue that requires further research in terms of big data analytics. The gaps in this area were explored by Atuahene et al. [30] and Atuahene et al. [84]. It was identified that the management and processing of data by firms led to the generation of more data, which made data analysis an uphill task. Developing an integrated framework for managing big data and sorting the useful datasets can greatly increase the usability and application of big data in the construction industry. Overall, data analytics is conducted through statistical, data mining, and regression techniques, as explained below.

4.3.1. Statistics

Statistics has wide applications in the construction industry. Statistical techniques including Monte Carlo simulation, Gaussian distribution, non-Bayesian methods, correlation analysis, factor analysis, decision trees, Naïve Bayes, and others have been reported by various studies in construction [85,86]. Some of the areas that benefitted from statistics include learning from post-project reviews, identifying causes of construction delays, analyzing buildings for structural damages, construction litigation, and identifying and recognizing heavy machinery and workers. Other examples of statistics in construction are those of bidding statistics to predict completed construction cost [87], accidents statistics [88], quality control [89], and six sigma for project success [90,91]. From measuring the bid-to-win ratio to how much a project is over budget or schedule, and KPIs, the more numbers you can put behind your work, the better. Data not only allow for more visibility into the state of a particular project, but relevant industry statistics and facts can provide valuable information needed to make important future decisions regarding preconstruction and planning, productivity tools, risk assessment, and workforce and operational efficiency. Table 1 presents some uses of statistical models in construction.

4.3.2. Data Mining

Data mining is used to extract meaningful patterns in the data. It has been an integral part of all big data management systems. It employs the techniques used in pattern recognition, ML, and statistics. Several models are assessed, and the ones with the best tolerance and high accuracy are selected and used for obtaining predictive results. In construction, data mining has been reported in waste management [97], BIM-based construction engineering quality management [98], and other relevant areas. Data mining detects useful regularities and information necessary for decision making for construction management projects. A data mining method such as cluster analysis is important for the construction industry, as it combines different construction objects into homogeneous groups and investigates them.

Data mining is supported through data warehousing. Specially structured data is stored in data warehousing for querying and analysis. Extract, transform and load (ETL) is a program that allows the collation of transactional data and operational data. Warehouse data analysis is conducted using Online Analytical Processing (OLAP), which performs better than SQL in computing breakdown and data summaries. OLAP has been used for cost data management in construction cost estimates by Moon et al. [99]. OLAP technology deals with the operational data and data obtained using big data technology. OLAP is presented as a multidimensional cube that rapidly processes datasets.

Similarly, different data mining techniques have been used to identify construction delays. For analyzing construction datasets, Kim et al. [12] presented a framework of knowledge discovery in databases (KDD). In the KDD, the most time-consuming and challenging step is data preprocessing. Nevertheless, KDD is a powerful tool for identifying casual relationships in construction projects and reducing construction variability by identifying and eliminating causes for possible deviations. With the application of KDD, randomness of construction projects and novel patterns can be determined. Other techniques include dimensional matrix analysis, link analysis, and text analysis [100]. Other datasets with information related to delay causes, BIM-based knowledge discovery, intelligent learning, and the prevention of occupational injuries can be easily extended in the domain of data mining.

4.3.3. Regression Techniques

Based on an input variable, regression predicts the value of the target variable. It is a supervised ML method. Regression is categorized into simple linear and multiple linear regression based on explanatory variables. In simple linear regression, the relationship between two variables (an explanatory variable x and a dependent variable y) is modeled using ML. While in multiple linear regression, two or more explanatory variables are used and their relationship with the dependent variable is modeled. The more common regression technique is multiple linear regression.

Regression has been extensively used in construction research. For example, it has been used to predict properties of concrete cured under hot weather [48], predicting final cost for competitive bids on construction projects [101], determining contingency in international construction projects [102], estimating performance time for construction projects [103], and others. Moreover, regression has been used for cost estimation, which is a difficult task in the early stages of the project. Adoption of parametric methods such as regression and multiple regression can be applied as both analytical and predictive techniques to estimate the overall reliability of the cost estimation.

4.4. The 10 vs. of Big Data

The bulk and variety of big data gathering enormously each day make it virtually impossible to deal with the data sources seamlessly. On the other hand, the enormity of big data gives it many characteristics that further expand the potential of big data and its applications in different research fields. Figure 9 provides an overview of some of the crucial characteristics of big data, also known as the vs. of big data. The 10 vs. of big data have been discussed in Figure 9. Understanding these characteristics of big data enables the identification of opportunities and challenges. The most crucial properties of big data include their value, volume, velocity, variety, veracity, volatility, validity, variability, vulnerability, and visualization, also known as the 10 vs. of big data [104]. These characteristics of vs. are used to guide research in different areas and fields.

In terms of the use of big data in the field of construction, analyzing the vs. can help explore how big data can be used for developing better construction models in the future. Firstly, big data provide great value using various databases and sources that inform the research studies and algorithm developments related to computational models of different building structures. In addition to the value of research, big data also provide a bulk of information needed for research simply through the ever-increasing volume of data that becomes available each day. Furthermore, the velocity with which databases expand each day adds variety to the sort of data available for utilization in fields like construction. The variety of data present is not varying just in terms of the data sources but also the types of data. For example, big data can be present in the form of written text, graphs, pictures, and various other formats to help manage construction project schedules and progress reporting. The increasing amounts of data make the visualization process quite complex. Therefore, it is crucial to develop new ways for data visualization and analysis to keep with the volatility of big data.

The 10 vs. of big data are among the crucial characteristics representing the true picture of big data as a field of research. The applications of big data in the construction industry are innumerable and they can all be categorized and managed through understanding the characteristic features (or Vs) of big data. The construction industry benefits immensely as a business by integrating big data technologies. The correlation with the business side of the construction industry has been explored in light of the 10 vs. of big data and it has been found that these characteristics provide an immense business growth potential. Starting from the core attributes of volume, variety and velocity, big data have come a long way in terms of their applications and trends. Today, there are 10 characteristics that define big data and are also crucial for implementing big data into different fields. It is crucial to understand that these 10 vs. of big data can be explained in a context-dependent manner considering the field of research. As for the construction industry, the variety and volume of big data are immense, but there is also a great deal of variability in the data present. For example, the choice of building materials and the suitability of the selected materials in different projects depend on several different factors. In this case, analyzing the applicability of big data is possible through data-visualizing techniques that can help deal with the volatility and variability of big data. Similarly, the validity and veracity of big data in construction can be judged only after analyzing the value that the data sources bring and the authenticity that these sources present. Therefore, the increasing velocity of big data is not useful as an independent factor. Instead, the application of big data in the construction industry depends on the 10 different characteristics (Vs) which are associated with big data and are explained in Figure 9.

Similarly, these data types can be refined and unstructured, further adding variety to the type of data present for various reporting and research purposes. Veracity refers to the reliability of big data. This is guided by statistics as the enormity of big data makes it hard to identify reliable data sources. Therefore, validating data sources and ensuring that they can be reliably used to guide construction project developments is crucial for research. The veracity of data sources leads to another important characteristic of big data: variability. It is crucial to understand that big data can be highly variable depending on the sources used for extracting the datasets. Understanding these characteristics of big data and analyzing these characteristics given the use of big data in the construction industry can greatly enhance the potential of future construction projects.

Overall, multiple construction-related studies have reported the usage of vs. of big data. For example, velocity has been reported for high-speed construction data processing [105]. Value has been reported for smarter universities and campuses [106]. Volume has been reported for mass level offsite construction material and component production [107]. Variety has been reported for investigating the profitability performance of construction projects [39]. Veracity has been reported for forecasting the success of construction projects [68]. Similarly, variability has been reported for modeling occupational accidents in construction projects [108].

Big data necessitate cost-effective, innovative information processing forms for enhanced insights and decision making. Construction companies can analyze historical datasets and carry out predictive analytics to forecast future events. Data-driven decision making has the potential to reshape the entire business. Together, the 10 attributes or 10 vs. of big data play a crucial role in the construction industry. The volume of data and the velocity through which data are produced at high speed lead to the possibility of validating information related to construction projects. The ability to visualize big data, keep up with the variety of data, and accept the volatility, vulnerability, and variability that come with the veracity of data helps ensure that big data could be truly applicable in the construction industry. Therefore, the value of big data in the construction industry is high and it helps guide future projects.

4.5. Machine Learning Techniques

One AI subdomain is ML which can be used to learn from the data using computational systems. The tools used for big data ML are presented in Table 2. ML is further categorized into: (i) supervised learning; (ii) unsupervised learning; (iii) association; and (iv) numeric prediction. ML has several applications in the construction industry. It uses different approaches, including rule-based learning approaches, case-based reasoning techniques, artificial neural networks, and hybrid methodologies.

ML has immense potential as a tool in the field of construction. Over the last two decades, several ML algorithms have been proposed to aid and improve the overall process of construction. For example, ML has been used to predict properties of concrete [48], contract management [109], site safety and injury prediction [46], delay risks management [45], BIM integrated on-demand site monitoring [47], and other areas of construction engineering and management.

Various ML tools are integrated at different steps along with the construction management processes. Different ML interfaces such as PyTorch and Keras.io help develop computational models based on existing data for building futuristic construction models. BIM can also be improved by using big data and ML tools, as these technologies allow the opportunity to explore how technology could be applied to the construction industry [110]. Over the last few years, different algorithms have been explored to predict various project phases and guide construction projects from inception to closure [111]. Firstly, decision trees and similar tools are used for developing an overall project timeline to predict or determine construction project performance in various phases. Secondly, statistical analysis tools are used for analyzing previous projects and choosing guiding principles for future projects [112]. Finally, design tools are integrated with ML algorithms to build 3D construction models and graphics for building models. These computational models enable analyzing construction projects by planning through look-up schedules and looking for ways to improve buildings and other structures [113].

The combined use of big data, ML, and AI holds the potential to develop seamless construction projects and enable the development of structures that can withhold severe weather conditions and disasters. For example, one of the key uses of ML tools in futuristic construction projects can be the development of structures that can stand through natural disasters and provide safety nets to communities during floods and other disasters [114]. Similarly, post-disaster evacuation and rescue of individuals can also be carried out more easily if the area contains structures such as roads and buildings built through the use of statistical modeling, thus providing safe routes for people [115]. Although the automation of construction projects remains a future goal, the integration of different ML algorithms is already underway. Managing costs, timelines, and human resources on a construction project are areas guided by various algorithms and computational models [116]. The ML approach can also be applied to develop leading indicators to classify sites according to their safety risk in construction projects.

Table 2. Machine learning tools used for big data.

No.	Tool	Description	Supported Algorithm	Languages	Applications in Construction	Ref.
1	PyTorch	PyTorch is a free tool available for Windows, Mac OS, and Linux for developing ML programs	Regression Classification Clustering Dimensionality reduction Preprocessing	C, C++, Python	Object detection, analyzing buildings and other structures to develop better models	[117]
2	Apache Mahout	An open-source tool that allows high-performing and scalable applications using ML	Distributed Linear Algebra Clustering Regression Preprocessing	Java, Scala	Processing big data for the development of building models and appropriate algorithms	[118,119]
3	Shogun	A diverse ML platform supporting various languages and platforms. Works well with Windows, Linux, and Mac OS	Classification Regression Dimensionality reduction Online learning Support vector machines	C++	Provides a platform for analyzing data and developing strategies for construction projects using available information in the form of big data	[120]
4	SciKit Learn	A free, machine-learning tool that supports Windows, Mac OS, and Linux	Regression Classification Preprocessing Clustering Model selection	C, C++, Cython, Python	Enables statistical analysis for construction projects, particularly using existing data for developing suitable construction models	[121]
5	Keras.io	An ML software that can be used across different platforms	API for neural networks	Python	Provides training models which can be harnessed for improving BIM and creating confident models for construction projects	[122]

5. Future Opportunities of Big Data in Construction

There is immense potential for the use of big data in the construction industry. The use of big data and ML can enable construction automation. These tools can also enhance the overall project by removing various hurdles and roadblocks that tend to slow down different projects. The construction industry is quite dynamic and demanding, with the need for labor strength and human resources to ensure the smooth running of projects. The constant challenge of keeping projects on track and ensuring that new buildings and structures are made up to modern standards puts much strain on the project management teams. These roadblocks can greatly be reduced with the use of big data and ML. The core aim of using big data in the construction industry is to enhance the project planning phases and speed up the overall construction process by predicting the possible timelines for particular projects and identifying what factors can be worked on to improve the overall process [123].

The automation of the construction projects will require the combined use of big data, deep learning, and ML tools. One of the major concerns with such projects is ensuring workers’ safety and developing strategies for overcoming potential threats to the overall process. Safety of the workers and the structures is essential for the smoother development of construction projects. The use of big data and related tools can ensure that existing data and information can be used for drafting guiding principles and then building computational models accordingly. For example, using sensor-based wearable personal protective equipment, the big data of near misses, onsite accidents, hazards, and other issues can be generated for developing safety plans and management techniques. Similarly, big data, BIM, and cloud-powered simulations can help minimize project waste and help produce superior quality constructed facilities. Further, big data artifacts generated by 3D scanners for as-built drawing development are another key advantage whereby the rehabilitation plans of ancient heritage sites can be developed.

The future holds great potential for the construction industry through big data integration. Some of the key opportunities for the construction industries lie in using big data for business and environmental sustainability. The current roadblocks faced by the construction industry can be overcome in the future through the integration of information extracted through big data. The use of information gathered from past and present projects can help develop sustainable infrastructure in the long term. It is possible to avoid past mistakes and use better quality products guided by the information found through big data in construction. Future research directions in the field of construction rely heavily on big data as the presence of information sources can help in building better infrastructure and greatly improve building designs and the overall construction business. The construction industry must move towards automation and build upon the integration of technology to make the future use of big data seamless and hassle-free. The use of big data tools, BIM, and CAD can only be possible if the relevant support and integration systems are present [107]. Hence, the future of the construction industry depends on upgrading the present environment gradually.

Overall, the role of big data in enabling the entire process of futuristic construction projects is undeniable. Data play a crucial role in developing training models and smoothly enabling the process of construction. Future developments in this field will also include the generation and use of more algorithms and models that rely on big data, owing to the need to train the models reliably.

6. Conclusions

The construction industry is yet to reap the true benefits of using big data aptly. Over the last two decades, the rapid growth of big data technologies has caused a spike in the number of models and platforms that have been developed for increasing digitalization across different fields. However, the same level of digitalization has not truly been harnessed or integrated by the construction industry. A critical overview of the existing literature points towards the bulk of existing resources and platforms that can easily be applied for construction management. However, the state of implantation of adoption in construction is below par. Therefore, the utilization and commercialization of big data to benefit the construction industry are crucial. An extensive literature review enabled us to identify the potential of big data in construction as the industry generates huge amounts of data daily and can greatly improve using the latest technologies. The development of online tools and software which enable infrastructure modeling and CAD is a crucial step in the right direction for futuristic constructions. Having explored the existing ML tools, we found that these tools, coupled with big data, can be applied in the construction industry. In this paper, we have discussed the existing tools used in big data, the use of statistics, big data storage, and BDE. Overlap between these variables further creates complications in that more data are present and the field of big data is ever-expanding.

The current study contributes to the body of knowledge by providing a state-of-the-art review of relevant articles focused on big data applications in construction published between 2010 and 2021. It further provides various current applications and future opportunities of big data in the construction industry for practitioners and researchers to ponder upon and initiates the necessary debate around practical implementation and adoption of big data applications in construction.

There are currently various gaps and pitfalls that act as barriers to using big data to its full potential. Firstly, data generation is much faster than the tools available for processing it. Moreover, big data integration into the construction industry is quite an uphill task even with the existing data processing tools.

The current study is limited to the literature published in the last decade and may not include all the available papers due to specific selection criteria developed in this study. Similarly, the search terms may not be holistic and thus not exhaustive; a study conducted in the future with slightly different search strings may produce different results. In the future, the researchers can expand upon and explore the five clusters identified in Figure 4. The individual relations and adoption frameworks for big data in these clusters can be explored.

Author Contributions

Conceptualization, H.S.M. and F.U.; methodology, H.S.M., F.U. and S.Q.; software, H.S.M. and F.U.; validation, H.S.M., F.U., S.Q. and D.S.; formal analysis, H.S.M. and F.U.; investigation, H.S.M., F.U. and S.Q.; resources, H.S.M. and F.U.; data curation, H.S.M., F.U., S.Q. and D.S.; writing—original draft preparation, H.S.M. and F.U.; writing—review and editing, H.S.M., F.U., S.Q. and D.S.; visualization, H.S.M. and F.U.; supervision, F.U.; project administration, H.S.M. and F.U.; funding acquisition, H.S.M. and F.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available with the first author and can be shared upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Villars, R.L.; Olofson, C.W.; Eastwood, M. Big data: What it is and why you should care. White Pap. IDC 2011, 14, 1–14. [Google Scholar]
Siddiqa, A.; Karim, A.; Gani, A. Big data storage technologies: A survey. Front. Inf. Technol. Electron. Eng. 2017, 18, 1040–1070. [Google Scholar] [CrossRef] [Green Version]
Phaneendra, S.V.; Reddy, E.M. Big Data-solutions for RDBMS problems-A survey. In Proceedings of the 12th IEEE/IFIP Network Operations & Management Symposium (NOMS 2010), Osaka, Japan, 19–23 April 2013. [Google Scholar]
Henry, R.; Venkatraman, S. Big Data Analytics the Next Big Learning Opportunity. J. Manag. Inf. Decis. Sci. 2015, 18, 17–29. [Google Scholar]
Xu, W.; Sun, J.; Ma, J.; Du, W. A personalized information recommendation system for R&D project opportunity finding in big data contexts. J. Netw. Comput. Appl. 2016, 59, 362–369. [Google Scholar]
Sepasgozar, S.M.; Davis, S. Construction technology adoption cube: An investigation on process, factors, barriers, drivers and decision makers using NVivo and AHP analysis. Buildings 2018, 8, 74. [Google Scholar] [CrossRef] [Green Version]
Ullah, F.; Sepasgozar, S.M.; Wang, C. A systematic review of smart real estate technology: Drivers of, and barriers to, the use of digital disruptive technologies and online platforms. Sustainability 2018, 10, 3142. [Google Scholar] [CrossRef] [Green Version]
Kwon, O.; Lee, N.; Shin, B. Data quality management, data usage experience and acquisition intention of big data analytics. Int. J. Inf. Manag. 2014, 34, 387–394. [Google Scholar] [CrossRef]
Cui, L.; Yu, F.R.; Yan, Q. When big data meets software-defined networking: SDN for big data and big data for SDN. IEEE Netw. 2016, 30, 58–65. [Google Scholar] [CrossRef]
Chaudhary, R.; Aujla, G.S.; Kumar, N.; Rodrigues, J.J. Optimized big data management across multi-cloud data centers: Software-defined-network-based analysis. IEEE Commun. Mag. 2018, 56, 118–126. [Google Scholar] [CrossRef]
Simmhan, Y.; Aman, S.; Kumbhare, A.; Liu, R.; Stevens, S.; Zhou, Q.; Prasanna, V. Cloud-based software platform for big data analytics in smart grids. Comput. Sci. Eng. 2013, 15, 38–47. [Google Scholar] [CrossRef]
Kim, K.Y. Business intelligence and marketing insights in an era of big data: The q-sorting approach. KSII Trans. Internet Inf. Syst. (TIIS) 2014, 8, 567–582. [Google Scholar]
Hu, X. Sorting big data by revealed preference with application to college ranking. J. Big Data 2020, 7, 1–26. [Google Scholar] [CrossRef]
Custers, B.; Uršič, H. Big data and data reuse: A taxonomy of data reuse for balancing big data benefits and personal data protection. Int. Data Priv. Law 2016, 6, 4–15. [Google Scholar] [CrossRef] [Green Version]
Majumdar, J.; Naraseeyappa, S.; Ankalaki, S. Analysis of agriculture data using data mining techniques: Application of big data. J. Big Data 2017, 4, 1–15. [Google Scholar] [CrossRef] [Green Version]
Shadroo, S.; Rahmani, A.M. Systematic survey of big data and data mining in internet of things. Comput. Netw. 2018, 139, 19–47. [Google Scholar] [CrossRef]
Zhou, R.; Liu, M.; Li, T. Characterizing the efficiency of data deduplication for big data storage management. In Proceedings of the 2013 IEEE international symposium on workload characterization (IISWC), Portland, OR, USA, 22–24 September 2013; pp. 98–108. [Google Scholar]
Petri, I.; Rana, O.; Beach, T.; Rezgui, Y.; Sutton, A. Clouds4Coordination: Managing project collaboration in federated clouds. In Proceedings of the 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), Limassol, Cyprus, 7–10 December 2015; pp. 494–499. [Google Scholar]
Hay, B.; Nance, K.; Bishop, M. Storm clouds rising: Security challenges for IaaS cloud computing. In Proceedings of the 2011 44th Hawaii International Conference on System Sciences, Washington, DC, USA, 4–7 January 2011; pp. 1–7. [Google Scholar]
Afolabi, A.; Ojelabi, R.A.; Fagbenle, O.I.; Mosaku, T. The economics of cloud-based computing technologies in construction project delivery. Int. J. Civ. Eng. Technol. (IJCIET) 2017, 8, 232–242. [Google Scholar]
Moniruzzaman, A.; Hossain, S.A. Nosql database: New era of databases for big data analytics-classification, characteristics and comparison. arXiv 2013, arXiv:1307.0191. [Google Scholar]
Kouanou, A.T.; Tchiotsop, D.; Kengne, R.; Zephirin, D.T.; Armele, N.M.A.; Tchinda, R. An optimal big data workflow for biomedical image analysis. Inform. Med. Unlocked 2018, 11, 68–74. [Google Scholar] [CrossRef]
Rodrigues, M.; Santos, M.Y.; Bernardino, J. Big data processing tools: An experimental performance evaluation. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1297. [Google Scholar] [CrossRef]
Wang, D.; Fan, J.; Fu, H.; Zhang, B. Research on optimization of big data construction engineering quality management based on RNN-LSTM. Complexity 2018, 2018, 9691868. [Google Scholar] [CrossRef]
Bilal, M.; Oyedele, L.O.; Akinade, O.O.; Ajayi, S.O.; Alaka, H.A.; Owolabi, H.A.; Qadir, J.; Pasha, M.; Bello, S.A. Big data architecture for construction waste analytics (CWA): A conceptual framework. J. Build. Eng. 2016, 6, 144–156. [Google Scholar] [CrossRef]
Munawar, H.S.; Qayyum, S.; Ullah, F.; Sepasgozar, S. Big data and its applications in smart real estate and the disaster management life cycle: A systematic analysis. Big Data Cogn. Comput. 2020, 4, 4. [Google Scholar] [CrossRef] [Green Version]
Qadir, Z.; Khan, S.I.; Khalaji, E.; Munawar, H.S.; Al-Turjman, F.; Mahmud, M.P.; Kouzani, A.Z.; Le, K. Predicting the energy output of hybrid PV–wind renewable energy system using feature selection technique for smart grids. Energy Rep. 2021, 7, 8465–8475. [Google Scholar] [CrossRef]
Miller, H.J.; Tolle, K. Big data for healthy cities: Using location-aware technologies, open data and 3D urban models to design healthier built environments. Built Environ. 2016, 42, 441–456. [Google Scholar] [CrossRef]
Chen, X.; Lu, W. Scenarios for Applying Big Data in Boosting Construction: A Review. In Proceedings of the 21st International Symposium on Advancement of Construction Management and Real Estate, Guiyang, China, 24–27 August 2018; pp. 1299–1306. [Google Scholar]
Atuahene, B.T.; Kanjanabootra, S.; Gajendran, T. Towards an integrated framework of big data capabilities in the construction industry: A systematic literature review. In Proceedings of the 34th Association of Researchers in Construction Management (ARCOM), Belfast, UK, 3–5 September 2018; p. 547. [Google Scholar]
Ullah, F. A beginner’s guide to developing review-based conceptual frameworks in the built environment. Architecture 2021, 1, 5–24. [Google Scholar] [CrossRef]
Ullah, F.; Al-Turjman, F. A conceptual framework for blockchain smart contract adoption to manage real estate deals in smart cities. Neural Comput. Appl. 2021, 1–22. [Google Scholar] [CrossRef]
Ullah, F. Developing a Novel Technology Adoption Framework for Real Estate Online Platforms: Users’ Perception and Adoption Barriers; University of New South Wales: Sidney, Australia, 2021. [Google Scholar]
Ullah, F.; Qayyum, S.; Thaheem, M.J.; Al-Turjman, F.; Sepasgozar, S.M. Risk management in sustainable smart cities governance: A TOE framework. Technol. Forecast. Soc. Change 2021, 167, 120743. [Google Scholar] [CrossRef]
Qayyum, S.; Ullah, F.; Al-Turjman, F.; Mojtahedi, M. Managing smart cities through six sigma DMADICV method: A review-based conceptual framework. Sustain. Cities Soc. 2021, 72, 103022. [Google Scholar] [CrossRef]
Huang, X. Application of BIM Big Data in Construction Engineering Cost. J. Phys. Conf. Ser. 2021, 1865, 032016. [Google Scholar] [CrossRef]
Loyola, M. Big data in building design: A review. J. Inf. Technol. Constr. 2018, 23, 259–284. [Google Scholar]
Ismail, S.A.; Bandi, S.; Maaz, Z.N. An appraisal into the potential application of big data in the construction industry. Int. J. Built Environ. Sustain. 2018, 5, 145–154. [Google Scholar] [CrossRef]
Bilal, M.; Oyedele, L.O.; Kusimo, H.O.; Owolabi, H.A.; Akanbi, L.A.; Ajayi, A.O.; Akinade, O.O.; Delgado, J.M.D. Investigating profitability performance of construction projects using big data: A project analytics approach. J. Build. Eng. 2019, 26, 100850. [Google Scholar] [CrossRef]
Sharif, M.; Mercelis, S.; Van Den Bergh, W.; Hellinckx, P. Towards real-time smart road construction: Efficient process management through the implementation of internet of things. In Proceedings of the International Conference on Big Data and Internet of Thing, London, UK, 20–22 December 2017; pp. 174–180. [Google Scholar]
Curtis, C. Architecture at Scale: Reimagining One-Off Projects as Building Platforms. Archit. Des. 2020, 90, 96–103. [Google Scholar] [CrossRef]
Shtern, M.; Mian, R.; Litoiu, M.; Zareian, S.; Abdelgawad, H.; Tizghadam, A. Towards a multi-cluster analytical engine for transportation data. In Proceedings of the 2014 International Conference on Cloud and Autonomic Computing, London, UK, 8–12 September 2014; pp. 249–257. [Google Scholar]
Ying, L.J.; Pheng, L.S. Enhancing buildability in China’s construction industry using Singapore’s buildable design appraisal system. J. Technol. Manag. China 2007, 2, 264–278. [Google Scholar] [CrossRef]
Munawar, H.S.; Ullah, F.; Qayyum, S.; Heravi, A. Application of Deep Learning on UAV-Based Aerial Images for Flood Detection. Smart Cities 2021, 4, 1220–1242. [Google Scholar] [CrossRef]
Gondia, A.; Siam, A.; El-Dakhakhni, W.; Nassar, A.H. Machine learning algorithms for construction projects delay risk prediction. J. Constr. Eng. Manag. 2020, 146, 04019085. [Google Scholar] [CrossRef]
Tixier, A.J.-P.; Hallowell, M.R.; Rajagopalan, B.; Bowman, D. Application of machine learning to construction injury prediction. Autom. Constr. 2016, 69, 102–114. [Google Scholar] [CrossRef] [Green Version]
Rahimian, F.P.; Seyedzadeh, S.; Oliver, S.; Rodriguez, S.; Dawood, N. On-demand monitoring of construction projects through a game-like hybrid application of BIM and machine learning. Autom. Constr. 2020, 110, 103012. [Google Scholar] [CrossRef]
Maqsoom, A.; Aslam, B.; Gul, M.E.; Ullah, F.; Kouzani, A.Z.; Mahmud, M.; Nawaz, A. Using Multivariate Regression and ANN Models to Predict Properties of Concrete Cured under Hot Weather. Sustainability 2021, 13, 10164. [Google Scholar] [CrossRef]
Yang, A.; Yu, G. Application of Heterogeneous Network Oriented to NoSQL Database in Optimal Postevaluation Indexes of Construction Projects. Discret. Dyn. Nat. Soc. 2022, 2022, 4817300. [Google Scholar] [CrossRef]
Sanni-Anibire, M.O.; Zin, R.M.; Olatunji, S.O. Machine learning model for delay risk assessment in tall building projects. Int. J. Constr. Manag. 2020, 1–10. [Google Scholar] [CrossRef]
Lu, W.; Chen, X.; Ho, D.C.; Wang, H. Analysis of the construction waste management performance in Hong Kong: The public and private sectors compared using big data. J. Clean. Prod. 2016, 112, 521–531. [Google Scholar] [CrossRef] [Green Version]
Han, Z.; Wang, Y. The applied exploration of big data technology in prefabricated construction project management. ICCREM 2017, 2017, 71–78. [Google Scholar]
Jiao, Y.; Zhang, S.; Li, Y.; Wang, Y.; Yang, B.; Wang, L. An augmented MapReduce framework for building information modeling applications. In Proceedings of the 2014 IEEE 18th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Hsinchu, Taiwan, 21–23 May 2014; pp. 283–288. [Google Scholar]
Yu, T.; Liang, X.; Wang, Y. Factors affecting the utilization of big data in construction projects. J. Constr. Eng. Manag. 2020, 146, 04020032. [Google Scholar] [CrossRef]
Qadir, Z.; Ullah, F.; Munawar, H.S.; Al-Turjman, F. Addressing disasters in smart cities through UAVs path planning and 5G communications: A systematic review. Comput. Commun. 2021, 168, 114–135. [Google Scholar] [CrossRef]
Alaka, H.A.; Oyedele, L.O.; Owolabi, H.A.; Bilal, M.; Ajayi, S.O.; Akinade, O.O. A framework for big data analytics approach to failure prediction of construction firms. Appl. Comput. Inform. 2018, 16, 207–222. [Google Scholar] [CrossRef]
Asadianfam, S.; Shamsi, M.; Kenari, A.R. Hadoop Deep Neural Network for offending drivers. J. Ambient. Intell. Humaniz. Comput. 2021, 13, 659–671. [Google Scholar] [CrossRef]
Liu, R.-H.; Kuo, C.-F.; Yang, C.-T.; Chen, S.-T.; Liu, J.-C. On construction of an energy monitoring service using big data technology for smart campus. In Proceedings of the 2016 7th International Conference on Cloud Computing and Big Data (CCBD), Macau, China, 16–18 November 2016; pp. 81–86. [Google Scholar]
Song, Y.; Clayton, M.J.; Johnson, R.E. Anticipating reuse: Documenting buildings for operations using web technology. Autom. Constr. 2002, 11, 185–197. [Google Scholar] [CrossRef]
Zhang, C.; Arditi, D. Automated progress control using laser scanning technology. Autom. Constr. 2013, 36, 108–116. [Google Scholar] [CrossRef]
Caneparo, L. Shared virtual reality for design and management: The Porta Susa project. Autom. Constr. 2001, 10, 217–228. [Google Scholar] [CrossRef]
Palaneeswaran, E.; Kumaraswamy, M.M. Knowledge mining of information sources for research in construction management. J. Constr. Eng. Manag. 2003, 129, 182–191. [Google Scholar] [CrossRef]
Pan, W.; Ilhan, B.; Bock, T. Process information modelling (PIM) concept for on-site construction management: Hong Kong case. Period. Polytech. Archit. 2018, 49, 165–175. [Google Scholar] [CrossRef] [Green Version]
Kim, J.-S.; Kim, B.-S. Analysis of fire-accident factors using big-data analysis method for construction areas. KSCE J. Civil Eng. 2018, 22, 1535–1543. [Google Scholar] [CrossRef]
Bilal, M.; Oyedele, L.O.; Qadir, J.; Munir, K.; Ajayi, S.O.; Akinade, O.O.; Owolabi, H.A.; Alaka, H.A.; Pasha, M. Big Data in the construction industry: A review of present status, opportunities, and future trends. Adv. Eng. Inform. 2016, 30, 500–521. [Google Scholar] [CrossRef]
Chang, C.-Y.; Tsai, M.-D. Knowledge-based navigation system for building health diagnosis. Adv. Eng. Inform. 2013, 27, 246–260. [Google Scholar] [CrossRef]
Lin, J.R.; Hu, Z.Z.; Zhang, J.P.; Yu, F.Q. A natural-language-based approach to intelligent data retrieval and representation for cloud BIM. Comput.-Aided Civ. Infrastruct. Eng. 2016, 31, 18–33. [Google Scholar] [CrossRef]
Narayan, S.; Tan, H.C. Adopting big data to forecast success of construction projects: A review. Malays. Constr. Res. J. 2019, 6, 132–143. [Google Scholar]
Linder, L.; Vionnet, D.; Bacher, J.-P.; Hennebert, J. Big building data—A big data platform for smart buildings. Energy Procedia 2017, 122, 589–594. [Google Scholar] [CrossRef]
Zhang, Y.; Luo, H.; He, Y. A system for tender price evaluation of construction project based on big data. Procedia Eng. 2015, 123, 606–614. [Google Scholar] [CrossRef] [Green Version]
Das, M.; Cheng, J.C.; Kumar, S.S. Social BIMCloud: A distributed cloud-based BIM platform for object-based lifecycle information exchange. Vis. Eng. 2015, 3, 1–20. [Google Scholar] [CrossRef] [Green Version]
Jeong, S.; Byun, J.; Kim, D.; Sohn, H.; Bae, I.H.; Law, K.H. A data management infrastructure for bridge monitoring. In Proceedings of the Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2015, San Diego, CA, USA, 9–12 March 2015; p. 94350. [Google Scholar]
Guo, S.; Ding, L.; Luo, H.; Jiang, X. A Big-Data-based platform of workers’ behavior: Observations from the field. Accid. Anal. Prev. 2016, 93, 299–309. [Google Scholar] [CrossRef]
Ram, J.; Afridi, N.K.; Khan, K.A. Adoption of Big Data analytics in construction: Development of a conceptual model. Built Environ. Proj. Asset Manag. 2019, 9, 564–579. [Google Scholar] [CrossRef]
Oti, A.; Tah, J.; Abanda, F. Integration of lessons learned knowledge in building information modeling. J. Constr. Eng. Manag. 2018, 144, 04018081. [Google Scholar] [CrossRef]
Das, M.; Cheng, J.C.; Law, K.H. An ontology-based web service framework for construction supply chain collaboration and management. Eng. Constr. Archit. Manag. 2015, 22, 551–572. [Google Scholar] [CrossRef]
Jing, Y.; Wang, Y.-C.; Wang, Z. Knowledge management in construction—The framework of high value density knowledge discovery with graph database. In Civil, Architecture and Environmental Engineering; CRC Press: Boca Raton, FL, USA, 2017; pp. 712–715. [Google Scholar]
Jiao, Y.; Wang, Y.; Zhang, S.; Li, Y.; Yang, B.; Yuan, L. A cloud approach to unified lifecycle data management in architecture, engineering, construction and facilities management: Integrating BIMs and SNS. Adv. Eng. Inform. 2013, 27, 173–188. [Google Scholar] [CrossRef]
Woodhead, R.; Stephenson, P.; Morrey, D. Digital construction: From point solutions to IoT ecosystem. Autom. Constr. 2018, 93, 35–46. [Google Scholar] [CrossRef] [Green Version]
ElZahed, M.; Marzouk, M. Smart archiving of energy and petroleum projects utilizing big data analytics. Autom. Constr. 2022, 133, 104005. [Google Scholar] [CrossRef]
Yang, B.; Dong, M.; Wang, C.; Liu, B.; Wang, Z.; Zhang, B. IFC-based 4D construction management information model of prefabricated buildings and its application in graph database. Appl. Sci. 2021, 11, 7270. [Google Scholar] [CrossRef]
Zibion, D. Development of a BIM-Enabled Software Tool for Facility Management Using Interactive Floor Plans, Graph-Based Data Management and Granular Information Retrieval. Master’s Thesis, Aalto University, Espoo, Finland, 2018. [Google Scholar]
Ngo, J.; Hwang, B.-G.; Zhang, C. Factor-based big data and predictive analytics capability assessment tool for the construction industry. Autom. Constr. 2020, 110, 103042. [Google Scholar] [CrossRef]
Atuahene, B.T.; Kanjanabootra, S.; Gajendra, T. Benefits of Big Data Application Experienced in the Construction Industry: A Case of an Australian Construction Company. In Proceedings of the 36th Annual Association of Researchers in Construction Management (ARCOM) Conference, Virtual Conference, Leeds, UK, 7–8 September 2020. [Google Scholar]
Lam, H.F.; Yang, J.H.; Au, S.K. Markov chain Monte Carlo-based Bayesian method for structural model updating and damage detection. Struct. Control Health Monit. 2018, 25, e2140. [Google Scholar] [CrossRef]
Ara, J.; Ali, S.; Shah, I. Monitoring schedule time using exponentially modified Gaussian distribution. Qual. Technol. Quant. Manag. 2020, 17, 448–469. [Google Scholar] [CrossRef]
Wright, M.G.; Williams, T.P. Using bidding statistics to predict completed construction cost. Eng. Econ. 2001, 46, 114–128. [Google Scholar] [CrossRef]
Abdullah, D.; Wern, G.C.M. An analysis of accidents statistics in Malaysian construction sector. In Proceedings of the International Conference on E-business, Management and Economics, Dubai, United Arab Emirates, 28–30 December 2011; pp. 1–4. [Google Scholar]
Munawar, H.S.; Ullah, F.; Heravi, A.; Thaheem, M.J.; Maqsoom, A. Inspecting Buildings Using Drones and Computer Vision: A Machine Learning Approach to Detect Cracks and Damages. Drones 2022, 6, 5. [Google Scholar] [CrossRef]
Siddiqui, S.Q.; Ullah, F.; Thaheem, M.J.; Gabriel, H.F. Six Sigma in construction: A review of critical success factors. Int. J. Lean Six Sigma 2016, 7, 171–186. [Google Scholar] [CrossRef]
Ullah, F.; Thaheem, M.J.; Siddiqui, S.Q.; Khurshid, M.B. Influence of Six Sigma on project success in construction industry of Pakistan. TQM J. 2017, 29, 276–309. [Google Scholar] [CrossRef]
Shirowzhan, S.; Lim, S. Autocorrelation statistics-based algorithms for automatic ground and non-ground classification of Lidar data. In Proceedings of the ISARC, International Symposium on Automation and Robotics in Construction, Sydney, Australia, 9–11 July 2014; p. 1. [Google Scholar]
Sepasgozar, S.M.; Karimi, R.; Shirowzhan, S.; Mojtahedi, M.; Ebrahimzadeh, S.; McCarthy, D. Delay causes and emerging digital tools: A novel model of delay analysis, including integrated project delivery and PMBOK. Buildings 2019, 9, 191. [Google Scholar] [CrossRef] [Green Version]
Doloi, H.; Sawhney, A.; Iyer, K. Structural equation model for investigating factors affecting delay in Indian construction projects. Constr. Manag. Econ. 2012, 30, 869–884. [Google Scholar] [CrossRef]
Baker, H.R.; Smith, S.D.; Masterton, G.; Hewlett, B. Failures in construction: Learning from everyday forensic engineering. In Forensic Engineering 2018: Forging Forensic Frontiers; American Society of Civil Engineers: Reston, VA, USA, 2018; pp. 648–658. [Google Scholar]
Alipour, M.; Harris, D.K.; Barnes, L.E.; Ozbulut, O.E.; Carroll, J. Load-capacity rating of bridge populations through machine learning: Application of decision trees and random forests. J. Bridge Eng. 2017, 22, 04017076. [Google Scholar] [CrossRef]
Lu, W.; Chen, X.; Peng, Y.; Shen, L. Benchmarking construction waste management performance using big data. Resour. Conserv. Recycl. 2015, 105, 49–58. [Google Scholar] [CrossRef] [Green Version]
Sun, H.; Wang, L.; Yang, Z.; Xie, J. Research on Construction Engineering Quality Management Based on Building Information Model and Computer Big Data Mining. Arab. J. Sci. Eng. 2021, 1–11. [Google Scholar] [CrossRef]
Moon, S.; Kim, J.; Kwon, K. Effectiveness of OLAP-based cost data management in construction cost estimate. Autom. Constr. 2007, 16, 336–344. [Google Scholar] [CrossRef]
Carrillo, P.; Harding, J.; Choudhary, A. Knowledge discovery from post-project reviews. Constr. Manag. Econ. 2011, 29, 713–723. [Google Scholar] [CrossRef] [Green Version]
Williams, T.P. Predicting final cost for competitively bid construction projects using regression models. Int. J. Proj. Manag. 2003, 21, 593–599. [Google Scholar] [CrossRef]
Polat, G.; Bingol, B.N. A comparison of fuzzy logic and multiple regression analysis models in determining contingency in international construction projects. Constr. Innov. 2013, 13, 445–462. [Google Scholar] [CrossRef]
Hoffman, G.J.; Thal, A.E., Jr.; Webb, T.S.; Weir, J.D. Estimating performance time for construction projects. J. Manag. Eng. 2007, 23, 193–199. [Google Scholar] [CrossRef]
Bukowski, L. Reliable, Secure and Resilient Logistics Networks; Springer: Cham, Switzerland, 2019. [Google Scholar]
Konikov, A.; Konikov, G. Big Data is a powerful tool for environmental improvements in the construction business. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2017; p. 012184. [Google Scholar]
Williamson, B. The hidden architecture of higher education: Building a big data infrastructure for the ‘smarter university’. Int. J. Educ. Technol. High. Educ. 2018, 15, 1–26. [Google Scholar] [CrossRef] [Green Version]
Gbadamosi, A.-Q.; Oyedele, L.; Mahamadu, A.-M.; Kusimo, H.; Bilal, M.; Delgado, J.M.D.; Muhammed-Yakubu, N. Big data for Design Options Repository: Towards a DFMA approach for offsite construction. Autom. Constr. 2020, 120, 103388. [Google Scholar] [CrossRef]
Ajayi, A.; Oyedele, L.; Akinade, O.; Bilal, M.; Owolabi, H.; Akanbi, L.; Delgado, J.M.D. Optimised big data analytics for health and safety hazards prediction in power infrastructure operations. Saf. Sci. 2020, 125, 104656. [Google Scholar] [CrossRef]
Valpeters, M.; Kireev, I.; Ivanov, N. Application of machine learning methods in big data analytics at management of contracts in the construction industry. In Proceedings of the MATEC Web of Conferences, St. Petersburg, Russia, 20–22 December 2017; p. 01106. [Google Scholar]
Braun, A.; Borrmann, A. Combining inverse photogrammetry and BIM for automated labeling of construction site images for machine learning. Autom. Constr. 2019, 106, 102879. [Google Scholar] [CrossRef]
Huang, M.; Ninić, J.; Zhang, Q. BIM, machine learning and computer vision techniques in underground construction: Current status and future perspectives. Tunn. Undergr. Space Technol. 2021, 108, 103677. [Google Scholar] [CrossRef]
Cheng, J.C.; Chen, W.; Chen, K.; Wang, Q. Data-driven predictive maintenance planning framework for MEP components based on BIM and IoT using machine learning algorithms. Autom. Constr. 2020, 112, 103087. [Google Scholar] [CrossRef]
Bloch, T.; Sacks, R. Comparing machine learning and rule-based inferencing for semantic enrichment of BIM models. Autom. Constr. 2018, 91, 256–272. [Google Scholar] [CrossRef]
Munawar, H.S.; Hammad, A.W.; Waller, S.T. A review on flood management technologies related to image processing and machine learning. Autom. Constr. 2021, 132, 103916. [Google Scholar] [CrossRef]
Munawar, H.S.; Hammad, A.; Ullah, F.; Ali, T.H. After the flood: A novel application of image processing and machine learning for post-flood disaster management. In Proceedings of the 2nd International Conference on Sustainable Development in Civil Engineering (ICSDC 2019), Jamshoro, Pakistan, 5–7 December 2019; pp. 5–7. [Google Scholar]
Qureshi, A.H.; Alaloul, W.S.; Manzoor, B.; Musarat, M.A.; Saad, S.; Ammad, S. Implications of machine learning integrated technologies for construction progress detection under industry 4.0 (IR 4.0). In Proceedings of the 2020 Second International Sustainability and Resilience Conference: Technology and Innovation in Building Designs (51154), Sakheer, Bahrain, 11–12 November 2020; pp. 1–6. [Google Scholar]
Rozemberczki, B.; Scherer, P.; He, Y.; Panagopoulos, G.; Riedel, A.; Astefanoaei, M.; Kiss, O.; Beres, F.; López, G.; Collignon, N. Pytorch geometric temporal: Spatiotemporal signal processing with neural machine learning models. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Gold Coast, Australia, 1–5 November 2021; pp. 4564–4573. [Google Scholar]
Eluri, V.R.; Ramesh, M.; Al-Jabri, A.S.M.; Jane, M. A comparative study of various clustering techniques on big data sets using Apache Mahout. In Proceedings of the 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, Oman, 15–16 March 2016; pp. 1–4. [Google Scholar]
Solanki, R.; Ravilla, S.H.; Bein, D. Study of distributed framework hadoop and overview of machine learning using apache mahout. In Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019; pp. 0252–0257. [Google Scholar]
Sonnenburg, S.; Rätsch, G.; Henschel, S.; Widmer, C.; Behr, J.; Zien, A.; Bona, F.d.; Binder, A.; Gehl, C.; Franc, V. The SHOGUN machine learning toolbox. J. Mach. Learn. Res. 2010, 11, 1799–1802. [Google Scholar]
Jain, A. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Majumder, G.; Jain, R. A Comparative Study and Analysis of Classification Methods in Machine Learning. Think India J. 2019, 22, 709–718. [Google Scholar]
Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Funnel diagrams showing trends in big data research in construction since 2016.

Figure 2. Overlapping fields of research contributing to big data.

Figure 3. Google Trends analyses for big data in construction, showing interest development and spike in interest.

Figure 4. Five key clusters of big data in construction based on keywords reported in the reviewed literature.

Figure 5. Countries conducting big data research in construction based on reviewed literature.

Figure 6. Classification of big data into its key domains.

Figure 7. Components of Spark and Hadoop. A side-by-side comparison of Spark and Hadoop provides insights about the usability and applications of each.

Figure 8. Database popularity in 2016–2021 based on search trends.

Figure 9. The 10 vs. of big data.

Table 1. Use of statistical models in construction.

Purpose	Techniques	References
Damage detection in buildings	Monte Carlo simulation Gaussian distribution	[85]
Construction time scheduling	Gaussian distribution	[86]
Predicting project delays	Monte Carlo simulation Non-Bayesian methods Correlation analysis Factor analysis	[92,93,94]
Decision making	Decision trees Naïve Bayes	[95,96]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Munawar, H.S.; Ullah, F.; Qayyum, S.; Shahzad, D. Big Data in Construction: Current Applications and Future Opportunities. Big Data Cogn. Comput. 2022, 6, 18. https://doi.org/10.3390/bdcc6010018

AMA Style

Munawar HS, Ullah F, Qayyum S, Shahzad D. Big Data in Construction: Current Applications and Future Opportunities. Big Data and Cognitive Computing. 2022; 6(1):18. https://doi.org/10.3390/bdcc6010018

Chicago/Turabian Style

Munawar, Hafiz Suliman, Fahim Ullah, Siddra Qayyum, and Danish Shahzad. 2022. "Big Data in Construction: Current Applications and Future Opportunities" Big Data and Cognitive Computing 6, no. 1: 18. https://doi.org/10.3390/bdcc6010018

APA Style

Munawar, H. S., Ullah, F., Qayyum, S., & Shahzad, D. (2022). Big Data in Construction: Current Applications and Future Opportunities. Big Data and Cognitive Computing, 6(1), 18. https://doi.org/10.3390/bdcc6010018

Article Menu

Big Data in Construction: Current Applications and Future Opportunities

Abstract

1. Introduction

Research Questions

2. Materials and Methods

3. Preliminary Analyses

4. Big Data Engineering (BDE)

4.1. Big Data Processing

4.1.1. MapReduce (MR)

4.1.2. Directed Acyclic Graph

4.1.3. Big Data Processing in Construction

4.2. Big Data Storage

4.2.1. Distributed File Systems

4.2.2. NoSQL Databases

4.3. Big Data Analytics (BDA)

4.3.1. Statistics

4.3.2. Data Mining

4.3.3. Regression Techniques

4.4. The 10 vs. of Big Data

4.5. Machine Learning Techniques

5. Future Opportunities of Big Data in Construction

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI