Next Article in Journal
Enhancing Sewage Sludge Treatment with Hydrothermal Processing: A Case Study of Adana City
Next Article in Special Issue
A Multi-Information Fusion Method for Repetitive Tunnel Disease Detection
Previous Article in Journal
Coupling Global Parameters and Local Flow Optimization of a Pulsed Ejector for Proton Exchange Membrane Fuel Cells
Previous Article in Special Issue
A Variable-Weight Model for Evaluating the Technical Condition of Urban Viaducts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multimodal Framework for Smart Building Occupancy Detection

by
Mohammed Awad Abuhussain
1,
Badr Saad Alotaibi
1,*,
Yakubu Aminu Dodo
1,
Ammar Maghrabi
2 and
Muhammad Saidu Aliero
3
1
Architectural Engineering Department, College of Engineering, Najran University, Najran 66426, Saudi Arabia
2
Urban and Engineering Research Department, The Custodian of the Two Holy Mosques Institute for Hajj and Umrah Research, Umm Al-Qura University, Makkah 24236, Saudi Arabia
3
Department of Information Technology, Kebbi State University Science and Technology, Aliero 863104, Nigeria
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(10), 4171; https://doi.org/10.3390/su16104171
Submission received: 28 March 2024 / Revised: 30 April 2024 / Accepted: 13 May 2024 / Published: 16 May 2024
(This article belongs to the Special Issue Emergency Plans and Disaster Management in the Era of Smart Cities)

Abstract

:
Over the years, building appliances have become the major energy consumers to improve indoor air quality and occupants’ lifestyles. The primary energy usage in building sectors, particularly lighting, Heating, Ventilation, and Air conditioning (HVAC) equipment, is expected to double in the upcoming years due to inappropriate control operation activities. Recently, several researchers have provided an automated solution to turn HVAC and lighting on when the space is being occupied and off when the space becomes vacant. Previous studies indicate a lack of publicly accessible datasets for environmental sensing and suggest developing holistic models that detect buildings’ occupancy. Additionally, the reliability of their solutions tends to decrease as the occupancy grows in a building. Therefore, this study proposed a machine learning-based framework for smart building occupancy detection that considered the lighting parameter in addition to the HVAC parameter used in the existing studies. We employed a parametric classifier to ensure a strong correlation between the predicting parameters and the occupancy prediction model. This study uses a machine learning model that combines direct and environmental sensing techniques to obtain high-quality training data. The analysis of the experimental results shows high accuracy, precision, recall, and F1-score of the applied RF model (0.86, 0.99, 1.0, and 0.88 respectively) for occupancy prediction and substantial energy saving.

1. Introduction

Energy is an essential attribute that contributes to the sustainability of building sectors [1]. Currently, buildings account for a high percentage of global energy consumption with an expectation to double in upcoming years due to the increase in the deployment of electrical appliances [1]. The bulk of energy used in buildings to produce a healthy and comfortable environment requires more energy generation, which has a great impact on the well-being of our environment. Authors in [2] showed a major turning point that is intimately tied to the Asian region’s continuous population growth as well as the increasing building sector. According to a similar study, attempts made by the government to enhance the psychological well-being and lifestyle of its people have been somewhat countered by a significant increase in energy magnitude, which has increased the use of energy. The efforts to improve the supply that meets energy requirements for more than 1.3 billion individuals and numerous industries have been commended, according to a recent report [1]. The continuous installation of energy-intensive appliances such as HVAC (Heating, Ventilation, and Air Conditioning) systems, televisions, ovens, hair dryers, etc., has increased abnormally in the last few years. Through the years, several researchers have improved energy conservation via different methodologies by utilizing historic energy usage data. More energy savings are possible due to emerging smart grid approaches, which help balance energy usage, demand, and production throughout various sectors. As a result, the overall growth of energy usage in the building and transportation sectors has slowed to less than half that of the past twenty years. In the same way, the industry’s growth in energy usage has stalled. The requirement for non-combusted energy has not decreased despite the current advancements in technologies, especially in the industry as a source of energy for petrochemicals.
Building occupancy prediction plays a major role in various building applications and infrastructures such as smart buildings, indoor intrusion detection, evacuation, building operation, and demand control applications [3]. Currently, different technologies such as passive infrared, cameras, illumination, Wi-Fi, Bluetooth, and environmental sensing (such as temperature, relative humidity, and CO2) are utilized in buildings to count the number of occupants [4]. Studies show that occupancy prediction has the potential to reduce more than 60% of unnecessary building energy consumption. With the rapid improvement of machine learning (ML) and the advancement of computer vision, research on building occupancy prediction via image/video is gaining momentum as reported in several studies [3,5,6,7,8,9]. The research techniques mainly contain two aspects: cameras at the room entrance and installed indoor cameras. The study in [3] configured cameras at the indoor entrance for overhead video recording. The study utilized the ML background subtraction (BS) technique to remove the noise in the frame or area of interest and estimate the number of occupants based on headcount. The study in [10] uses surveillance cameras as boundary sensors alongside the ML histogram of oriented gradient (HOG) classifier based on the ML support vector machine (SVM) to detect occupants. In addition, the study utilized the event-based optimization (EBO) theory to refine the estimation results. This type of method usually uses motion detection and tracking to decide the direction of entering or leaving rooms.
An occupancy localization study that used Plug-Mate alongside IoT technology for occupancy-driven plug load management that employed intelligent plug load automation to lower plug load energy usage and user strain was proposed [11]. The suggested system infers plug load type data via a sophisticated plug load recognition feature, performs plug load automation based on the users’ excellent-quality occupancy details gained via a non-intrusive indoor localization system, and accommodates different patterns and customized interfaces [11]. A similar approach employed Bluetooth Low Energy (BLE) technology that is potentially quite helpful to pinpoint an occupant’s location by using data from beacons placed throughout a facility [12]. The proposed system was implemented using a prototype system consisting of BLE beacons, a mobile application, and a central server to assess BLE as the main method of occupancy estimation in an indoor setting.
Different ML methods are widely used in building occupancy detection [13]. However, the random forest (RF) algorithm’s broad appeal can be attributed to its versatility and ease of use, which allow it to efficiently address challenges related to both regression and classification which serve as a useful tool for a variety of machine learning predictions. Additionally, its proficiency in handling complicated datasets and mitigating overfitting is reported with solid performance even in a residential building where occupants do not have a fixed schedule [14]. The research in [2,15] provides a comparative analysis of building occupancy prediction using different ML methods. The results indicate that the RF method achieves higher prediction accuracy with minimal false results. The studies in [16,17] chose RF as a candidate model over another ML model due to its flexibility on random data to predict room occupancy based on carbon dioxide.
Currently, research on building occupancy prediction has given more emphasis to environmental sensing using occupancy data and indoor metrological conditions to formulate a model that triggers an event to control appliance operation [18]. Indoor environmental quality monitoring is a holistic concept that deals with indoor air quality or thermal comfort and is further described in [19]. Another study proposed multiplicative manufacturing methods using a microcontroller to incorporate temperature, CO2, and relative humidity sensors as well as some other modules to automatically control an air conditioning system and lighting in a residential building occupied by four people [20]. A similar study [21] indicated the impact of energy costs on user comfort, health, and safety. Research has shown the fast acceptance of smart interior lighting in recent years for improving human life [16]. Authors in [22] suggested the utilization of dimmable light-emitting diodes, occupancy sensors, and photodetectors that use microcontroller units to control indoor lighting requirements. These appliances have been primary research areas in recent years [23]. Incorporating occupancy parameters in the control system can essentially enable appliances to significantly optimize energy consumption to deal with energy waste [1]. Direct sensing techniques using technologies such as wearable devices, cameras, or passive infrared have become popular recently for smart energy optimization [24,25]. However, the majority of camera-based [3,15,24] and wearable [26,27] solutions require high interaction with occupants. Even though direct sensing practice has proven to be a reliable solution for complex processing, hardware expenditure, installation feasibility, and occupancy prediction, it still presents serious challenges, including privacy [3].
Many smart buildings today employ environmental sensing as an alternative solution to deal with the privacy and other challenges of previous occupancy prediction solutions. Environmental sensing uses environmental parameters such as changes in CO2, temperature, or humidity to predict building occupancy. A previous study [2] indicated the lack of a publicly accessible dataset for environmental sensing and suggested a technique in which a high-quality dataset can be generated to develop a holistic model that performs solid occupancy prediction, which in turn can be used to reduce HVAC energy consumption. Furthermore, the previous experimental studies indicated that lighting appliances, among the major building energy consumers, limited their solution to only HVAC systems. Additionally, the reliability of their solutions tends to weaken when more than seven occupants are present. Therefore, this study explored and analyzed current approaches for smart building occupancy prediction to improve the quality of occupancy-related datasets that are used for training purposes through the use of a multimodal data fusion approach within the smart home ecosystem. Previous research employed interactive learning to label occupant numbers during data collection [2,14,28,29]. However, the proposed data fusion approach uses a parametric classifier to ensure a strong correlation between the predictor (room occupancy) and predicting parameters (occupancy-related data in the room), verify the quality of occupancy-related data, and filter noisy measurements from sensors without being overly invasive using rule-based decisions. In addition, a lighting control parameter is incorporated together with HVAC system control, unlike current studies [29,30]. A random forest was used to handle the occupancy prediction tasks (other ML algorithms could be integrated) to balance HVAC energy usage with thermal comfort. The occupancy prediction performance results were measured using different evaluation metrics against the baseline design [31]. The thermal perception of the occupancy was analyzed based on on-site interviews with occupants on their experiences of thermal comfort in different temperature settings based on designed questionnaires for participants of different nationalities. The findings demonstrate that a satisfactory comfort level and up to 45% energy savings can be achieved when the proposed controller is deployed in comparison with the previous approach.
The major contribution of this study is to predict indoor occupancy to minimize energy consumption and improve occupant thermal comfort with a minimally intrusive nature. The sub-contributions of this work are as follows:
  • A proposed novel multimodal framework to predict indoor occupancy with minimal intrusion;
  • Feature selection for occupancy prediction based on various occupancy-related data collected;
  • Evaluation of the prediction performance of the proposed approach using a prototype system;
  • Simulation and evaluation of the proposed smart controller based on the indoor thermal comfort of the occupants and energy consumption.
This paper is organized as follows: Section 2 discusses the previous research on interior occupancy recognition and prediction. The materials and methods are presented in Section 3. Section 4 presents the findings from the occupancy recognition and assessment models and presents the findings of energy-saving prospective simulations. Section 5 provides a study discussion and Section 6 concludes the study findings and upcoming research.

2. Related Work

This section examines the literature on smart building occupancy prediction and smart home energy management systems. It highlights typical occupancy prediction approaches. Recent advancements in this domain have concentrated on incorporating features that allow occupants to take part in dataset collection for model training, track building energy consumption, schedule or modify building energy consumption profiles, and engage in utility communication with the grid via various allocations such as self-fault reports and demand responses. Previous studies focused on improving and integrating smart building technologies to enhance energy efficiency through methods such as the multimodal strategy or fusion mechanisms. The integration can be accomplished using various techniques, including monitoring indoor occupancy activities and occupancy tracking through wearable devices, voice identification, and facial recognition [24]. Table 1 summarizes previous proposals for occupancy prediction approaches. These proposals can be classified into predictive and dynamic predictive approaches.
The predictive approach is designed to automate the management of HVAC operations through the formulation and computation of occupancy information and indoor meteorological data to estimate the probability of presence in space. The occupancy information can either be statically generated from fixed-schedule occupancy activities or dynamically collected in real time by sensors at the point of occupancy prediction [1].
The dynamic predictive approach uses real-time data via installed sensors within occupancy surroundings to identify occupancy presence or number in the building. In some cases, the occupant schedule is considered a desirable parameter to generate a model that detects room occupation [1]. The majority of existing dynamic predictive control faces difficulty in identifying actual occupancy to reduce false negative results triggered by PET or stationary objects and estimating the total number of occupants presently in the space to adjust to a comfortable temperature.
The study in [40] uses a CO2PIR sensor-based approach to detect occupant presence in space with the help of an intelligent controller to flag HVAC to adjust the temperature set point, ensuring the thermal comfort of occupants.
PIR and CO2 sensors are widely used technologies for occupancy prediction in indoor environments to facilitate HVAC control. The efficacy of these sensors to notice a sudden change in an occupied environment is very high. However, their major concern is the lack of additional information to identify human occupancy, which can lead to excess energy consumption in the presence of PET or other stationary objects in the indoor space.
The occupancy detection method in [35] efficiently detects human presence and determines the number of occupants using a Gaussian Mixture Model. The suggested approach managed to attain a reasonable degree of precision despite significant false positives stemming from ambient noise in the vicinity. Similarly, to estimate occupancy for HVAC ventilation management, the method necessitates simultaneous communication from all indoor occupancies. A study [36,42] that focused on improving the accuracy of [35] presented a background cancelation process that ignores the sound frequency level of the chosen ideal sound frequency threshold to cope with noise inference from undesired sources. The concept is based on the background cancelation algorithm’s ability to detect the strength of sound frequencies, which decreases occupancy prediction false positives by 11–12% and reduces HVAC equipment energy usage by 3.54% when compared to [35]. Despite using a background cancelation technique, the study’s occupancy forecast is off. A study [3] suggested using two cameras in tandem to improve the identification of building occupancy. With the use of open-source human–computer vision libraries, camera-based image and video processing is frequently utilized for occupancy prediction. Headcount or indoor object tracking is also frequently employed to estimate total occupancy numbers [43]. Using a headcount occupancy estimation process, single-camera occupancy detection in [43] was utilized to regulate indoor ventilation in a lecture hall. The research in [32] predicts the number of occupants using a naive Bayesian algorithm and an open CV library template. The examination of the experimental results reveals that there was a high rate of occupancy detection and estimation, as well as a high false positive rate due to the non-linear line of sight experienced by students as they walked or raced around the perimeter of interest. A significant obstacle encountered by this research is inadequate occupancy identification throughout the overlapped period of entry and exit from the study area. Research in [44,45,46,47] suggested using passive infrared cameras in conjunction with optical cameras to identify human occupancy in space. The purpose of employing a single camera is to guarantee detection reliability while lowering the possibility of false alarms. Potential detection thresholds were generated by extracting and analyzing pixels using a computer vision human template and a support vector machine method. The purpose of employing a single camera is to guarantee detection reliability while lowering the possibility of false alarms. Potential detection thresholds were generated by extracting and analyzing pixels using a computer vision human template and a support vector machine method.
A Naive Bayesian algorithm classification procedure was applied in both cameras to determine the type of occupancy detected. In this procedure, occupancy detection is only acceptable if the cameras report the same occupancy type; otherwise, the detection is rejected and considered undefined, which makes the approach prone to false results when one camera’s quality is better than that of its counterpart.
To reduce this challenge in [48], a study in [3] suggested a solution that uses the naive Bayesian algorithm to extract image pixels for training purposes based on a desirable threshold that should be used for human occupancy prediction. The idea was that if there is a contradiction in occupancy detection, the threshold of the camera with a negative forecast should be reduced by 30% for a potential match. The experimental result analysis shows human occupancy prediction improvement by 12% and 5% energy saving in comparison with the previous study.
Similar studies were proposed in [44,46] that combine occupancy information collected by PIR sensors and cameras to reduce false alarms in conflicting cameras, such as overlapping and straight lines of sight. These techniques use occupancy tagging through binary object tracking to track indoor occupancy activities in order to control HVAC ventilation requirements and maintain occupants’ thermal comfort. Thermal sensors are attached to chairs to detect the electromagnetic radiation and heat frequency produced by occupants.
Research in [20,38,42] uses a sensor fusion approach to collect real-time occupancy data from installed CO2, PIR, and motion detection sensors and parse the data for human activity classification using K-Nearest Neighbor. ML was employed to monitor and classify the occupancy type in the indoor pre-existing trained test data template of the SVM threshold. The PIR and motion sensors are used to provide the controller with knowledge of occupancy movement within the indoor space, and CO2 sensors keep track of occupancy numbers as concentration levels increase to control HVAC ventilation requirements.
The study in [49] combines a light sensor counter and CO2 sensor as a stream of bytes to detect the occupants present in an indoor environment and estimate the occupancy number to manage the ventilation requirement of HVAC equipment to minimize unnecessary energy consumption. The light sensor was placed on the door entrance to count the occupants passing through the entry and exit and to keep track of the indoor occupancy number. The CO2 sensor was placed on the ceiling to track the warm breath of occupants as an indicator of occupant presence status in the indoor space. The experimental result analysis in the laboratory showed significant energy-saving potential for HVAC equipment in comparison with traditional thermostat control.

3. Materials and Methods

IoT sensor data are key players with the full potential to be a significant contributor to smart building technologies that enable dynamic responsive approaches to optimally manage building appliances. Consequently, the foundation for more intelligent, effective appliance control and energy savings is laid by the incorporation of sensors, such as occupancy sensors, with smart building control systems. Moreover, this study applies an ML technique to construct an occupancy prediction model using sensor data. Figure 1 represents the research methodology and process for the proposed approach.

3.1. Occupancy-Related Raw Data

This study uses a sitting room in a building located in Taman Teratai Johor, Malaysia, as a case study, with an ambient value of 25 °C to 30 °C the whole year and the use of a cassette floor building method and a stick-built timber frame as part of an inventive lightweight structural strategy. Table 2 displays the thicknesses and thermal characteristics of the construction materials. These characteristics help evaluate the dynamic and stable behavior of occupants. The purpose of the sitting room is to host social gatherings for activities like dining, lounging, and watching TV. The sensors were installed to track several interior environmental parameters, including temperature, illumination, relative humidity, and CO2 concentrations. Moreover, the sitting room’s entries and departures were documented to verify that the values corresponded with the sensor readings. More details of the experimental setup can be accessed in [2].
Table 3 provides the descriptions of the used sensors to collect occupancy-related data in the building using communication with occupants about the context of the problem. Due to challenges in the labeling of occupancy numbers through interactive learning data collection [28], our approach employed a camera to address the labeling problem that emerges in the estimation of the number of occupants in supervised learning methods, which are widely utilized in many applications [14,44,50,51,52].

3.2. Data Pre-Processing

The quality of the dataset is one of the major factors that contribute to the performance of the prediction model. Previously, in [28], an interactive technique was applied to provide occupancy numbers during data collection to deal with practical privacy problems and incorrect values used in model training. However, in this study, environmental sensing and camera sensing are deployed to replace the manual method (interactive technique) during data collection, enabling self-estimation or labeling. This approach only used camera processing during data collection, thereby minimizing the privacy challenge when the prediction model is deployed in the building. The following three steps are used during data pre-processing.

3.2.1. Sensor Fusion

A general description of the proposed framework is presented in Figure 2, which includes the candidate record, dataset training, and ML model (random forest) that lead to the fuzzy system. Based on an occupant’s decision or habits, such results are sent to regulate or create the proper set point temperature using a fuzzy inference engine to control HVAC operation. Every parameter in the dataset log stream has an occupant-related characteristic. The total occupancy count is recorded in data logs. However, the content of these dataset streams can be affected by a particular occurrence, such as a door opening when entering or departing. When such an occurrence occurs, it takes a while to restore the indoor condition to the real level that corresponds to the building’s occupants.
This study adopted [31] for environmental sensing and [3] for the camera. To ensure the reliability of the dataset record, an interval of 5 min was set between the previously captured record and the next candidate record.
The dynamic assessment of occupancy data is essential for effective modeling. The fusion approach uses camera data to complement environmental sensing data. Dataset collection using this approach has shown good performance for the cross-validation of various ML algorithms compared to the environmental sensing approach alone. Therefore, this study proposed a multimodal occupancy prediction framework (see Figure 2) that incorporates a camera approach with an environmental sensing approach to support dataset collection and quality assessment using a multi-layer perceptron regression algorithm for model training.
To ensure the quality and reliability of training data, the data reading from the sensors has to be verified before storage, which later supports the model prediction when deployed for testing. Therefore, this problem can be treated as follows.
Occupancy number is one of the major parameters used by smart thermostats to regulate indoor comfort and energy usage. Previously, in [28], an interactive technique was applied to provide occupancy numbers during data collection to deal with practical privacy problems and incorrect values used in model training. However, in this study, environmental sensing and camera sensing are deployed to replace the manual method (interactive technique) during data collection, enabling self-estimation or labeling. When the prediction model is deployed, only environmental sensors are used for accurately predicting occupancy numbers by reading the indoor environmental values. In addition, during the data collection, the quality of the data record read by sensors is verified before storage, which later supports the model prediction. Therefore, this problem can be treated as follows.
Let dataset records measured by sensors every time be  r e c o r d e d t ,  consisting of a certain feature,  f 1 f n = r e c o r d ( t ) . Thus,  f 1 r e c o r d t f n ( a   r e c o r d )  correlates with the status of occupancy at time  t , where  f i t ( f i ˇ f i ^ ) . Let the occupancy count represent labels  m e a s u r e d  by a thermal camera that tally with the corresponding indoor data recorded by environmental sensors. Thus, the comprehensive dataset records measured to train the prediction model are represented as  r t = r e c o r d t , l t ). Any record in the dataset that has no corresponding label recorded is considered a candidate record (required validation) and any record with a corresponding label recorded is considered as a recorded record.
Let recorded records be  p . The major challenges in this type of interactive learning are (1) how to avoid the recording of a candidate record in the  d a t a s e t  since it is collected automatically by a camera, and (2) how to assess the quality of the label t  assigned by the camera in datasets.
This process is performed based on a discrete response from the camera-predicted occupancy label to provide the ground truth. Each time a record from the environment tallies with data recorded by cameras, it is considered valid and stored in dataset records; otherwise, the record is discarded. This process is repeated throughout data collection with the help of a parametric classifier. The parameterized classifier used in this study employed decision trees to analyze the candidate record use (if–then) rules. This parameterized classifier employs a preset classifier structure with parameters that can be changed based on the data input. This classifier can be modified in the final structure by each new record set and how much it differs from the previous one to match and address the desired tuning problem. An objective function is to minimize the difference between the actual and estimated (coming from the classifier) indoor occupancy. We combine the conclusion from every modality and execute a mix of biased options and rule-based conclusions in the last step.
A conclusion is drawn from multimodal approaches combined from each modality and a mix of weighted possibilities is obtained to verify the quality of data using rule-based decision making in the final stage. During data collection, the decision tree and parameterized classifier are applied to assess the quality of data, and the occupant input value is automated using the occupants’ label predicted by the camera instead of the manual input provided by the occupants. The effective data were trained and modeled using a random forest to obtain and analyze the distribution of the error for the proposed approach.

3.2.2. Normality of the Data Distribution

Moreover, checking the normality distribution of data is essential to determine whether the data are normally distributed. However, the graphical depiction for evaluating normalcy necessitates a high level of knowledge to avoid erroneous readings. X and Y vectors are commonly used to show data for graphic analysis. According to [53], suppose that  Y  is the parameter that relies on the regression of  X . If  X x 1 , x 2 , x 3 , x n  are correlated, then  Y  is considered to be dependent on  X , and  µ = f X  is a scattered vector. Hence,  Y  and  µ  can be presented as
Y | X ~ N µ = f X , σ 2
µ = f X = ( ß 0 + ß 1 x 1 + ß 2 x 2 + ß n x n )
The normality distribution of the recorded dataset is illustrated in a graphical presentation using the Q-Q plot test (see Figure 3 and Figure 4). As can be seen in Figure 3 and Figure 4 blue line indicate normal distribution of the dataset.
There is some variance in the distribution of datasets. After a physical observation of the unmatched plots, it was established that the skew was not from sensor malfunction but rather was created naturally, which was not an issue and could not impact the model prediction accuracy. All variables have unfitted point distributions, with CO2 and occupancy having the most severe values. One in 340 observations in a normal distribution can diverge by a minimum average of three standard deviations from the mean [54].

3.2.3. Data Correlation

Before feature selection, it is essential to identify predictors with a strong correlation with predicting variables [14]. In this research, the Pearson Product-Moment value (PPMC) measure is used to determine the correlation value (see Figure 5). PPMC evaluates the degree of reliance among the variables y and x when provided a collection of associated (x, y) values ranging from −1 to +1 [6]. Figure 5 depicts the computed PPMC values for a total of six variables with values ranging from −1 to 1, where 1 indicates a label peak correlation, then 0.9 with strong correlation, 0.8 with moderate correlation, down to 0.00 and −0.00 with a green background, denoting a very weak correlation. Parameters that are not linked with predictors at all or have a low correlation index are potentially excluded from the development of the predicting model via variable permutation significance. Furthermore, if two factors are strongly correlated, it is suggested that only one of them should be examined to simplify models.

3.3. Feature Selection

Variable feature identification is critical in the development of machine learning models because it necessitates the removal of features with a low association before the model training evaluation. The variable significance measure metric is another important factor that is essential in machine learning model development that is basically employed to evaluate and eliminate uncorrelated variable parameters [38]. According to the theory, predictors  X = ( X 1 , , X p )  should be an array of parameters or variables for forecasting variable   Y  [38]. In a regression situation,  ƒ ^ , the formula for variable predictor Y, is a function measured with numbers in  R .  The error in the estimation of  ƒ ^  can be calculated with  R ( ƒ ^ ) = ƒ ^ X Y 2 , and the goal is to measure the expectation  ƒ x = E Y | X = x . Moreover, let  D n = X 1 , Y 1 , X n , Y n  be set data of  n  replications of  X , Y  where  X i = X i 1 , . . , X i p . Since the positive prediction error of  ƒ ^  is not known in the test dataset ( D ¯ ),  D ¯  can be represented as
D ¯ :   R ^ ƒ ^ , D ¯ = 1 D ¯ i : X i , Y i D ¯ Y i ƒ ^ Y i ƒ ^ X i 2
Variable permutation significance, proposed in [40], demonstrated competence in many non-linear predictors similar to the proposed model and was thus used in this research. The method took into account predictors  X i X j  as the key predicting  Y . If the connection between the features  X i X j  and  Y  is disrupted, the error score in the prediction may rise. The model’s score number shows the extent to which the predictor relies on the collated features. This technique offers the advantage of being model-independent, enabling it to be tested multiple times with different function permutations [40]. Arbitrarily, we can permute the data of  X i X j  to illustrate this model.
The statistical permutation can be measured as follows: express the set of samples of out-of-bag  D ¯ n t = D n \ D ¯ n t , t = 1 , , n t r e e . Let  , t = 1 , , n t r e e  donate permuted out-of-bag samples by randomly permuting of the  j - t h  variable’s values for every out-of-bag subset. The variable  X j s statistical permutation value is defined as
I ^ X j = 1 n t r e e t = 1 n t r e e R ^ ƒ ^ t , D ¯ n t j R ^ ƒ ^ t , D ¯ n t
This quantity is the statistical equivalent of the permutation importance measure  I ^ X j  recently formalized by Zhu [55]. Let  X j  =  X 1 , , X j , , X p  be the random vector, such that  X j  is an independent replicate of  X j  that is also independent of  Y  and all other predictors, which can be measured by
I X j = E Y ƒ X j 2 E Y ƒ X 2
The permutation values of  X j  in the expression of  I ^ X j  replicate the exact independency of the pattern of distribution of (X)n I ^ X j .

3.4. Model Development

Throughout model development, datasets are frequently divided into training and testing ratios, while ML techniques are used to produce solid prediction efficacy. The procedure is simple and efficient for examining the efficacy of various ML methods for model prediction and choosing the best strategy to address the development of model prediction. The procedure involves rearranging and splitting the dataset into a 70:30 ratio (Figure 6). The model is matched using the first ratio, which is also considered as the training dataset. The second ratio, considered as the test dataset, is fed into the model as input to supply the variable dataset to test and evaluate the forecast.

3.5. Proposed Flowchart

The IoT thermostat collects the occupancy number, sets the desired set point, and instructs the controller to maintain the desired set point to ensure that thermal comfort is kept within pre-agreed comfort bounds (see Figure 7). Our initial study in [29] proposed a novel adaptive controller for an HVAC system or IoT thermostat that can predict preferred temperature boundaries based on known environmental parameters (such as the level of indoor CO2, temperature, and humidity). Every five minutes, the controller requests the current indoor temperature and occupancy with the IoT thermostat to compare the current indoor temperature with the preferred temperature and decide whether to adjust the HVAC set point temperature or not (see Figure 7).
Usually, smart thermostats enable direct operation to control the on/off state without remote control, which can fairly be achieved by decreasing/increasing set points by 5 degrees from the preferred temperature, forcing it to turn on when the temperature goes below comfort bounds or off when the temperature goes above comfort bounds. The controller then notes the HVAC state to maintain comfort and satisfaction with the desired bound. Similar studies [3,14,56] chose 5 degrees as an adequate margin to deal with the deadband to keep room conditions until the next control interval. The procedure continues until the requested response period is over, at which time the controller returns the routinely desired comfort on the smart thermostat, enabling it to operate in the usual deadband-based set point in the following mode.

3.6. Hardware and Software Layout

The adaptive controller proposed in this study is composed of three components, as shown in Figure 8. The first component is made up of sensors for collecting occupancy and surrounding data installed within the perimeter of interest, transmitted via a serial connection with an Arduino microcontroller, sending parse collected data to the IoT cloud ThingSpeak via the internet using the Wi-Fi-based microcontroller ESP/011 (Version 1.0). The second component is ThingSpeak integrated with Matlab [57] where the fuzzy rule controllers were designed for occupancy detection and estimation mechanisms. The third component is an intelligent controller that receives signal information on recommendations from ThingSpeak and decides whether to approve or reject the demand to switch lights and HVAC appliances on/off.
As presented in Figure 8, sensors such as ultrasonic sensors were installed at occupancy entry perpendicular to the area of interest to perform occupant body dimension analysis, such as occupant height prediction. Temperature and humidity sensors monitor indoor climate change in addition to CO2 concentrations to implement the control logic of the system. A lamp was connected with a photoresistor positioned about 1.5 m from the floor and equipped with a 10 W dimmer light bulb for energy consumption.

Candidate Model

Random Forest (RF) is considered to be the optimum candidate for the proposed model due to the nature of our data for both regression prediction and classification. The current implementation of this concept and recent advancements have demonstrated strong performance in many domains despite its simplicity. A powerful Python library called scikit-learn was used during model implementation. The summarized details of the library documentation can be found in [58].
To forecast the behavior defined by training data, the RF consists of a list of various decision nodes that are used progressively from a bottom (parent) to a terminal (child) node [58]. This method offers various conditional rules that may be used simply as matching data samples based on shared characteristics by comparing sensor readings to a threshold. Bagging is performed for each decision tree [31], implementing more than half of the dataset samples for training and the rest of the dataset for assessing the efficiency of the prediction model. This suggests that while each RF tree learns from different subsets of the training data, they all work toward the same aim.

3.7. Evaluation Metrics

Usually, more than one metric is used to measure the performance of the model to confirm its performance across a range of new data. As a result, other measures are taken into account, such as the AUC; the area under the ROC curve refers to a metric used to measure aggregate performance across the entire categorization criteria. It can be used to interpret the probability that the rates of a random positive sample may be larger than those of a random negative sample.
Precision is an unbalanced classification measure and contains two classes: correct detection, known as true positive (TP) divided by the total correct detection, and incorrect detection, known as false positive (FP). The equation can be represented as
P r e c i s i o n   T P T P + F P
Recall is an unbalanced classification measure containing two classes and can be determined by dividing the total correct detection by the total correct detection plus unpredicted detection (false negatives). The equation can be represented as
R e c a l l   T P T P + F N
F1-Score is the metric used to harmonize the value of precision and recall. The equation can be represented as
F 1 - S c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l

3.8. Results of the RF Model

To obtain the results, the outputs from each tree are joined together. The models’ handling of bias and uncertainty in their predictions is influenced using these guidelines. To determine whether a room is inhabited or unoccupied, the binary classifier employs CO2 as a predictive variable.
The efficiency of the RF classifier for occupancy room occupancy detection is shown in Table 4. The score bin represents recursive splitting analysis from the dataset sample that provides the set decision strategies that prediction used to fit the data, as shown in Table 4.
The assessment was carried out to validate the efficacy of the suggested technique on fresh data, which is essential, particularly for systems that implement on/off control, where the model’s efficacy can vary depending on previous historical and new data. As a result, the dataset record in Table 4 is divided into two bin ratios consisting of the training and testing datasets. The performance of the classifier prediction varies from 75% to 88% for the FI score, the accuracy ranges from 74% to 86%, the precision ranges from 70% to 99.6%, and the recall ranges from 29% to 99%.

3.9. Comparative Analysis

The focus of this section is on the selection of benchmark datasets in the area of building occupancy prediction, as well as its evolution. In this section, we present the analysis of four widely used and published datasets of occupancy prediction alongside machine learning. These datasets were collected based on the performance metrics published by the authors. These datasets are easily accessible by researchers. These datasets vary in terms of the number of variables and the size of records. We have identified that several research works have used them to empirically evaluate the proposed approaches (see Table 5), selection, composition, and ranking. Before proposing the performance of our framework for building occupancy prediction, we need to perform a performance analysis of the four published web service datasets. Therefore, selecting the most appropriate dataset can enhance the efficiency of the proposed framework. Person correlation coefficient analysis is aimed at determining the relationship between predicting variables and predictors. Similarly, a normality test using a Q-Q plot was used to extract dataset features to reduce the parity in datasets. This chapter also contributes to the identification of candidate machine learning that can perform a solid prediction of building occupancy using random data.
A camera-based occupancy prediction model is proposed that employs passive infrared to predict indoor occupancy once all parameters have been measured [2]. Similarly, a renewed camera sensing approach is proposed to be used in inverse occupancy modeling [14]. These techniques employed background subtraction focusing on the area of interest (entry or exit door) to estimate indoor occupancy once the object pixel analysis is precisely determined. However, these techniques failed to check or validate the accuracy of captured or predicted occupancy, which might affect the quality of the training dataset, particularly when incorrect predictions are recorded in the training dataset. To accommodate for these flaws, the multimodal framework is proposed, which ensures that predicted occupancy is validated through a parametric classifier alongside the machine learning method.
As can be seen in Table 5, both studies [14] implement a similar approach, with the proposed study achieving a prediction accuracy of 89–99% and 76–99%, respectively. After applying the necessary steps to clean up the datasets provided by the authors, we achieved higher accuracy (99.6%) with our model. Furthermore, [29] proposed a similar approach (only sensors without camera), which achieved a prediction of 79–85%. Due to the inconsistency (outlier) in the dataset provided by the authors, our prediction model achieved the same result reported by the author. In addition, the accuracy of [29] is reduced when the occupancy number is higher than seven. In summary, our finding reveals that the recent literature is now giving more emphasis to the multimodal data fusion approach to improve the quality of the dataset, which is essential to verify the efficacy of the prediction model. The proposed approach also offers useful data related to occupancy profiles, which can be further explored to perform occupancy prediction via occupancy schedules regardless of building type (residential or commercial building).

4. Building Occupation and HVAC System Analysis

The primary goal of binary thermostats is to switch the HVAC system off when there is no one in the room. During our observation, when the space was empty, the air conditioner was turned off and the temperature was elevated. The study recorded the indoor temperature (see Figure 9) from 8:00 AM to 5:00 PM. It was observed that from 11:00 AM to 2:00 AM, the temperature increased significantly, causing the discomfort to be higher than any other period.
The existing controller uses a fixed schedule to control HVAC operation, which makes it prone to false alarms, especially when occupants’ schedules or plans conflict with the control policy, which can result in unnecessary energy use. Our strategy uses occupancy prediction in the space to control HVAC operation, meaning that the HVAC system operates only in the event of room occupation. This indicates that the overall energy consumption is influenced by the room temperature and the rate at which the air conditioner operates to keep the room at a satisfactory comfort level.

4.1. Room Temperature Control

As can be seen in Figure 9, from 8:00 AM to 9:00 AM and 4:00 PM to 8:00 PM, the temperature is “Hot”. From 9:00 PM to 11:00 AM and 3:00 PM to 4:00 PM, it is considered to be “Very Hot”. From 11:00 PM to 3:00 PM, the temperature is “Extremely Hot” when the air conditioner is not operating.
The goal of the proposed controller at any given moment is to bring the temperature down as much as possible to ensure that the occupants’ comfort is within a satisfactory level. The proposed controller indicates remarkable operation by bringing the indoor temperature to the desired comfort level. Figure 10 represents the state of indoor temperature when the air conditioner is operating. As can be seen in Figure 10, the controller was able to maintain the room temperature between 20.5 °C and 16.5 °C for the entire operation of the air conditioning. This shows that proposed controllers adequately bring the indoor temperature to the desired comfort level except during very hot and extreme temperatures.

4.2. Thermal Comfort Analysis

This study uses a procedure for the temperature measurement of the indoor building using the approach used in [2]. To prevent radiation or emissions from objects or surfaces, the logger is positioned in the middle of the room and left hanging freely. Data loggers were also placed on posts and walls so that the devices were stacked on top of each other. The data logger was fixed at 2 m high on a pillar or wall with double-sided tape to prevent it from falling. The data logger is installed at a sufficient distance from the sun, roofs, and windows to avoid sources of heat, radiation, and cold (see Figure 11). When installed in the building, the loggers are set to record temperatures in 30 s intervals.
This configuration will enable our research design to include on-site interviews with occupants on their experiences of thermal comfort in different temperature settings based on the designed questionnaire for occupants of different nationalities (Malaysian, Saudi Arabian, Indonesian, and Nigerian) using ISO 7730, ISO 10551, and ISO 8996 [59,60] as guidelines. The occupants are adults aged between 24 and 60 years. The designed questionnaire captured subjective variables and the perception of thermal sensation [61]. The seven-point scale ISO 77300 [29] (very hot, hot, warm, neutral, cool, cold, and very cold) was adopted after the question “How are you feeling now?” and Rayman software (Version 1.2) was utilized to compute PET based on the occupants’ gathered feedback.
For the duration of the study, 184 questionnaires were deemed legitimate. Table 6 shows the average PET values for the Malaysian, Saudi Arabian, Indonesian, and Nigerian participants.
Figure 12 demonstrates that a higher percentage (55%) of the Malaysian and Indonesian participants believe the weather is neutral in the PET within the range of 16 °C, while most of the participants (50%) of Saudi Arabian and Nigerian nationality believe the weather is cold at the same temperature level (see Figure 13). This indicates that Malaysians and Indonesians appear to have a higher tolerance for cold and lower PET readings. The majority of the nationalities report feeling neutral at temperatures between 16 °C and 24 °C (see Figure 12 and Figure 13). Furthermore, thermal perceptions of the temperature are quite comparable in the 20 °C–24 °C PET range for all nationalities, indicating that weather is hot when PET levels are above 24 °C.
In addition, the PET index was calibrated based on the conclusions of the logistic ordinal regression, which demonstrated the range correlation of the various thermal sensations of the participants. This conclusion tells us that once the temperature levels fall very low, the PET values become “Very Cold”, a range could not be assigned to any nationalities. The “Cold” range’s top bound is the same for all nationalities. However, the Nigerians had a narrower “Cool” range than Saudi Arabians, while the Malaysians and Indonesians had a broader “Neutral” range. It was difficult to pinpoint the “Warm” range for Malaysians or Indonesians; however, all the participants experienced the same phenomenon, suggesting feeling “Hot “and “Very hot” as the temperature rose.

4.3. Energy Consumption

The cost of electricity does not have much impact on energy usage in this setting; therefore, the room temperature is the major factor considered for the controller setting which is required to keep the thermostat operating to keep the area comfortable, or at least closer to the acceptable comfort level. This also applied to peak energy demand hours; the compressor stayed active until the area’s temperature stabilized to the desired level as specified on the thermostat. After reaching the appropriate temperature, the compressor becomes inactive. This process will continue in a loop as long as the room is occupied. This process determines the cycle of the air conditioner.
The cycle time of an air conditioner is the amount of time it takes to run it in order to maintain an area at the intended temperature. As shown in Figure 14, the compressor works for considerably extended amounts of time to reduce the area’s temperature, increasing the cycle time.
The energy usage of the proposed system is 38.708 kW, compared to 39.159 kW for the existing approach. The energy consumption of the HVAC systems of the proposed controller under similar temperature settings in [30] is lower than in the existing study. Because the energy usage of the systems could not be postponed for extended periods to allow consumers to use them during peak hours, the proposed approach optimally adjusts the set point temperature through a feedback loop. Therefore, subtracting the energy consumed by the proposed controller (38.708 kW) from the energy consumed by the existing controller (39.159 kW), the difference is 0.451 kW, as indicated in Figure 14. Therefore, the proposed approach saves more than 45% of energy consumption in comparison to the previous approach.

5. Discussion

In a smart building, occupancy prediction can help with demand control ventilation strategies to trade-off between energy use and thermal comfort. Because of the importance of occupancy privacy, particularly in residential buildings, many of the suggested occupancy prediction techniques that make use of intrusive technology like wearable devices, cameras, and Wi-Fi are not realistically appropriate to deploy in residential settings due to privacy concerns. The recent environmental sensing technique attracted a lot of attention as a result. Unfortunately, as shown in the literature, environmental sensing performs relatively poorly because of the subpar training dataset, weak feature correlation between predictors and predicting variables, and improper ML approach selection in the prediction model.
Occupancy prediction: This study suggested an approach for predicting indoor occupancy by using data from several sensor streams that have a significant correlation with building occupancy. The proposed approach was trained and tested with a prototype system for performance assessment. Even though the proposed prototype can perform well in predictions using a variety of ML methods, the random forest method was selected due to its flexibility and efficiency in handling challenges related to both regression and classification approaches. The result of the assessment indicates that the proposed approach yields solid performance in comparison with the overall performance of previous approaches. Furthermore, the results show that adding more variable parameters with a high correlation with predicting variables can make a significant improvement in reliable occupancy prediction in contrast to employing a single-variable parameter or directly using sensor data. User interaction during data collection to ensure that data from environmental sensors matches with the corresponding data recorded by the camera is one of the major challenges of the proposed approach, which, if not carefully handled, can affect the quality of the dataset and the overall performance of the prediction. This challenge can be further tackled in many ways, such as camera fusion (to make sure that occupancy numbers captured by multiple cameras are equal to each other; otherwise, the record is discarded). It can also be enhanced through a validation classifier that employs machine learning or deep learning to ensure that only matched and valid corresponding records are captured in the training dataset instead of the parametric classifier (if–else classifier) proposed in this study.
Feature selection: Feature selection is one of the critical aspects of the proposed prediction approach to identify the best set of features that allows us to build an optimized model of our occupancy prediction. Because the data collection was conducted in a residential building, the time (occupancy departing and arrival) became irrelevant to be included in occupancy prediction. At this stage of feature selection, this study identified a variable (time) with a weak correlation and eliminated it before the dataset sample was added to the model for assessment. Therefore, the proposed approach cannot predict building occupancy based on schedule. Providing precise occupancy schedules for building energy modeling is crucial for energy conservation. A deep learning approach can be explored to include reference occupancy schedules (including sets that deviate greatly from the real occupancy changes) in establishment occupancy variation patterns, which would help to make reliable indoor occupancy predictions even when occupancy schedules are changed.
Generalize thermal comfort bounds: To manage building energy usage and provide a suitable indoor atmosphere, suitable thermal comfort is an important aspect to consider. Yet since it depends on individual traits and indoor settings, it is quite challenging to determine suitable thermal comfort effectively. Furthermore, gathering datasets of individual and indoor environmental variables might be impractical at times and difficult in terms of resources and effort. For this reason, this study design focused on configurations that enable on-site interviews with occupants based on their experiences of thermal comfort in different temperature settings using the designed questionnaire for occupants of different nationalities. The findings reveal comfort margins or ranges that are suitable to maintain acceptable comfort depending on nationality. To provide a more general approach to ensure that HVAC controllers always maintain an acceptable comfort level with respect to nationality, it is essential to design a prediction model for personal thermal comfort via transfer learning to transfer information from datasets of individuals in various indoor and thermal conditions, even in situations when the target subject’s physiological and environmental data are insufficient. This can be achieved through wearable wristbands and sensors to gather each subject’s physiological data as well as information about the indoor environment. Then, the datasets can be used to create a pre-trained model by fusing machine learning and deep learning methods. The transfer learning approach may be used to address the poor generalization performance that resulted from inadequate datasets for each targeted subject based on the pre-trained model.

6. Conclusions

Building occupancy modeling and prediction have been primary areas of building energy efficiency research for the past decade. Even though camera-based occupancy and other intrusive technologies have shown solid performance in predicting building occupancy, their adoption in residential buildings has declined recently due to privacy concerns. Occupancy overlapping is one of the critical research obstacles presently faced by most of the previous approaches, which affects the accuracy of the prediction model and, in turn, affects the performance of HVAC operation, thereby creating discomfort. For this reason, this research focused on an environmental sensing (non-intrusive) approach for predicting room occupancy. The output of the prediction is used as one of the input parameters to the smart controller to handle HVAC operation under three different settings depending on the occupants’ thermal comfort requirements. To test the efficiency of the proposed model, a prototype was designed and coupled to collect occupancy-related data in the building for model training. A random forest regressor was used to predict the building occupancy. To test and validate the proposed HVAC controller against energy consumption and overall occupant thermal satisfaction, a questionnaire was designed that would help to determine the ideal thermal perception of participants from different countries.
Even though the response to thermal perception can be influenced by psychological and sociocultural factors, which may vary or fluctuate based on thermal adaptation, variations in the PET calibration across participants from different countries demonstrate how sociocultural traits affect the thermal perception of the individual. For Malaysian and Indonesian nationals, the PET neutral comfort value range starts from 16 °C, which is considered cold for Nigerians and Saudi Arabians. This demonstrates a variation in the comfort zone of around 4 °C when comparing thermal perception. This reveals that an ideal PET calibration is not practical and that the calibration must be tailored to the specific climatic zone. Therefore, it is crucial to comprehend the social and psychological elements that impact how people perceive their surroundings when designing HVAC controllers, assessing comfort levels, and making decisions related to urban development to lessen the effects of discomfort.
Finally, the experimental results show that the proposed controller can operate HVAC systems to maintain satisfactory comfort and save up to 45% energy in comparison with the previous approach. Future works can investigate the integration of several deep learning methods with the proposed framework for enhancing the detection of building occupancy.

Author Contributions

All authors contributed to the research as follows: conceptualization—M.A.A., B.S.A. and Y.A.D.; methodology—B.S.A., M.A.A., A.M. and M.S.A.; formal analysis—Y.A.D., M.S.A. and A.M.; resources—A.M., M.A.A. and B.S.A.; writing—B.S.A., M.S.A. and Y.A.D.; original draft preparation—M.A.A., M.S.A., Y.A.D. and A.M. All authors have read and agreed to the published version of the manuscript.

Funding

The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Distinguished Research Funding program, grant code (NU/DRP/SERC/12/2).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Conflicts of Interest

The authors declare that they have no competing interests.

References

  1. Aliero, M.S.; Qureshi, K.N.; Pasha, M.F.; Jeon, G. Smart Home Energy Management Systems in Internet of Things networks for green cities demands and services. Environ. Technol. Innov. 2021, 22, 101443. [Google Scholar] [CrossRef]
  2. Aliero, M.S.; Pasha, M.F.; Smith, D.T.; Ghani, I.; Asif, M.; Jeong, S.R.; Samuel, M. Non-Intrusive Room Occupancy Prediction Performance Analysis Using Different Machine Learning Techniques. Energies 2022, 15, 9231. [Google Scholar] [CrossRef]
  3. Cao, N.; Ting, J.; Sen, S.; Raychowdhury, A. Smart Sensing for HVAC Control: Collaborative Intelligence in Optical and IR Cameras. IEEE Trans. Ind. Electron. 2018, 65, 9785–9794. [Google Scholar] [CrossRef]
  4. Jin, X.; Wang, G.; Song, Y.; Sun, C. Smart building energy management based on network occupancy sensing. J. Int. Counc. Electr. Eng. 2018, 8, 30–36. [Google Scholar] [CrossRef]
  5. Sadikoglu-Asan, H. ‘User-Home relationship’ regarding user experience of smart home products. Intell. Build. Int. 2020, 14, 114–130. [Google Scholar] [CrossRef]
  6. Oliveira, E.d.L.; Alfaia, R.D.; Souto, A.V.F.; Silva, M.S.; Francês, C.R.L.; Vijaykumar, N.L. SmartCoM: Smart Consumption Management Architecture for Providing a User-Friendly Smart Home based on Metering and Computational Intelligence. J. Microw. Optoelectron. Electromagn. Appl. 2017, 16, 736–755. [Google Scholar] [CrossRef]
  7. Baig, F.; Mahmood, A.; Javaid, N.; Razzaq, S.; Khan, N.; Saleem, Z. Smart home energy management system for monitoring and scheduling of home appliances using zigbee. J. Basic. Appl. Sci. Res. 2013, 3, 880–891. [Google Scholar]
  8. Shin, J.; Park, Y.; Lee, D. Who will be smart home users? An analysis of adoption and diffusion of smart homes. Technol. Forecast. Soc. Chang. 2018, 134, 246–253. [Google Scholar] [CrossRef]
  9. Moreno, M.V.; Zamora, M.A.; Skarmeta, A.F. User-centric smart buildings for energy sustainable smart cities. Trans. Emerg. Telecommun. Technol. 2014, 25, 41–55. [Google Scholar] [CrossRef]
  10. Na, H.; Choi, J.-H.; Kim, H.; Kim, T. Development of a human metabolic rate prediction model based on the use of Kinect-camera generated visual data-driven approaches. Build. Environ. 2019, 160, 106216. [Google Scholar] [CrossRef]
  11. Tekler, Z.D.; Low, R.; Yuen, C.; Blessing, L. Plug-Mate: An IoT-based Occupancy-Driven Plug Load Management System in Smart Buildings. Build. Environ. 2022, 223, 109472. [Google Scholar] [CrossRef]
  12. Filippoupolitis, A.; Oliff, W.; Loukas, G. Bluetooth Low Energy based Occupancy Detection for Emergency Management. In Proceedings of the 2016 15th International Conference on Ubiquitous Computing and Communications, Granada, Spain, 14–16 December 2016. [Google Scholar]
  13. Tekler, Z.D.; Chong, A. Occupancy prediction using deep learning approaches across multiple space types: A minimum sensing strategy. Build. Environ. 2022, 226, 109689. [Google Scholar] [CrossRef]
  14. Aliero, M.S.; Pasha, M.F.; Toosi, A.N.; Ghani, I. The COVID-19 impact on air condition usage: A shift towards residential energy saving. Environ. Sci. Pollut. Res. Int. 2022, 29, 85727–85741. [Google Scholar] [CrossRef] [PubMed]
  15. Abade, B.; Perez Abreu, D.; Curado, M. A Non-Intrusive Approach for Indoor Occupancy Detection in Smart Environments. Sensors 2018, 18, 3953. [Google Scholar] [CrossRef] [PubMed]
  16. Brennan, C.; Taylor, G.W.; Spachos, P. Designing Learned CO2-based Occupancy. Estimation in Smart Buildings. IET Res. J. 2015, 8, 1–7. [Google Scholar]
  17. Hänninen, O.; Canha, N.; Kulinkina, A.V.; Dume, I.; Deliu, A.; Mataj, E.; Lusati, A.; Krzyzanowski, M.; Egorov, A.I. Analysis of CO2 monitoring data demonstrates poor ventilation rates in Albanian schools during the cold season. Air Qual. Atmos. Health 2017, 10, 773–782. [Google Scholar] [CrossRef]
  18. Zhai, S.; Wang, Z.; Yan, X.; He, G. Appliance Flexibility Analysis Considering User Behavior in Home Energy Management System Using Smart Plugs. IEEE Trans. Ind. Electron. 2019, 66, 1391–1401. [Google Scholar] [CrossRef]
  19. Thomas, A.M.; Moore, P.; Shah, H.; Evans, C.; Sharma, M.; Xhafa, F.; Mount, S.; Pham, H.V.; Wilcox, A.J.; Patel, A. Smart care spaces: Needs for intelligent at-home care. Int. J. Space-Based Situated Comput. 2013, 3, 35–44. [Google Scholar] [CrossRef]
  20. Zuraimi, M.S.; Pantazaras, A.; Chaturvedi, K.A.; Yang, J.J.; Tham, K.W.; Lee, S.E. Predicting occupancy counts using physical and statistical Co2-based modeling methodologies. Build. Environ. 2017, 123, 517–528. [Google Scholar] [CrossRef]
  21. Wei, P.; Ning, Z.; Ye, S.; Sun, L.; Yang, F.; Wong, K.C.; Westerdahl, D.; Louie, P.K.K. Impact Analysis of Temperature and Humidity Conditions on Electrochemical Sensor Response in Ambient Air Quality Monitoring. Sensors 2018, 18, 59. [Google Scholar] [CrossRef]
  22. Yazici, M.; Ceylan, O.; Shafique, A.; Abbasi, S.; Galioglu, A.; Gurbuz, Y. A new high dynamic range ROIC with smart light intensity control unit. Infrared Phys. Technol. 2017, 82, 161–169. [Google Scholar] [CrossRef]
  23. Aliero, M.S.; Asif, M.; Ghani, I.; Pasha, M.F.; Jeong, S.R. Systematic Review Analysis on Smart Building: Challenges and Opportunities. Sustainability 2022, 14, 3009. [Google Scholar] [CrossRef]
  24. Ahmad, J.; Larijani, H.; Emmanuel, R.; Mannion, M. Occupancy detection in non-residential buildings—A survey and novel privacy preserved occupancy monitoring solution. Appl. Comput. Inform. 2021, 17, 279–295. [Google Scholar] [CrossRef]
  25. Balaji, B.; Xu, J.; Nwokafor, A.; Gupta, R.; Agarwal, Y. Sentinel: Occupancy Based HVAC Actuation using Existing WiFi Infrastructure within Commercial Buildings. In Proceedings of the SenSys 13: The 11th ACM Conference on Embedded Networked Sensor Systems, Rome, Italy, 11–15 November 2013. [Google Scholar]
  26. Castro, D.; Coral, W.; Rodriguez, C.; Cabra, J.; Colorado, J. Wearable-Based Human Activity Recognition Using an IoT Approach. J. Sens. Actuator Netw. 2017, 6, 28. [Google Scholar] [CrossRef]
  27. Low, R.; Tekler, Z.D.; Cheah, L. An End-to-End Point of Interest (POI) Conflation Framework. ISPRS Int. J. Geo-Inf. 2021, 10, 779. [Google Scholar] [CrossRef]
  28. Amayri, M.; Ploix, S.; Bouguila, N.; Wurtz, F. Database quality assessment for interactive learning: Application to occupancy estimation. Energy Build. 2020, 209, 109578. [Google Scholar] [CrossRef]
  29. Abuhussain, M.A.; Alotaibi, B.S.; Aliero, M.S.; Asif, M.; Alshenaifi, M.A.; Dodo, Y.A. Adaptive HVAC System Based on Fuzzy Controller Approach. Appl. Sci. 2023, 13, 11354. [Google Scholar] [CrossRef]
  30. Talebi, A.; Hatami, A. Online fuzzy control of HVAC systems considering demand response and users’ comfort. Energy Sources Part B Econ. Plan. Policy 2020, 15, 403–422. [Google Scholar] [CrossRef]
  31. Candanedo, L.M.; Feldheim, V. Accurate occupancy detection of an office room from light, temperature, humidity and CO2 measurements using statistical learning models. Energy Build. 2016, 112, 28–39. [Google Scholar] [CrossRef]
  32. Aftab, M.; Chen, C.; Chau, C.-K.; Rahwan, T. Automatic HVAC control with real-time occupancy recognition and simulation-guided model predictive control in low-cost embedded system. Energy Build. 2017, 154, 141–156. [Google Scholar] [CrossRef]
  33. Lu, S.; Wang, W.; Wang, S.; Cochran Hameen, E. Thermal Comfort-Based Personalized Models with Non-Intrusive Sensing Technique in Office Buildings. Appl. Sci. 2019, 9, 1768. [Google Scholar] [CrossRef]
  34. Guo, J.; Amayri, M.; Najar, F.; Fan, W.; Bouguila, N. Occupancy estimation in smart buildings using predictive modeling in imbalanced domains. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 10917–10929. [Google Scholar] [CrossRef]
  35. Huang, C.-C.; Yang, R.; Newman, M.W. The potential and challenges of inferring thermal comfort at home using commodity sensors. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing—UbiComp ’15, Osaka, Japan, 7–11 November 2015; pp. 1089–1100. [Google Scholar]
  36. Huang, Q. Occupancy-Driven Energy-Efficient Buildings Using Audio Processing with Background Sound Cancellation. Buildings 2018, 8, 78. [Google Scholar] [CrossRef]
  37. Yang, D.; Xu, B.; Rao, K.; Sheng, W. Passive Infrared (PIR)-Based Indoor Position Tracking for Smart Homes Using Accessibility Maps and A-Star Algorithm. Sensors 2018, 18, 332. [Google Scholar] [CrossRef] [PubMed]
  38. O’Neill, Z.D.; Li, Y.; Cheng, H.C.; Zhou, X.; Taylor, S.T. Energy savings and ventilation performance from CO2-based demand controlled ventilation: Simulation results from ASHRAE RP-1747 (ASHRAE RP-1747). Sci. Technol. Built Environ. 2019, 26, 257–281. [Google Scholar] [CrossRef]
  39. Sisco, M.R.; Bosetti, V.; Weber, E.U. When do extreme weather events generate attention to climate change? Clim. Chang. 2017, 143, 227–241. [Google Scholar] [CrossRef]
  40. Elkhoukhi, H.; NaitMalek, Y.; Bakhouya, M.; Berouine, A.; Kharbouch, A.; Lachhab, F.; Hanifi, M.; El Ouadghiri, D.; Essaaidi, M. A platform architecture for occupancy detection using stream processing and machine learning approaches. Concurr. Comput. Pract. Exp. 2019, 32, e5651. [Google Scholar] [CrossRef]
  41. Sun, K.; Qaisar, I.; Khan, M.A.; Xing, T.; Zhao, Q. Building occupancy number prediction: A Transformer approach. Build. Environ. 2023, 244, 110807. [Google Scholar] [CrossRef]
  42. Wang, F.; Feng, Q.; Chen, Z.; Zhao, Q.; Cheng, Z.; Zou, J.; Zhang, Y.; Mai, J.; Li, Y.; Reeve, H. Predictive control of indoor environment using occupant number detected by video data and CO2 concentration. Energy Build. 2017, 145, 155–162. [Google Scholar] [CrossRef]
  43. Jung, W.; Jazizadeh, F. Human-in-the-loop HVAC operations: A quantitative review on occupancy, comfort, and energy-efficiency dimensions. Appl. Energy 2019, 239, 1471–1508. [Google Scholar] [CrossRef]
  44. Fayed, N.S.; Abu-Elkheir, M.; El-Daydamony, E.M.; Atwan, A. Sensor-based occupancy detection using neutrosophic features fusion. Heliyon 2019, 5, e02450. [Google Scholar] [CrossRef] [PubMed]
  45. Maschi, L.F.C.; Pinto, A.S.R.; Meneguette, R.I.; Baldassin, A. Data Summarization in the Node by Parameters (DSNP): Local Data Fusion in an IoT Environment. Sensors 2018, 18, 799. [Google Scholar] [CrossRef] [PubMed]
  46. Roselyn, J.P.; Uthra, R.A.; Raj, A.; Devaraj, D.; Bharadwaj, P.; Krishna Kaki, S.V.D. Development and implementation of novel sensor fusion algorithm for occupancy detection and automation in energy efficient buildings. Sustain. Cities Soc. 2019, 44, 85–98. [Google Scholar] [CrossRef]
  47. Zhang, Z.; Wang, J.; Zhong, H.; Ma, H. Optimal scheduling model for smart home energy management system based on the fusion algorithm of harmony search algorithm and particle swarm optimization algorithm. Sci. Technol. Built Environ. 2019, 26, 42–51. [Google Scholar] [CrossRef]
  48. Choo, Y.-Y.; Ha, Y.-J.; Kim, Y.-B.; Lee, S.-J.; Choi, H.-D. Development of CoAP-based IoT Communication System for Smart Energy Storage System. In Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control—ISCSIC’18, Stockholm, Sweden, 21–23 September 2018; pp. 1–5. [Google Scholar]
  49. Fan, G.; Xie, J.; Yoshino, H.; Yanagi, U.; Hasegawa, K.; Wang, C.; Zhang, X.; Liu, J. Investigation of indoor thermal environment in the homes with elderly people during heating season in Beijing, China. Build. Environ. 2017, 126, 288–303. [Google Scholar] [CrossRef]
  50. Chen, Z.; Masood, M.K.; Soh, Y.C. A fusion framework for occupancy estimation in office buildings based on environmental sensor data. Energy Build. 2016, 133, 790–798. [Google Scholar] [CrossRef]
  51. Chung, M.; Kim, J. The Internet Information and Technology Research Directions based on the Fourth Industrial Revolution. KSII Trans. Internet Inf. Syst. 2016, 10, 1311–1320. [Google Scholar]
  52. Taştan, M. Internet of Things based Smart Energy Management for Smart Home. KSII Trans. Internet Inf. Syst. 2019, 13, 18. [Google Scholar]
  53. Gregorutti, B.; Michel, B.; Saint-Pierre, P. Correlation and variable importance in random forests. Stat. Comput. 2016, 27, 659–667. [Google Scholar] [CrossRef]
  54. Zittis, G. Observed rainfall trends and precipitation uncertainty in the vicinity of the Mediterranean, Middle East and North Africa. Theor. Appl. Climatol. 2017, 134, 1207–1230. [Google Scholar] [CrossRef]
  55. Zhu, R.; Zeng, D.; Kosorok, M.R. Reinforcement Learning Trees. J. Am. Stat. Assoc. 2015, 110, 1770–1784. [Google Scholar] [CrossRef] [PubMed]
  56. Adhikari, R.; Pipattanasomporn, M.; Rahman, S. An algorithm for optimal management of aggregated HVAC power demand using smart thermostats. Appl. Energy 2018, 217, 166–177. [Google Scholar] [CrossRef]
  57. ThingSpeak and Matlab Simulation. Available online: https://www.mathworks.com/products/thingspeak.html (accessed on 10 May 2024).
  58. The Python Standard Library. 2021. Available online: https://docs.python.org/3/library/index.html (accessed on 10 May 2024).
  59. Olesen, B.W.; Parsons, K.C. Introduction to thermal comfort standards and to the proposed new version of EN ISO 7730. Energy Build. 2002, 34, 537–548. [Google Scholar] [CrossRef]
  60. Parsons, K.C. ISO Standards and Thermal Comfort: Recent Developments. In ISO Standards and Thermal Comfort; Taylor & Francis Group: Abingdon, UK, 1995. [Google Scholar]
  61. Schaudienst, F.; Vogdt, F.U. Fanger’s model of thermal comfort: A model suitable just for men? Energy Procedia 2017, 132, 129–134. [Google Scholar] [CrossRef]
Figure 1. Research methodology.
Figure 1. Research methodology.
Sustainability 16 04171 g001
Figure 2. The proposed framework for occupancy prediction.
Figure 2. The proposed framework for occupancy prediction.
Sustainability 16 04171 g002
Figure 3. Temperature, humidity, and light.
Figure 3. Temperature, humidity, and light.
Sustainability 16 04171 g003
Figure 4. Occupancy, CO2, and humidity ratio.
Figure 4. Occupancy, CO2, and humidity ratio.
Sustainability 16 04171 g004
Figure 5. Variable correlation values.
Figure 5. Variable correlation values.
Sustainability 16 04171 g005
Figure 6. The ratio of training and testing datasets.
Figure 6. The ratio of training and testing datasets.
Sustainability 16 04171 g006
Figure 7. Flowchart of the proposed controller.
Figure 7. Flowchart of the proposed controller.
Sustainability 16 04171 g007
Figure 8. Proposed smart controller.
Figure 8. Proposed smart controller.
Sustainability 16 04171 g008
Figure 9. Room temperature with air conditioning off.
Figure 9. Room temperature with air conditioning off.
Sustainability 16 04171 g009
Figure 10. Proposed smart controller.
Figure 10. Proposed smart controller.
Sustainability 16 04171 g010
Figure 11. Installation of sensors for indoor temperature measurement.
Figure 11. Installation of sensors for indoor temperature measurement.
Sustainability 16 04171 g011
Figure 12. Thermal perception of Malaysian and Indonesian participants.
Figure 12. Thermal perception of Malaysian and Indonesian participants.
Sustainability 16 04171 g012
Figure 13. Thermal perception of Saudi Arabian and Nigerian participants.
Figure 13. Thermal perception of Saudi Arabian and Nigerian participants.
Sustainability 16 04171 g013
Figure 14. Comparison of energy consumption.
Figure 14. Comparison of energy consumption.
Sustainability 16 04171 g014
Table 1. Summary of the existing literature.
Table 1. Summary of the existing literature.
Method Technology UsedResult ReportedTechnological ChallengeResearch ChallengeOpportunity Offered
Camera
[3]Optical and infrared cameras 65% prediction accuracy and 40% energy saving The object should be within a range of 5 m in a straight line; sensitivity; dark/night scene limitation.Prone to overlap/being covered by an obstacle; no feature extraction or classification.- The model can be simply improved to recognize human occupancy through their indoor behavior or activities.
- Availability of datasets for research communities.
[32]Camera and door counter 85% prediction accuracy and 30% energy savingAn object should be within an area of interest (5 m max); poor quality in dark/night scenes. Losing track of an occupant when exiting in a different entry; occupants overlap partly (pixels features analysis with GNB).- Support sensor fusion mechanism for multimodal data collection.
- The number of occupants does not affect the reliability of the model.
[33]PIR, camera 90% prediction accuracy and 10% energy saving Object should be within the area of interest; poor quality in dark/night space even with light on.Required −1 min video every 15 min interval; false positive if the object remains idle for both camera and PIR.Good choice for human detection, availability of the dataset, and support for different ML algorithms.
[29]Camera and environmental sensors Up to 79–99% prediction accuracy
50% energy saving
- Sensor reading takes up to 15 min on average to stabilize the room before correct prediction.
- Privacy and computational power challenges.
- The approach is effective when occupancy in the building is not more than seven.
- Sensitive to false positive prediction when door or window is open.
- The approach was able to maintain the desire for healthy comfort when occupants chose to balance energy consumption with thermal comfort.
- Can be simply integrated with several controls.
[34]Camera and environmental sensorsUp to 95% prediction accuracy
25% energy saving
- Sensor reading takes up to 15 min on average to stabilize the room before correct prediction.
- Privacy and computational power challenges.
Poor prediction performance when deployed in a chemical laboratory.The number of occupants in the building cannot affect the prediction performance.
Audio processing
[35]PIR audio sensor50% prediction accuracy 26% energy savingAffected by external noise; occupants must be close to the microphone; false result in the absence of speech.High false positive rate when occupant number tends to grow; 25 s of continuous speech is required; false result from PIR when idle for a max of 30 min. - Less computational resources are required compared with the camera approach.
- Suitable for both residential and commercial buildings.
[36]PIR audio sensorPrediction accuracy improves by 12% and energy saving by 3.4% Device background noise cancelation is not effective; the occupant must be close to the microphone, with false results in the absence of speech.High false positive rate is observed when occupants increase; 25 s of speech is required and background noise is partly addressed.- Provides more accurate prediction through noise cancelation.
- CO2 sensors can be integrated to easily verify occupancy number during data collection.
Passive infrared sensors
[37]
PIR, IR FPGA, CO2 sensors97% prediction accuracy 30% energy savingPartially does not support human detection; false result in the absence of motion for a max of 30 min. It takes time to populate room space; 1000–1500 ppm is maximum concentration. CO2 partially supports human detection; lack of availability of template or dataset for training, and supports few algorithms.
CO2 concentration
[17]PIR, CO2, sensors 80% prediction accuracy 62% energy savingNot practical for occupancy prediction; false result in the absence of motion for a max of 30 min.Error in reporting the number of occupants.The study supports the ML technique and sensor fusion mechanism to provide more accurate occupant data during data collection to minimize incorrect readings from PIR.
[38]CO2 sensor50% prediction accuracy 33% energy savingPartly supports human detection.Error in reporting the number of occupants.The technique is suitable in spaces with less occupancy turnover such as offices or labs.
[39]CO2 sensor21% energy savingCannot be used in multipurpose halls such as lecture theaters. Prone to false prediction.- The proposed approach can be deployed in both commercial and residential building types.
[16]CO2 and camera97% prediction accuracy 30% energy savingRequires object to be in close range in a straight line.Prone to false result; no background subtraction; error in reporting the number of occupants in the room.- Approach can support sensor fusion mechanism when ML techniques are used.
[40]CO2 and light sensor60% prediction accuracy 30% energy savingDoes not support human detection.Prone to false results as a light sensor can be covered by any object.- Approach is suitable in both commercial and residential buildings.
ML technique can be integrated to improve the data collection process.
Environmental sensing
[41]Environmental sensors Up to 98% prediction accuracy
and more than 30% energy saving
Sensor reading takes an average of 15 min to stabilize the room for accurate prediction.The prediction accuracy reduces when room occupancy grows to larger than seven.- Ensures occupancy prediction throughout the prediction process.
- Other IoT networks can be simply integrated for sensor fusion.
Table 2. Room thermo-physical properties.
Table 2. Room thermo-physical properties.
PropertiesMaterialc (J/Kg·K)(W/m·K)Thickness (cm)
Tuff6501.510
WallBrick10000.1118
Polystyrene16000.0288
Concrete6500.43
Stoneware flooring6501.251.3
Ground FloorIgloo6500.078
Gravel 1.11
Screed: ordinary concrete65015
Hollow-core concrete6500.725
CeilingXPS polystyrene panel6500.48
Brick tuff6500.55
Table 3. Sensors used in this study.
Table 3. Sensors used in this study.
SensorDescription AmbiguityUnitRecord
Temperature Measure indoor temperature 1 °CDegree Celsius 60 s interval
Relative HumidityMeasure indoor relative humidity ±5%Percentage 60 s interval
CO2Measure indoor CO2 concentration level300–1000 ppm: ±120 ppmParts per million (ppm)60 s interval
Light Measure luminance in the building10–2000 lux range Lux60 s interval
Table 4. RF model prediction using binary classification using CO2 data.
Table 4. RF model prediction using binary classification using CO2 data.
Score BinCumulative AUCF1 ScorePrecisionRecallNegative PrecisionNegative RecallAccuracy
(0.900, 1.000)0.0010.8130.9830.7190.7920.9990.837
(0.800, 0.900)0.0130.8670.9990.7410.7210.9820.849
(0.700, 0.800)0.0270.8210.9650.7560.7320.9640.855
(0.600, 0.700)0.0300.8230.9760.7730.7530.9600.865
(0.500, 0.600)0.0470.8810.9320.7840.7040.9400.867
(0.400, 0.500)0.0730.8610.9260.7880.7780.9080.860
(0.300, 0.400)0.1690.8450.8650.8040.8370.7930.833
(0.200, 0.300)0.3640.8770.7630.9360.9120.5790.808
(0.100, 0.200)0.6440.7890.6651.0001.0000.2970.706
(0.000, 0.100)0.9410.7540.5831.0001.0000.0000.583
Table 5. Benchmark databases extracted from the literature.
Table 5. Benchmark databases extracted from the literature.
Approach Technologies Technique Accuracy (%)
[2]Camera and sensors Machine Learning 89–99
[29]Sensors Machine Learning79–85
[14]Camera and sensors Machine Learning76–99
Proposed approach Camera and sensorsMachine Learning89–99.6
Table 6. PET values for all participants.
Table 6. PET values for all participants.
Nationality PET, (°C)
Maximum Medium Minimum Amplitude
Malaysian 41251229
Saudi Arabian56301244
Indonesian40261228
Nigerian 52281224
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abuhussain, M.A.; Alotaibi, B.S.; Dodo, Y.A.; Maghrabi, A.; Aliero, M.S. Multimodal Framework for Smart Building Occupancy Detection. Sustainability 2024, 16, 4171. https://doi.org/10.3390/su16104171

AMA Style

Abuhussain MA, Alotaibi BS, Dodo YA, Maghrabi A, Aliero MS. Multimodal Framework for Smart Building Occupancy Detection. Sustainability. 2024; 16(10):4171. https://doi.org/10.3390/su16104171

Chicago/Turabian Style

Abuhussain, Mohammed Awad, Badr Saad Alotaibi, Yakubu Aminu Dodo, Ammar Maghrabi, and Muhammad Saidu Aliero. 2024. "Multimodal Framework for Smart Building Occupancy Detection" Sustainability 16, no. 10: 4171. https://doi.org/10.3390/su16104171

APA Style

Abuhussain, M. A., Alotaibi, B. S., Dodo, Y. A., Maghrabi, A., & Aliero, M. S. (2024). Multimodal Framework for Smart Building Occupancy Detection. Sustainability, 16(10), 4171. https://doi.org/10.3390/su16104171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop