1. Introduction
Nowadays, as the Spanish Institute for Diversification and Saving of Energy (IDAE) [
1] of the Spanish Government says, energy is becoming a precious asset of incalculable value, which converted from electricity, heat or fuel, makes the everyday life of people easier and more comfortable. Moreover, it is also a key factor to make the progress of industry and business feasible.
Spanish households consume
of the total energy expenditure of the country [
1]. In the European Union (EU), primary energy consumption in buildings represents about
of the total [
2]. In the whole world, recent studies say that energy in buildings also represents a
rate of the total consumed energy, where more than half is used by heating, ventilation and air conditioning (HVAC) systems [
3].
Energy is a scarce resource in nature, which has an important cost, is finite and must be shared. Hence, there is a need to design and implement new systems at home, which should be able to produce and use energy efficiently and wisely, reaching a balance between consumption and streamlined comfort. A person could realize his activities much easier if his comfort is ensured and there are no negative factors (e.g., cold, heat, low light, noise, low air quality, etc.) to disturb him. With the evolution of technology, new parameters have become more controllable, and the requirements for people’s comfort level have increased.
Systems that let us monitor and control such aspects make it necessary to refer to what in reference [
4] is called “Ambient Intelligence” (AmI). This refers to the set of user-centered applications that integrate ubiquitous and transparent technology to implement intelligent environments with natural interaction. The result is a system that shows an active behavior (intelligent), anticipating possible solutions adapted to the context in which such a system is located. The term, home automation, can be defined as it is mentioned in reference [
5], as the set of services provided by integrated technology systems to meet the basic needs of security, communication, energy management and comfort of a person and his immediate environment. Thus, home automation can be understood as the discipline which studies the development of intelligent infrastructures and information technologies in buildings. In this paper, the concept of smart buildings is used in this way, as constructions that involve this kind of solution.
In this sense, the School of Technical Sciences at the University CEU-UCH has built a solar-powered house, known as the Small Medium Large System (SMLsystem), which integrates a whole range of different technologies to improve energy efficiency, allowing it to be a near-zero energy house. The house has been constructed to participate in the 2012 Solar Decathlon Europe competition. Solar Decathlon Europe [
6] is an international competition among universities, which promotes research in the development of energy-efficient houses. The objective of the participating teams is to design and build houses that consume as few natural resources as possible and produce minimum waste products during their lifecycle. Special emphasis is placed on reducing energy consumption and on obtaining all the needed energy from the sun. The SMLsystem house includes a Computer-Aided Energy Saving System (CAES). The CAES is the system that has been developed for the contest, which aims to improve energy efficiency using home automation devices. This system has different intelligent modules in order to make predictions about energy consumption and production.
To implement such intelligent systems, forecasting techniques in the area of artificial intelligence can be applied. Soft computing is widely used in real-life applications [
7,
8]. In fact, artificial neural networks (ANNs) have been widely used for a range of applications in the area of energy systems modeling [
2,
9,
10,
11]. The literature demonstrates their capabilities to work with time series or regression, over other conventional methods, on non-linear process modeling, such as energy consumption in buildings. Of special interest to this area is the use of ANNs for forecasting the room air temperature as a function of forecasted weather parameters (mainly solar radiation and air temperature) and the actuator (heating, ventilating, cooling) state or manipulated variables, and the subsequent use of these mid-/long-range prediction models for a more efficient temperature control, both in terms of regulation and energy consumption, as can be read in reference [
10].
Depending on the type of building, location and other factors, HVAC systems may represent up to
of the total energy consumption of a building [
2,
3]. The activation/deactivation of such systems depends on the comfort parameters that have been established, one of the most being indoor temperature, directly related to the notion of comfort. Several authors have been working on this idea; in reference [
2], an excellent state-of-the-art system can be found. This is why the development of an ANN to predict such values could help to improve overall energy consumption, balanced with the minimum affordable comfort of a home, in the case that these values are well anticipated in order to define efficient energy control actions.
This paper is focused on the development of an ANN module to predict the behavior of indoor temperature, in order to use its prediction to reduce energy consumption values of an HVAC system. The architecture of the overall system and the variables being monitored and controlled are presented. Next, how to tackle the problem of time series forecasting for the indoor temperature is depicted. Finally, the ANN experimental results are presented and compared to standard statistical techniques. Indoor temperature forecasting is an interesting problem which has been widely studied in the literature, for example, in [
2,
3,
12,
13,
14]. We focus this work in multivariate forecasting using different weather indicators as input features. In addition, two combinations of forecast models have been compared.
In the conclusion, it is studied how the predicted results are integrated with the energy consumption parameters and comfort levels of the SMLsystem.
2. SMLhouse and SMLsystem Environment Setup
The Small Medium Large House (SMLhouse) and SMLsystem [
15] solar houses have been built to participate in the Solar Decathlon 2010 and 2012 [
6], respectively, and aim to serve as prototypes for improving energy efficiency. The competition focus on reproducing the normal behavior of the inhabitants of a house, requiring competitors to maintain comfortable conditions inside the house—to maintain temperature, CO
and humidity within a range, performing common tasks like using the oven cooking, watching television (TV), shower,
etc., while using as little electrical power as possible.
As stated in reference [
16], due to thermal inertia, it is more efficient to maintain a temperature of a room or building than cooling/heating it. Therefore, predicting indoor temperature in the SMLsystem could reduce HVAC system consumption using future values of temperature, and then deciding whether to activate the heat pump or not to maintain the current temperature, regardless of its present value. To build an indoor temperature prediction module, a minimum of several weeks of sensing data are needed. Hence, the prediction module was trained using historical sensing data from the SMLhouse, 2010, in order to be applied in the SMLsystem.
The SMLhouse monitoring database is large enough to estimate forecasting models, therefore its database has been used to tune and analyze forecasting methods for indoor temperature, and to show how they could be improved using different sensing data as covariates for the models. This training data was used for the SMLsystem prediction module.
The SMLsystem is a modular house built basically using wood. It was designed to be an energy self-sufficient house, using passive strategies and water heating systems to reduce the amount of electrical power needed to operate the house.
The energy supply of the SMLsystem is divided into solar power generation and a domestic hot water (DHW) system. The photovoltaic solar system is responsible for generating electric power by using twenty-one solar panels. These panels are installed on the roof and at the east and west facades. The energy generated by this system is managed by a device to inject energy into the house, or in case there is an excess of power, to the grid or a battery system. The thermal power generation is performed using a solar panel that produces DHW for electric energy savings.
The energy demand of the SMLsystem house is divided into three main groups: HVAC, house appliances and lighting and home electronics (HE). The HVAC system consists of a heat pump, which is capable of heating or cooling water, in addition to a rejector fan. Water pipes are installed inside the house, and a fan coil system distributes the heat/cold using ventilation. As shown in reference [
17], the HVAC system is the main contributor to residential energy consumption, using
of total power in U.S. households or
of total power in European residential buildings. In the SMLsystem, the HVAC had a peak consumption of up to
kW when the heat pump was activated and, as shown in
Table 1, it was the highest power consumption element of the SMLsystem in the contest with
of total consumption. This is consistent with data from studies mentioned as the competition was held in Madrid (Spain) at the end of September. The house has several energy-efficient appliances that are used during the competition. Among them, there is a washing machine, refrigerator with freezer, an induction hob/vitroceramic and a conventional oven. Regarding the consumption of the washing machine and dishwasher, they can reduce the SMLsystem energy demand due to the DHW system. The DHW system is capable of heating water to high temperatures. Then, when water enters into these appliances, the resistor must be activated for a short time only to reach the desired temperature. The last energy-demanding group consists of several electrical outlets (e.g., TV, computer, Internet router and others).
Table 1.
Energy consumption per subsystem. HVAC: heating, ventilation and air conditioning; HE: home electronics.
Table 1.
Energy consumption per subsystem. HVAC: heating, ventilation and air conditioning; HE: home electronics.
System | Power peak (kW) | Total power (Wh) | Percentage |
---|
HVAC | | | |
Home appliances | - | | |
Lighting & HE | | | |
Although the energy consumption of the house could be improved, the installed systems let the SMLsystem house be a near-zero energy building, producing almost all the energy at the time the inhabitants need it. This performance won the second place at the energy balance contest of the Solar Decathlon competition [
18].
A sensor and control framework shown in
Figure 1 has been used in the SMLsystem. It is operated by a Master Control Server (MCS) and the European home automation standard protocol known as Konnex (KNX) [
19] has been chosen for monitoring and sensing. KNX modules are grouped by functionality: analog or binary inputs/outputs, gateways between transmission media, weather stations, CO
detectors,
etc. The whole system provides 88 sensor values and 49 actuators. In the proposed system, the immediate execution actions had been programmed to operate without the involvement of the MCS, such as controlling ventilation, the HVAC system and the DHW system. Beyond this basic level, the MCS can read the status of sensors and actuators at any time and can perform actions on them via an Ethernet gateway.
Figure 1.
Small medium large system (SMLsystem) sensors and actuators map.
Figure 1.
Small medium large system (SMLsystem) sensors and actuators map.
A monitoring and control software was developed following a three-layered scheme. In the first layer, data is acquired from the KNX bus using a KNX-IP (Internet Protocol) bridge device. The Open Home Automation Bus (openHAB) [
20] software performs the communication between KNX and our software. In the second layer, it is possible to find a data persistence module that has been developed to collect the values offered by openHAB with a sampling period of 60 s. Finally, the third layer is composed of different software applications that are able to intercommunicate: a mobile application has been developed to let the user watch and control the current state of domotic devices; and different intelligence modules are being developed also, for instance, the ANN-based indoor temperature forecasting module.
The energy power generation systems described previously are monitored by a software controller. It includes multiple measurement sensors, including the voltage and current measurements of photovoltaic panels and batteries. Furthermore, the current, voltage and power of the grid is available. The system power consumption of the house has sensors for measuring power energy values for each group element. The climate system has power consumption sensors for the whole system, and specifically for the heat pump. The HVAC system is composed of several actuators and sensors used for operation. Among them are the inlet and outlet temperatures of the heat rejector and the inlet and outlet temperatures of the HVAC water in the SMLsystem. In addition, there are fourteen switches for internal function valves, for the fan coil system, for the heat pump and the heat rejector. The DHW system uses a valve and a pump to control water temperature. Some appliances have temperature sensors which are also monitored. The lighting system has sixteen binary actuators that can be operated manually by using the wall-mounted switches or by the MCS. The SMLsystem has indoor sensors for temperature, humidity and CO. Outdoor sensors are also available for lighting measurements, wind speed, rain, irradiance and temperature.
3. Time Series Forecasting
Forecasting techniques are useful in terms of energy efficiency, because they help to develop predictive control systems. This section introduces formal aspects and forecasting modeling done for this work. Time series are data series with trend and pattern repetition through time. They can be formalized as a sequence of scalars from a variable
x, obtained as the output of the observed process:
a fragment beginning at position
i and ending at position
j will be denoted by
.
Time series forecasting could be grouped as univariate forecasting when the system forecasts variable x using only past values of x, and multivariate forecasting when the system forecasts variable x using past values of x plus additional values of other variables. Multivariate approaches could perform better than univariate when additional variables cause variations on the predicted variable x, as is shown in the experimental section.
Forecasting models are estimated given different parameters: the number of past values, the size of the future window, and the position in the future of the prediction (future horizon). Depending on the size of the future window and how it is produced [
21], forecasting approaches are denoted as:
single-step-ahead forecasting if the model forecasts only the next time step;
multi-step-ahead iterative forecasting if the model forecasts only the next time step, producing longer windows by an iterative process; and
multi-step-ahead direct forecasting [
22] if the model forecasts in one step a large future window of size
Z. Following this last approach, two different major model types exist:
Pure direct, which uses Z forecasting models, one for each possible future horizon.
Multiple input multiple output (MIMO), which uses one model to compute the full Z future window. This approach has several advantages due to the joint learning of inputs and outputs, which allows the model to learn the stochastic dependency between predicted values. Discriminative models, as ANNs, profit greatly from this input/output mapping. Additionally, ANNs are able to learn non-linear dependencies.
3.1. Forecast Model Formalization
A forecast model could be formalized as a function
F, which receives as inputs the interest variable (
) with its past values until current time
t and a number
C of covariates (
), also with its past values, until current time
t and produces a future window of size
Z for the given
variable:
being the
past values of variable/covariate
x.
The number of past values
is important to ensure good performance of the model, however, it is not easy to estimate this number exactly. In this work, it is proposed to estimate models for several values of
and use the model that achieves better performance, denoted as BEST. It is known in the machine learning community that ensemble methods achieve better generalization [
23,
24,
25]. Several possibilities could be found in the literature, such as vote combination, linear combination (for which a special case is the uniform or mean combination), or in a more complicated way, modular neural networks [
26]. Hence, it is also proposed to combine the outputs of all estimated models for each different value of
, following a linear combination scheme (the linear combination is also known as ensemble averaging), which is a simple, but effective method of combination, greatly extended to the machine learning community. Its major benefit is the reduction of overfitting problems and therefore, it could achieve better performance than a unique ANN. The quality of the combination depends on the correlation of the ANNs, theoretically, as the more decorrelated the models are, the better the combination is. In this way, different input size
ANNs were combined, with the expectation that they will be less correlated between themselves than other kinds of combinations, as modifying hidden layer size or other hyper-parameters.
A linear combination of forecasts models, given a set
of
M forecast models, with the same future window size (
Z), follows this equation:
where
is the combination weight given to the model
; and
is its corresponding Ω function, as described in
Section 3.1. The weights are constrained to sum one,
. This formulation allows one to combine forecast models with different input window sizes for each covariate, but all of them using the same covariate inputs. Each weight
will be estimated following two approaches:
3.2. Evaluation Measures
The performance of forecasting methods over one time series could be assessed by several different evaluation functions, which measure the empirical error of the model. In this work, for a deep analysis of the results, three different error functions are used: MAE, root mean square error (RMSE) and symmetric mean absolute percentage of error (SMAPE). The error is computed comparing target values for the time series
, and its corresponding time series prediction
, using the model
:
The results could be measured over all time series in a given dataset
as:
being the size of the dataset and
, the loss-function defining MAE
, RMSE
, and SMAPE
.
3.3. Forecasting Data Description
One aim of this work is to compare different statistical methods to forecast indoor temperature given previous indoor temperature values. The correlation between different weather signals and indoor temperature will also be analyzed.
In our database, time series are measured with a sampling period of
min. However, in order to compute better forecasting models, each time series is sub-sampled with a period of
min, computing the mean of the last
values (for each hour, this mean is computed at 0 min, 15 min, 30 min and 45 min). The output of this preprocessing is the data series
, where:
One time feature and five sensor signals were taken into consideration:
Indoor temperature in degrees Celsius, denoted by variable . This is the interesting forecasted variable.
Hour feature in Universal Time Coordinated (UTC), extracted from the time-stamp of each pattern, denoted by variable . The hour of the day is important for estimating the Sun’s position.
Sun irradiance in , denoted by variable . It is correlated with temperature, because more irradiance will mean more heat.
Indoor relative humidity percentage, denoted by variable . The humidity modifies the inertia of the temperature.
Indoor air quality in CO ppm (parts per million), denoted by variable . The air quality is related to the number of persons in the house, and a higher number of persons means an increase in temperature.
Raining Boolean status, denoted by variable . The result of sub-sampling this variable is the proportion of minutes in sub-sampling period , where raining sensor was activated with True.
To evaluate the forecasting models’ performance, three partitions of our dataset were prepared: a training partition composed of 2017 time series over 21 days—the model parameters are estimated to reduce the error in this data; a validation partition composed of 672 time series over seven days—this is needed to avoid over-fitting during training, and also to compare and study the models between themselves; training and validation were performed in March 2011; a test partition composed of 672 time series over seven days in June 2011. At the end, the forecasting error in this partition will be provided, evaluating the generalization ability of this methodology. The validation partition is sequential with the training partition. The test partition is one week ahead of the last validation point.
5. Experimental Results
Using the data acquired during the normal functioning of the house, experiments were performed to obtain the best forecasting model for indoor temperature. First, an exhaustive search of model hyper-parameters was done for each covariate combination. Second, different models were trained for different values of past size for indoor temperature , and a comparison among different covariate combinations and ANN vs. standard statistical methods has been performed. A comparison of a combination of forecasting models has also been performed. In all cases, the future window size Z was set to 12, corresponding to a three-hour forecast.
A grid search exploration was done to set the best hyper-parameters of the system and ANN topology, fixing covariates
to a past size,
and
, searching combinations of:
different covariates of the model input;
different values for ANN hidden layer sizes;
learning rate, momentum term and weight decay values.
Table 2 shows the best model parameters found by this grid search. For illustrative purposes,
Figure 3 and
Figure 4 show box-and-whisker plots of the hyper-parameter grid search performed to optimize the ANN model,
.
Table 2.
Training parameters depending on the input covariates combination (η is the learning rate, μ is the momentum term, and ϵ is weight decay).
Table 2.
Training parameters depending on the input covariates combination (η is the learning rate, μ is the momentum term, and ϵ is weight decay).
Covariates | η | μ | ϵ | Hidden layers |
---|
d | | | | 8 tanh–8 tanh |
| | | | 24 tanh–8 tanh |
| | | | 8 tanh |
| | | | 24 tanh–16 tanh |
| | | | 16 tanh |
| | | | 16 logistic–8 logistic |
| | | | 24 logistic |
| | | | 16 tanh |
| | | | 16 logistic–8 logistic |
| | | | 8 tanh–8 tanh |
| | | | 24 tanh–8 tanh |
Figure 3.
Mean absolute error (MAE) box-and-whisker plots for ANNs with one hidden layer and the hyper-parameters of the grid search performed to optimize the ANN model, . The x-axis of the learning rate, momentum and weight decay are log-scaled.
Figure 3.
Mean absolute error (MAE) box-and-whisker plots for ANNs with one hidden layer and the hyper-parameters of the grid search performed to optimize the ANN model, . The x-axis of the learning rate, momentum and weight decay are log-scaled.
Figure 4.
MAE box-and-whisker plots for ANNs with two hidden layers and the hyper-parameters of the grid search performed to optimize the ANN model, . The x-axis of the learning rate, momentum and weight decay are log-scaled.
Figure 4.
MAE box-and-whisker plots for ANNs with two hidden layers and the hyper-parameters of the grid search performed to optimize the ANN model, . The x-axis of the learning rate, momentum and weight decay are log-scaled.
They show big differences between one- and two-hidden layer ANNs, two-layered ANNs being more difficult to train for this particular model. The learning rate shows a big impact in performance, while momentum and weight decay seems to be less important. This grid search was repeated for all the tested covariate combinations, and the hyper-parameters that optimize MAE were selected in the rest of the paper.
5.1. Covariate Analysis and Comparison between Different Forecasting Strategies
For each covariate combination, and using the best model parameters obtained previously, different model comparison has been performed. Note that the input past size of covariates is set to = 5 time steps, that is, 60 min, and to . For forecasted variable , models with sizes were trained.
A comparison between BEST, COMB-EQ and COMB-EXP approaches was performed and shown in
Table 3.
Figure 5 plots the same results for a better confidence interval comparison.
Table 4 shows COMB-EQ weights used in experimentation, obtained following Equation (
4) and using MAE
as the loss-function. From all these results, the superiority of ANNs
vs. standard statistical methods is clear, with clear statistical significance and with a confidence greater than
. Different covariate combinations for ANN models show that the indoor temperature correlates well with the hour (
) and sun irradiance (
), and the combination of these two covariates (
) improves the model in a significant way (
confidence) with input
. The addition of more covariates is slightly better in two cases (
and
), but the differences are not important. With only the hour and sun irradiance, the ANN model has enough information to perform good forecasting. Regarding the combination of models, in some cases, the COMB-EXP approach obtains consistently better results than COMB-EQ and BEST, but the differences are not important.
Table 3.
Symmetric mean absolute percentage of error (SMAPE), MAE and root mean square error (RMSE) results on the validation partition comparing different models, input features and combination schemes with the confidence interval. BEST refers to the best past size ANN, CEQ refers to COMB-EQ ANNs, and CEXP refers to COMB-EXP ANNs. Bolded face numbers are the best results, and the gray marked row is the most significant combination of covariates. ARIMA: auto-regressive integrated moving average models; ARIMAQ: ARIMA with covariate as a quadratic form (ARIMAQ); ARIMAF: ARIMA with covariate as a factor.
Table 3.
Symmetric mean absolute percentage of error (SMAPE), MAE and root mean square error (RMSE) results on the validation partition comparing different models, input features and combination schemes with the confidence interval. BEST refers to the best past size ANN, CEQ refers to COMB-EQ ANNs, and CEXP refers to COMB-EXP ANNs. Bolded face numbers are the best results, and the gray marked row is the most significant combination of covariates. ARIMA: auto-regressive integrated moving average models; ARIMAQ: ARIMA with covariate as a quadratic form (ARIMAQ); ARIMAF: ARIMA with covariate as a factor.
Model | SMAPE | MAE | RMSE |
---|
Standard statistical models |
ARIMA-d | | | | | | | | | |
ARIMAQ- | | | | | | | | | |
ARIMAF- | | | | | | | | | |
ETS-d | | | | | | | | | |
ANN models |
BEST-d | | | | | | | | | |
CEQ-d | | | | | | | | | |
CEXP-d | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
Figure 5.
SMAPE
error plot with
confidence interval for models of
Table 3 on the validation partition.
Figure 5.
SMAPE
error plot with
confidence interval for models of
Table 3 on the validation partition.
Table 4.
Combination weights of every input size of d for the COMB-EXP models given tested covariates combinations. All co-variables have an input size of 5 (75 min). Bold numbers are the best input sizes.
Table 4.
Combination weights of every input size of d for the COMB-EXP models given tested covariates combinations. All co-variables have an input size of 5 (75 min). Bold numbers are the best input sizes.
Input covariates | COMB-EXP combination weights for every d variable input size (min) |
---|
| | | | | | | | | | |
---|
d | | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
A deeper analysis could be done if comparing the SMAPE values for each possible future horizon, as
Figure 6 shows. A clear trend exists: error increases with the enlargement of the future horizon. Furthermore, an enlargement of the confidence interval is observed with the enlargement of the future horizon. In all cases, ANN models outperform statistical methods. For shorter horizons (less than or equal to 90 min), the differences between all ANN models are insignificant. For longer horizons (greater than 90 min), a combination of covariates
achieve a significant result (for a confidence of
) compared with the
combination. As was shown in these results, the addition of covariates is useful when the future horizon increases, probably because the impact of covariates into indoor temperature becomes stronger over time.
Figure 6.
SMAPE error plot with confidence interval of each of the future horizon predicted values (from 15 min forecast to 180 min forecast.)
Figure 6.
SMAPE error plot with confidence interval of each of the future horizon predicted values (from 15 min forecast to 180 min forecast.)
Finally, to compare the generalization abilities of the proposed best models, the error measures for the test partition are shown in
Table 5 and
Figure 7. All error measures show better performance in the test partition, even when this partition is two weeks ahead of training and contains hotter days than the training and validation partitions. The reason for this better performance might be that the test series has increasing/decreasing temperature cycles that are more similar to the training partition than the cycles in the validation partition. The differences between models are similar, and the most significant combination of covariates is time hour and sun irradiance (
) following the COMB-EXP strategy, achieving a SMAPE
, MAE
, and RMSE
.
Table 5.
SMAPE, MAE and RMSE results on test partition comparing the best models with the confidence interval. Bolded face numbers are the best results, and the gray marked row is the most significant combination of covariates.
Table 5.
SMAPE, MAE and RMSE results on test partition comparing the best models with the confidence interval. Bolded face numbers are the best results, and the gray marked row is the most significant combination of covariates.
Model | SMAPE | MAE | RMSE |
---|
ETS-d | | | | | | | | | |
BEST-d | | | | | | | | | |
CEQ-d | | | | | | | | | |
CEXP-d | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
BEST- | | | | | | | | | |
CEQ- | | | | | | | | | |
CEXP- | | | | | | | | | |
Figure 7.
SMAPE
error plot with the
confidence interval for the models of
Table 5 in the test partition.
Figure 7.
SMAPE
error plot with the
confidence interval for the models of
Table 5 in the test partition.
In order to perform a better evaluation, the conclusions above are compared with mutual information (MI), shown in
Table 6. Probability densities have been estimated with histograms, making the assumption of independence between time points, which is not true for time series [
39], but is enough for our contrasting purpose. The behavior of the ANNs is similar to the MI study. Sun irradiance (
W) covariates show high MI with indoor temperature (
d), which is consistent with our results. Humidity (
H) and air quality (
Q) MI with indoor temperature (
d) is higher than sun irradiance, which seems contradictory with our expectations. However, if we compute MI only during the day (removing the night data points), the sun irradiance shows higher MI with indoor temperature than other covariates. Regarding the hour covariate, it shows lower MI than expected, probably due to the cyclical shape of the hour, which breaks abruptly with the jump between 23 and 0, affecting the computation of histograms.
Table 6.
Mutual Information (MI) and normalized MI between considered covariates and the indoor temperature, for the validation set.
Table 6.
Mutual Information (MI) and normalized MI between considered covariates and the indoor temperature, for the validation set.
Data | Algorithm | d | h | W | H | R | Q |
---|
| MI (for d) | | | | | | |
Validation set | Normalized MI (for d) | | | | | | |
Validation set, | MI (for d) | | | | | | |
removing night data points | Normalized MI (for d) | | | | | | |
6. Conclusions
An overview of the monitoring and sensing system developed for the SMLsystem solar powered house has been described. This system was employed during the participation at the Solar Decathlon Europe 2012 competition. The research in this paper has been focused on how to predict the indoor temperature of a house, as this is directly related to HVAC system consumption. HVAC systems represent of the overall power consumption of the SMLsystem house. Furthermore, performing a preliminary exploration of the SMLsystem competition data, the energy used to maintain temperature was found to be – of the energy needed to lower it. Therefore, an accurate forecasting of indoor temperature could yield an energy-efficient control.
An analysis of time series forecasting methods for prediction of indoor temperature has been performed. A multivariate approach was followed, showing encouraging results by using ANN models. Several combinations of covariates, forecasting model combinations, comparison with standard statistical methods and a study of covariate MI has been performed. Significant improvements were found by combining indoor temperature with the hour categorical variable and sun irradiance, achieving a MAE degrees Celsius (SMAPE). The addition of more covariates different from hour and sun irradiance slightly improves the results. The MI study shows that humidity and air quality share important information with indoor temperature, but probably, the addition of these covariates does not add different information from which is indicated by hour and sun irradiance. The combination of ANN models following the softmax approach (COMB-EXP) produce consistently better forecasts, but the differences are not important. The data available for this study was restricted to one month and a week of a Southern Europe house. It might be interesting to perform experiments using several months of data in other houses, as weather conditions may vary among seasons and locations.
As future work, different techniques for the combination of forecasting models could be performed. A deeper MI study to understand the relationship between covariates better would also be interesting. The use of second order methods to train the ANN needs to be studied. In this work, for the ANN models, the hour covariate is encoded using 24 neurons; other encoding methods will be studied, for example, using splines, sinusoidal functions or a neuron with values between 0 and 23.
Following these results, it is intended to design a predictive control based on the data acquired from ANNs, for example, from this one that is devoted to calculating the indoor temperature, extrapolating this methodology to other energy subsystems that can be found in a home.