1. Introduction
Predicting the heating and cooling energy consumption of the building stock is critical to delineate the required renovation strategies according to the Renovation Wave for Europe program [
1]. The same applies to understanding the paradox of excess mortality in mild winter climates and its relation to energy poverty [
2,
3] or accounting for the building sector’s share for regional and national energy and climate plans (NECP) [
4]. It is not less important to understand how buildings perform to more extensive heat waves, and the corresponding impact on summer mortality [
5]. Therefore, there is an urgent need to promote operational tools to support energy policies and answer specific energy use questions in the residential building stock.
The building stock energy modeling (BSEM), also known as urban building energy modeling (UBEM), was in 2016 a nascent field [
6] but has extensively increased during the last years [
7]. A recent review [
8] cited almost 300 references, describing numerous models and techniques with the common goal of predicting building stock energy consumption, covering a spatial scale from a city block to an entire city.
The remarkable development of this research area required a revision of the previous classification [
9]. Therefore, Langevin et al. [
10] updated it, using a multi-layer quadrant scheme where top-down and bottom-up define the horizontal axis’s energy model design. The degree of transparency (vertical axis) distinguishes white-box (meaning physics-based) from black-box approaches. Models from all quadrants aim to address one or more of the following issues: forecasting and prediction, profiling, mapping or benchmarking. The determinant factors for the energy prediction corresponding to different layers are environmental context, building stock itself and occupants’ energy-related behaviors [
10]. In their review, Hong et al. [
7] identified two main areas requiring further developments: modeling occupant behavior and end-use disaggregation for black-box models.
Energy Performance Certificates (EPC) provide important building-stock-related information not considered in the census but EPC data have flaws. For example, Ahern and Norton [
11] concluded that models overestimate primary energy up to 70% by using default values for unknown parameters. In addition, Dall’O et al. [
12] found 24% EPC with unreliable information. Notwithstanding, EPC constitutes the large primary source of updated building stock data. Except for a few countries with an operational rating EPC system, such as Sweden [
13], the EPC rating uses calculated energy consumption, which may significantly differ from operational energy [
14]. This negative performance gap is higher in pre-retrofitted buildings when compared to those retrofitted [
15]. The uncertainty associated with determinant factors from the three layers certainly explains performance gaps.
Furthermore, the simplified approaches developed for Northern and Central European countries where heating is dominant, usually steady-state methods, are far from appropriate for Mediterranean countries [
16], where cooling is not negligible and intermittent heating is the standard practice. Vivian et al. [
17] compared the performance of two lumped capacitance models, a first-order (5R1C) and a second-order (7R2C), with advanced energy simulation tools (TRNSYS), considering intermittent heating and cooling in different climate conditions. They concluded that the second-order model improves the peak loads and energy consumption prediction compared to the first-order. Because of the resistance-capacitance’s (RC) low complexity, both in inputs and computation load, they would have a prominent role in the future of white-box bottom-up approaches [
18].
Changing the object scale from the building to the urban scale requires a detailed evaluation of the complexity modeling level. Focusing the energy calculations on a few building archetypes significantly decreases the computation load [
19]. However, identifying the representative buildings using pre-defined criteria such as typology and construction period requires expert intervention and reduces the building stock heterogeneity. Goy et al. [
20] explored semi-supervised or unsupervised building clustering techniques to obtain representative buildings. They showed that those are strictly related to the final indices (in their study, heating demand), supporting that representative buildings are only valid for specified indices. Assuming that descriptive parameters are distributions instead of fixed values is an alternative approach to archetypes because it preserves the building stock diversity [
21]. By defining them as probability distributions, the unknown or uncertain archetypes parameters result in a combined solution to the lack of variety of archetypes [
22]. Moreover, Ben and Steemers [
23] explored the extension of building archetypes to households to overcome the significant variations found in occupant behavior. Findings from statistical analyses of behavioral patterns resulted in five household archetypes: active spenders, conscious occupiers, average users, conservers and inactive users [
24].
We have learned so far that adapting previous energy models is not a straightforward solution because those were developed for a particular context, taking into account the local climate, socio-economic and cultural aspects and building architecture. Due to the interdisciplinary topic, predicting heating and cooling energy consumption at a large scale is far from a deterministic science. Furthermore, energy models should consider the variability that comes from the cultural context and socio-economic conditions.
The remarkable low heating and cooling energy consumption of some Southern European countries emphasizes the critical research challenge. For example, in Portugal, achieving thermal comfort is still a non-priority expense even for large-income households [
25]. Space heating and cooling are often perceived as no basic needs. In fact, during interviews, Horta et al. [
26] found those who consider that feeling cold (or hot) at home in winter (or summer) is acceptable.
Specifically for Portugal, some studies used EPC data for the residential building stock characterization. Magalhães and Leal [
27] used EPC energy demand to quantify the heating performance gap. Palma et al. [
28] used 176 different typologies to calculate the energy performance gap by comparing the theoretical with actual energy consumption. Silva et al. [
29] estimated the energy use for mobility, space heating and space cooling for Porto city using neural networks to study different urban configurations.
These studies consider seasonal steady-state energy demand, calculated using raw data from buildings or EPC indicators. A significant step to support future BSEM consisted of collecting, mapping, cleansing and integrating urban building data, resulting in 18 archetypes of the residential building stock for a case-study of the Lisbon area [
30]. Moreover, space heating and cooling energy demand obtained by building simulation for 10 building typologies were inputs for the Évora city energy model [
31].
On the other hand, Fonseca and Panão [
32] applied the Monte Carlo method to model the residential building stock and calculate the energy performance indicators using input data as probability distributions. Afterward, Panão and Brito [
33] applied that building stock characterization for profiling the electricity use in Lisbon city, adding stochastic modeling of user behavior, which accurately predicted hourly profiling electricity use in Lisbon dwellings. Figueiredo et al. [
34] extended the hourly energy calculation to predict electricity loads in future climates.
The identified barriers that this research intends to tackle are: (1) heating and cooling steady-state energy models (based on degree-days) are not accurate predictive models of intermittent heating and cooling; (2) archetypes or representative buildings narrow the variability found in the building stock; (3) energy performance certificates collect data but are useless in characterizing user profiles. This research is a step forward in the previously developed models [
32,
33,
34] since it intends to explore other statistical techniques of generating the building stock model and shift to transient energy calculations using RC modeling. Strengths and weaknesses of previous models developed by the authors are summarized in
Table 1.
The paper is organized as follows.
Section 2 describes the statistical techniques to generate the building stock, the energy modeling approaches, the case-study area and other modeling assumptions. The results and discussion section (
Section 3) includes a first validation by comparing EPC data with the calculated energy needs. Furthermore, it compares steady-state energy needs with those calculated by an RC model. Finally, it presents a sensitive study of how user profiles influence energy needs and electricity consumption for heating and cooling. The paper closes with the conclusion section (
Section 4).
2. Materials and Methods
The methodology flow diagram (
Figure 1) summarizes the study methods and purposes.
Section 2.1 describes the statistical methods that generate the building stock, namely, the building form—steps (1), (2), and (3)—and fabric (4). The method is applied to the case-study area presented in
Section 2.2. The energy needs calculated by EPC are used to validate the building stock generation, which constitutes the first study purpose. Afterward, statistical methods are applied in the generation of hourly user profiles for nominal, actual, adequate and minimum conditions described in
Section 2.3. The second purpose is to compare the energy calculations using two different approaches: seasonal steady-state and hourly RC (
Section 2.4). Hourly profiles are required to test the use of the RC model to compute the electricity consumption for space heating and cooling. The electricity consumption is compared with average data from statistics (third purpose) for actual conditions. Finally, to illustrate the model applicability and potential (fourth purpose), the calculated electricity consumption for adequate and minimum conditions is compared with the calculated electricity for actual consumption to infer the number of buildings without thermal comfort (under-heated in Winter or overheated in Summer). Climate data used for the case-study area are presented in
Section 2.5.
2.1. Building-Stock Generation
The computational tool developed in [
35] uses raw data from energy performance certificates of residential building units in Portugal, issued during 2008 to 2018. It performs data processing and analysis of the main parameters related to form (e.g., net floor area, walls, roof, ground floor and window areas) and fabric (e.g., envelope U-values, glazing and shading devices g-value and air permeability). Raw data are transformed into comparable data, using normalization (e.g., window-to-floor area, opaque-to-floor area) or weighted average values (e.g., mean window U-value and g-value). Evaluations can be performed by context (new, existing, or major renovation), by NUTS III (corresponding to the third territorial statistical subdivision of the Portuguese National Statistics) or by the mix of the two. Probability distributions are the main output of the computational tool obtained by searching for the best fitting considering the likelihood criteria.
Modeling building stock requires a dataset of building parameters capable of recreating a theoretical sample of building units. The method tested here consists of applying the Gaussian copula method [
21,
36]. The evaluation uses multi-correlation among seven main parameters: opaque-to-floor area, window-to-floor area, window, external wall and roof U-values, glazing and shading g-values (
Table 2). The remaining are stochastically generated based on independent probability distributions because no strong correlation is expected.
The Gaussian copula method uses a training dataset—the EPC processed data—to compute the covariance matrix used in the sample generation. The training data are form and fabric parameters from the computational tool split into (i) building units with an external roof (e.g., detached, semi-detached houses, last floor apartments) and (ii) without an external roof (e.g., middle floor apartments). For each parameter, the original vector
is transformed into a vector
formed by the corresponding percentiles obtained to each position
k. Afterward, the distribution of values within the interval [0,1] is converted into a normal distribution with a mean and standard deviation equal to 0 and 1, respectively, by applying the following transformation:
The covariance matrix results directly from the matrix composed of the transformed vectors, and it is used in the random generation of normal distributions for the seven parameters. The theoretical sample is obtained by reversing the initial transformation.
The primary parameter used to rebuild geometry is the net floor area. The generation of windows and opaque envelope areas comes directly from the net floor area, window-to-floor area and opaque-to-floor area. The secondary parameter is the opaque envelope area, used to determine the thermal bridge length and the envelope area separating the net volume from unheated spaces. Selecting primary and secondary parameters prevents the generation of unrealistic geometries. The overall descriptive parameters required to calculate energy needs are obtained after generating the geometry of a sample of N theoretical building units. Examples of those are opaque envelope heat conductance () and window heat conductance (), directly resulting from multiplying envelope U-values and areas. The same applies to effective solar collection areas () for each orientation j, considering the product of window areas, correction factors and g-values.
2.2. Case-Study Area
The regional building stock energy model is similar to that applied nationally in [
33] but downscaled to the Metropolitan Area of Lisbon (Área Metropolitana de Lisboa, hereafter AML), taking into account a total of 242,860 energy performance certificates of new and existing residential building units. AML covers 3000 km
(
Figure 2) and has a population of 2.86 million. It includes two NUTS III regions, corresponding to North and South of Tagus riverside regions (
Figure 2), hereafter North-AML and South-AML (‘Grande Lisboa’ and ‘Península de Setúbal’, respectively). The total number of building units in this area is about 1.5 million, and 75% of them are regularly occupied (
Table 3). EPC data represent 16% of the total building units, a considerable large sample of the building stock.
Building units not regularly occupied with an electricity contract still use electricity (e.g., refrigerator, freezer, other appliances), even if the electricity might be much lower than those that are regularly occupied. Considering not regularly occupied units explains the mean electricity consumption (e.g., 2226 kWh/y in 2015) being lower than values obtained from smart metering (e.g., for Lisbon city 3927 kWh/y in 2015–2016 [
33]) and the data reported by the two national surveys on the residential energy consumption [
40,
41] (for national territory 3674 kWh/y in 2010 and 3360 kWh/y in 2020).
This study focuses on the building stock electricity consumption for space heating and cooling. For that reason, it excludes other heating systems (gas burners, boilers, fireplaces, etc.) still used in about 18% of the regularly occupied units (data from 2011). Since there are no available regional disaggregated data, the best knowledge comes from the 2010 national survey on household energy consumption [
40] that estimated mean electricity consumption 333 kWh/y for space heating and 59 kWh/y for space cooling. According to the preliminary results of the updated survey for 2020 [
41], this figure for heating decreased to 138 kWh/y. No information is available for cooling. It is noteworthy that the percentage of AML regularly occupied units with resistive heating equipment represented, in 2011, 51% [
37] and those with heat pumps (including air conditioning units) for heating and cooling, in 2015, represented 16% [
38].
Distributions in
Table 4 are direct outputs of the tool [
35] applied to AML and support the generation of some of the building form and fabric parameters. Since no regional differences for other form and fabric parameters are expected, the independent distributions already applied in [
33] are kept invariant.
2.3. User Profiles
EPC data do not contain enough information to generate user profiles, but those are critical to computing electricity consumption. To that end, different data sources support the model inputs selection. The first set of conditions regards occupancy patterns, temperature set-points, heating and cooling time of use and heated and cooled floor area. Four scenarios regarding how users interact with their houses and equipment are explored:
A summary of scenarios (A) to (D) is presented in
Table 5. A user profile is randomly generated for each building unit for (B) to (D) conditions. For nominal conditions (A), the user profile is the same for all building units.
Calculating the building stock total electricity consumption under actual conditions (B) assumes no electricity consumption for building units without heating or cooling equipment, which are 15.3% and 84.4%, respectively. On the other hand, for scenarios (A), (C) and (D), potential electricity consumption is additionally calculated assuming a default equipment, which are resistive heating and a heat pump for cooling. Default EER values are 2.9 (the mean value of the generated distribution) for (C) and (D) and 3.0 (the EPC default) for (A). For all scenarios, no electricity consumption is accounted for the building units that use other energy sources for heating. For heating and cooling, the heat pump performance is stochastically obtained by the independent distributions obtained for AML building units [
35].
2.4. Energy Needs Calculation
Two different approaches are used to calculate space heating and cooling energy needs: quasi-steady-state on a seasonal basis (seasonal steady-state) and RC model on an hourly basis (hourly RC). Despite inputs being similar, approaches significantly differ in considering transient heat processes.
The first considers thermal inertia by including an empirical input—the utilization gain factor
—which is a function of
, the ratio between heat transfer (
) and heat gains (
). Heat transfer and heat gains are energy values integrated during the calculation period (heating or cooling season). The computation of heat transfer and heat gains might include simplifications. Heating and cooling energy needs are obtained, respectively, from:
The equations to compute heat transfer, heat gains and gain utilization factor follow EN ISO 13790 [
45] considering the simplifications and assumptions of the EPC approved method in Portugal [
42]. It is noteworthy that energy needs on the EPC database were calculated by the seasonal steady-state method, and that is the reason why EN ISO 13790 is still used, even if it was revised and replaced by ISO 52016 [
46]. The latter does not include the seasonal approach but only the monthly approach.
The second approach uses thermal inertia, simplifying the building unit to a lumped single heat capacitance (5R1C) [
47]. The thermal grid includes five thermal resistances—expressed as the inverse of the thermal conductances—connecting the thermal nodes (
Figure 3): outdoor air,
, supplying ventilation air,
(if different from outdoor), and indoor,
. Heavy and light elements are separately modeled.
The mass temperature,
, connects to a lumped thermal capacitance,
C, modeling the energy storage capability of heavy elements. The star temperature,
, is the weighted indoor, outdoor and mass temperatures. Total heat gains are split into nodes
,
and
. Node energy balance equations are solved by a Cranck–Nicolson scheme with an hourly time-step. For further details on the approaches equations, refer to [
45].
Both approaches are simplified and low computational demanding, critical modeling issues when computing energy needs for many building units. Direct outputs of the model are distribution functions for space heating and cooling energy needs (per unit of net floor area). However, for evaluating the building stock, integrated values are as relevant as distributions. For simplicity, the mean value per building unit is adopted as the primary parameter to compare scenarios.
2.5. Climate Data
Climate data for the selected region are available on an hourly and seasonal basis for North-AML and South-AML. The energy calculations consider the hourly data of air temperature and façade solar radiation (horizontal, cardinal and ordinal orientations) for the reference altitude of each region [
48] (CLM#2). On the other hand, seasonal climate data used for model validation are those defined by EPC [
42] (CLM#1), considering heating degree-days and mean air temperature corrected to local altitude.
Table 6 and
Table 7 compare the seasonal data (CLM#1) with the same parameters calculated from the hourly dataset (CLM#2).
The main differences between climate datasets are found for heating degree-days and winter daily solar radiation (
Table 6). The calculated parameters for CLM#2 are 15% to 32% lower than the tabulated CLM#1. The most probable explanation for these differences is that CLM#1 inadvertently refers to an 8-month season (from October to May) instead of considering a variable heating season. The hourly model uses a shorter heating season, agreeing with the heating season length of 159 days (5.3 months) for North-AML and 139 days (4.6 months) for South-AML, both starting on the last days of November. The heating season for CLM#2, in agreement with [
49], begins on the first day of the first 15-day period with a daily mean air temperature not above 15
C and it ends on the last 15-day period with values not below 15
C. Fewer heat gains during winter may counterbalance fewer heat losses due to the decrease in heating degree-days and daily solar radiation of CLM#2. No significant differences are found for the cooling season (
Table 7).
4. Conclusions
The developed building stock energy model combines the building form and fabric characterization with energy systems and user profiles. Using a decade EPC dataset (2008–2018) and the corresponding probability distribution functions, the model generates a theoretical sample of building units and calculates the electricity consumption for space heating and cooling for the case-study region: the Metropolitan Lisbon Area (AML).
Despite the uncertainty regarding some missing parameters not collected in the EPC process, the building stock model consistently predicts the distribution of energy needs per unit of net floor area with an error lower than 7% of the first, second and third quartiles. The mean value of heating energy needs per unit of floor area is slightly underestimated by 1%, while the cooling energy needs are overestimated by 3%. This comparison considered the same approach (the seasonal steady-state) and climate dataset (CLM#1).
A further step was to compare the seasonal steady-state with the hourly RC approach, using the climate dataset CLM#2, since there is no hourly data for CLM#1. Relative differences are low for energy needs mean value (per building unit): seasonal steady-state overestimates the heating energy needs by 6% and underestimates the cooling energy needs by 3%. The study also points out that EPC energy needs are not appropriate to compute energy demand since they are much higher than those calculated by the RC hourly approach in the same conditions.
The study was extended to other scenarios, taking advantage of the model capability of defining hourly profiles different from constant assumptions. The scenario defining actual conditions (built from survey) leads to electricity consumption for space heating, representing 17% of all end-use electricity consumption (3135 GWh in 2015). For cooling, the estimated value is less than 1%. The mean electricity consumption for the regularly occupied building units group is 444 kWh/y for space heating; by restricting the group to building units without electric heating equipment, a more appropriate parameter, the mean increases to 610 kWh/y. The same procedure applied to space cooling increases the mean from 12 to 108 kWh/y. In the impossibility of an explicit validation due to lack of available data at the regional scale, the electricity consumption presented here should be taken as indicative approximations.
Two other scenarios were evaluated, assuming different user profiles scenarios for what might be the minimum energy consumption required to ensure adequate and minimum thermal comfort (controlling only indoor air temperature). The results show that the actual electricity consumption for space heating (610 kWh/y) is close to the minimum (512 kWh/y), indicating a potential number of underheated building units. In addition, with the electricity consumption distributions, it is possible to estimate that 37% of regularly occupied units are under heated. Due to the lack of installed cooling equipment in building units, the electricity consumption is clearly under the required to ensure thermal comfort.
The building stock energy model here presented consists of an alternative approach to the deterministic models. Its stochastic characteristic helps handle data uncertainty and missing values. Furthermore, it is ready to be fed with more updated EPC datasets (e.g., collected data after 2018) and survey results on user profiling. Future developments should compare model estimates with the electricity consumption for space heating and cooling obtained from statistics and surveys for case-study areas.