The following section presents the outcomes of the complete analysis carried out to answer the research questions.
3.1. The Evolution of the Building Energy Efficiency Field of Knowledge
According to Price’s Law [
109], the scientific production concerned with a field of knowledge grows exponentially until it reaches a point of inflection and, afterwards, a threshold value around which it stabilises, meaning that this field has reached its maturity. The aspect of the curve that represents the evolution of publications goes from exponential to logistics, signalling that the scientific community’s interest in this field has cooled down.
According to Dabi et al. [
113]: “The main hypothesis of Price’s law is that the development of science follows an exponential growth. The growth of a scientific domain goes through four phases”. The first phase is the precursors’ phase. According to Dabi et al. [
113] “during this phase only a small number of researchers begin publishing”. The second phase is the proper exponential growth. “During this phase, the expansion of the field attracts many researchers as many aspects of the subject still have to be explored” [
113]. In the third phase, the body of knowledge is consolidated and the growth of scientific production becomes linear [
113]. The next phase, according to Dabi et al. [
113], “corresponds to the collapse of the domain and is marked by a decrease in the number of the publications”. The aspect of the curve transforms from exponential to logistical, reaching a ceiling value after passing through an inflection point. Therefore, in order to perform the Price’s Law analysis, the frequency distribution of the publications addressing BEE is presented in
Figure 4.
Figure 4a shows the number of publications on a yearly basis, whilst
Figure 4b shows the cumulative version, on which compliance with Price’s Law is investigated.
The first phase roughly extends to 2005. The second phase is from 2005 to 2014. The number of publications fits well with an exponential function since the statistic R2 is very close to 1.00. The third phase extends from 2014 to 2018. The growth of scientific production becomes linear (R2 = 0.988). There is no statistical evidence that an inflection point has been reached yet. It is worth mentioning that only articles with 5 or more citations were considered and it is well known that the older the article, the more cited it is. Thus, it is likely that the number of articles during the later years will increase, reinforcing the linear trend of the plot for the final years even more. Therefore, the maturity of this field of knowledge has not yet been reached, leaving several aspects to be explored.
3.2. Authors’ Keyword Collection, Classification, and Manipulation
From the 9326 keywords collected, only 1728 were useful for the purposes of this study. They were, naturally, classified into 30 categories (or themes): building automation and control (BAC), building energy modelling (BEM), building envelope (BEV), building information modelling (BIM), building integrated photovoltaics (BIP), building management systems (BMS), building retrofitting (BRF), data analysis techniques (DAT), decision making (DMK), energy management systems (EMS), environmental (ENV), energy performance software (EPS), energy storage (EST), green building (GRB), heat pumping systems (HPS), heating-ventilation-air-conditioning (HVAC), life cycle assessment (LCA), lighting (LIG), occupancy behaviour (OCB), regulations (REG), renewable energy sources (RNE), smart buildings (SMB), smart grids (SMG), sustainability (SUS), thermal comfort (THC), thermal storage (THS), types of building (TOB), windows (WIN), water heating (WTH), and near zero/zero energy building (ZEB).
Figure 5 shows a schematic representation of such a classification.
Figure 5 is a pictorial representation of the thirty categories along with the number of keywords classified into each of them.
In the majority of cases, the classification of a keyword into a given category was straightforward, like ‘green building’ (classified under the Green Building theme or group) for instance. However, there were cases in which a keyword could be coded into more than one theme. In such cases, the classification demanded some extra work. It was necessary to read the title and abstract and, in some cases, the introduction of the articles from which the keyword was collected, to decide which theme it fitted best.
A keyword was classified into a unique theme but a theme could cluster several keywords with similar meanings, in such a way that each theme represents a homogenous group. It is worth mentioning that an article can have keywords classified into different themes.
3.4. Evolution and Trend of the Themes
Before studying the evolution and trend of the themes it is worth discussing their relevance over the period under investigation.
The relevance of a theme can be derived from the number of articles that address it over the period considered [
111]. Thus,
Table 5 presents the themes ranked according to their relevance.
Table 5 shows the absolute number and percentage of articles addressing each of the thirty themes. Therefore, it can be seen that the three largest themes are BEM, DAT, and BIM, which are present in more than 54% of the articles captured for this research. Eleven themes are addressed by less than 4% of the articles, meaning that the interest in them is small, so that they will be neglected for further analysis (grey background). However, it is worth mentioning that some of them are indirectly of interest for the ZEB and BRF themes. The themes BMS, BEV, EPS, EST, LIG, RNE, THS, and WIN could be still focused on the recent research under the umbrella of other themes with increasing.
Table 5 also shows the interdisciplinary character of the research carried out in the BEE field of knowledge. For instance, from the 2000 articles collected for this study, 505 (25.3%) address the theme BEM and the other 29 themes. According to Sun and Latora [
111], such interaction can reflect the exchange of knowledge across themes. It is possible to infer that the strength of such an interaction depends on the number of publications sharing the themes.
Table 5 provides a static view of the BEE field of knowledge. It shows the most relevant themes within the field but it does not show the evolution and trend of each theme. Thus,
Table 6 presents the trend of each theme, allowing investigation as to whether a given theme has a perennial presence or is just incidental in the literature. A theme can be analysed as to when it emerged, if it is still active or vanished, and when its apogee was.
Table 6 presents the annual participation of each theme in the literature, summarising their trend in the last column.
Eight themes are in an upward trend: BAC, EMS, DAT, BEM, BIM, OCB, BRF and ZEB. It can be seen that the themes BAC, EMS and DAT reached a maximum in the early 2000s, while the others peaked in the late 2010s. The development of the internet and image processing software packages explain the remarkable growth of the theme BIM [
112,
114]. Once the stock of old buildings far surpasses the stock of new buildings everywhere in the world, the only way to achieve the current energy-saving standards is by retrofitting them, which explains the growing interest of the scientific community in the BRF theme. The raising of the theme OCB can be explained because the scientific community has realised that the success of energy-efficient projects are significantly influenced by human factors [
115,
116].
Since there are many consecrated statistical methods, which have been waiting for the development of informatics to become popular, it is expected that DAT will keep growing for a while, even within other fields of knowledge. According to Cristino et al. [
108] the data analysis techniques mentioned by the papers within this field of knowledge can be roughly clustered into the following categories: regression analysis, descriptive statistics, multivariate analysis, computational intelligence, inferential statistics, and design of experiments.
There is no statistical evidence of a particular trend for SMB, TOB, GRB, HVAC, LCA, THC, BIP, REG, ENV and SMG.
The theme SUS shows a downward trend. The themes concerned with environmental issues (ENV, SUS and LCA) reached their maximum in the second half of the 2000s and have decreased since then, showing that interest in these subjects cooled down.
The volume of publications addressing each theme, as well as the interaction between them, defines the evolution of a field of knowledge. As these variables change over time, it is possible to infer that such an evolution is marked by distinct phases. Thus, the next step in this study is to identify such stages.
3.5. Stages of the Evolution of this Field of Knowledge
The evolution of a field of knowledge is marked by a sequence of periods with a similar profile of publications. Thus, reading
Table 6 from the columns’ point of view, it is possible to see the profile of the years according to the themes published and look for a pattern.
One of the ways to identify similarities between multivariate observations is to apply clustering techniques [
112,
117]. Thus, the space of the columns in
Table 6 was submitted to a hierarchical clustering algorithm, leading to the dendrogram presented in
Figure 6.
It is worth mentioning that a dendrogram is a tree diagram that shows hierarchical relationships between similar objects [
118], which, in this case, are the years.
Therefore, the dendrogram shows in
Figure 6 two well-defined clusters. One of these clusters groups the years 2007–2011, at a similarity level of 66.7, and the other, the years 2012–2018, at a similarity level larger than 80. The years ranging from 2000 to 2006 are very heterogeneous. This suggests that the period covered by this research could be divided into three phases.
Figure 7 shows the profile of each of these phases.
Figure 7 presents the annual participation of the themes in the literature for the three evolutionary periods determined by the cluster analysis.
During the first period (2000–2006), the scientific community’s gaze was scattered over 26 themes, differently distributed over the whole period. In 2000, ten themes were presented in the literature; in 2001, only one theme (BEV); in 2002, this number increased to 14; in 2003 and 2004 decreased to 10; in 2005, increased to 15; and in 2006, the number of different themes presented in the literature reached its maximum, 20.
The participation of the themes in the literature varied over the years. In 2000, ten themes shared the same participation in the literature (10%); in 2002, the theme BEM stood out (24%); in 2003 two themes were highlighted, GRB and HVAC with 19%; in 2004, other two themes stood out, but this time, with 12% of participation (DAT, REG); in 2005, the theme DAT increased its participation to 25%, and, in 2006 the theme BEM stood out with 17% of participation in the literature.
The low number of themes in 2001 is due to the fact that only the articles that reached five or more citations were considered, which leads to the conclusion that the production of articles addressing the Building Envelope was the most consistent in 2001.
Therefore, it can be seen that the evolution of this field of knowledge over this period did not exhibit any pattern.
The second period (2007–2011) is the shortest of the three periods (five years). It presented more themes consolidated than the previous one. Twenty-nine themes had been explored over this period, and 15 of them were present in all five years of this period. In 2007, 20 different themes were present in the literature; in 2008, 23 themes; in 2009, 26; in 2010, 24; and in 2011, 27. Thus, it is fair to conclude that the scientific community’s interest in this field of knowledge became more consistent.
It was in this period that the themes BIP, ENV, REG and THC reached their greatest participation in the literature. However, the theme BEM was by far the one most present in the literature, closely followed by DAT. The participation of the themes BAC, BRF, EMS, OCB, SMG and ZEB had a neglectable participation in the literature over this period, while the participation of the themes BEV, BMS, GRB, HVAC, SMB, SUS and TOB shrank.
In the third period (2012–2018), all the thirty themes had been explored, 29 themes in 2012, 2013 and 2015; 28 themes in 2014; 30 themes in 2016, 2017 and 2018. Thus, it can be said that the scientific interest in this field of knowledge increased even more over this period.
The participation of the themes BEM, BIM, BRF, DAT, OCB and ZEB had increased and, according to statistical analysis, they are in an upward trend. participation of other themes like BAC, EMS and SMG had increased as well, but not enough signalise an upward trend. The interest for the themes ENV, GRB, HVAC, REG and SUS had decreased. The other themes remained stable.
3.6. Interrelationships between Themes
According to Sun and Latora [
111], the interaction between themes within a field of knowledge reflects the flow of knowledge between the sub-areas of this field. Thus, in order to understand the evolution of this field, it is fundamentally important to define and study the interaction between the themes.
Many articles address multiple themes at once. What indicates an interaction between themes? The interaction between the themes
i and
j can be assessed by means of Equation (2).
where
is the number of articles that concurrently address the themes
i and
j, and
is the number of articles for the considered period (
N1 = 149,
N2 = 342 and
N3 = 1509).Thus,
is the percentage of the articles produced during the period under investigation that addresses the themes
i and
j.
Figure 8 presents a graphical representation of the model used to account for the interactions between themes.
Based on
Figure 8, it can be seen that
can be stored in a symmetric matrix, called interaction or interrelationship matrix. According to Equation (2), such a matrix varies depending on the evolutionary period.
Figure 9 shows the interrelationship matrix for each period.
The darker the fill colour, the greater the interaction between themes i and j.
Observing the matrix for the first period, it can be said that, during this period, this field of knowledge was driven in large part by themes concerned with sustainable development and thermal comfort. Also, it can be noticed that the greatest interaction occurred between HVAC-THC and LCA-SUS. It is possible to observe the emergence of the relationship between the themes BEM-DAT, which would increase until the end of the third period.
During the second period, the interest of the scientific community revolved more around the interaction between BEM-DAT; BEM-HVAC; GRB-SUS; HVAC-TOB and HVAC-THC.
The interaction between BEM-DAT is remarkable; it is by far the largest one, not only over the third period, but over the whole period covered by this research. Therefore, these two themes have been the great engine for developing the research on Building Energy Efficiency.
Since it is difficult to analyse and understand the interaction between the themes only by examining the interrelationship matrices in
Figure 9, a visual representation of such matrices is valuable. Such a representation can be obtained by means of a data analysis technique known as multidimensional scaling [
119], which allows the representation of the interrelationship of the themes in an abstract, two-dimensional Cartesian plot, as illustrated in
Figure 10. Although such a representation is not absolutely perfect, it gives some insight into the interaction between themes. For instance, the greater the interaction between themes, the closer they are in the plot, forming clusters of synergetic themes. In other words: the closer the themes, the greater the flow of knowledge between them.
The left side of
Figure 10 shows the participation of the themes over the three evolutionary stages. On the right side, three plots represent the interrelationship matrix between the themes for each of the three evolutionary stages.
The clusters shown in
Figure 10 only include the themes for which the
>1.0. The distance between clusters and elements was assessed according to the nearest neighbour strategy [
118].
Observing the plots for the three evolutionary periods, it should be noted that the themes have clustered around the origin of the plot as this field evolved. In general, although it can distort the representation of the themes in the plot, the more central a theme, the greater the interaction with the others.
In the first evolutionary period of this field of knowledge, significant interaction between themes related to thermal comfort (THC-HVAC), themes concerned with environmental/sustainability issues (SUS-ENV-LCA-GRB-SMB), and themes addressing modelling and data analysis techniques (BEM-DAT) can be seen.
A number of clusters dropped from the first to the second evolutionary period. The cluster BEM-DAT remained and came closer to the centre of the plot. They are cross themes. Some articles are devoted to revisiting a given theme and have an interest in comparing the results emerging from different data analysis techniques, in such a way that the modelling and data analysis become the kernel of the paper instead of being tools by means of which better results can be achieved. Such articles give little attention to the aspects concerned with Building Energy Efficiency, which are only the background and data source, while their main purpose is data analysis.
Still, within the second period, the themes LCA and SMB leave the environment/sustainability cluster because of the lack of interaction with the other themes. The interaction of the remaining themes with the thermal comfort cluster increased, resulting in the formation of a new cluster.
The cluster BAC-EMS was extinct by this stage. Since the participation of both themes in the literature increased over this period, it is fair to assume that both themes developed in isolation, without sharing knowledge.
The number of isolated themes in this period was the largest amongst the evolutionary stages. Thus, it can be said that, during this period, the exchange of knowledge was the smallest.
The third stage is the one with the largest number of clusters and the smallest number of isolated themes. It can be considered the period with the greatest flow of knowledge between sub-areas within this field of knowledge.
The clusters THC-HVAC and BAC-EMS, from the first evolutionary stage, have been re-established, meaning that the themes within each cluster restarted, triggering knowledge production in each other.
The cluster BEM-DAT is even closer to the centre of the plot in this stage. According to the interaction matrix for the third period, in
Figure 9, this cluster interacts with all the themes (
> 1.0) except the themes BIP and REG.
The cluster concerned with environment/sustainability in the first period was broken into three small clusters (LCA-ENV, SMB-SMG, and SUS-GRB-BIM), suggesting the exchange of more specialised knowledge. The flow of knowledge between themes related to sustainability and information modelling is noteworthy. As the latter theme shows an upward trend, it is quite possible that its development increases the knowledge production of themes related to green buildings and sustainability.
The themes BIP, TOB, REG, ZEB, BRF, and OCB developed in isolation over all three evolutionary stages. The latter three are in an upward trend, according to the trend analysis previously presented. Thus, a clear relation between trend and evolutionary development of a theme within the Building Energy Efficiency field of knowledge could not be seen.