1. Introduction
The perception of water as a resource is undergoing dramatical change. As a consequence of climate change, on the one hand, drought even in temperate latitudes and the lowering of groundwater levels in many areas is an issue that now affects almost everyone [
1,
2]. On the other hand, the increase in extreme weather events and climate-related natural hazards poses a risk to more and more people [
3,
4]. It is not only the first United Nations Water Conference in almost 50 years in 2023 that is drawing attention to what has become a universal truth: water must be managed.
One source of water which has a significant impact on the sizing of water infrastructure systems is rainwater [
5,
6,
7]. Storm water discharge to protect other infrastructure from damage, such as that seen, for example, in Los Angeles or Tokyo, is only one question for urban planners [
8,
9,
10]. The integration of rainwater into the urban water cycle is a subject of water-sensitive urban design [
11,
12,
13], with concepts such as green roof technology, living walls, and sponge cities as instantiations [
14,
15,
16,
17,
18].
Rainwater harvesting (RWH) and use is a cornerstone of many of these concepts and contributes to influencing both peak water demands and storm water runoff [
19,
20]. In addition to simply installing a rainwater tank on a property, understanding neighborhoods or districts as local, decentrally managed water grids offers the possibility of resource balancing, as participants with low consumption or additional municipal buffers can make their harvest available to other participants in the network as needed [
21].
Thus, when considering an RWH system as a design object, the questions are, first, how to size it according to the provision and consumption of rainwater [
22], second, how to make it robust to changing requirements and premises over a long service life [
23], and third, where to set the system boundaries for maximum benefit to all stakeholders. Current design practice and standards such as EN 16941-1 [
24] tend to focus on individual aspects of the above. A widely used approach is the simulation of water mass balances based on yield and demand for each time step of the simulation for an isolated RWH system [
6].
Considering the above questions, an appropriate design tool for RWH systems should generally be able to size the system components based on reliable and adequately resolved data, support the designer in detecting sensitivities in design parameters, e.g., due to changes in use patterns or premises, and compare different system configurations and boundaries. For such design tasks, the field of Knowledge-Based Engineering (KBE) offers methods and a comprehensive set of tools [
25,
26,
27].
This article presents a modular KBE system for the design of residential rainwater harvesting and distribution systems, in which the authors apply KBE techniques such as model- and resource-based configuration and Bayesian decision networks. The following contributions are associated with this: First, the authors propose a probabilistic consumer model to make the prediction of water demand more robust and to also evaluate the effects of different influencing parameters over time. Second, the impacts of a networked RWH system in a neighborhood with a central buffer tank can be compared to those of isolated RWH systems so that appropriate design recommendations or design obligations can be made.
The article is organized as follows: First, in
Section 2, the theoretical background of KBE systems in general, Bayesian Networks as tool for modeling uncertainties, and the sizing of RWH systems is presented.
Section 3 then contains a description of the Bayesian network consumer model concept as well as the mental model and premises the authors used for system development. Afterwards,
Section 4 presents the structure and setup of the implemented system, before
Section 5 shows the application of the system for different configurations of a neighborhood as well as the generation of design knowledge by analyzing sensitivities of parameter changes for the provision and consumption of water. In
Section 6, the application as well as the methodological approach are discussed before
Section 7 concludes the article.
3. Model Development
Although several approaches that account for uncertainty about future parameters have been reported in the literature, they seem to focus heavily on the supply side [
65]. The demand side is commonly modeled as average water consumption per person, supplemented by models for, e.g., garden irrigation during the summer months [
66].
To assess the robustness of RWH systems, the authors aim to provide new impetus by introducing a probabilistic consumer model and a comparative assessment of isolated and networked RWH systems on multiple properties, allowing resource balancing between the individual systems. Therefore, a BN will represent the user behavior and allow the calculation of water demand. A model- and resource-based configuration approach for the RWH system then enables the comparison of different system configurations and reasoning about single design parameters.
3.1. Consumer Model
The consumer model consists of a BN with four layers, where the fourth layer represents a person’s total consumption based on their habits and behaviors.
Figure 2 shows the topology of the network and all influencing factors.
In the model, the single consumptions for shower, toilet, laundry, and car washing are represented and divided into six discrete areas for a first approximation. Sensitivities of
are considered to better assess transitions between discrete areas. The attributes of a person are set as nodes in the graph. Each node contains a CPT, where the size of the table depends on the number of parent nodes. The conditional probabilities themselves were deposited on the basis of real statistical data from market and opinion research, as well as from economic and official statistics for the year 2021 with reference to the Federal Republic of Germany. The CPTs for the part consumptions used for the model are included in the appendix in
Table A7,
Table A8,
Table A9 and
Table A10. These part consumptions are in turn dependent on the user’s characteristics, which can be divided into coarse and fine filtering. Coarse filtering includes the distribution according to age and gender. Age was divided into six discrete ranges (
Table 1), which are not equally distributed to take in account the greater mobility and fluctuation of residents of a younger age.
For fine filtering, the focus is more on the influencing factors for water consumption itself. These include digestive system diseases, hair length, sportiness, hygiene, place of work, and availability of a car, all broken down to their impact on single part consumption. E.g., if a person works more in a home office it is likely that this increases water consumption from toilet flushing but reduces consumption from laundry. In the same way, a high sportiness is likely to increase the consumption from showers and laundry. The CPTs for the fine filtering are attached in the appendix in
Table A1,
Table A2,
Table A3,
Table A4,
Table A5 and
Table A6. For adapting the model to other locations, the CPTs need to be updated with the probability distributions of that location by substituting the corresponding values.
In the first stage of implementation, binary values were assumed for the user characteristics, so that the more probable value was assumed for the update of the BN. Based on this, the conditional probabilities of the consumers are updated. The initial level of the BN represents the statistically likely consumption for an individual. The inference of the BN is performed using a junction tree algorithm, which takes the nodes as intersections and divides the graph into small decision trees so that it can update the probabilities step by step. Therefore, from the BN, which is modeled as a directed acyclic graph (DAG), first, an undirected graph, named moral graph, is constructed, where the parent nodes of a common child are connected [
56,
74]. Subsequently, more edges are added to divide the graph into triangles of nodes. From these triangles, clusters are determined which consist of subsets of nodes from the triangulated graph. In the last step, a junction tree is formed from the clusters, which allows minimization of computational time [
75].
Figure 3 shows first a BN in the form of a DAG, second the undirected triangulated graph, and third the junction tree derived from it.
Based on the determined junction tree, the conditional probabilities can be calculated. According to Huang and Darwiche [
75], the junction tree is initialized with the probabilities of the nodes and the observations are introduced so that this junction tree can be regarded as inconsistent. Propagation using message passing, for example, can ensure consistency within the junction tree. In the last step, the conditional probabilities have to be marginalized and normalized so that the sum of the posterior variables is 1 again. This form of algorithm works well for smaller BNs, as in this example, where few discrete values are used. As soon as the BN becomes larger or even continuous, the inference must be performed using sampling algorithms. Now that the BN has been modeled, the user-induced water consumption can be determined by updating the probabilities for the user property nodes. For example, if it is known that the person is female and between 14 and 24 years old, the probability of long hair increases from 29.87% to 70% and so does the assumed water consumption for showering.
As an simplified alternative to the BN, a second calculation model uses a virtual tree diagram, which is composed of the properties of a resident to show a statistically probable water consumption. For this purpose, the calculated path probabilities of all possible property combinations of a person are determined. The six combinations with the highest probability are set as possible standard residents.
3.2. System Dynamics Model and Premises
The basis of the KBE system for networked rainwater harvesting and distribution systems is a model of the respective water grid. In this context, the representation of stocks and flows emphasizes the resource-based modeling approach that considers sources and sinks of the network as well as different characteristics for inflows and outflows.
Figure 4 shows a single residential unit as a System Dynamics model, while
Figure 5 presents the System Dynamics model for a neighborhood consisting of 10 properties, each instantiating the single residential unit model above. At this stage, it was not the aim to fully reproduce the urban water cycle, e.g., as introduced in [
11], but to visualize the grid and the premises explained below.
Rainwater, cleaning water, and tap water are assumed to be the main sources for water collection considering a single residential unit, while toilet flushing, shower, laundry, car cleaning, and garden irrigation are assumed to be the main consumers. Rainwater was modeled on the historical data from the German Weather Service (DWD) on a day-based resolution for the past 5 years (1.1), so that seasonal fluctuations could also be taken into account. These data are publicly available and can later be automatically retrieved from the KBE system after the designer enters the location of the RWH system to be designed. To model the actual yield of rainwater, the roof yield coefficient (2) represents different roof types and is stored as an efficiency table based on the values of EN 16941-1. Additionally, the authors introduce a catchment yield coefficient (2.1) as different commercially available rainwater collectors have their own efficiencies, which are also stored as tables based on real provider data. As an option, a first-flush diverter (3) reduces the yield from each rainfall event by 0.33
, according to [
66], so that no contaminants enter the storage tank.
The second source of water is a household internal water cycle that reuses water, e.g., from cleaning vegetables, where no chemical detergents occur. Therefore, it is assumed that the water is collected in a small tank in the kitchen and then manually transferred into the storage tank. Cleaning water depends on the number of people and the probability of cooking in the household itself (4.1) and how often the water is supplied to the tank, e.g., twice a week (4.2).
The tap water (5) represents the water from the mains water supply, e.g., provided by the local distributor. Tap water is added when the consumption (5.1) exceeds the collected water from the other sources. Its amount is a design parameter to be minimized by the system. To be able to integrate further water sources into the system, a respective placeholder is included. The water from all sources is first led into the water tank (7) before it is transferred from there to the various sinks. Losses due to leakage and evaporation (7.1) could be considered to model the efficiency of the RWH system but are neglected in this stage of implementation as they are not considered as a design-determining variable. To clean the water tank in regular cycles, the tank is allowed to overflow (8). The overflow is considered as a design variable here, in the sense that the cistern should overflow at least a given number of times per year by a given volume. This is included in the later calculation of the mass balances.
As for user-induced consumers, the model integrates toilet (9), shower (10), laundry (11), and car cleaning (12) with their daily water demand, based on either values from standards or provider data. The influencing factors (10.1, 11.1, and 12.1) map the characteristics from the consumer model described above. The garden (13) is considered a context-specific consumer, as it depends on the garden size and needs to be watered differently depending on the outdoor temperature (13.1). The current implementation uses an average irrigation demand of 2.5 L per square meter per day, which is integrated into the mass balances when the temperature is above 15 °C. To be able to expand the system on the consumer side, it is also possible to model additional consumers (14). For evaluating different scenarios for tank size and for resource balancing procedures, an additional buffer tank (15) is implemented in the model.
Another System Dynamics model represents the super system of a neighborhood with ten residential units (
Figure 5). The use of a central buffer to balance resources between individual housing units plays a key role in this. Note that the central buffer does not have its own catchment, such as sidewalk gutters, but is fed only by the connected residential units. Each residential unit contains its own water tank, into which the entire water collection of the residential unit enters as input and the total consumption of the residential unit leaves the tank as output. Surplus water can be fed into the buffer tank by the individual housing units and can also be retrieved when the buffer tank is filled (15.1). In addition, the piping network (15.2) has to be filled before the water reaches the consumers so that the length and pipe dimensions are also stored.
4. Implementation
The system was implemented in MATLAB, version R2022a, as it offers many toolboxes, a high number of numerical algorithms, and a simple visualization [
60].
Figure 6 shows the basic program flow chart.
After starting the program, the input window opens, where the user can select the location (
Figure 7a). The possible locations result from the location list of the measuring stations of the German Weather Service (DWD). The script downloads the corresponding local precipitation and temperature data and prepares them in the targeted resolution. In addition, the approach for the consumption calculation is requested in this tab where the designer selects between the BN and the simplified virtual tree diagram. Furthermore, the designer chooses whether the multifamily house mode (MFHM) should be activated. If so, the frame of reference changes from a neighborhood with ten properties to a multifamily house with ten flats so that the demand of the residents can be calculated accordingly. In the MFHM, only the roof area of house one is considered, and the interconnection of several tanks is deactivated.
Figure 7b shows the input window for the respective houses/apartments, where the number of inhabitants, roof area, roof coefficient, size of the garden, and the single consumers of rainwater are requested. The last option allows the developer to compare different scenarios, e.g., when no car cleaning is allowed, and thus investigate the sensitivity of such measures on the system’s behavior.
Afterwards, the yield is calculated using the building specifications, the precipitation data, and first-flush diversion. The system also instantiates the household’s internal water cycle. Depending on the number of people, it is therefore estimated that between 12 and 30 L of cleaning water is produced per week and returned to the storage tank. Based on statistics for the monthly number of home-cooked meals, a weighted random generator maps cooking behavior week by week for each household.
In the next step, the demand for the units is calculated. The system populates the residential units based on the chosen demand calculation method. The consumption per day and person for toilet flushing (24 L), showering (35 L), laundry (15 L), and car cleaning (400 L) are assumed as a starting point. These values are factorized by the generated user portfolios so that the user-induced influence can be considered to a greater extent. To create the BN in MATLAB, version R2022a, a library called Bayes Net Toolbox (BNT) was used, which was developed by Kevin Murphy in 1997 and kept up to date until 2014 [
60]. The graph of the BN is represented in BNT as a matrix, where the rows and columns represent the nodes and the entries within the matrix represent the connection between these nodes as arcs. Once each consumer’s individual factors are determined based on user characteristics, they are added together to create a total water consumption per person. This step is performed for all individuals in a household, which in turn leads to a total water consumption for a residential unit. Once all water consumptions per person and per unit have been determined, all occupant profiles and the resulting consumptions of all houses are saved in a text file for documentation purposes.
The context-specific consumptions are then calculated according to the user-induced factors. For the application example, the temperature distributions for the selected location are analyzed to calculate the water consumption for garden irrigation at temperatures above 15 °C. As an intermediate result, the water consumption per household is output in a monthly resolution so that, on the one hand, tank sizes can be determined on this basis and, on the other hand, resource balancing can be performed.
Considering the calculated yields and consumption, the monthly tank volumes are calculated based on the mass balances for the residential units according to EN 16941-1, the self-sufficiency period is also considered as a design variable and set to a default of 21 days following the standard. Finally, the median volume of the individual monthly volumes is selected. Yields, consumption, and tank volumes are plotted per house and displayed as bar graphs. The tank must overflow regularly to drain off contaminants on the water surface, the so-called floating layer, such as leaves and pollen. It is assumed that the tank should overflow three times a year and drain two percent of the nominal volume. To accomplish this, the year is divided into three sections and the maximum fill level is determined in each section. The difference between the current tank content and the target volume (the 102%) is then covered with tap water. A monthly approach is taken for the final assessment of tank size for a unit so that fluctuations and outliers can be compensated for.
Up to this point, the program has viewed each RWH system as an isolated individual. For the networked view of the water grid, the program follows the model shown in
Figure 5 with the central buffer tank. For resource balancing, a daily resolution is used to better analyze when and how much water can be exchanged and how many days can be covered with the buffer tank. A balanced matrix is created in which all relevant data are stored. The consumptions are divided into A, B, and C categories. The classification is based on the priority with which the individual consumptions are to be covered. A consumptions are toilet, shower, and laundry and should be covered with high priority. B consumptions are for garden irrigation and are fed when there is still water left after feeding the A consumptions. C-category water is used for car washing. It has the lowest demand priority and is fed only when A and B consumptions are covered. In the case of water rationing, B and C consumers can be blocked.
To be able to investigate and evaluate the aspect of resource balancing, two scenarios are distinguished, one being the allocation of shared water evenly according to per-head consumption and the other being allocation using prioritization. For the per-head allocation scenario, the current amount of water in the buffer tank is allocated proportionally to the total number of users. If a deficit remains in the user’s balance despite the allocated amount of water, the tank must be refilled with tap water. If the household has more water than it needs, the unused balance remains in the buffer tank.
The prioritization scenario considers both demand and the composition of the single sources and sinks, so that a bonus or malus can be set. For example, for the bonus, small differences are served first or the water that is supplied from external sources and cleaning water is subtracted from the difference so that the delivery of additional water is rewarded with a higher score. If the user uses most of their water for B and C consumption, they will again receive less water as a malus. The daily score is stored in the balanced matrix. On each day that households request water from the buffer tank, these requests are sorted by their score and served in order until all requests are met or the buffer tank is empty.
6. Discussion
The above examples show the applicability of the implemented KBE system for the design of networked RWH systems. The consumer model allows conclusions to be drawn about the total water consumption of a household based on the composition and behaviors of its inhabitants. It thus allows the demand side of the mass balances to be simulated with a consideration of uncertainties in the requirements and extends the possibilities from the standards and most of the systems in the literature. A question that remains is the quality of the data used for modeling the probabilities. For this work, statistical assumptions obtained from available databases allowed the differentiation of different profiles according to age, gender, and individual habits. The installation of smart meters and recording their data on the resource streams within the water cycle of a standardized sample neighborhood would, of course, make the sizing more accurate. Additionally, the choice of the weather data set also has a direct impact on the calculation and represents an uncertainty for the inflow prediction. In the above scenario, the median data set of the last five years was chosen, but calculations can also be performed and compared for the median, wettest, and driest years. Complementing this with a fully functional rainfall generator, such as that mentioned in
Section 2.3, would then allow for assessing the robustness of the RWH systems, e.g., with calculated mass balances ten, twenty, and forty years in the future.
The actual premises result in different avenues for model refinement. Regarding the catchment, the implementation of a first-flush model, such as that proposed in [
66], instead of the fixed value, as well as a model for the gutter capacity, which limits the catchment during heavy rain, such as that mentioned in [
20], seems promising to raise the precision of the inflow prediction. Regarding the consumers, a model for predicting the garden irrigation to a more sophisticated level, that distinguishes different types of beds and fields and integrates evaporation to calculate the irrigation water demand more precisely, would be interesting, as well as in the sense of an integrated simulation of a smart irrigation control. Additionally, a calculation of the evaporation rates of pools or ponds in the garden and the thus required replenishment is conceivable. Finally, the efficiency of the RWH system in terms of leakage and losses could be integrated. In the context of model refinement, it would generally be interesting to determine the effects of data resolution. As is known from the literature, especially for small storage tanks below the size of a cubic meter, a resolution at a sub-hour level improves the calculation quality of the mass balances significantly. In the above examples, several tanks pose this challenge.
In thinking about a completely decentralized water supply and the degree of self-sufficiency for a quarter, it is also possible to consider a wider range of sources and sinks. The goal must then be to further reduce or even replace the amount of tap water from the mains water supply. In addition to the obvious options of further increasing yield, drilling a well on the individual properties or connecting a well to the buffer tank, local wastewater treatment facilities are particularly interesting from the point of view of sustainability to increase water recycling. An alternative way to increase utilization here would be to sequence the different consumers. For example, the initially collected rainwater can be used for showering, and after the surfactants have been filtered out the water could still be used for toilet flushing or garden irrigation. As for additional sinks with a focus on residential districts, the necessary regular flushing of the sewage system could be integrated and also linked to times when the buffer tank is well filled. However, new applications from the construction sector might also be interesting, especially under the aspect of ecological building, e.g., the idea of adiabatic building cooling. In this natural cooling principle, rainwater is injected into the exhaust air of the building and cools it by evaporation. An air-to-air heat exchanger thus cools down the building’s supply of air. As a result, the energy required for building air conditioning can be reduced by up to 70%. Per cubic meter of rainwater, 700 kWh of cooling capacity is possible [
76].
A further step in the development of the presented KBE system beside model refinement is the extension with configurable instrumentation data, including commercially available tanks, the choice of available pumps in accordance to the water demands of the individual consumers, and a calculation of necessary pipe diameters. In this way, the KBE system would output a basic bill of materials with the main components of the water grid. Coupling this to a 3D computer-aided design system and adding information about the geometric configuration of the properties, their buildings, and down pipe positions, for example, would then allow for building a design generator for water grids. The piping is then part of the output geometric data and could further be used for hydraulic simulations and for visualization of the resource streams in the grid itself. Adding relevant data about maintenance intervals, wear parts, and consumables, such as filter inlays, for the single water grid components, would then allow the quality of service cost estimations to be improved [
77].
7. Conclusions
The authors successfully applied methods and tools from KBE to the design objects of RWH systems with different system boundaries. Designers are able (1) to investigate the effects of different catchment areas or alternatively calculate needed catchment areas according to the occurring demands, (2) to adjust or minimize the storage tank sizes and evaluate their effects on the individual harvest and the exchange with the central buffer, (3) to evaluate the demands within a neighborhood either with respect to maximum peak water demands or the temporal development over the yearly projection, and (4) to test the sensitivities of the single sinks and sources to the water grid. For urban planners, this offers the possibility, e.g., to make design obligations for housing construction or for the refurbishment of settlements.
In this increment, the KBE system is intended as design support system in which the necessary measures for optimization and the comparison of different system configurations are still performed by a human designer. However, the optimization of the local water network still requires experience. Some of the individual design variables influence each other, while individual measures such as shortening the self-sufficiency period appear counterintuitive at a first glance. To fully automate the design, the different viewpoints that occur in the design need to be mapped into the system. Following the principle of distributed artificial intelligence, a multi-agent system approach could be an interesting option. It would additionally allow the discussion and negotiation of the individual agents to be followed, in order to find the global optimum of the system configuration and to be explainable and trustworthy. The multi-agent system approach could also be of interest for the online management of existing water networks, e.g., integrating revenue models for surplus water offered to other network participants.