1. Introduction
Flooding is a significant and recurring issue in Nebraska. In the past 100 years alone, five historic floods have been recorded [
1]. The most recent of those historic floods was the March 2019 flood, during which several new records were set for river crests, snowfall prior to the event, and precipitable water values [
2]. The flood caused extensive damage to infrastructure and personal property, including the destruction of bridges and roads, the breaching of 50 levees, and more. Five lives were lost, and thousands more were forced to evacuate their homes and businesses [
1]. By August of 2019, flood damage cost estimates had reached
$3 billion [
2].
In addition to the widespread infrastructure damage inflicted, Nebraska’s agriculture industry also suffered heavy repercussions from the March 2019 flood. Cows and calves died or were stranded, crop fields were flooded or left covered in chunks of river ice, and grain stores were contaminated, resulting in approximately
$400 million in cattle losses and
$440 million in crop losses [
3]. Destruction of roads due to the floods posed further challenges for farmers beyond immediate losses, as many farms suddenly became inaccessible or required detours of extra tens of miles to reach in order to tend to livestock. According to an estimate made by the President of the Nebraska Farm Bureau, the additional transportation costs of these detours, as well as related costs of fuel and feed, were costing the Nebraska cattle industry roughly
$1 million a day at the time [
4]. The flooding of March 2019 was a devastating example of the serious damage to crops and livestock that floods can inflict on farmers and rural areas.
The March 2019 flood occurred as a result of the culmination of several meteorological and hydrological events. The warm and wet start to winter resulted in unfrozen ground that was heavily saturated with moisture, which then froze following a shift to colder temperatures in late January; early winter conditions were approximately 1–2 °C warmer than average, while late January conditions ranged from 3–5 °C colder than average and resulted in frost depths of 60–90 cm. Record-breaking snowfall followed, developing a significant snowpack with a snow-water equivalent of 3–10 cm, and rivers froze, creating the potential for ice jams. When warmer temperatures returned coincident with a cyclone that brought 25–50 mm and 40–75 mm of rain across northeastern and central Nebraska, respectively, the excessive runoff generated from the combination of snowmelt, precipitation, and rain-on-snow events could not percolate into the frozen, saturated ground, thus overwhelming rivers and creating significant flooding [
2].
The mechanism by which the March 2019 flood was produced, along with its repercussions, highlights the importance of understanding the interactions of antecedent hydrological and meteorological factors in creating flooding. Increased understanding of conditions that contribute to flooding can help with forecasting and decision-making [
5]. As seen during the March 2019 flood, monitoring and modeling of several hydrological and meteorological inputs allowed for forecasts and warnings to be put out in advance of the flood event, which likely saved many lives and personal property [
2]. Further improvements to flood models and additions to the understanding of flood development can aid in yielding more accurate and timely predictions for flooding.
This study aimed to contribute to the improvement of flood forecasting and hydrologic process understanding in a specific rural watershed in Nebraska, Shell Creek Watershed (SCW). Furthermore, preliminary insights were drawn from the results of this study regarding the effects of conservation practices on attenuating flooding in the watershed. SCW provides a unique example of a watershed that has had a comprehensive and strategic conservation plan aimed at addressing flooding and water quality issues, and this plan has been in place for a long enough span of time that observations can reasonably be drawn regarding the relationship between conservation practices and flood attenuation. This study opens the door to such discussion, laying groundwork and preliminary observations for future researchers to build upon when analyzing the impact of SCW’s conservation plan on flooding.
SCW is a major tributary of the Platte River located in east-central Nebraska. The watershed is largely agrarian, with 93% of its area designated as agricultural land, and has a history of chronic flooding [
6]. In the past 80 years, SCW experienced 17 river crests above flood stage, with an additional seven above action stage [
7]. As previously exemplified, flooding can cause devastating damage and setbacks to Nebraskan farms through crop and cattle losses, making the frequent flooding of SCW an important matter of concern. In the early 1990s, some farmers began implementing conservation practices on their land in efforts to reduce flood risk on a small scale. Later, in 1999, the Shell Creek Watershed Improvement Group (SCWIG) organized to address flooding in SCW, then shifted their focus towards water quality impairments in the watershed and drove the development and implementation of the 2005 Shell Creek Watershed Management Plan. The plan was fully implemented in 2015; however, certain water quality impairments and other issues remained unresolved, warranting the development of the 2016 Shell Creek Watershed Environmental Enhancement Plan to improve soil health, reduce runoff, improve water quality, and address stream conditions impacted by watershed degradation. The 2016 plan is to be implemented in several phases extending to 2032 and beyond, and efforts will require a more detailed understanding of the hydrology of SCW [
6]. The existing restoration efforts and conservation practices implemented over the course of these management plans have been credited anecdotally with somewhat reducing flooding, an observation that warrants exploration and discussion and, as yet, lacks validation. This paper’s research on flood modeling of SCW will aid in addressing the demand for further understanding of SCW hydrological processes, investigate potential changes in streamflow patterns, and support the improvement of forecasting for the watershed’s frequent floods. Furthermore, analysis of the impact conservation practices have had on flood attenuation in the watershed will aid managers in making decisions regarding future conservation implementations.
To model flooding in SCW, a feedforward artificial neural network (ANN) was developed, trained, and tested in MATLAB. ANNs are useful for their ability to determine relationships between given inputs and recorded outputs of a process without explicit physical process information [
8,
9,
10] and have been successfully employed in several hydrological applications [
11,
12,
13,
14,
15].
This study designed and trained a single-hidden-layer feedforward ANN to model the rainfall-runoff process in SCW using varying combinations of potential flooding factors. Building a rainfall-runoff model allows for a deeper understanding of the hydrological processes, specifically flooding, in SCW, and aid in the accomplishment of the four main objectives of this study: (1) characterizing the efficacy of the selected controlling factors in SCW flood prediction, (2) building a rainfall-runoff model for the watershed using ANN, (3) assessing drought intensity and flood frequency in the watershed, and (4) drawing preliminary insights on the effects of conservation practices on flooding in the watershed.
2. Materials and Methods
2.1. Study Area
Shell Creek Watershed is a rural Nebraskan watershed that serves as a major tributary of the Lower Platter River, which eventually feeds into the Missouri River. Approximately 193 km long and spanning 123,391 hectares of land, the watershed runs through Antelope County, Boone County, Madison County, Platte County, and Colfax County. The cropland map of SCW is shown in
Figure 1. Land use in SCW is predominantly agricultural, with 93% of the watershed area dedicated to farmland. The main agricultural crops of the watershed by land-use area are corn (48%) and soy (28%), and other agricultural products include swine, cattle, and alfalfa. Developed land only takes up about 4% of SCW. Land-use immediately adjacent to the channel (305 m to either side) is 73% crop cultivation and 11% grassland and grazing, with the rest populated by forest, wetlands, or development. In the upper half of SCW, cropland is often farmed up to the edge of the channel. In the lower half, the edges of the channel are typically lined with a narrow forest buffer [
6].
The meteorological patterns of SCW are seasonal, and thus average temperature, precipitation, and streamflow in the region vary by season. Summers are warm, with an average temperature of 23.9 °C. Precipitation falls as rain in the form of showers and storms during summer, reaching a cumulative 50.8 cm from April to September out of the total annual 66 cm. The average summer surface water flow is 2.04 m
3/s. During winter, temperatures average −4.4 °C, and precipitation is mainly in the form of snow. Average snowfall is about 63.5 cm, and average surface water flow is 0.71 m
3/s. The average mid-afternoon relative humidity in SCW is 60% [
6].
Like most of eastern Nebraska, SCW is characterized by rolling hills of easily erodible soils [
6]. The majority of soils found in SCW are classified in the B Hydrologic Soil Group: moderately deep to deep soils that are between 10% and 20% clay and 50% to 90% sand in composition. These soils have a moderate infiltration rate when thoroughly wet [
16]. SCW also contains a sizeable amount of loess [
17], silt blankets of which eastern Nebraska has some of the thickest in the Midwest. Loess is highly erodible when wet and thus contributes significantly to the channel stability problems observed in SCW. There are three main types of bedrock in SCW, which divide the watershed into thirds; the western third is composed of mudstone and sandstone, the middle third of limestone, and the lower third of shale. Bedrock rarely impacts stream processes in SCW.
The designated beneficial uses of Shell Creek are recreation, aquatic life, agricultural water supply, and aesthetics. However, a history of agricultural use and anti-conservationist attitudes and practices left Shell creek impaired in the areas of recreation and aquatic life by Atrazine, selenium, and
Escherichia coli (
E. coli) [
18]. Furthermore, historic anthropogenic hydrological and environmental modification in the way of clearing large tracts of land for cultivation has resulted in an increased rate and volume of storm flow in SCW, increasing the risk and frequency of flooding and accelerating erosion. Since 1999, SCWIG has been working to address these issues systematically on a watershed scale. The result of their efforts was the development of the 2005 Shell Creek Watershed Management Plan, which focused on both the quantity and quality of runoff to resolve flooding and water quality issues in Shell Creek. The 2016 Shell Creek Watershed Environmental Enhancement Plan was subsequently developed to address persisting water quality impairments not resolved by the 2005 plan. Through these plans and the work of SCWIG and SCW landowners, more than 340 conservation practices have been implemented on the land. These practices include no-till farming, cover crops, and filter and buffer strips and have resulted in the successful delisting of Shell Creek for aquatic life impairment due to Atrazine in 2018 [
19]. Outside of improvements to water quality, these conservation practices have additionally been credited with alleviating flooding somewhat in SCW [
20].
2.2. Data
The majority of hydrological and meteorological data for this study were obtained from Phase 2 of the North American Land Data Assimilation System [
21,
22]. The available dataset includes precipitation totals (PCP), above-ground convective available potential energy (CAPE), the fraction of total precipitation that is convective (CNFRAC), longwave radiation flux downwards (DLRWF), shortwave radiation flux downwards (DSWRF), potential evaporation (PEVAP), surface pressure (S.P.), specific humidity (S.H.), temperature, zonal wind speed (UGRD), and meridional wind speed (VGRD), and starts from January 1979. Hourly data for these variables were masked to the study region, then aggregated to daily averages. Daily snow water equivalent (SWE) data were pulled from the National Snow and Ice Data Center [
23], and daily groundwater level (GWL) data were obtained from the National Water Information System (NWIS). A large number of initial variables was selected intentionally, as it allows for the narrowing down of variables to determine what factors are relevant for flooding in SCW—one of the main objectives of this study.
Discharge data were obtained from the U.S. Geological Survey (USGS) from a streamflow gauge near Columbus, NE. The Columbus gauge is the only gauge for Shell Creek that provides daily discharge data to date, and as discharge is the target variable in our rainfall-runoff model, the study region and time period for this research are restricted by the location and available data of this gauge. The Columbus gauge is located upstream from the point where SCW flows into the Lower Platte River and just slightly upstream from where Loseke Creek feeds into Shell Creek, so this study will focus on the portion of the watershed above the gauge and exclude the downstream area beyond Columbus. Approved daily discharge data from the Columbus gauge only extend as far back as 1990, and thus the total time interval used for variable selection and model training in this work is from 1990 to 2020.
While daily discharge data from the Columbus gauge are only available from 1990, annual peak flow data from the same gauge are documented and available all the way back to 1947, with the exception of 1976 and 1977. These annual peak flow data from the USGS are used for flood frequency analysis of the watershed. Drought analysis was conducted using NLDAS precipitation data from 1982 to 2020.
2.3. Model: Artificial Neural Network
Artificial neural networks are machine learning models loosely based on the structure of biological networks in the brain. These networks are composed of artificial neurons organized into distinct layers—the input layer, hidden layers, and output layer—each of which contains at least one neuron. Data inputs are run from the input layer through the hidden layers to the output layer, where a prediction is output. Neurons in the hidden layers and output layer take the outputs of other neurons as inputs, then compute nonlinear transformations of those inputs to generate their own outputs. In recurrent neural networks, a neuron may take the outputs of other neurons from the previous layer and from within the same layer. In feedforward neural networks (FFNNs), neurons only receive outputs as inputs from the previous layer.
In an ANN, a connection between two neurons has a weight associated with it that represents the connection strength. Changing these weights of an ANN changes the final output of the model, and thus it is necessary for such weights to be adjusted to optimize the performance of the model. This is done through training. Two widely used categories of ANN training are supervised and unsupervised training (note that there are other types of training as well, such as semi-supervised or self-supervised training, which are not discussed here). Supervised training is a method in which an established pair of inputs and outputs is compared against the model’s outputs for the same established data inputs, and then feedback is given in order to minimize the deviation of the model outputs from the expected outputs. Unsupervised learning is a method in which unlabeled training data (data without a target variable) is given to an algorithm, from which the algorithm then identifies patterns and categories on its own. This study employed supervised learning to train the ANN model.
A common method used for adjusting network weights in supervised training is error backpropagation. Backpropagation compares the network output for a set of inputs with the observed target, then evaluates the error with a loss function. This error is then propagated backward to adjust connection weights and improve the accuracy of the model.
This study used a FFNN with a single hidden layer to model rainfall-runoff processes in SCW. A single-layer model was chosen for this research as preliminary testing of a single dataset on both a single-layer and double-layer model yielded nearly identical results, making the added complexity of the double-layer ANN redundant.
ANN codes were written in MATLAB, and all datasets were normalized using z-score normalization prior to input to the ANN according to the following expression:
where x is the data point, x
mean is the mean value of the data, and σ is the standard deviation of the data. Once data had been normalized, they were used in the training of an ANN whose number of neurons in the hidden layer was varied from 1 to 50. The model was trained on 70% of the dataset, then validated on the remaining 30% of the data. Mean squared error (MSE) from validation for the output of each model was compared, and a number of neurons between 10 and 50 were then selected for the prediction ANN model by determining which ANN had the lowest MSE after training and validation. These bounding values of 10 and 50 neurons were chosen in order to ensure the selection of a model that was neither oversimplified nor overly complex.
Following training, the selected ANN model was made to predict discharge, and the performance of the model was measured using the Kling–Gupta efficiency (KGE) [
24]. KGE is a statistic that compares the bias, variability, and timing of a model’s output to that of the observed data, and is calculated as follows:
where r is the linear correlation coefficient, α a is a measure of relative variability in the simulated and observed values, and β represents bias. KGE values range from negative infinity to 1, with 1 indicating that a model’s outputs perfectly match with the observed target data. The closer a KGE value is to 1, the better the model is performing.
The ANN was trained and run numerous times, and on data for three different time periods: the full period, 1990–2020; the pre-planning period, 1990–2004; and the post-planning period, 2005–2020. In this paper, the pre-planning period is sometimes noted as the non-conservation period, and the post-planning period is sometimes noted as the conservation period. Discharge was the target variable for all trials. The first trial for each period included all predictive variables, and its KGE was recorded as a reference KGE for future trials. Leave-one-out analysis was then conducted where the model for each period was run multiple times, each with a different variable left out, and the resulting KGE of each trial was compared to the reference KGE for that time period in order to determine the respective influence of each variable on the model’s output accuracy and performance. The least influential variables were eliminated, and the process was repeated until the most effective model had been obtained.
2.4. Flood Frequency Analysis
Flood frequency analysis is a method employed in hydrology to estimate the exceedance probabilities corresponding to specific streamflow values for a given river. Annual peak flow or peak-over-threshold data existing over a sizeable number of consecutive years (typically more than 30) were collected and used to fit probability distribution functions from which the exceedance probabilities can be calculated. In this study, flood frequency analysis was conducted using the U.S. Army Corps of Engineers Hydrologic Engineering Center’s (HEC) Statistical Software Package (HEC-SSP 2.2). Flood frequency analysis for SCW was also broken into three different time periods to allow for the examination and identification of flood frequency differences before and after the implementation of conservation practices. These periods are the full period, 1947–2020, the pre-planning period, 1947–2004, and the post-planning period, 2005–2020. However, only results from the full period and pre-planning period were analyzed in this paper, as the post-planning period extends only 15 years back, so it would not be a reliable source to draw conclusions from regarding flood frequencies. Conclusions about flood frequency changes during the post-planning period were instead drawn from the comparison of the full period and the pre-planning period. It is to be noted that there always has been some level of conservation in the watershed, but the key to success in the Shell Creek watershed was the development and implementation of the comprehensive and strategic plan to address the flooding and water quality issues as opposed to random acts of conservation.
2.5. Drought Analysis
Droughts are often measured using drought indexes, which analyze data for selected drought indicators over various time intervals in order to output a drought index value. This drought index value is a single number interpreted on a range from abnormally wet to abnormally dry. In order to conduct a drought analysis for SCW, this study employed the Standard Precipitation Index (SPI) [
25], which compares actual precipitation accumulation over a region for a certain time period to the probability of precipitation according to historical records for that same time period. SPI values indicate the number of standard deviations from mean moisture conditions, with positive SPI values representing wet conditions and negative SPI values representing dry conditions. SPI values ranging in magnitude from 0 to 0.99 indicate mild conditions, from 1 to 1.49 indicate moderate conditions, from 1.5 to 1.99 indicate severe conditions, and 2 or above indicate extreme conditions [
25]. In this study, SPI was run for 1-month, 3-month, 6-month, and 12-month time intervals. Inferences about soil moisture levels in an area can be drawn from the 3-month SPI, while SPI values corresponding to longer timescales (e.g., 6- to 12-month) relate information about wet and dry periods.
2.6. Analysis Framework
Figure 2 shows the analysis framework followed in this study. In order to determine variables relevant to flooding, both model-free and model-based elimination were employed. Once data were masked to study area and aggregated to daily data, pair plots and cross-correlation plots were created that plotted each variable individually against the target variable discharge, as well as all variables against each other. These plots were generated for the three time intervals previously mentioned: the full time period, 1990–2020; the pre-planning period, 1990–2004; and the post-planning period, 2005–2020. This step allowed for the examination of relationships between each variable and the target variable and the identification of multicollinearity between independent variables. The variables were then run through the ANN, which first underwent training and validation with the data, then generated predictions for the target variable discharge. These predictions were evaluated using KGE, and using those evaluations, the variables most relevant to flooding were determined. This identification of factors and conditions conducive to flooding allowed for an improved understanding of SCW hydrological processes.
Additional pathways towards improved hydrologic understanding taken on in this research are flood frequency analysis and drought analysis. Using USGS annual peak flow data, flood frequency analysis was carried out to identify the exceedance probabilities of streamflow in SCW. Using long-term precipitation data, drought analysis was conducted with the SPI in order to better understand patterns of dryness and wetness in SCW.
5. Conclusions
In this study, Shell Creek Watershed was examined in order to identify the respective influence of selected controlling factors on the hydrology of the catchment, improve understanding of hydrological processes there, and develop an effective rainfall-runoff model for the watershed using ANN. An additional intention of the work was to identify any possible relationship between changes in flood severity and the implementation of conservation practices. Variable selection was broken into three time periods determined by the start of widespread conservation practice implementation in SCW: the full time period (1990–2020), the pre-planning period (1990–2004), and the post-planning period (2005–2020). Model-free and model-based variable selection were used to identify variable influence on flooding, following which ANN rainfall-runoff models were developed using the optimal combination of input variables for each period. Additionally, drought and flood frequency analyses were conducted to improve hydrological understanding of the watershed. Their results contributed to analyses of the efficacy of certain variables in discharge prediction and the identification of changes in peak flow trends.
From the results and analysis of this work, several conclusions can be drawn, as well as some suggestions for future study. In terms of individual variable predictability, our results suggest that of the variables considered, precipitation is most tied to flooding in SCW. However, the correlation between precipitation and discharge for SCW is still quite low, so dependence solely on this factor for discharge prediction is discouraged. Instead, a combination of all variables is encouraged for this purpose. Out of all of the models developed in this study, the model for the post-planning period more closely reflects the current conditions of the watershed and has overall good prediction performance, making it the strongest candidate for discharge prediction in SCW. This model included all variables used in this study as inputs; however, future work may benefit from the investigation and inclusions of soil moisture, as the 3-month SPI results implied cases of high soil moisture coinciding with flooding. With regard to the relationship between flood-change and conservation implementation, flooding has seen an overall decrease in intensity during the post-planning period (with the exception of a few outliers); however, more intensive investigation is warranted to determine a cause for this trend. Furthermore, the incorporation of climate change effects on SCW in such future studies may aid in shedding light on flood trend shifts and the appearance of anomalous flows. Understanding the potentially countering effects of climate change and conservation practices on flooding is crucial to advancing our fundamental understanding of the hydrological processes in the basin. This will require a more in-depth investigation with high-resolution remote sensing datasets and advanced hydrologic modeling.