Enhancing Peak Runoff Forecasting through Feature Engineering Applied to X-Band Radar Data
Abstract
:1. Introduction
2. Study Area and Dataset
2.1. Study Area
2.2. Dataset
3. Methods
3.1. Determination of Independent Peak Runoff Events
3.2. Development of Peak Runoff Forecasting Models
3.2.1. Runoff and Precipitation Lags
3.2.2. Random Forest (RF) Algorithm for Regression
- i.
- The bootstrap resampling method is applied to randomly select samples from the IFS, which are used to construct individual regression trees. The “out-of-bag” (OOB) sampling technique is applied to each bootstrap sample. The OOB samples consist of the data that are not included in a particular bootstrap sample, serving as a validation set for the corresponding tree, allowing for unbiased regression.
- ii.
- Data splitting for each bootstrap sample determined in (i). It occurs randomly at each node within every tree. To prevent the risk of overfitting, it is crucial to specify a maximum number of features for choosing the optimal split from the complete set of predictors within the feature space. This helps to ensure diversity in the models and avoids duplicate model construction.
- iii.
- All models generated in the bootstrap sample generation stage grow based on the splits defined in step (ii). Their growth is restricted by defining an upper limit, which can be achieved by configuring a hyperparameter governing the maximum depth or specifying the minimum number of samples expected in the final node. The regulation of the maximum size of the trees (pruning) is intended to decrease the structural complexity of the model, resulting in noise reduction and the model’s simplicity.
- iv.
- Determination of the regression prediction result, which involves calculating the arithmetic mean of the responses from all the regression trees.
3.2.3. Object-Based Approach to Derive Precipitation Attributes for Enhanced Forecasting Models
Overview of Object-Based Approach (OBA) Process Implementation
Object Attributes
3.3. Model Evaluation between Referential and Enhanced Models
4. Results
4.1. Independent Peak Runoff Events
4.2. Development of Peak Runoff Forecasting Models
Precipitation Attributes for Enhanced Forecasting Models
4.3. Comparison between Referential and Enhanced Models
5. Discussion
6. Conclusions
- The application of the FE strategy resulted in enhanced model efficiencies and enabled us to better leverage precipitation radar data by incorporating attributes of precipitation, such as the precipitation volume, areal extension of precipitation objects, and the distance between the centroid of these objects and the outlet of the catchment.
- All enhanced models demonstrated improvements in their efficiencies. Notably, the models for 3 and 6 h lead times exhibited more significant enhancements compared to the 1 h forecast, where the autoregressive behavior already produced an efficient model.
- To fully utilize the high spatial resolution of radar data for modeling, it is crucial to extract relevant attributes, rather than using the entire dataset, which introduces noise to the models. The enhanced models achieved a significant reduction in input data, emphasizing the efficiency gained through selective attribute extraction. This highlights a simplified method that optimally utilizes ground-based radar data.
- This study has demonstrated the positive impact of improving the representativeness of precipitation retrieved from a high-resolution X-band weather radar. By extracting relevant attributes from high-resolution imagery, we were able to better capture the spatial characteristic of precipitation and improve the assimilation of this information to RF models.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mosavi, A.; Ozturk, P.; Chau, K.-W. Flood prediction using machine learning models: Literature review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
- Falck, A.; Maggioni, V.; Tomasella, J.; Diniz, F.; Mei, Y.; Beneti, C.; Herdies, D.; Neundorf, R.; Caram, R.; Rodriguez, D. Improving the use of ground-based radar rainfall data for monitoring and predicting floods in the Iguaçu river basin. J. Hydrol. 2018, 567, 626–636. [Google Scholar] [CrossRef]
- Rozalis, S.; Morin, E.; Yair, Y.; Price, C. Flash flood prediction using an uncalibrated hydrological model and radar rainfall data in a Mediterranean watershed under changing hydrological conditions. J. Hydrol. 2010, 394, 245–255. [Google Scholar] [CrossRef]
- Stefanidis, S.; Stathis, D. Assessment of flood hazard based on natural and anthropogenic factors using analytic hierarchy process (AHP). Nat. Hazards 2013, 68, 569–585. [Google Scholar] [CrossRef]
- Muñoz, P.; Orellana-Alvear, J.; Willems, P.; Célleri, R. Flash-flood forecasting in an andean mountain catchment-development of a step-wise methodology based on the random forest algorithm. Water 2018, 10, 1519. [Google Scholar] [CrossRef]
- Anagnostou, M.N.; Nikolopoulos, E.I.; Kalogiros, J.; Anagnostou, E.N.; Marra, F.; Mair, E.; Bertoldi, G.; Tappeiner, U.; Borga, M. Advancing precipitation estimation and streamflow simulations in complex terrain with X-Band dual-polarization radar observations. Remote Sens. 2018, 10, 1258. [Google Scholar] [CrossRef]
- Bournas, A.; Baltas, E. Comparative analysis of rain gauge and radar precipitation estimates towards rainfall-runoff modelling in a peri-urban basin in Attica, Greece. Hydrology 2021, 8, 29. [Google Scholar] [CrossRef]
- Grek, E.; Zhuravlev, S. Simulation of rainfall-induced floods in small catchments (The polomet’ river, north-west russia) using rain gauge and radar data. Hydrology 2020, 7, 92. [Google Scholar] [CrossRef]
- Orellana-Alvear, J.; Celleri, R.; Rollenbeck, R.; Muñoz, P.; Contreras, P.; Bendix, J. Assessment of native radar reflectivity and radar rainfall estimates for discharge forecasting in mountain catchments with a random forest model. Remote Sens. 2020, 12, 1986. [Google Scholar] [CrossRef]
- Beven, K. Rainfall-Runoff Modelling; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2012. [Google Scholar]
- Yu, P.-S.; Yang, T.-C.; Chen, S.-Y.; Kuo, C.-M.; Tseng, H.-W. Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting. J. Hydrol. 2017, 552, 92–104. [Google Scholar] [CrossRef]
- Lohani, A.K.; Goel, N.; Bhatia, K. Improving real time flood forecasting using fuzzy inference system. J. Hydrol. 2014, 509, 25–41. [Google Scholar] [CrossRef]
- Tayfur, G.; Singh, V.P.; Moramarco, T.; Barbetta, S. Flood hydrograph prediction using machine learning methods. Water 2018, 10, 968. [Google Scholar] [CrossRef]
- Biau, G.; Scornet, E. A random forest guided tour. TEST 2016, 25, 197–227. [Google Scholar] [CrossRef]
- Liu, D.; Fan, Z.; Fu, Q.; Li, M.; Faiz, M.A.; Ali, S.; Li, T.; Zhang, L.; Khan, M.I. Random forest regression evaluation model of regional flood disaster resilience based on the whale optimization algorithm. J. Clean. Prod. 2020, 250, 119468. [Google Scholar] [CrossRef]
- Al-Fawa’reh, M.; Hawamdeh, A.; Alrawashdeh, R.; Jafar, M.T. Intelligent Methods for flood forecasting in Wadi al Wala, Jordan. In Proceedings of the 2021 International Congress of Advanced Technology and Engineering (ICOTEN), Virtual, 4–5 July 2021. [Google Scholar]
- Choi, C.; Kim, J.; Kim, J.; Kim, D.; Bae, Y.; Kim, H.S. Development of heavy rain damage prediction model using machine learning based on big data. Adv. Meteorol. 2018, 2018, 5024930. [Google Scholar] [CrossRef]
- A Pollard, J.; Spencer, T.; Jude, S. Big Data Approaches for coastal flood risk assessment and emergency response. WIREs Clim. Chang. 2018, 9, e543. [Google Scholar] [CrossRef]
- Fang, Z.; Wang, Y.; Peng, L.; Hong, H. Predicting flood susceptibility using LSTM neural networks. J. Hydrol. 2020, 594, 125734. [Google Scholar] [CrossRef]
- Muñoz, P.; Corzo, G.; Solomatine, D.; Feyen, J.; Célleri, R. Near-real-time satellite precipitation data ingestion into peak runoff forecasting models. Environ. Model. Softw. 2022, 160, 105582. [Google Scholar] [CrossRef]
- Yang, Y.; Chui, T.F.M. Modeling and interpreting hydrological responses of sustainable urban drainage systems with explainable machine learning methods. Hydrol. Earth Syst. Sci. 2021, 25, 5839–5858. [Google Scholar] [CrossRef]
- Miao, Q.; Pan, B.; Wang, H.; Hsu, K.; Sorooshian, S. Improving monsoon precipitation prediction using combined convolutional and long short term memory neural network. Water 2019, 11, 977. [Google Scholar] [CrossRef]
- Kim, G.; Barros, A.P. Quantitative flood forecasting using multisensor data and neural networks. J. Hydrol. 2001, 246, 45–62. [Google Scholar] [CrossRef]
- Davis, C.A.; Brown, B.; Bullock, R. Object-based verification of precipitation forecasts. Part I: Application to convective rain systems. Mon. Weather Rev. 2006, 134, 1785–1795. [Google Scholar] [CrossRef]
- Laverde-Barajas, M.; Perez, G.C.; Chishtie, F.; Poortinga, A.; Uijlenhoet, R.; Solomatine, D. Decomposing satellite-based rainfall errors in flood estimation: Hydrological responses using a spatiotemporal object-based verification method. J. Hydrol. 2020, 591, 125554. [Google Scholar] [CrossRef]
- Laverde-Barajas, M.; Corzo, G.; Bhattacharya, B.; Uijlenhoet, R.; Dimitri, P.S. Spatiotemporal Analysis of Extreme Rainfall Events Using an Object-Based Approach; Elsevier Inc.: Amsterdam, The Netherlands, 2019. [Google Scholar]
- Contreras, P.; Orellana-Alvear, J.; Muñoz, P.; Bendix, J.; Célleri, R. Influence of random forest hyperparameterization on short-term runoff forecasting in an andean mountain catchment. Atmosphere 2021, 12, 238. [Google Scholar] [CrossRef]
- Pesántez, J. Propuesta de un Modelo de Gestión de la Subcuenca del Río Tomebamba, Como Herramienta de Manejo Integrado y de Conservación; Universidad del Azuay: Cuenca, Ecuador, 2015. [Google Scholar]
- Buytaert, W.; Célleri, R.; Timbe, L. Predicting climate change impacts on water resources in the tropical Andes: Effects of GCM uncertainty. Geophys. Res. Lett. 2009, 36, L07406. [Google Scholar] [CrossRef]
- Nieves, J.A.; Contreras, J.; Pacheco, J.; Urgilés, J.; García, F.; Avilés, A. Assessment of drought time-frequency relationships with local atmospheric-land conditions and large-scale climatic factors in a tropical Andean basin. Remote Sens. Appl. Soc. Environ. 2022, 26, 100760. [Google Scholar] [CrossRef]
- Hastenrath, S. On snow line depression and atmospheric circulation in the tropical americas during the pleistocene. S. Afr. Geogr. J. 1971, 53, 53–69. [Google Scholar] [CrossRef]
- Orellana-Alvear, J.; Célleri, R.; Rollenbeck, R.; Bendix, J. Optimization of X-Band radar rainfall retrieval in the southern andes of ecuador using a random forest model. Remote Sens. 2019, 11, 1632. [Google Scholar] [CrossRef]
- Orellana-Alvear, J.; Célleri, R.; Rollenbeck, R.; Bendix, J. Analysis of rain types and their z–r relationships at different locations in the high andes of southern ecuador. J. Appl. Meteorol. Clim. 2017, 56, 3065–3080. [Google Scholar] [CrossRef]
- Willems, P. A time series tool to support the multi-criteria performance evaluation of rainfall-runoff models. Environ. Model. Softw. 2009, 24, 311–321. [Google Scholar] [CrossRef]
- Sudheer, K.P.; Gosain, A.K.; Ramasastri, K.S. A data-driven algorithm for constructing artificial neural network rainfall-runoff models. Hydrol. Process. 2002, 16, 1325–1330. [Google Scholar] [CrossRef]
- Li, M.; Zhang, Y.; Wallace, J.; Campbell, E. Estimating annual runoff in response to forest change: A statistical method based on random forest. J. Hydrol. 2020, 589, 125168. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- van der Walt, S.; Schönberger, J.L.; Nunez-Iglesias, J.D.; Boulogne, F.; Warner, J.; Yager, N.; Gouillart, E.; Yu, T. scikit-image: Image processing in Python. PeerJ 2014, 2, e453. [Google Scholar] [CrossRef] [PubMed]
- Lamontagne, J.R.; Barber, C.A.; Vogel, R.M. Improved Estimators of Model Performance Efficiency for Skewed Hydrologic Data. Water Resour. Res. 2020, 56, e2020WR027101. [Google Scholar] [CrossRef]
- Krause, P.; Boyle, D.P.; Bäse, F. Comparison of different efficiency criteria for hydrological model assessment. Adv. Geosci. 2005, 5, 89–97. [Google Scholar] [CrossRef]
- Cho, Y. Application of NEXRAD Radar-Based Quantitative Precipitation Estimations for Hydrologic Simulation Using ArcPy and HEC Software. Water 2020, 12, 273. [Google Scholar] [CrossRef]
- Xiaoyang, L.; Jietai, M.; Yuanjing, Z.; Jiren, L. Run off Simulation Using Radar and Rain Gauge Data. Adv. Atmos. Sci. 2003, 20, 213–218. [Google Scholar] [CrossRef]
- Noymanee, J.; Nikitin, N.O.; Kalyuzhnaya, A.V. Urban Pluvial Flood Forecasting using Open Data with Machine Learning Techniques in Pattani Basin. Procedia Comput. Sci. 2017, 119, 288–297. [Google Scholar] [CrossRef]
- Massari, C.; Camici, S.; Ciabatta, L.; Brocca, L. Exploiting Satellite-Based Surface Soil Moisture for Flood Forecasting in the Mediterranean Area: State Update Versus Rainfall Correction. Remote Sens. 2018, 10, 292. [Google Scholar] [CrossRef]
- Tripathi, M.P.; Panda, R.K.; Pradhan, S.; Sudhakar, S. Runoff modelling of a small watershed using satellite data and GIS. J. Indian Soc. Remote Sens. 2002, 30, 39–52. [Google Scholar] [CrossRef]
- Asadi, H.; Shahedi, K.; Jarihani, B.; Sidle, R.C. Rainfall-Runoff Modelling Using Hydrological Connectivity Index and Artificial Neural Network Approach. Water 2019, 11, 212. [Google Scholar] [CrossRef]
- Schaefer, J.T. The Critical Success Index as an Indicator of Warning Skill. Weather Forecast. 1990, 5, 570–575. [Google Scholar] [CrossRef]
Hyperparameter | Values |
---|---|
n_trees a | 50; 800; 10 |
max_features | n_features b, n_features(1/2), log2(n_features) |
max_depth a | 5; 200; 5 |
Metric | Equation | Range | Ideal Value |
---|---|---|---|
NSE | −∞, 1 | 1 | |
KGE | −∞, 1 | 1 | |
RMSE | 0, +∞ | 0 |
Referential Models | |||
Lead Time | n_trees | max_features | max_depth |
1 h | 300 | 9688 | 55 |
3 h | 450 | 9688 | 5 |
6 h | 400 | 9688 | 35 |
Enhanced Models | |||
Lead time | n_trees | max_features | max_depth |
1 h | 420 | 32 | 25 |
3 h | 310 | 32 | 65 |
6 h | 130 | 5 | 40 |
Lead Time | NSE | KGE | RMSE | |||
---|---|---|---|---|---|---|
Referential | Enhanced | Referential | Enhanced | Referential | Enhanced | |
1 h | 0.93 | 0.94 | 0.90 | 0.92 | 7.33 | 6.83 |
3 h | 0.65 | 0.75 | 0.54 | 0.66 | 16.72 | 14.14 |
6 h | 0.42 | 0.50 | 0.37 | 0.44 | 21.56 | 20.07 |
Lead Time | NSE | KGE | RMSE |
---|---|---|---|
1 h | 1% | 2% | 7% |
3 h | 15% | 23% | 15% |
6 h | 18% | 17% | 7% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Álvarez-Estrella, J.; Muñoz, P.; Bendix, J.; Contreras, P.; Célleri, R. Enhancing Peak Runoff Forecasting through Feature Engineering Applied to X-Band Radar Data. Water 2024, 16, 968. https://doi.org/10.3390/w16070968
Álvarez-Estrella J, Muñoz P, Bendix J, Contreras P, Célleri R. Enhancing Peak Runoff Forecasting through Feature Engineering Applied to X-Band Radar Data. Water. 2024; 16(7):968. https://doi.org/10.3390/w16070968
Chicago/Turabian StyleÁlvarez-Estrella, Julio, Paul Muñoz, Jörg Bendix, Pablo Contreras, and Rolando Célleri. 2024. "Enhancing Peak Runoff Forecasting through Feature Engineering Applied to X-Band Radar Data" Water 16, no. 7: 968. https://doi.org/10.3390/w16070968
APA StyleÁlvarez-Estrella, J., Muñoz, P., Bendix, J., Contreras, P., & Célleri, R. (2024). Enhancing Peak Runoff Forecasting through Feature Engineering Applied to X-Band Radar Data. Water, 16(7), 968. https://doi.org/10.3390/w16070968