1. Introduction
The quantitative precipitation estimation (QPE) method is a foundational pillar in hydro-meteorological sciences, with far-reaching implications for energy generation, agricultural planning, and environmental conservation. This study’s motivation arose from Brazil’s hydroelectric heart, Western Paraná, underpinned by the Itaipu Binational Dam, a global leader in hydroelectric energy output [
1]. The complexities of weather forecasting in this region not only have academic and economic significance but also are vital for the strategic operation of hydroelectric reservoirs and for the protection of communities against the unpredictable forces of nature [
2,
3].
The Z–R relationship [
4], a cornerstone in radar meteorology for the conversion of radar reflectivity (Z) into rainfall rates (R), has long been recognized for its broad applicability, but it is also subject to various limitations [
5,
6]. Factors such as variability in the raindrop size distribution, the presence of mixed-phase precipitation, radar signal attenuation, calibration challenges, and physical obstructions contribute to these uncertainties, which are particularly pronounced in the complex landscape of Western Paraná, Brazil.
These issues are common in various geographical settings, necessitating a novel methodological approach to enhance the accuracy of rainfall estimation [
4]. The Z–R relationship is based on empirical correlations that vary geographically and temporally, influenced by the local climatic conditions, which underscores the need for adaptive approaches that can adjust to specific meteorological conditions.
This research aimed to evaluate the applications of the Z–R relationship in a particular environmental context and explore the potential of machine learning (ML) to improve QPE. This study focused on tree-based machine learning models—random forest [
7] and gradient boosting [
8]—to leverage their ability to model the nonlinear complexities of precipitation data. While previous research has tested machine learning methods for QPE in various parts of the world, including the Southern Andes of Ecuador [
9], South Korea [
10], and Switzerland [
11], none have yet employed a hybrid approach integrating multiple machine learning models to enhance rainfall estimation.
In our academic pursuits, we have extensively utilized machine learning to achieve progress in quantitative precipitation estimation (QPE). Previous studies have suggested that machine learning can be effective in this field, but it is still relatively new, particularly in the development of hybrid machine learning models. Rollenbeck et al. [
12] showed that machine learning can outperform empirical approaches in calibrating X-band radar for extreme weather events in a region of complex precipitation in North Peru, highlighting the potential of advanced algorithms in such scenarios.
This research explored the application of machine learning in meteorology to obtain more accurate precipitation estimates in a region where the weather patterns are closely linked with hydroelectric power generation. Our study analyzed the performances of two models, random forest and gradient boosting, in both classification and regression scenarios. We aimed to create a more resilient and accurate meteorological practice, which could have implications beyond Western Paraná.
This article is structured to guide the reader through the research process. The analysis begins with an in-depth examination of the dataset, which forms the basis of the hybrid machine learning approach for QPE (
Section 2). This is followed by an explanation of the selected machine learning models, the data transformation procedures, and the benchmarks for performance evaluation (
Section 3). An assessment of the current QPE methods establishes the context for a detailed analysis of the proposed hybrid model’s effectiveness, calibration, and configuration (
Section 3.1). This article then discusses the practical application of the model, its adaptability to operational demands, and its validation against real-world precipitation events (
Section 3.2). Finally, this article concludes by synthesizing the findings, discussing their implications, and considering the potential of the hybrid approach within the broader context of QPE advancements (
Section 4).
2. Materials and Methods
The research methodology can be categorized into four phases: data collection, data preprocessing, feature engineering, model development, and model evaluation.
2.1. Data Collection
Paraná is one of the five most developed states in Brazil, with a strong economy that is centered around agriculture and industry. In addition, it has the second-highest energy potential in the country.
In the Paraná region, which encompasses six micro-regions and their hydroelectric plants, there is a need for increased monitoring due to various critical factors. These factors include the region’s significant climate variability, the potential impact on hydroelectric plant operations, and the importance of accurate weather forecasts for water resource management and natural disaster prevention.
To address these challenges, a primary dataset was derived from a dual-polarization weather radar system located in Cascavel, Paraná, Brazil. Installed in 2014, this radar system captures atmospheric data with high granularity, providing comprehensive representations of precipitation events over several years.
Figure 1 illustrates the study area and the network of 36 rain gauges, which complements the radar data. The rain gauge network features automatic tipping bucket mechanisms with a sensitivity greater than 0.1 mm. The EEC S-band CAS radar magnetron is equipped with an 8.5 m parabolic antenna, offering an over 45.0 dB gain and a half-power beam width of 0.95 degrees. It supports both linear horizontal and vertical polarization, with an angular positioning accuracy of 0.05 degrees. The radar’s scanning speed can reach up to 10 rpm, and it includes a magnetron transmitter with peak power of 850 kW. It uses a single receiver, with a typical minimum discernible signal of −114 dBm and a linear dynamic range of up to 105 dB.
Table 1 presents the spatial and temporal resolutions of the radar and rain gauge data, both in their original and transformed forms. Initially, the radar data feature a spatial resolution of 1° in azimuth by 250 m in range at a single elevation, with a temporal resolution of 5 min. These data are then reprocessed in the database to a different format by redefining the range resolution as a ’bin gate’ to better understand the precipitation patterns over different distances. This transformation also adjusts the temporal resolution to 15 min for consistency in analysis.
Similarly, the rain gauge data, which originally use the latitude and longitude for spatial resolution and have a 15 min temporal resolution, are converted into a format compatible with the radar data, using azimuth and range coordinates, while maintaining the exact temporal resolution. The binning process of the radar data allows for a more nuanced interpretation of the spatial variability in the precipitation, aligning them with the rain gauge data for a comprehensive analysis.
This research examined the Z–R relationships used in the operational environment in Paraná, Brazil (
Figure 2). The relationships chosen for comparison were derived from the methodologies of Marshall and Palmer [
4], Calheiros [
13], and NEXRAD [
14], which are operationally viable within the regional context. While other coefficients, such as those proposed by Vulpiani et al. [
15], are available, this study prioritized the operational applicability of the chosen Z–R relationships. These relationships were specifically selected because they have been extensively tested and are commonly used in operational settings in research, ensuring a practical and targeted analysis. This selection allows for a coherent evaluation of the Z–R relationship’s performance in real-world weather radar applications.
2.2. Data Preprocessing
In this study, we considered the distribution of the precipitation over the Western Paraná region from 2018 to 2022. Our dataset predominantly comprised events with no precipitation, with almost 94% of the data showing precipitation levels below 0.1 mm. Due to this imbalance, we chose to focus on precipitation events exceeding 0.2 mm per 15 min, which aligns with the calibration settings of the rain gauges used. These gauges were optimized to accurately detect minimal yet significant precipitation events, ensuring an effective analysis.
We created a distribution graph of the precipitation data between 2018 and 2022 that focuses on rain events exceeding 0.2 mm (
Figure 3). This graph displays the amount of precipitation in millimeters on a logarithmic scale. The analysis shows that the majority of the data cluster between 0.2 and 10 mm per 15 min, with a significant peak in the range of 5 to 10 mm. There is also a noticeable decrease in the frequency of data for rainfall volumes exceeding 10 mm. Only a tiny fraction, approximately 0.02%, of the data correspond to precipitation events that exceed 30 mm. This refined focus on specific precipitation ranges allowed for a more targeted and accurate analysis of the data, which is essential in understanding and predicting rainfall patterns in the context of weather forecasting, climate studies, and urban planning.
We ensured the integrity and reliability of the meteorological data by comprehensively addressing several critical aspects during the data preprocessing phase. This phase included the following key steps:
Distance Filtering: We refined the dataset for an accurate analysis by applying filters to remove data from the radar’s blind spot and areas beyond its reliable range.
Handling Missing Data: We implemented techniques to address missing values in key variables, such as reflectivity, ensuring the dataset’s completeness.
Polarimetric Variable Selection: The meticulous selection and filtering of crucial polarimetric variables were conducted to enhance the quality of the data. Variables such as
,
, and
were carefully chosen based on their importance in distinguishing meteorological phenomena, as described in [
17]. The thresholds defined for these variables were as follows:
- –
For , a threshold of ≥0.5 identifies cloud data, whereas values up to indicate high linear polarization associated with precipitation.
- –
The threshold for was set at , crucial for selecting rain data due to high horizontal diffraction.
- –
Similarly, a threshold of for identifies rain data related to differential polarization diffraction.
- –
For snow data selection, a threshold of was used, owing to the low differential polarization diffraction.
These thresholds are instrumental in ensuring precise discrimination between different meteorological conditions.
Consistency Check Between Reflectivity and Precipitation: A thorough validation ensured consistency between the radar reflectivity measurements and rain gauge data.
These preprocessing steps were pivotal in preparing the radar and rain gauge data for subsequent machine learning model training and validation. The cleaned dataset significantly improved the accuracy of our precipitation estimation models by facilitating more reliable and detailed meteorological analyses.
2.3. Feature Engineering
Feature engineering plays a critical role in enhancing the performance of machine learning models in meteorological applications, particularly in precipitation estimation using radar data. The process involves the meticulous selection and preprocessing of the variables to ensure their relevance and effectiveness in the predictive models.
In the training and evaluation of our models, we carefully selected the variables based on their relevance to the collected data and the objectives of our study. The primary variables included the horizontal reflectivity (), differential reflectivity (ZDR), specific differential phase (), and co-polar correlation coefficient (), all of which are measured by the radar system. Additionally, we incorporated the altitude and distance from the radar as critical features in our machine learning models. These additions were crucial in enhancing the accuracy of our quantitative precipitation estimates, allowing our models to account for variations in elevation and radar beam dispersal over different distances.
Table 2 presents a clear and visual description of the variables selected for the model. These variables were chosen because they notably impacted the model’s accuracy in predicting precipitation events. To make these variables comparable and improve the precision of the predictions, they were preprocessed. This conversion from their original units, such as decibels (dBZ) for
, to more relevant units, such as millimeters of precipitation over a 15 min interval, was necessary.
For a complete understanding,
Table 3 shows the input variables utilized for the precipitation estimation model from the radar and station data.
The dataset was divided into 2018 and 2021 data for training (70%), validation (30%), and testing (year of 2022). The feature engineering process, including variable selection and preprocessing, proved crucial in enhancing the model’s precision and reliability in predicting the precipitation patterns.
2.4. Model Development
As part of our research, we designed a model to improve quantitative precipitation estimation (QPE) using machine learning techniques, specifically the random forest (RF) and gradient boosting (GB) methods. These techniques were chosen for their exceptional performance in classification and regression tasks, which are crucial for accurate precipitation prediction. The RF technique combines the predictions of multiple decision trees (denoted as (denoted as
) ) by either voting or averaging:
where
denotes the random forest outcome, and
N represents the number of trees.
Gradient boosting is an ensemble technique that builds a series of models in a sequential manner, with each subsequent model aiming to correct the errors made by its predecessors. Specifically, GB refines the predictions by focusing on the errors of prior iterations:
where
is the composite model’s prediction for input
y,
n is the iteration count,
is the weight for iteration
i’s model
, and
is the prediction of the weak learner at iteration
i.
Our methodology comprised two primary stages: classification and regression. Initially, we distinguished between ‘rain’ and ‘no rain’ events using thresholds based on the precipitation values. Subsequently, we employed regression to estimate the precipitation intensity in the ‘rain’ data, using the same machine learning techniques. This dual approach, which combined RF’s ensemble method and GB’s error-minimizing capability, ensured an all-inclusive and accurate QPE model, effectively leveraging the strengths of both techniques in handling complex meteorological data.
The classification stage in our methodology employs the RF and GB techniques to effectively distinguish between ‘rain’ and ‘no rain’ events. This distinction is crucial, as it allows our model to identify operational patterns that signal rain events above the 0.2 mm/15 min threshold, ensuring that only data representing quantifiable rain enter the regression stage. This process not only enhances the accuracy of our rainfall estimation but also optimizes the computational efficiency by focusing on relevant events. These methodological steps are depicted in
Figure 4, which illustrates the model application workflow.
We selected key variables for model training and evaluation, including radar-measured factors such as reflectivity (), differential reflectivity (ZDR), the specific differential phase (), and the co-polar correlation coefficient ().
2.5. Model Evaluation
To evaluate the model’s performance comprehensively, we compared its estimated precipitation rates with the actual values from the test dataset. Additionally, we compared the model’s predictions with the theoretical Z–R model.
During the validation process, ground-based meteorological station data were used to compare the model’s forecasts with actual observations. This enabled a detailed analysis to identify any discrepancies. To determine the accuracy of the regression model, statistical metrics such as the root mean square error (
RMSE), mean absolute error (
MAE), and Kling–Gupta efficiency (
KGE) were used [
18,
19]. The mean squared error (
MSE), which is defined in Equation (
3), measures the average of the squared differences between the forecast and actual values and provides a residue variance metric:
where
denotes the forecast value of
y, and
signifies the mean of
y.
The RMSE is a metric that consolidates the forecast errors into a single predictive power score. It is calculated by taking the square root of the MSE (Equation (4)). When extrapolating the precipitation estimates across the Paraná grid, the RMSE is particularly sensitive to larger magnitude errors, such as potential outliers that may result from the extrapolation process.
The mean absolute error (MAE) is a commonly used metric in data analysis that calculates the average of the absolute differences between the predicted and actual values in a dataset. The metric is widely utilized due to its simple interpretation and compatibility with the prediction target’s scale. Equation (
5) is used to calculate the MAE:
where
symbolizes the forecast value of
y,
is the observed value of
x, and
n represents the total number of data points.
The last metric used to evaluate the performance of the regression model was the Kling–Gupta efficiency (KGE) metric [
18,
19,
20]. This metric is obtained by Equation (
6), with its parameters r,
, and
calculated through Equations (7)–(9), respectively. This metric quantifies the degree of overlap between the observed and forecast time series by examining their correlations, mean values, and standard deviations, thereby providing a comprehensive analysis of the regression model’s performance. One key advantage of using the KGE metric over other metrics is its global applicability and effectiveness in diverse hydrological contexts.
With regard to measuring the performance of regression models in predicting precipitation rates, it is important to use certain metrics to ensure an accurate evaluation. By taking a holistic approach and adopting these metrics, we can assess the model’s accuracy against the observed data with precision.
To evaluate the performance of classification models, it is customary to employ confusion matrices and metrics such as accuracy, recall, and precision. In our hybrid methodology, which amalgamates elements of both classification and regression, it is imperative to apply these performance metrics to the classifier component to ensure the overall efficacy of the model. A confusion matrix (shown in
Table 4) is a useful tool for displaying classification results in matrix format. The rows represent the actual classes, while the columns represent the predicted classes. The matrix contains true positive (TP), false positive (FP), true negative (TN), and false negative (FN) values.
Accuracy (Equation (10)) is a general metric indicating the total percentage of correct predictions (both positive and negative) of the model, calculated using the following formula:
Although widely used, the accuracy can be misleading when imbalanced classes tend to favor the dominant class.
Other important metrics include the recall and precision, defined, respectively, as follows:
Recall, also known as sensitivity, is a metric that measures how well a model can correctly identify positive cases among all true positive cases. In simpler terms, it shows the proficiency of the model in detecting positive cases and reducing the number of false negatives. This metric is particularly important when failing to identify a positive case can lead to severe consequences, such as diagnosing a disease. For instance, in a test for a severe illness, it is crucial to have high recall to ensure that most patients with the disease are detected. Equation (
11) represents the recall.
Equation (
12) calculates the precision of a model. The precision measures the proportion of correct positive predictions. It is an essential metric used to evaluate the quality of a model’s positive predictions. If a model has high precision, it means that a significant number of its positive classifications are correct. This is crucial in cases where a false positive can have severe consequences, such as spam filtering. For example, high precision in an email filtering system is desirable to prevent legitimate emails from being mistakenly marked as spam.
It is essential to balance the precision and recall in certain situations. In cases where false positives can have serious consequences, the precision cannot be ignored entirely. Therefore, the decision to use relevant metrics will depend on the context of the application and the impact of classification errors.
In this work, it was crucial to prioritize the recall. In the classification that we adopted, it is essential to identify as many positive cases as possible, even if this leads to an increase in false positives. The consequences of failing to recognize truly positive cases can be severe. Failure to alert an at-risk population may lead to injuries or even deaths during natural disasters.
3. Results and Discussion
The ML models developed in this research focused on the task of precipitation estimation. Thus, for the evaluation of the models’ capabilities, we compared their estimations against observed data obtained from a collocated gauge–radar dataset. This approach aimed to address the challenge of categorizing meteorological events, such as rain and no rain, while concurrently estimating the intensity of the precipitation when present.
The classification step helps to categorize meteorological events as “rain” or “no rain”. Once the data have been classified as “rain”, the regression step is used to accurately estimate the precipitation amount. This is essential for various applications, such as weather forecasting, climate analysis, water resource management, and urban planning. Combining classification and regression methods provides a more in-depth and comprehensive meteorological analysis, offering valuable insights into the precipitation and its intensity.
Tuning the hyperparameters in machine learning models, such as setting ‘n_estimators’ to 100 for RF and 500 for GB, optimizes the performance, and it enabled us to achieve more accurate results in this study. These values were selected based on evaluations using cross-validation to balance the model’s complexity and generalization. This section will explore how the combination of these methods can lead to significant outcomes in understanding and predicting precipitation conditions.
Analyzing the results in
Table 5, we can observe the performances of the RF and GB classifiers. Both algorithms demonstrated remarkable performances in the task of classifying meteorological events. During the validation phase, they achieved high accuracy, with values of 0.90 for both, indicating the ability of these models to classify conditions such as “rain” or “no rain” accurately.
The models’ effectiveness was evaluated using the recall metric, which measured their ability to correctly identify rain events. The classifiers performed exceptionally well during the validation phase, with recall values approaching 1.0. This result indicates the low occurrence of false negatives, meaning that very few instances of rain were erroneously classified as “no rain”. This outcome emphasizes the accuracy of the classifiers in correctly identifying and categorizing rain events.
The precision, which measured the proportion of true positives relative to the total predicted positives, also showed significant results, with values of 0.85 for RF and 0.84 for GB in the validation phase, suggesting that most of the models’ rain predictions were correct.
The results in the test phase, although slightly lower than during validation, were still satisfactory, with both classifiers maintaining good performance in terms of accuracy, recall, and precision. This consistency between the validation and test phases suggests that the models can generalize well to new data.
In
Table 6, we examine the RF and GB regressors’ performances in estimating the precipitation intensity. The results highlight the efficacy of these models in quantitative estimation. The RMSE and MAE are standard metrics for the evaluation of the quality of predictions. In both metrics, the regressors achieved good performances.
During the validation phase, RF registered an RMSE of 1.58 mm and an MAE of 1.07 mm, while GB recorded an RMSE of 1.56 mm and an MAE of 0.73 mm. These values indicate that the regressors’ predictions were very close to the actual values, with relatively low average errors. The performances in the test phase were also excellent, with both regressors maintaining low values for the RMSE and MAE.
The results confirm that the hybrid approach effectively classifies meteorological events and estimates the precipitation intensity. The coordinated combination of these two steps provides a comprehensive and precise view of the weather conditions, essential for various practical applications. The consistency of the results between the validation and test phases demonstrates that these models can generalize well to new data, further emphasizing their usefulness in real-world scenarios.
In
Table 7, we present a comparative analysis of the performance of the hybrid model in relation to theoretical Z–R relations and the three Z–R relations of the meteorological radar (DSD [
13], Marshall–Palmer [
4], and Nexrad [
14]). The table displays the RMSE, MAE, and KGE metrics for the validation and test phases.
We observed remarkable results when comparing the hybrid model with the estimated theoretical Z–R relations. For both phases, validation and testing, the random forest–random forest (RFRF) model achieved RMSEs of only 1.00 mm and 0.82 mm, respectively, indicating that the quantitative predictions of the model were very close to the actual values. The MAEs were also low, with values of 0.41 mm and 0.34 mm in the validation and test phases. Moreover, the KGE coefficient demonstrated good agreement between the predictions and observations, with values of 0.62 and 0.80 for the validation and test phases.
The results for the gradient boosting–gradient boosting (GBGB) model are also consistent, with RMSEs of 1.30 mm and 0.70 mm, MAEs of 0.35 mm and 0.23 mm, and KGE values of 0.80 and 0.90 for the validation and test, respectively. These results confirm the ability of the hybrid model methodology to accurately estimate the precipitation intensity.
However, when evaluating different combinations of the classifiers and regressors, such as random forest–gradient boosting (RFGB) and gradient boosting–random forest (GBRF), we observed variations in performance, highlighting the importance of selecting the algorithms appropriately. We also compared the performances of these models with the Z–R relations of the meteorological radar—DSD [
13], Marshall–Palmer [
4], and Nexrad [
14]—noting that the hybrid approach surpassed these in terms of accuracy.
We also highlight the performance of the Oracle (OC) method, which combines the three Z–R methods and chooses the closest to the observed value. The Oracle method served as a benchmark in our analysis, providing an idealized scenario where the best-performing Z–R relation was selected for each event. This approach allowed us to gauge the potential upper limit of accuracy achievable by dynamically adapting the Z–R relations based on real-time observations. This model achieved good results, with RMSEs of 1.10 mm and 1.20 mm, MAEs of 0.53 mm and 0.56 mm, and KGE values of 0.60 and 0.78 for the validation and test, respectively.
Figure 5 presents graphs of the test dataset, providing a clearer visualization of the performances of the different linear regression models and the coefficient of determination Equation (13), which represents the proportion of data variation explained by the model. In this analysis, we focused on the models that are highlighted in
Table 7: RFRF, GBGB, and Oracle.
When observing the linear regression figures, it is evident that these three models demonstrate remarkable performances compared to the others. The regression lines fitted for RFRF, GBGB, and Oracle are very close to the data points, suggesting a considerable ability to estimate the precipitation intensity.
However, to determine which of the three models can be identified as the best, it is essential to consider the evaluation metrics presented in
Table 7, where the RMSE indicates how close the predictions are to the actual values, with lower values indicating the better fit of the model to the data.
Figure 6 illustrates the correlation between the observed data and the predictions generated by the models. The graph displays points (a) to (h) that identify cases where the precipitation was notably underestimated. The color variations in the graph indicate critical areas: red identifies where filters should be applied to exclude anomalous data or outliers, and blue shows where the results are considered reliable without the need for additional filtering.
The data suggest that 80.79% of the values with low correlations require filtering, indicating that filters are an effective tool for improving the accuracy of predictions. Of particular interest is the observation indicated by points (a) and (e) of 29.6 mm. This observation shows a significant discrepancy in the data processed by the ML models. This observation underlines the importance of points with lower correlations to identify possible failures or limitations in the used models.
Table 8 compares the precipitation estimates of the RFRF and GBGB models with the actual measurements. The analysis focuses on the underestimated events, as illustrated in
Figure 6. Notably, points (a) and (e), corresponding to the Pinhao station, reveal a significant disparity between the estimates of 0.65 mm (RFRF) and 0.78 mm (GBGB) against the observed value of 29.6 mm.
This case illustrates the complexity of precipitation estimation and highlights the influence of factors such as the distance between the meteorological station and the radar. The distance of 193 km between the Pinhao station and the radar affects the accuracy of the reflectivity reading. For this point, the value of 28.63 dBZ suggests possible distortions caused by radar beam scattering and atmospheric variations at great distances. Furthermore, interpreting this value of 28.63 dBZ using different methods of calculating the precipitation rate (different Z–R relationships) reveals a significant discrepancy compared to the observed value. The calculated rates are as follows:
Using Marshall–Palmer [
4], 1948—0.56 mm/15 min;
Using Nexrad [
14]—0.47 mm/15 min;
Using DSD [
13]—0.53 mm/15 min.
These lower values suggest that the observed precipitation rate may be overestimated, indicating a possible error at the Pinhao station. The high linear polarization (RHOHV = 0.97) and the differential polarization diffraction values (KDP = 0.20 and ZDR = 0.37) point to complex meteorological conditions that the models may not have adequately interpreted. The complex nature of these polarimetric parameters, as discussed regarding the selection of the polarimetric variables, suggests that specific aspects of the meteorological phenomena were beyond the estimation capacity of the machine learning models.
Table 8 presents a comparative analysis between the recorded data and the model estimates. Taking the case of Pinhao as an example, we observe a notable discrepancy between the observed data and the estimates. In the hour before the event, a precipitation volume of 0.0 mm was recorded, corroborated by the model estimate of 0.00 mm. However, a considerable precipitation event was observed in the subsequent hour, reaching 29.6 mm. This amount contrasts significantly with the model estimate of only 0.65 mm.
This discrepancy could be attributed to potential reading errors at the meteorological station, possibly caused by technical failures. The fact that no amount of rain was measured immediately before a significant rainfall event is a possible indicator of an instrument malfunction. Alternatively, it may indicate an inherent limitation of the model in accurately predicting intense rain events, particularly in scenarios where previous readings suggest low levels or the absence of precipitation. This observation highlights the need to refine the estimation models, seeking the better representation and capture of the temporal dynamics of intense rain events.
3.1. Rain Intensity Analysis
We next consider a model analysis regarding different rain types, from the lightest rain to the heaviest rain, to evaluate the accuracy of the algorithms in differentiating between various rainfall intensities.
Table 9 presents the accuracy percentages for the RFRF and GBGB algorithms across different rainfall intensities, including no rain, light rain, moderate rain, and heavy rain scenarios. The results from both the validation and testing phases are represented as precision values.
Under moderate rain conditions (intensity greater than 2 mm/15 min and up to 5 mm/15 min), both algorithms exhibited high accuracy, with RF achieving up to 97% precision in the testing phase and GB also demonstrating robust performance, which indicates the effectiveness of these models in correctly identifying and classifying moderate rain events, a crucial aspect of accurate meteorological nowcasting and water resource management.
However, when faced with heavy rain events (intensity greater than 5 mm/15 min), a significant variation in performance was observed. While GB maintained high precision, indicating its robustness and adaptability under extreme precipitation conditions, RF showed a decrease in precision. This difference underscores the importance of selecting the appropriate algorithm for the modeling of intense rain events, highlighting GB as a valuable tool for practical applications where the accurate identification of heavy rainfall is essential.
These results highlight the sensitivity of the RFRF and GBGB algorithms to the rainfall intensity, with a particular focus on moderate and heavy rain. The superior performance of GB under these conditions suggests its applicability in scenarios where distinguishing between different precipitation levels is crucial for informed decision making in meteorology and water resource management. This analysis reinforces the need for tailored algorithm selection and data preprocessing strategies to enhance the estimation and classification of intense precipitation events.
3.2. Comparison of Images from Different Precipitation Estimation Methodologies
This section presents a comparative analysis of the model outputs and theoretical Z–R relationships, emphasizing the application of Radial Basis Function Interpolation (RBF) [
21] for data transformation. A series of images are showcased to illustrate the models’ performance in replicating the radar reflectivity and estimating the precipitation rates.
Figure 7 displays images representing the model outputs for 11 October 2022, at 11:00 UTC. On this date, significant rainfall was experienced in the western region of Paraná, which was covered by the Cascavel radar. The figure comprises several sub-figures, with the first depicting the radar reflectivity, while the subsequent images show the precipitation rates over 15 min intervals.
Notably, the RFRF (
Figure 7b) and GBGB (
Figure 7c) models exhibit alignment with the radar reflectivity field, indicating their precision in capturing the observed precipitation system’s characteristics. The visual representations in
Figure 7 are invaluable in evaluating the models’ capability to reproduce the actual weather conditions, particularly under heavy rainfall scenarios.
The transformation of the station data to the radar grid was achieved using RBF interpolation for 117 stations. It is crucial to highlight that the employed RBF interpolation produced a smooth and efficient surface (
Figure 7), although the parameter selection and overfitting potential are critical considerations in ensuring the method’s accuracy.
The comparison between the station-obtained data values and those derived from the radar images (as illustrated in
Figure 8) reveals a tendency for overestimation within both metrics. Specifically, the machine learning models RFRF (
Figure 8a) and GBGB (
Figure 8b) exhibit this overestimation, closely followed by the OC_Sub (Oracle) model (
Figure 8c).
Our employed methodology relies on point-based interpolation to generate a 500 m resolution grid. In the northwest region, for example, where interpolated rain gauge data are unavailable, all estimation methods demonstrate overestimation. Thus, it is expected that an increase in the number of rain gauges in the region would reduce this overestimation, by providing more data points for the interpolation method.
Additionally, this overestimation is more pronounced for regions distant from the radar. As the radar beam travels from the radar, its distance from the planetary surface increases due to the Earth’s curvature. This, in turn, affects the radar measurements. Adjustments are made for the radar beam height, relevant for regions farther from the radar, and further analysis will be conducted that is focused on improving these adjustments.
To improve future investigations, it would be useful to incorporate additional predictive variables such as the temperature, humidity, and atmospheric pressure into the models. This extension would enable a more detailed and comprehensive analysis of the weather conditions, resulting in more precise precipitation estimates. Additionally, it is crucial to train the models with more data points around the radar station.
Figure 9 and
Figure 10 display the algorithm outcomes and theoretical Z–R relationships for the weather events of 12 July 2023, at 22:30 UTC and 2 September 2023, at 13:00 UTC, respectively. These instances confirm the earlier discussed overestimation trend seen for the machine learning data for the event on 11 October 2022, at 11:00 UTC.