1. Introduction
Climate extremes such as floods, droughts and heatwaves have become topical issues since they have triggered most natural disasters in recent decades that can potentially affect humans and the natural environment [
1]. Climate extreme events are regular across the globe and impact society in various ways, leading to loss of lives, shortage of food, failure of crops, famine, mass migration and health issues [
2]. The increased number, frequency and intensity of natural hazards such as floods, heatwaves and hurricanes are generally attributed to climate change [
3,
4,
5]. In Africa, impacts of a changing climate vary significantly by region [
6,
7]. More than 90% of natural disasters in southern Africa are related to weather, climate and water. Understanding extreme climate events will help prepare and formulate mitigation strategies to cope with events associated with climate change. Modelling and predicting future extreme events become more relevant in commercial agriculture, to insurance companies, statisticians and meteorologists.
Extreme climate and weather events such as floods, droughts and heatwaves negatively impact society, environment and resource management in developing countries [
6,
8,
9]. In South Africa, anomalous cut-off lows, tropical cyclones and tropical storms are the major extreme rainfall producing systems affecting the Limpopo province, while the Botswana High becomes dominant during heatwaves and drought. Extreme weather events are common in Limpopo during summertime and often coincide with mature phases of the El Niño Southern Oscillation. In February 2000, about 700 people lost their lives and over a million residents were displaced in Mozambique due to flooding associated with tropical cyclone Eline [
10,
11]. In recent decades (1980–2015), southern Africa experienced 491 climate disasters (hydrological, climatological and meteorological) which resulted in 110,978 deaths and left 2.49 million people homeless [
8,
12]. Therefore, climate extreme events cause risks to the lives and livelihoods of South African society [
13]. South Africa is highly vulnerable to extreme climate events due to its geographical location and socio-economic factors. Several tropical cyclones have distressed various countries such as Madagascar, Mozambique and South Africa [
14].
Rainfall is highly variable over southern Africa on several space and time scales [
15]. Climate change has altered rainfall characteristics, including duration of the rainy season, the length of dry spells, frequency of rainy days and the occurrence of heavy rainfall events [
16]. This results in regular and severe water-associated extremes such as floods and drought [
17]. In South Africa, the Limpopo province experiences hot to very hot conditions during the austral summer season [
18,
19]. Extreme drought is a critical problem in the region affecting the agricultural sector due to high temperatures and unreliable rainfall [
8,
20]. This study is built on this factual background coupled with challenges and impacts of climate and weather extreme events in the Limpopo province.
Long-term data gained from historical extreme climate analysis provides a huge possibility for good management, forecasting and mitigation of climate extremes [
7]. Extreme Value Theory is a powerful method to quantify the stochastic behaviour of low or unusual levels. Extreme value theory (EVT) has been widely used in various fields such as atmospheric science (e.g., [
21]), hydrology (e.g., [
22]), finance industry [
23] and many other fields of application. The observational and statistical modelling results of the studies mentioned above have shown remarkable increases in the intensity of precipitation extremes.
This study aimed to employ Extreme Value Theory to model climate extreme events in the future using generalised extreme value distribution (GEVD) by using the maximum likelihood estimation method. Generalised extreme value distribution (GEVD) is the family of asymptotic distribution that describes the behaviour of extreme conditions. The GEVD consists of three extreme value distributions, namely: Gumbel, Fréchet and Weibull families, which are also referred to as type I, II and III extreme value distributions [
24].
Chifurira and Chikobvu [
25] fitted a GEVD to average yearly rainfall with an objective of modelling the upper tail of the rainfall distribution. The Gumbel class of distributions was found to fit the data well using the Anderson–Darling goodness of fit test. The GEVD with constant shape and scale parameters but varying location parameters over time were inadequate to model Zimbabwe’s extreme maximum rainfall. The study indicated that a high mean annual rainfall of 1193 mm is expected in approximately 300 years ([
25]). A similar analysis to the present study in multivariate extreme value theory (MEVT) is that of [
26], who used bivariate threshold excess in modelling temperature extremes in the Limpopo province for three meteorological stations Thohoyandou, Lephalale and Polokwane. Similar to the present study, the approach by [
26] also used a penalised cubic smoothing spline to perform a nonlinear detrending of the temperature data before fitting bivariate threshold excess models to positive residuals above the threshold. The present study dealing with rainfall as the main parameter extends the approach of [
26] by using a time-varying threshold instead of a constant threshold to capture the climate change effects in the monthly maximum rainfall data series. Recent studies on modelling extreme rainfall using extreme value theory and the
r-largest order statistics considering model and return level uncertainty include those of [
27,
28,
29], among others.
This study applies extreme value distribution to model maximum annual rainfall in Limpopo province. Results from this study can contribute vitally to the knowledge of EVT application to long-term rainfall data and recommendations to government agencies private organisations on extreme events and their negative impact on the economy. There are no studies available to the public domain in the science sphere that have modelled long-term yearly maximum rainfall in Limpopo province using EVT approaches applied in this study.
Various studies such as [
30] discuss the modelling of the influence of temperature on average daily electricity demand in South Africa using a piecewise linear regression model and the generalised extreme value theory approach from 2000–2010. Severe weather conditions increase electricity demand because air-conditioned appliances are used in summer and heating systems are used in winter [
6,
30]. South Africa is also concerned about the impacts of extreme heat wave events on the public and how these events may change in the future [
31,
32]. The most robust approach in Extreme Value Theory is the choice of a threshold when using the POT approach. We also closely follow the work of [
33,
34].
Southworth et al. [
34] provide a detailed computational approach of multivariate extreme value data conditional modelling using an R package called ‘texmex’. In another study on threshold choice, Ref. [
35] proposed a covariate-dependent threshold based on expectiles. They argued that although no threshold choice method is universally the best, strong arguments against the use of constant threshold is that the observation that may be considered extreme at some covariate level may not necessarily qualify as an extreme observation when considered at another covariate level. The present study use threshold stability plots. This is a graphical method that is widely used to determine the threshold. The idea of this plot is that the exceedances of a high threshold follow a GPD. The study by [
36] used a GPD with time-varying covariates and thresholds to model daily peak electricity demand for South Africa. They used an intervals estimator method in declustering observations that exceed the threshold. Furthermore, the findings of [
36] showed a better fit for the GPD model to the data compared to the generalised extreme value distribution (GEVD).
The main highlights of this study are as follows: The main contribution of this paper is to employ Extreme Value Theory to model climate extreme events using the
r-largest order statistics. The knowledge and understanding of extreme climate events will help prepare and formulate mitigation strategies to cope with events associated with climate change. In this study, the interest was in deriving extreme maximum rainfall return levels from 1960 to 2020. The study combines two main approaches: bivariate condition extremes model [
33,
34,
37] and time-varying threshold [
36]. The rest of the paper is organised as follows:
Section 2 presents the materials and methods. The empirical results are presented in
Section 3. A discussion of the results is given in
Section 4, while
Section 5 concludes the paper.
4. Discussion
The current study was motivated by the work of [
44,
55], who used the
r-largest order statistics in modelling extreme wind speed and estimation of maximum daily temperature, respectively. The results produced from this study were from the application of GEVD
for
and the GPD models. The parameters of the models were estimated using the MLE method. Empirical results from the evaluation metrics for prediction intervals suggest that GEVD
, which was based on the profile likelihood, produces prediction intervals with the smallest PINAW. Modelling of extreme maximum rainfall is important in the field of hydrology for decision making. The stakeholders can be informed of return levels and periods by modelling excessive maximum rainfall in the study area, Thabazimbi. This helps in decision making and alarms the community living around Thabazimbi and surrounding areas when they are likely to experience extreme, destructive rainfall.
In this study, the data were tested for the existence of a monotonic trend using the Cox–Stuart (CS) trend test. Using the CS test, the p-value was 0.2492, implying no monotonic trend at the 5% level of significance. However, upon using the Mann–Kendall (MK) trend test and the seasonal MK test, we failed to reject the null hypothesis and concluded that there is both a local trend (p-value = 0.0001784) and a seasonal global trend (p-value = 0.000002689). This was then followed by computing the magnitude of the trend based on Sen’s slope test. The Weibull class of distributions is the best fitting model for the data in all the modelling frameworks. This implies that the distributions of extreme maximum rainfall are bounded above.
The study declustered the exceedances above a sufficiently high threshold before fitting the GPD model. It should be noted that, although the procedure of declustering and then fitting the GPD to cluster maxima gives a valid statistical model whose underlying assumptions are met, this may not be of ultimate interest in practice. For example, rainfall information can be helpful if the assessment of flood damage is the ultimate goal. Here it may be more informative to analyse complete clusters and understand the aggregate rainfall over a rainy spell rather than focus on the largest yearly value over that spell. The lack of long-term rainfall data for various stations in Limpopo province limits the other stations to be investigated in this study. The correlation between rainfall data with ocean atmospheric drivers such as SOI and IOD data was weak. As a result, these two variables were not included as covariates in the developed models in this study.
Empirical results from this study show that the prediction interval widths from the profile likelihood method are preferred to those from the delta method, as seen in
Table 5. From
Table 5 the delta method used on the model GEVD
has the lowest PINAW value of 1.97 from the four models: (GEVD
, delta), (GEVD
, profile), (GEVD
, delta) and (GEVD
, profile). The PINAW values from the GPD, Mean and Median models are too narrow and do not capture the uncertainty in the return levels. Robust narrower prediction intervals are preferred and usable by decision-makers in hydrology at capturing uncertainty than those too wide. Our results are consistent with those of [
56] who estimated extreme flood heights using the
r-largest order statistics and modelled the uncertainty in the extreme quantiles of flood heights using the delta and the profile likelihood methods, respectively. Similar studies on the use of the
r-largest order statistics are given in [
28,
38,
40,
57], among others.