1. Introduction
Ultraviolet radiation can cause different effects on Earth’s life. In living organisms, UVB radiation destroys DNA, produces protein denaturation, triggers coagulation of albumin, as well as erythema and skin problems. In humans, UVB radiation causes the weakening of immune system, creates conditions for the development of skin cancer, cataract, aging as well as the formation of erythema [
1], dealing to wide economic losses in public health and thousands of deaths each year due to skin cancer [
2].
The intensity of UVB radiation at ground level is affected by the absorption of energy required by the chemical reactions that occur in the atmosphere and by the reflection caused by particles and gases. One of the most important is ozone, which among all atmospheric gases plays an active role in the absorption of UV radiation and protection against dangerous levels of solar radiation [
3]. In the stratosphere, UVB is absorbed mainly in the ozone layer. This region concentrates ninety percent of total ozone at an altitude of between 9 to 18 miles forming a protective shield against UVB radiation. The ozone concentration varies spatially due to chemical reactions which constantly create or destroy this element. In densely populated areas with air pollution problems, UVB radiation interacts with pollutant oxides of nitrogen and nonmethane hydrocarbons in the troposphere to form ozone. There are other air pollutants called ozone-depleting substances (ODS) which, in contact with UVB radiation, release chemical compounds such as chlorine and bromine, which destroy the ozone in the stratosphere. In all latitudes, except the equatorial zone, from 1979 to 1998, the decrease in ozone was the cause of the annual average increase in ultraviolet radiation from 290 nm to 325 nm [
4].
The relationships between ultraviolet radiation and air pollutants have been widely used to analyze the spatial distribution of continuous levels of UVB radiation through statistical models [
5]. The covariates that have been used by these models to explain the spatial and temporal distribution of UVB radiation are clouds, ozone, nitrogen oxides, particulate matter, carbon monoxide and sulfur dioxide [
6]. The clouds have an effect on the distribution of UVB radiation. Scattering by clouds increases the rate of photochemical reactions and reduces the radiation below them. The particulate matter concentration is also an important factor that plays an important role in the distribution of ultraviolet radiation. In [
7], Sun et al. showed that the magnitude of correlation between PM
2.5 and ultraviolet radiation is of the order of −0.5 in the near-surface layer. They deduced that the maximum and the daily average UV radiations could be attenuated by particulate matter by 40% at most. Their results showed that if one day the average UV radiation was high, the next day the average UV radiation was also high, the reason was that the amount of chemical reactions related to UV radiation created new particulate material. Fluctuations of intensities in ultraviolet radiation are also caused by the amounts of atmospheric NO
2 and SO
2. In [
8], McKenzie et al. found that if the amount of NO
2 is increased 10 or more times than the average amount, then the irradiation of UVA rays decreases up to 40.
One of the most important results of statistical theory were developed by Fisher and Tippett [
9], and Gnedenko [
10] on the asymptotic distribution of the maximum of a random sample. They showed that if the maximum of a random sample centered and scaled by properly chosen constants converges to some distribution, it should be one of the following: Fréchet, Weibull or Gumbel. Later, Jenkinson [
11] combined these three distributions into a single distribution, known as the generalized extreme value distribution, also known as the generalized extreme value (GEV) distribution. The GEV distribution uses three parameters corresponding to the location, scale and shape. The sign of the shape parameter determines the type of the distribution: negative values correspond to the Weibull, positive values to the Fréchet and zero to the Gumbel distribution. The estimation of the parameters has been made using maximum likelihood [
12,
13], partial probability weighted moments, L-moments [
14] as well as several Bayesian approaches [
15,
16]. The L-moments estimators are sometimes more accurate in small samples than those obtained by maximum likelihood and in the case of outliers in the data, are more robust than the conventional moments methods [
17,
18]. Martins and Stedinger [
19] showed that the method of penalized maximum likelihood provided better estimates than maximum likelihood and method of moments when the sample sizes are small and the GEV distribution has heavy-tailed tails.
The generalized distribution of extreme values was developed under the assumption of independent samples with stationary distribution. However, since most real applications have spatial or temporal trends, it has been adapted for the study of nonstationary processes [
20]. The nonstationary extreme value analysis has been widely used to study extreme events in hydrology, hydroclimatology, as well as in environmental, anthropogenic and geophysical processes. Particularly, it has been used to study the long-term risks in rainfall [
21], winds [
22], heat waves [
16] and earthquakes [
23,
24]. In these studies, it can be seen that the trend of extreme values has been adjusted using several approaches. In fact, one of these approaches is that if the patterns follow the law determined by a model, then the GEV parameters of the corresponding model are estimated [
25,
26]. In contrast, for some others, it is more appropriate to adjust the trend with smoothing functions [
21,
27]. In both cases, the trend is adjusted by estimating the parameters on predictors of the location parameter of the GEV distribution. Analogously, an adjustment similar to the logarithm of the scale parameter is made [
28,
29,
30]. The contrast occurs for the shape parameter, which is assumed to be constant because the estimation is numerically fraught when this parameter is allowed to be too flexible [
28]. Extreme nonstationary values have also been studied extensively using the Bayesian approach. For instance, Gaetan and Grigoletto [
15] studied rainfall maxima with Markov random fields approximated based on smoothing kernel, Reich et al. [
16] studied heat waves using a Bayesian hierarchical model with the generalized Pareto distribution (GPD) and Sang and Gelfand [
31] studied the extreme values of spatial stochastic processes and modeled the observed trend as a function of spatial covariates.
3. Results and Discussion
The descriptive statistics of the data by each monitoring station are shown in
Table 1. On this table, we can see that there are differences in the distributions of the extreme values in each of the monitoring stations, i.e., the mean of the distributions is not constant. We verified these results in the boxplot presented in
Figure 4. By a simple inspection of the descriptive statistics, we observed that the distribution of the extremes is not stationary, therefore the use of a nonstationary model of extreme values is justified for the analysis of trends. The inspection of
Table 1 indicates that in station 1, located in the Acatlán area, the maximum intensity recorded was 6.09 W/m
2, in contrast to the maximum intensity measured in station 5, located in the San Agustín area, located in the municipality of Ecatepec de Morelos, in the State of Mexico, where the maxima was 5.65 W/m
2. Similarly, the station 3, located at the Merced and the station 7, located at Tlalnepantla, showed the lowest UVB radiation values in comparison with the other stations. Moreover, on these three stations where we observe more frequently intense periods of air pollution within the study region, the level of UVB radiation is lower than other less polluted areas in the MCMA.
The results of the comparison between the modeling of maximum likelihood method and the modeling of penalized maximum likelihood is shown in
Table 2. In this table, we observed that the estimated parameters have a considerable shrinkage, which is a desirable characteristic that indicates a strong regularization of the model. We validated the results obtained by observing that the value of −12.49 corresponding to the log-likelihood of the adjusted model has been significantly improved in relation to the value of −180.07 corresponding to the log likelihood of the stationary model. We also validated the proposed model using the Deviance statistic [
20]. In the contrast of two models,
with
a parametric vector against another model
with
a subset vector such as
, the deviance statistic defined by
, where
is the maximized log likelihood function of model M, is used to prove the superiority of the
model. Values of D greater than the quantile
of the
distribution with k degrees of freedom, are considered significant, where k is the difference between the dimensions of
and
. In
Table 2 shows the results of the validation of the proposed model using the statistical deviance test, which evidences the improvement of the proposed model compared to the stationary model. These results indicate that the deviance statistics is significant with a reliability level of 99%. Therefore, we concluded that our model allows us to explain spatial and temporal trends using the relationship between covariates and the amount of UVB radiation measured at ground surface.
The estimates obtained for the
and
parameters were 0.2504 and −0.0356, respectively. The spatial smoothing for the years 2000, 2005, 2010, 2015, 2018 and 2019 for the location functions are shown in
Figure 5. The results show well-defined patterns related to the trend in the spatial plane. The magnitude of the UVB radiation maxima decreased as we moved toward the east direction of the study area. This region coincides with the most industrialized areas of the MCMA. Therefore, these results indicate that the air pollution covariates reduce the net amount of UVB radiation that reaches the ground surface. In contrast, we observed that in the less polluted areas of MCMA there is a greater amount of UVB radiation. However, we were able to observe that, although the general trends are maintained throughout the study period, between 2010 and 2015 there was a small decrease in the intensity of UVB radiation in the central region, which return to the initial levels in 2018 and 2019. These results show that our model allowed us to identify the complete temporal dynamics of the trend throughout the study period. This is one of the strengths of the proposed model, which allows us to identify patterns in the distribution of maximum UVB radiation, make inferences and obtain conclusions about extreme values throughout the study region over time. The results also show the advantage of using a spline model with radial-based functions to estimate trends in extreme values. The nonlinear spatial function estimates in each of the different periods through a single model show the existence of the spatial variation of UVB radiation maxima. The proposed model also has the advantage that it includes the effects of covariate interactions in the model through the use of spline functions that depend on the norm of Euclidean distance between covariates and knots. This feature of the model combined with the penalty of the parameters results in a smooth continuous surface as shown in
Figure 5. Future research could include the study of other types of distances between observations.
The results of the temporal trend of UVB radiation over the years keeping the other covariates constant are shown in
Figure 6. In this figure, we can observe the existence of cyclic temporal patterns in the trends of the maxima in the regions located around the monitoring stations. An important finding related to the temporal behavior of the extremes, which can also be seen in
Figure 6, is the decrease in the location parameter over time. An explanation for this finding could be the increase in pollution, specifically the amount of ozone (O
3), nitrogen oxides (NO
x), particles of 10 μm or less in diameter (PM
10) and carbon monoxide (CO), which decreases the amount of UVB radiation as a result of direct chemical reactions or by radiation blockage.
The spatial distribution of the extreme values of UVB radiation is influenced by the physicochemical interactions it has with covariates such as ozone, nitrogen oxides, particles of 10 μm or less in diameter (PM
10), carbon monoxide (CO), relative humidity (RH) and sulfur dioxide (SO
2) [
6]. The atmospheric concentrations of some of these covariates also present seasonal behaviors which modify the intensities of UVB radiation over time. These covariates are used in the nonstationary extreme value model to estimate the trend on UVB radiation maxima, through the linear predictors corresponding to the location and scale parameters. In order to increase the likelihood of the model, we built the design matrix by using a nonlinear function of the square of distance of each observation to knots on the vector space of the observations. Each knot represented one of the k centroids resulting from a hierarchical clustering. There are several approaches to obtain a basis in the column space of the covariates, however, considering the sample size and the number of nodes, the radial basis functions are sufficient to obtain a linearly independent set.
One of the most important applications of the models obtained with the analysis of nonstationary extreme values, consists in the elaboration of risk probability maps and the return level maps. The return level
is the threshold at which an extreme value is exceeded with probability p, which is expected to occur once every 1/p years (Fawcett and Green [
35]).
Figure 7 shows the maximum expected UVB radiation for a return period of 25 years. Isolines on the map (
Figure 7) were used to visualize the spatial risk of maximum UVB radiation. In fact, the highest UVB radiation values over a 25-year return period can be expected in the west part of the study area in the regions surrounding the SFE and PED monitoring stations (
Figure 7). An interesting fact that explains the spatial trend of the maxima is the amount of atmospheric pollution resulted from the emissions of internal combustion vehicles and industrial emissions, among others. Further, in densely populated areas, such as the Merced or Hangares, where a large number of vehicles circulate daily, as well as in industrialized areas such as Naucalpan and Tlalnepantla, we can expect to have the lowest return levels of UVB radiation of the entire study area. An opposite situation occurs in regions farther from urban areas, in which an increase in the estimates of UVB radiation intensity can be observed. Therefore, the map of return levels shown in
Figure 7 allows us to confirm the potential risk of dangerous levels of UVB radiation in the study region. These findings should encourage the creation of policies and the revision of standards related to the protection against UVB radiation as well as the delimitation of critical areas of risk.
We agree with the results of Ailliot et al. [
36], who reported that for stationary case, the imposing constraints improves the performance of the estimation on
parameter. However, we verified these results in the nonstationary case. Similar to the results of Martins and Stedinger [
19], we obtained implausibly large estimates of
for unconstrained maximum likelihood. We observed that the Newton–Raphson method does not reach the global optimal solution for most of the initial values. This happens because we have enough variables to estimate and the quadratic approximation, which is the basis of the optimization algorithm, is not appropriate to approximate the log likelihood function when the initial values are distant from the optimal value. On this case, the optimization of the penalized log likelihood was carried out in two stages. The first stage consists of finding an initial point or seed, which will be used in the second stage of the algorithm. To achieve this, the optimization was performed in a smaller parametric space. Once the maximum is found in a parametric space with a dimension smaller than the original, we used these values as seeds or initial values to perform the approximation using the Newton–Raphson algorithm in the initial parametric space. We also agree with the results of Coles and Dixon [
37], which found that estimators are improved using the maximum penalized likelihood method by restricting the range of
.
Similar to the work of Bais et al. [
38], we conclude that there is a relationship between UVB radiation and ozone, SO
2 and clouds on the spatial UVB radiation distribution across the metropolitan area of Mexico City. Other researchers have found similar relationships through direct chemical studies, concluding that chemical reactions that involve UVB radiation to produce compounds such as ozone, nitric oxide or sulfur dioxide decrease the amount of UVB radiation that reaches the ground [
3,
39]. There also exists other factors that interact with UVB radiation. In heavily polluted regions, there are several types of particulate matter which block the UVB radiation path [
38]. The goal of our study was to analyze the spatio-temporal distribution of the UVB radiation maxima, since extremes have a strong impact on public health, while in most studies only study continuous measurements. We take advantage of the chemical interactions between air pollution and UVB radiation to model the temporal dynamics on the spatial distribution of maximas in the study area.
The monitoring and analysis of UVB radiation levels is a priority concern in terms of public health for all the largest population centers. In previous studies on UVB radiation in the metropolitan area of Mexico City, Acosta and Evans [
40] found that UVB radiation levels, measured over the international standard units, reached dangerous levels for humans. We agree with their findings, in which they also detected a strong attenuation of UVB radiation at ground level in the urban troposphere under polluted conditions. However, in contrast to them, we have used the GEV distribution. An alternative to this distribution is the skew generalized extreme value distribution (SGEV) [
41], which showed that it improves the return level estimation in the case of a slow convergence or in the heavy-tailed case. Future work should consider the use of the SGEV distribution for analysis of UVB radiation extremes.
4. Conclusions
In this study, we have developed a nonstationary extreme value model for UVB radiation maxima on the metropolitan area of Mexico City using a semi-parameterized model to obtain a spatio-temporal smoothing of the location parameter of the GEV distribution. We have estimated return levels of extreme events of UVB radiation through a nonstationary extreme value model in which we use both spatial and environmental covariates. UVB maxima were obtained in each of the monitoring stations through the block maxima method. The spatial and temporal trend was approximated by means of the location parameter of the GEV distribution using linear predictors based on Gaussian basis functions of the observations to knots, in order to include the effect of the interaction between the covariates in the model. One of the advantages of this model is that the estimated smooth curve allows for the adjustment of a wide variety of nonlinear functions, allowing its application in a wide variety of real situations. The regularization of the model is obtained by penalizing the parameters via penalized maximum likelihood (PML) which has the advantage of producing a shrinkage of the coefficient estimates and reducing overfitting. These methods are equivalent to the optimization of the constrained maximum likelihood and also to Bayesian methods in which the coefficients have a priori normal distributions with zero mean. The deviance test was used to validate the fitted model. The results showed that the adjusted model was significantly better than the stationary model with a reliability of 99%.
Regarding the empirical analysis of UVB radiation on the metropolitan area of Mexico City, we characterized the distribution of maxima in the spatial and temporal plane. In the spatial plane, although the results show the existence of differentiated local patterns, the estimates of the location parameter of the GEV distribution showed that there is a plane that determines the trend in the entire study region, which evidences the existence of a positive linear correlation in the west direction of the study area. These results are consistent with the demographic characteristics of the area. In the temporal plane, we observe cyclical observations on the location parameter of the spatial distribution of maxima. Such oscillations are dominated by a negative linear trend with respect to time, which is consistent with the increase in population and its corresponding consequences on air pollution. Our findings also revealed the existence of areas with well-defined spatio-temporal patterns which should help administrative authorities to improve prevention policies and standards to mitigate the impact of UVB radiation maxima. Regarding the simulation results, these demonstrate that it is feasible to identify the nonlinear characteristics of the trends reliably under the parametric conditions used in the simulation, which were established with values of the GEV parameters similar to those found in real conditions. Particularly, the spatial function for the location parameter used in the simulation, which contains nonlinear features that we can expect to find in real data, was satisfactorily estimated. However, we conclude that optimal simulations related to the distribution of nonstationary GEV is an issue that requires further investigation.
Future work can first include to analyze new functions of distance between vectors, since it is natural to think that some variables may be more important as explanatory variables than others. Secondly, to examine the asymptotic properties of the estimators regarding to the number of knots. Finally, some other further studies could consider the analysis of the sensitivity of estimates on more complex nonlinear functions using simulations on the GEV distribution under different sample sizes.