1. Introduction
Contrary to the general prospects of global warming [
1], a number of unusual extreme cold events (ECE), such as the bitterly cold waves, have frequently occurred in recent years in mid-latitude regions of the Northern Hemisphere during winter [
2,
3,
4,
5,
6,
7,
8,
9,
10,
11]. Some mid-latitude regions, like Central Asia and eastern Siberia, have even shown a downward temperature trend in winter over the past decades [
12,
13]. These extreme cold events often last from several days to more than ten days and have a great impact on people’s health and economic life. It is very useful to be able to forecast these kinds of events well in advance, i.e., two weeks or more before they occur.
The range of predictability of weather forecasts would then be about two weeks [
14]. Forecasts of more than two weeks fall into the subseasonal category. The subseasonal-to-seasonal (S2S) time scale has long been considered a “predictability gap”. Its time scale is sufficiently long that much of the memory of the initial conditions of the atmosphere is lost, and it may also be too short for the variability in the boundary forcing (such as ocean, land and sea ice) to be large enough [
15,
16,
17,
18,
19]. Recent studies have indicated important potential sources of predictability in this time range through improving understanding and representation of atmospheric phenomena such as the Madden–Julian Oscillation (MJO) [
20,
21,
22,
23]. To bridge the gap between weather and seasonal predictions and meet the demands of user communities, the World Weather Research Program (WWRP) and the World Climate Research Program (WCRP) established a Subseasonal-to-Seasonal Prediction Project [
24] in 2013 (
http://www.s2sprediction.net, accessed on 20 May 2018). An important outcome of this project is the establishment of an extensive database, which contains subseasonal-to-seasonal reforecasts and near real-time forecasts derived from 11 operational centres. It provides a powerful community resource for investigating the mechanisms of S2S predictability, as well as assessing their skill and the usefulness of state-of-the-art subseasonal forecasts for applications.
Since extreme events are rare, a longer period reforecasts covering more verification years would be desirable for validation. Relatedly, the S2S project actually provides longer reforecasts (the length of reforecasts varies from 12 to 33 years), and these data have so far not been considered for verification. This study will investigate whether ECE with the largest impacts during winters are usefully forecasted via the current major operational models on a subseasonal timescale. The paper is structured as follows: the S2S database and observational data used for verification are introduced in
Section 2;
Section 3 describes verification metrics;
Section 4 shows the BSS and ROC scores of probabilistic forecasts for ECE and the effects of MJO on probabilistic forecasts skill of ECE; and a summary and discussion of the potential for further research are provided in
Section 5.
2. Data
Reforecasts and near real-time forecasts (available with a 3-week delay) with a forecast lead-time of up to 60 days derived from 11 operational centres—the Australian Bureau of Meteorology (BoM), the China Meteorological Administration (CMA), the European Centre for Medium-Range Weather Forecasts (ECMWF), Environment and Climate Change Canada (ECCC), the Institute of Atmospheric Sciences and Climate of Italy (CNR-ISAC), the Hydrometeorological Centre of Russia (HMCR), the Japan Meteorological Agency (JMA), the Korea Meteorological Administration (KMA), Météo-France/Centre National de Recherche Meteorologiques (CNRM), the National Centres for Environmental Prediction (NCEP) and the United Kingdom’s Met Office (UKMO)—can be freely downloaded from the S2S database (
https://apps.ecmwf.int/datasets/data/s2s/levtype=sfc/type=cf/, accessed on 20 May 2018). A control forecast (only using a single non-perturbed initial value) and a number of perturbed forecasts that are used to produce probabilistic forecasts or ensemble means are included in every reforecast or forecast.
Table 1 shows the detail of the S2S models for reforecasts and forecasts. The reforecast frequencies and the reforecast lengths are different in different models. The different horizontal resolutions of atmospheric models were uniformly interpolated to a 1.5 × 1.5 latitude–longitude grid. In addition, the ensemble size and whether or not the atmospheric model is coupled to an ocean or sea-ice model are not consistent among the various models. Despite these differences, there are enough commonalities in the S2S models to make a comparison, as will be shown in this work.
Six models from the 11 S2S models have been selected according to their ensemble size and the nature of their probabilistic forecasts. They are the ECMWF, NCEP (its original forecast ensemble size is 16 per day, though this size was extended to 48 using a 3-day lag methodology), BOM, JMA, CNRM and CNR-ISAC models. In the six models, the ensemble size of models varies from 33 to 51.
The observation data used in this study are the daily temperature recorded at 2 m via ERA-Interim reanalysis [
25] using a 1.5 × 1.5 latitude–longitude grid. The MJO indexes used in our article were provided by the ECMWF [
26].
3. Verification Metrics
Forecast verification is the process of assessing the quality of forecasts. With the application of probabilistic forecasts becoming ever-more common, the evaluation of probabilistic forecast becomes more and more important [
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38]. The skill of ECE probabilistic forecasts will be illustrated and measured in this study in the form of the Brier Skill Score (BSS) and the area under Relative Operating Characteristics (ROC) curves.
The Brier score (BS) is essentially the mean squared error (mse) of the probabilistic forecasts of a dichotomous event [
39,
40,
41]. It is one of the most commonly used scores for assessing probabilistic forecasts of binary events (e.g., whether ECE occurs or not), and it forms the basis of other widely used probabilistic scores, such as the ranked probability score [
42,
43]. The
BS is often formulated as a skill score by relating it to the score obtained via a reference forecast strategy, which usually refers to the relevant climatology. Thus, the
BSS is given by:
The
BSS is the conventional skill score form that uses the Brier score as the underlying accuracy measure. Positive values of the
BSS indicate forecast benefit with respect to the climatological forecast. Since the
BSS is subject to a negative bias that is strongest for small ensemble sizes, some scientists introduced the debiased
BSS (
BSSD), which is calculated by adding a correction term
D to the references score [
35,
44].
BSSD provides a powerful and easily implemented tool for the evaluation of probabilistic forecasts with small ensembles and for the comparison of different ensemble prediction systems of different ensemble sizes. Thus, the
BSS mentioned in this work is
BSSD.
where
is the probability of events, and
is the ensemble size.
The
BS can be broken into three non-negative terms:
The three terms are known, respectively, as reliability, which measures the conditional bias of the forecasts and is equal to the weighted average of the square differences between the forecasts and conditional observed probabilities; resolution, which measures the ability of the forecast system to discriminate between event occurrences and non-occurrences; and uncertainty, which is the variance of the observation probabilities. The corresponding
BSS can be written as follows:
where
and
are the skill scores of the reliability and resolution components, respectively. Their values are found within the range [0, 1]. The reliability is negatively oriented, as is the probability score: the lower the reliability, the better the score. The resolution is positively oriented: the higher the resolution, the better the score. Usually, the
BSS represents the actual prediction skill of the probabilistic prediction of a certain event, the calculation of which is involved in both the discrimination and the reliability.
The ROC [
45,
46,
47,
48,
49] is considered by the World Meteorological Organization to be a recommendable method of indicating the skill of probabilistic weather and climate forecasts. The ROC score is usually calculated by plotting the hit rates against the false-alarm rates for different warning criteria, before calculating the area under the curve. The area under the ROC curve is a simple index to summarize the skill of a forecast system. As there is skill only when the hit rate exceeds the false-alarm rate, the ROC curve will lie above the 45° line from the origin, and the total area under the curve will be greater than 0.5. Where the curve lies close to the diagonal (the hit rate equals the false-alarm rate and the area under the curve equals 0.5), the forecast system does not provide any useful information. If the curve lies below the 45° line, negative skill is indicated. The ROC curve illustrates the varying quality of the forecast system at different levels of confidence in the warning and can be used to optimize forecast value based on the specifics of an individual user’s cost–loss table. Since the ROC is widely used as a measure of discrimination, it is often regarded as a measure of potential predictability.
A Z-test is a statistical test used to evaluate whether a finding or association is statistically significant within large samples, regardless of whether you know the population’s standard deviation. The Z-test was employed to evaluate whether the difference between prediction skill is statistically significant or not when analyzing the influence of MJO on ECE probabilistic forecast skill.
4. Results
An extreme event is often defined as occurring when verifying analysis is found in the tail(s) of the climatological distribution. In this study, we define the probabilistic forecast of an ECE as the ratio of the forecast sample numbers that are lower than the 10th percentile value of reforecasts to the total forecast sample numbers, and the duration of ECE should be longer than 5 days. In our article, we will evaluate the probabilistic prediction skill of six S2S models for ECE from November to April (Northern hemisphere extended winter) over the real-time period 2015–2017 due to the inherence of probabilistic forecasts on large samples.
4.1. Potential Predictability
Firstly, we calculated the ROC score, i.e., the area below the ROC curve, of probabilistic forecasts for ECE. The ROC score measures the ability of the forecast to discriminate between two alternative outcomes, thus measuring resolution. It is not sensitive to biases in the forecast, meaning that it says nothing about reliability. A biased forecast may still have good resolution and produce a good ROC curve, which means that it may be possible to improve the forecast through calibration. The ROC can, thus, be considered as a measure of potential usefulness [
50].
ROC scores of six models for ECE, which were computed over all land grid points in three regions, are shown in
Figure 1. The three regions are the Northern Hemisphere extra-tropics (NH, north of 20° N), the Southern Hemisphere extra-tropics (SH, south of 20° S), and the tropics (TR, between 20° S and 20° N).
Figure 1 shows that the ROC scores of the six selected S2S models for ECE all were bigger than 0.5 up to 4 weeks over the extra-tropics and tropics. The ROC scores of the six models declined rapidly as the lead time grew. In the first week, the ROC scores of six models for ECE were in the order of 0.8, except for the BoM model, which was below 0.7 over SH, indicating that the potential prediction skill of six models for ECE were much higher than climatologic forecasts within this time range. In the second week, the ROC scores dropped slightly to between 0.6 and 0.7. In the third week, the ROC scores were in the order of 0.6 over the extra-tropics and above 0.6 over the tropics for most of the models. These results suggest that models had some moderate skill and still performed better than climatology at this time. In the fourth week, the ROC scores were a little bigger than 0.5 over the extra-tropics and in the order of 0.6 over the tropics, which indicates that there was almost no skill at this time for ECE over the extra-tropics.
As shown above, we found that three weeks may be the limit of predictability for ECE probabilistic forecasts over the extra-tropics. Therefore, the ROC score of ECE at each land grid point in the third week was computed.
Figure 2a shows the ROC scores of ECMWF model for ECE in the third week. They were bigger than 0.6 in most of the world’s land points. This results suggests that the ECMWF model has a good performance and can provide a more useful prediction than climatology at this time. The ROC scores of the NCEP model were also bigger than 0.6 over most of the tropical land points, East Asia, Northern Europe, North America and Australia. This finding shows that the probabilistic prediction of the NCEP model for ECE was potentially skillful over these regions at this time. Probabilistic prediction of the CNRM model for ECE was potentially skillful over some tropical regions, East Asia, Northeast Asia, Eastern Europe, western North America, south–central South America, and most Australian land areas. The skill of CNR-ISAC model was greater than 0.6 over some of the tropical region, some of East and North Asia, North America and Australia. The ROC scores of the JMA and BoM models were only greater than 0.6 over some of tropical regions, East Asia, Europe, North America, South Africa and some Australian land areas and smaller than values recorded via four other models in most regions. This result indicates that the two models had no potential prediction skill in some areas in this week.
Figure 3 shows the ROC scores of six models for ECE in the fourth week. Although the ROC scores of the six models were all in the order of 0.5, having no more skill than climatological forecasts in most land points, which even smaller than 0.5 in some regions, it can be found that in some areas, such as the tropics and some extra-tropical regions, there are still moderate potential prediction skill for each model in this time range.
4.2. Actual Prediction Skill
It is well known that ROC diagnoses the potential predictability of probabilistic forecasts, instead of discrimination, and it has nothing to do with reliability. The actual probabilistic prediction skill of the model is determined by both the discrimination and the reliability. Usually, the BSS represents the actual prediction skill of the probabilistic prediction of a certain event, since the calculation of this skill is involved in the event.
Figure 4 shows the three region-averaged BSSs of the six S2S models used for the weekly probabilistic forecasts of ECE. Positive values of the BSS indicate forecasting benefits with respect to the climatologic forecast, as the larger the value, the higher the skill; negative values of the BSSs mean that the model’s probabilistic forecast is poorer than the climatologic forecast. For the regional average, the probabilistic forecasts of the ECMWF model for ECE were actually more skillful than climatologic forecast for up to 3 weeks over the NH and SH, as well as 4 weeks over the TR. The probabilistic forecasts of the NCEP model for ECE were actually more skillful for up to 2 weeks over the NH and SH, though they showed no skill over the TR in the first week, which is surprising. The forecast of the BoM model was skillful over the NH for only 1 week. The performance of the CNR-ISAC model was better than that of climatology over the NH and SH for only 1 week, as well as over the TR for 4 weeks. The first weeks’ forecasts via the JMA model were more useful over the NH, but less useful over the SH and TR (BSS is negative there). The CNRM model can provide a skillful probabilistic forecast for ECE only for 1 week over the NH and SH.
Figure 5 is the map of the BSS at each land grid in the first week. At this time, the BSS of ECMWF model is positive and greater than 0.2 in most areas, indicating that its probabilistic forecasts are more skillful than the probabilistic prediction of climatology. BSSs of NCEP and BOM models are positive over extra-tropics and negative over tropics. The probabilistic predictions of the JMA, CNR-ISAC and CNRM models are only skillful over some areas. In the second week (
Figure 6), BSSs decreased for all models compared to the previous time, but the ECMWF model has better performance than the climatology forecasts; the other five models have only moderate skill over sporadic areas, and their skill was worse than climatology forecasts in most of areas.
Above all, the scores show that the actual probabilistic prediction skill of the models for ECE is about 1–3 weeks shorter than the potential prediction skill. To identify the problem, we decomposed the BSS.
Figure 7 shows the averaged BSS decomposition terms BSSrel and BSSres over three defined regions. For some models, such as the ECMWF, NCEP, CNR-ISAC and CNRM models, discrimination by models for ECE is slightly higher over the NH and SH than the TR in the first week. The discrimination of models for ECE has a great relationship with the forecast time. As the forecast time becomes longer, discrimination of the model for ECE gradually decreases. BSSrel represents the reliability of the prediction. The smaller the value, the more reliable the probabilistic prediction. As can be seen from
Figure 7, for the BSSrel of ECMWF model, its actual predictions are skillful for 3–4 weeks and it has a relatively stable and small value for 4 weeks. This result indicates that the reliability of ECMWF model is good. BSSres of
Figure 7 shows that the ECMWF model has a good discriminative ability to detect whether ECE occurs or not. Therefore, the BSS of ECMWF model is positive in the 1–3 weeks, since it shows good reliability and discrimination. In the first week, given poor reliability over the TR, the BSS of the NCEP model is worse than the climatologic prediction. Over the extra-tropics, the reliability of the model is better than that over the tropics, meaning that its forecasts are skillful in the 1st and 2nd weeks. The reliability of the BoM model in the first week over the NH is good, and the discrimination also is not bad, meaning that the BSS is positive at this time. Both the discrimination and reliability of BoM model over the SH and the TR are low, meaning that its forecasting has worse performance than that of climatology. It is possible that the lower resolution is one of reasons for this outcome. The discrimination and reliability of the JMA model are not so good and need to be further improved, especially over the SH. The reliability of the CNR−ISAC model is good, but its discrimination decreases rapidly with time, causing poor performance. The BSS of the CNRM model is negative mainly due to insufficient discrimination and lower reliability.
4.3. Impact of MJO
MJO is the dominant variation in the tropical atmosphere at the weather and sub-seasonal scales (<90 days). The MJO is able to widely influence global weather and climate systems, and it is a bridge between weather (3–8 days) and climate (>90 days), providing a source of predictability for subseasonal forecasts (2–6 weeks) [
51,
52,
53,
54,
55]. Moreover, some studies have suggested that higher forecast skill exists during MJO events [
56,
57,
58]. Therefore, we further analysed the impact of MJO on the probabilistic forecast skill of ECE. The ECMWF model was employed because of its good performance for the event.
Both the ROC score and the BSS show that MJO has an impact on the probabilistic forecast skill of ECE. When there is an MJO in the initial conditions, the probabilistic forecasting skill of ECE over Europe from the first to the fourth week is higher than that of non-MJO conditions. For example, the ROC score of probabilistic forecasts for ECE over most grids covering European land was in the order of 0.6 when there was no MJO in the initial conditions in the third week, which means that the model’s probabilistic forecasts for ECE had moderate potential skill at this time (
Figure 8a). The ROC score was in the order of 0.7 in some grids over European land, especially over Western Europe, where there was an MJO in the initial conditions (
Figure 8b), and the presence of an MJO in the initial conditions improves the forecast skill of the model (
Figure 8c).
The impact of an MJO on ECE probabilistic forecasts is more important in the fourth week, which is a time range often considered as having very low predictability and reliability over the extra-tropics [
15,
53,
59].
Figure 8d shows that the ROC score of ECE in the fourth week is in order of 0.5 over most of Europe when there is no MJO in the initial conditions, meaning that forecasts have no more potential skill than climatology forecasts. It is interesting that the ROC scores of forecasts are more than 0.6 over some of Europe when there is an MJO event in the initial conditions, which indicates that forecasts are likely to be useful in this time range (
Figure 8e). Over Europe, the ROC scores of forecasts of ECE are 0.56 without an MJO and 0.64 with an MJO in the fourth week (
Figure 9). In order to estimate the statistical significance of the impact of the MJO on European extreme weather regimes, the Z-test calculation has been performed using 10,000 bootstrap resampling. The difference between them is very significant during the fourth week based on the Z-test. Those results suggest that MJO represents a major source of predictability over the NH at this time range. This result also demonstrates that the probability forecast skill in this time range is not always as low as previous studies have suggested, and forecasts of ECE in the fourth week can be potentially useful over the NH. From a practical point of view, this result also suggests that users of the ECMWF monthly forecasting system could use the presence of an MJO in the initial conditions to decide if the monthly forecasts in the fourth week should be trusted or not. Improvements in the probabilistic forecast skill for ECE over Europe may be related to MJOs in phases 6/7 or 2/3 in the initial conditions (
Figure 10). The probabilistic forecast of ECE is more skillful over the Europe about 1–2 weeks after an MJO in the tropical Indian Ocean or the western–central Pacific Ocean. The lag time is close to those of previous studies regarding the connection between the Indian Ocean and Europe [
60,
61,
62]. MJOs over two ocean regions’ forced Rossby waves and determined through teleconnection leads to increases in the probability of cold temperature anomalies over Europe.
The impact of an MJO on ECE probabilistic forecasts is also more important over North America in the third week. When there is no MJO in the initial conditions, the ROC scores over most of the North American continent are in order of 0.6, while they are in order of 0.7 with an MJO in the initial conditions (
Figure 11a,b). The BSSs over the North American continent are almost zero without an MJO in the initial conditions, means no skill than climatology forecast, while it has a positive value when there is an MJO in the initial conditions (
Figure 11d,e), which indicates forecasts are skillful and useful than climatology forecast.
Figure 12 shows that the clear influence of MJO on the probabilistic forecast of ECE over North America is mainly due to the contribution of MJO in phase 6/7. When there is an MJO in the western Pacific Ocean in initial conditions, three weeks later, a probabilistic forecast of ECE over North America is more skillful than that carried out when there is no MJO in the initial conditions. The influence of MJO in tropical regions on the probabilistic prediction skill of ECE over middle and high latitudes is mainly caused by wave trains, and further analysis of this influence mechanism is required in future studies.
5. Summary and Discussion
The S2S project provides a powerful community resource for assessing different operational centres’ model’s skill and the usefulness of state-of-the-art sub-seasonal forecasts for applications. The main goal of this study is to evaluate the probability forecast skill of models in S2S projects for extreme cold events in monthly time ranges based on the BSS and ROC scores.
After being regionally averaged for the probabilistic forecasts of ECE, the potential prediction skill of the six S2S models over the NH, SH and the TR can last for up to 4 weeks, except for the BoM model, which can only last for up to 3 weeks over the SH. As the forecast time increases, the potential prediction skill of the forecast quickly decays. In the 3rd and 4th weeks, forecasts over the TR are more skillful than those over extra-tropics.
Although the ROC score shows that six S2S models have good potential prediction skill related to ECE probability forecasts, and forecasts made 1–4 weeks in advance were more useful than those of climatology, the BSS results show that the actual prediction skill of six models were different. The probabilistic forecast of ECE by the ECMWF model was actually skillful for up to 3 weeks over the extra-tropics and 4 weeks over the tropics. The forecast of NCEP model was actually skillful for only up to 2 weeks over the extra-tropics. BoM, JMA, and CNRM models were actually more skillful than climatological forecasts for only up to 1 week over the NH. The actual prediction skill of ECE of CNR-ISAC model was only apparent for up to 1 week over the extra-tropics and 4 weeks over the tropics.
The actual prediction skill of the six S2S models for the ECE probabilistic forecast is only 1–3 weeks, which is shorter than the 4−week length of potential predictability. Given good resolution and good reliability, the actual prediction skill of the ECMWF model is up to 3 weeks. The actual prediction skill of the NCEP model is only up to 2 weeks due to its poor reliability, especially in the tropics. Calibration and bias corrections of the prediction results may improve the actual forecasting skill of the NCEP model. For the BOM, JMA and CNRM models, the reason for the short time of the actual forecast skill is that they all have poor discrimination and poor reliability. The CNR−ISAC model only has a one-week actual prediction skill because of insufficient discrimination, which is related to the ensemble prediction spread. There is still much room for improvement in the prediction ability of models used to forecast ECE.
Using the ECMWF model’s forecasts, the influence of MJO on the probabilistic prediction skill of ECE is analysed. When there is an MJO in phase 2/3 or 6/7 in the initial conditions, the potential prediction and actual prediction skill of ECE over Europe in the third and fourth weeks are higher than those without an MJO. When an MJO in phase 6/7 exists in the initial conditions, the potential prediction and actual prediction skill of ECE over central−eastern North America in the third week is higher than that in the absence of an MJO. This result further indicates that an MJO in tropical areas has an important influence on the prediction of extreme temperatures in middle and high latitudes. The way in which a MJO in the tropics affects the extreme low temperature in these two regions needs further study.
Although the ECE is rare, the results presented in this paper show they can be skillfully predicted based on monthly time scales. Many studies have shown that the use of a multi-model ensemble can significantly increase the skill of probabilistic seasonal forecasts. For subseasonal extreme event probabilistic forecasts, using the model output statistics, like logistic and non-homogeneous Gaussian regression, to give simple weight to each model results in the computation of multi-model ensemble combinations. This prospect is worth further exploration.
Since MJO has an important effect on probabilistic forecasts of extreme surface temperature events over the northern extra-tropics, especially in Europe, it is necessary to study the way in which MJO affects them and the mechanism between MJO and the ECE. Generally, we believe that a major improvement in monthly extreme surface temperature prediction might be feasible once the MJO, i.e., the most important source of intra-seasonal variability, is better represented by the model.
Author Contributions
Conceptualisation, F.V.; formal analysis, X.L.; methodology, F.V. and X.L.; resources, F.V. and X.L.; investigation, X.L.; supervision, T.W.; writing—original draft, X.L.; writing—review and editing, X.L., F.V. and T.W. All authors have read and agreed to the published version of the manuscript.
Funding
This study was funded by the National Natural Science Foundation of China Carbon Neutrality Project (42341202) and the National Natural Science Foundation of China (Grant No. 42230608).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
All data generated or analysed during this study are included in this published article.
Acknowledgments
The authors would like to thank the ECMWF for providing the database and working facilities and the help provided by the ECMWF Earth System Predictability Section.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Intergovernmental Panel on Climate Change (IPCC). Climate Change 2014: Mitigation of Climate Change: Working Group III Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
- Scaife, A.A.; Knight, J.R. Ensemble simulations of the cold European winter of 2005–2006. Quart. J. R. Meteor. Soc. 2008, 134, 1647–1659. [Google Scholar] [CrossRef]
- Wang, D.H.; Liu, C.J.; Liu, Y.; Wei, F.; Xu, X. A preliminary analysis of features and causes of the snow storm event over the Southern China in January 2008. Acta Meteorol. Sin. 2008, 66, 405–422. [Google Scholar] [CrossRef]
- Cattiaux, J.; Vautard, R.; Cassou, C.; Yiou, P.; Masson-Delmotte, V.; Codron, F. Winter 2010 in Europe: A cold extreme in a warming climate. Geophys. Res. Lett. 2010, 37, L20704. [Google Scholar] [CrossRef] [Green Version]
- Palmer, T. Record-breaking winters and global climate change. Science 2014, 344, 803–804. [Google Scholar] [CrossRef] [PubMed]
- Wallace, J.M.; Held, I.M.D.; Thompson, W.J.; Trenberth, K.E.; Walsh, J.E. Global warming and winter weather. Science 2014, 343, 729–730. [Google Scholar] [CrossRef] [PubMed]
- Bellprat, O.; Massonnet, F.; García-Serrano, J.; Fučkar, N.S.; Guemas, V.; Doblas-Reyes, F.J. The role of Arctic sea ice and sea surface temperatures on the cold 2015 February over North America. Bull. Am. Meteor. Soc. 2016, 97, S36–S41. [Google Scholar] [CrossRef] [Green Version]
- Iida, M.; Sugimoto, S.; Suga, T. Severe Cold Winter in North America Linked to Bering Sea Ice Loss. J. Clim. 2020, 33, 8069–8085. [Google Scholar] [CrossRef]
- Xu, F.; Liang, X.S. The synchronization between the zonal jet stream and temperature anomalies leads to an extremely freezing North America in January 2019. Geophys. Res. Lett. 2020, 47, e2020GL089689. [Google Scholar] [CrossRef]
- Zhou, C.; Dai, A.; Wang, J.; Chen, D. Quantifying Human-Induced Dynamic and Thermodynamic Contributions to Severe Cold Outbreaks Like November 2019 in the Eastern United States. Bull. Amer. Meteor. Soc. 2021, 102, S17–S23. [Google Scholar] [CrossRef]
- Zhang, Y.J.; Yin, Z.C.; Wang, H.; He, S.P. 2020/21 record-breaking cold waves in east of China enhanced by the ‘Warm Arctic-Cold Siberia’ pattern. Environ. Res. Lett. 2021, 16, 094040. [Google Scholar] [CrossRef]
- Cohen, J.; Screen, J.A.; Furtado, J.C.; Barlow, M.; Whittleston, D.; Coumou, D.; Francis, J.; Dethloff, K.; Entekhabi, D.; Overland, J.; et al. Recent arctic amplification and extreme mid-latitude weather. Nat. Geosci. 2014, 7, 627–637. [Google Scholar] [CrossRef] [Green Version]
- McCusker, K.E.; Fyfe, J.C.; Sigmond, M. Twenty-five winters of unexpected Eurasian cooling unlikely due to Arctic sea-ice loss. Nat. Geosci. 2016, 9, 838–842. [Google Scholar] [CrossRef]
- Lorenz, E.N. The predictability of a flow which possesses many scales of motion. Tellus 1969, 21, 289–307. [Google Scholar] [CrossRef]
- Vitart, F. Monthly forecasting at ECMWF. Mon. Weather Rev. 2004, 132, 2761–2779. [Google Scholar] [CrossRef]
- Waliser, D.E. Predictability of the Tropical Intraseasonal Variability. In Predictability of Weather and Climate; Palmer, T.N., Hagedorn, R., Eds.; Cambridge University Press: Cambridge, UK, 2006; pp. 275–305. [Google Scholar]
- Waliser, D.E. Predictability and Forecasting. In Intraseasonal Variability of the Atmosphere–Ocean Climate System, 2nd ed.; Lau, W.K.M., Waliser, D.E., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 433–468. [Google Scholar]
- Hoskins, B. Predictability beyond the deterministic limit. WMO Bull. 2012, 61, 33–36. [Google Scholar]
- Hoskins, B. The potential for skill across the range of the seamless weather-climate prediction problem: A stimulus for our science. Quart. J. R. Meteor. Soc. 2013, 139, 573–584. [Google Scholar] [CrossRef]
- Weaver, S.J.; Wang, W.; Chen, M.; Kumar, A. Representation of MJO variability in the NCEP Climate Forecast System. J. Clim. 2011, 24, 4676–4694. [Google Scholar] [CrossRef]
- Vitart, F.; Woolnough, S.; Balmaseda, M.A.; Tompkins, A. Monthly forecast of the Madden–Julian Oscillation using a coupled GCM. Mon. Weather Rev. 2007, 135, 2700–2715. [Google Scholar] [CrossRef]
- Vitart, F.; Buizza, R.; Alonso Balmaseda, M.; Balsamo, G.; Bidlot, J.R.; Bonet, A.; Fuentes, M.; Hofstadler, A.; Molteni, F.; Palmer, T.N. The new VAREPS-monthly forecasting system: A first step towards seamless prediction. Quart. J. R. Meteor. Soc. 2008, 134, 1789–1799. [Google Scholar] [CrossRef]
- Vitart, F. Evolution of ECMWF sub-seasonal forecast skill. Quart. J. R. Meteor. Soc. 2014, 140, 1889–1899. [Google Scholar] [CrossRef]
- Robertson, A.W.; Vitart, F. The Sub-Seasonal to Seasonal (S2S) Prediction Project. In ECMWF Sub-Seasonal Workshop-Reading, 2–5 November 2015; ECMWF: Reading, UK, 2015. [Google Scholar]
- Dee, D.P.; Uppala, S.M.; Simmons, A.J.; Berrisford, P.; Poli, P.; Kobayashi, S.; Andrae, U.; Balmaseda, M.A.; Balsamo, G.; Bauer, D.P.; et al. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. R. Meteor. Soc. 2011, 137, 553–597. [Google Scholar] [CrossRef]
- Gottschalck, J.; Wheeler, M.; Weickmann, K.; Vitart, F.; Savage, N.; Lin, H.; Hendon, H.; Waliser, D.; Sperber, K.; Nakagawa, M.; et al. A Framework for Assessing Operational Madden-Julian Oscillation Forecasts: A CLIVAR MJO Working Group Project. Bull. Amer. Meteor. Soc. 2010, 91, 1247–1258. [Google Scholar] [CrossRef] [Green Version]
- Atger, F. The skill of ensemble prediction systems. Mon. Weather Rev. 1999, 127, 1941–1953. [Google Scholar] [CrossRef]
- Atger, F. Spatial and interannual variability of the reliability of ensemble-based probabilistic forecasts: Consequences for calibration. Mon. Weather Rev. 2003, 131, 1509–1523. [Google Scholar] [CrossRef]
- Kumar, A.; Barnston, A.G.; Hoerling, M.P. Seasonal predictions, probabilistic verifications, and ensemble size. J. Clim. 2001, 14, 1671–1676. [Google Scholar] [CrossRef]
- Hou, D.C.; Kalnay, E.; Droegemeier, K.K. Objective verification of the SAMEX ’98 ensemble forecasts. Mon. Weather Rev. 2001, 129, 73–91. [Google Scholar] [CrossRef]
- Candille, G.; Talagrand, O. Evaluation of probabilistic prediction systems for a scalar variable. Quart. J. R. Meteor. Soc. 2005, 131, 2131–2150. [Google Scholar] [CrossRef]
- Candille, G.; Cote, C.; Houtekamer, P.L.; Pellerin, G. Verification of an ensemble prediction system against observations. Mon. Weather Rev. 2007, 135, 2688–2699. [Google Scholar] [CrossRef]
- Rodwell, M.J.; Doblas-Reyes, F.J. Medium-Range, Monthly, and Seasonal Prediction for Europe and the Use of Forecast Information. J. Clim. 2006, 19, 6025–6046. [Google Scholar] [CrossRef]
- Casati, B.; Wilson, L.J. A new spatial-scale decomposition of the brier score: Application to the verification of lightning probability. Mon. Weather Rev. 2007, 133, 81–101. [Google Scholar] [CrossRef]
- Weigel, A.P.; Liniger, M.A.; Appenzeller, C. The discrete Brier and ranked probability skill scores. Mon. Weather Rev. 2007, 135, 118–124. [Google Scholar] [CrossRef]
- McCollor, D.; Stull, R. Evaluation of probabilistic medium-range temperature forecasts from the North American Ensemble Forecast System. Weather Forecast. 2009, 24, 3–17. [Google Scholar] [CrossRef] [Green Version]
- Doblas-Reyes, F.J.; García-Serrano, J.; Lienert, F.; Biescas, A.P.; Rodrigues, L.R. Seasonal climate predictability and forecasting: Status and prospects. Wiley Interdiscip. Rev. Clim. Change 2013, 4, 245–268. [Google Scholar] [CrossRef]
- Johnson, S.J.; Stockdale, T.N.; Ferranti, L.; Balmaseda, M.A.; Molteni, F.; Magnusson, L.; Tietsche, S.; Decremer, D.; Weisheimer, A.; Balsamo, G.; et al. SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev. 2019, 12, 1087–1117. [Google Scholar] [CrossRef] [Green Version]
- Brier, G.W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 1950, 78, 1–3. [Google Scholar] [CrossRef]
- Wilks, D.S. Statistical Methods in the Atmospheric Sciences; International Geophysics Series; Academic Press: Cambridge, MA, USA, 1995; Volume 59, 467p. [Google Scholar]
- Jolliffe, I.T.; Stephenson, D.B. Forecast Verification: A Practitioner’s Guide in Atmospheric Science; Wiley: Hoboken, NJ, USA, 2003; 240p. [Google Scholar]
- Epstein, E.S. A scoring system for probability forecasts of ranked categories. J. Appl. Meteor. 1969, 8, 985–987. [Google Scholar] [CrossRef]
- Mason, S.J. On Using “Climatology” as a Reference Strategy in the Brier and Ranked Probability Skill Scores. Mon. Weather Rev. 2004, 132, 1891–1895. [Google Scholar] [CrossRef]
- Tippett, M.K. Comments on “The Discrete Brier and Ranked Probability Skill Scores”. Mon. Weather Rev. 2008, 136, 3629–3633. [Google Scholar] [CrossRef]
- Mason, I.B. A model for assessment of weather forecasts. Austral. Met. Mag. 1982, 30, 291–303. [Google Scholar]
- Mason, S.J.; Graham, N.E. Conditional probabilities, relative operating characteristics, and relative operating levels. Weather Forecast. 1999, 14, 713–725. [Google Scholar] [CrossRef]
- Mason, S.J.; Graham, N.E. Areas beneath relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quart. J. R. Meteor. Soc. 2002, 128, 2145–2166. [Google Scholar] [CrossRef]
- Kharin, V.V.; Zwiers, F.W. On the ROC score of probability forecasts. J. Clim. 2003, 16, 4145–4150. [Google Scholar] [CrossRef]
- Harvey, L.O.; Hammond, K.R.; Lusk, C.M.; Mross, E.F. The Application of Signal Detection Theory to Weather Forecasting Behavior. Mon. Weather Rev. 2009, 120, 863–883. [Google Scholar] [CrossRef]
- Doblas-Reyes, F.J.; Hagedorn, R.; Palmer, T.N. The rational behind the success of multi-model ensembles in seasonal forecasting. Part II: Calibration and combination. Tellus 2005, 57A, 234–252. [Google Scholar]
- Madden, R.A.; Julian, P.R. Detection of a 40–50 day oscillation in the zonal wind in the tropical Pacific. J. Atmos. Sci. 1971, 28, 702–708. [Google Scholar] [CrossRef]
- Lin, H.; Brunet, G. The influence of the Madden–Julian oscillation on Canadian wintertime surface air temperature. Mon. Weather Rev. 2009, 137, 2250–2262. [Google Scholar] [CrossRef]
- Vitart, F.; Molteni, F. Simulation of the Madden–Julian oscillation and its teleconnections in the ECMWF forecast system. Quart. J. R. Meteor. Soc. 2010, 136, 842–855. [Google Scholar] [CrossRef]
- Wang, B.; Chen, G.; Liu, F. Diversity of the Madden-Julian Oscillation. Sci. Adv. 2019, 5, eaax0220. [Google Scholar] [CrossRef] [Green Version]
- Chen, G. Diversity of the Global Teleconnections Associated with the Madden–Julian Oscillation. J. Clim. 2021, 34, 397–414. [Google Scholar] [CrossRef]
- Ferranti, L.; Palmer, T.N.; Molteni, F.; Klinker, E. Tropical-extratropical interaction associated with the 30–60 day oscillation and its impact on medium and extended range prediction. J. Atmos. Sci. 1990, 47, 2177–2199. [Google Scholar] [CrossRef]
- Jones, C.; Waliser, D.E.; Lau, K.M.; Stern, W. The Madden-Julian Oscillation and its impact on Northern Hemisphere weather predictability. Mon. Weather Rev. 2004, 132, 1462–1471. [Google Scholar] [CrossRef]
- Jung, T.; Miller, M.J.; Palmer, T.N. Diagnosing the Origin of Extended-Range Forecast Errors. Mon. Weather Rev. 2010, 138, 2434–2446. [Google Scholar] [CrossRef]
- Weigel, A.P.; Baggenstos, D.; Liniger, M.A.; Vitart, F.; Appenzeller, C. Probabilistic Verification of Monthly Temperature Forecasts. Mon. Weather Rev. 2008, 136, 5162–5182. [Google Scholar] [CrossRef]
- Cassou, C. Intraseasonal interaction between the Madden–Julian Oscillation and the North Atlantic Oscillation. Nature 2008, 455, 523–527. [Google Scholar] [CrossRef]
- Lee, R.W.; Woolnough, S.J.; Charlton-Perez, A.J.; Vitart, F. ENSO modulation of MJO teleconnections to the North Atlantic and Europe. Geophys. Res. Lett. 2019, 46, 13535–13545. [Google Scholar] [CrossRef] [Green Version]
- Abid, M.A.; Kucharski, F.; Molteni, F.; Almazroui, M. Predictability of Indian Ocean precipitation and its North Atlantic teleconnections during early winter. npj Clim. Atmos. Sci. 2023, 6, 17. [Google Scholar] [CrossRef]
Figure 1.
ROC areas of probabilistic forecasts of ECE over (a) the extra-tropical Northern Hemisphere, (b) the extra-tropical Southern Hemisphere, and (c) the tropics. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 1.
ROC areas of probabilistic forecasts of ECE over (a) the extra-tropical Northern Hemisphere, (b) the extra-tropical Southern Hemisphere, and (c) the tropics. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 2.
ROC areas of probabilistic forecasts of ECE in the third week during the NDJFMA of the period 2015–2017.
Figure 2.
ROC areas of probabilistic forecasts of ECE in the third week during the NDJFMA of the period 2015–2017.
Figure 3.
Same as
Figure 2, but in the fourth week.
Figure 3.
Same as
Figure 2, but in the fourth week.
Figure 4.
BSS of probabilistic forecasts of ECE over (a) the extra-tropical Northern Hemisphere, (b) the extra-tropical Southern Hemisphere, and (c) the tropics. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 4.
BSS of probabilistic forecasts of ECE over (a) the extra-tropical Northern Hemisphere, (b) the extra-tropical Southern Hemisphere, and (c) the tropics. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 5.
BSSs of probabilistic forecasts of ECE in the first week during the NDJFMA of the period 2015–2017.
Figure 5.
BSSs of probabilistic forecasts of ECE in the first week during the NDJFMA of the period 2015–2017.
Figure 6.
Same as
Figure 5, but in the second week.
Figure 6.
Same as
Figure 5, but in the second week.
Figure 7.
Resolution (BSSres) and reliability items (BSSrel) of BSSs over (a,d) the extra-tropical Northern Hemisphere, (b,e) the extra-tropical Southern Hemisphere, and (c,f) the tropics.
Figure 7.
Resolution (BSSres) and reliability items (BSSrel) of BSSs over (a,d) the extra-tropical Northern Hemisphere, (b,e) the extra-tropical Southern Hemisphere, and (c,f) the tropics.
Figure 8.
ROC area and BSS of probabilistic forecasts of ECE in the third and fourth weeks over the European land area during the NDJFMA of the period 2008−2017 with and without an MJO in the initial conditions and their differences. (a,d,g) without an MJO, (b,e,h) with an MJO, and (c,f,i) the differences. Only scores over land points are shown.
Figure 8.
ROC area and BSS of probabilistic forecasts of ECE in the third and fourth weeks over the European land area during the NDJFMA of the period 2008−2017 with and without an MJO in the initial conditions and their differences. (a,d,g) without an MJO, (b,e,h) with an MJO, and (c,f,i) the differences. Only scores over land points are shown.
Figure 9.
ROC diagrams of ECE probabilistic forecasts over Europe in (a) the third and (b) the fourth weeks during the NDJFMA of the period 2008–2017. The red curve represents the diagram obtained via forecasts with an MJO in the initial conditions (amplitude of the MJO index larger than 1 independently of the phase), and the blue curve represents the diagrams obtained via forecasts without an MJO in the initial conditions (amplitude of the MJO index less than 1).
Figure 9.
ROC diagrams of ECE probabilistic forecasts over Europe in (a) the third and (b) the fourth weeks during the NDJFMA of the period 2008–2017. The red curve represents the diagram obtained via forecasts with an MJO in the initial conditions (amplitude of the MJO index larger than 1 independently of the phase), and the blue curve represents the diagrams obtained via forecasts without an MJO in the initial conditions (amplitude of the MJO index less than 1).
Figure 10.
(a,b) ROC area and (c,d) BSS of ECE probabilistic forecasts over Europe with and without an MJO in each phase in the initial conditions during the NDJFMA of the period 2008−2017. In order to reduce the number of lines in the figures, phases 2 and 3, 4 and 5, 6 and 7, and 8 and 1 have been merged. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 10.
(a,b) ROC area and (c,d) BSS of ECE probabilistic forecasts over Europe with and without an MJO in each phase in the initial conditions during the NDJFMA of the period 2008−2017. In order to reduce the number of lines in the figures, phases 2 and 3, 4 and 5, 6 and 7, and 8 and 1 have been merged. The vertical grey bars represent the 95% level of confidence computed via a 10,000 bootstrap re-sampling procedure.
Figure 11.
Same as
Figure 8, but over the North American land area. (
a,
d) without an MJO, (
b,
e) with an MJO, and (
c,
f) the differences. Only scores over land points are shown.
Figure 11.
Same as
Figure 8, but over the North American land area. (
a,
d) without an MJO, (
b,
e) with an MJO, and (
c,
f) the differences. Only scores over land points are shown.
Figure 12.
Same as
Figure 10, but over the North American land area. (
a,
b) ROC area and (
c,
d) BSS.
Figure 12.
Same as
Figure 10, but over the North American land area. (
a,
b) ROC area and (
c,
d) BSS.
Table 1.
List of models participating in the S2S project.
Table 1.
List of models participating in the S2S project.
Models | Time Range | Resolution | Ens. Size | Frequency | Re-Forecasts | Rfc Length | Rfc Frequency | Rfc Size |
---|
BoM (ammc) | d 0–62 | T47L17 | 3 × 11 | 2/week | fix | 1981–2013 | 6/month | 3 × 11 |
CMA (babj) | d 0–60 | T106L40 | 4 | daily | fix | 1994–2014 | daily | 4 |
CNR-ISAC (isac) | d 0–32 | 0.75 × 0.56 L54 | 41 | weekly | fix | 1981–2010 | every 5 days | 5 |
CNRM (lfpw) | d 0–32 | T255L91 | 51 | weekly | fix | 1993–2014 | 4/month | 15 |
ECCC (cwao) | d 0–32 | 0.45 × 0.45 L40 | 21 | weekly | on the fly | 1998–2017 | weekly | 4 |
ECMWF (ecmf) | d 0–46 | Tco639/319 L91 | 51 | 2/week | on the fly | past 20 years | 2/week | 11 |
HMCR (rums) | d 0–61 | 1.1 × 1.4 L28 | 20 | weekly | on the fly | 1985–2010 | weekly | 10 |
JMA (rjtd) | d 0–33 | Tl479/Tl319L100 | 50 | weekly | fix | 1981–2010 | 3/month | 5 |
KMA (rksl) | d 0–60 | N216L85 | 4 | daily | on the fly | 1991–2010 | 4/month | 3 |
NCEP (kwbc) | d 0–44 | T126L64 | 16 | daily | fix | 1999–2010 | daily | 4 |
UKMO (egrr) | d 0–60 | N216L85 | 4 | daily | on the fly | 1993–2016 | 4/month | 7 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).