Predicting Atlantic Hurricanes Using Machine Learning
Round 1
Reviewer 1 Report
The paper is quite clear - there is some repetition that is a bit confusing but not severe. Basically, they took a binary series of annual hurricane occurrence in the ATL basin and looked at its wavelet spectra and then fit a nonlinear autoregressive model to it -- what the exogenous variable is was not clearly described and this needs to be clarified. The lags selected by the model should also be clarified for each case. To evaluate the performance of the model on a binary series it is necessary to show more than just the graphic of the wave that they present - perhaps it would be useful to have a logistic function as the final processor of their model so that the probability of a hurricane is output each year and this then allows one to use some classical performance measures for a binary outcome -- what they refer to as the cost function would then very naturally be a maximum likelihood problem.
It is also not clear whether any sort of cross validation or out of sample testing was employed - this is almost necessary for all papers nowadays. The fact that the periodic structures of hurricanes of different categories are not the same is interesting and perhaps worthy of further exploration and attribution. It is possible that the spatial expression is also very different and it seemed that they were going to explore that but how/what was done in that regard is not so clear.
Author Response
Review 1
The paper is quite clear - there is some repetition that is a bit confusing but not severe. Basically, they took a binary series of annual hurricane occurrence in the ATL basin and looked at its wavelet spectra and then fit a nonlinear autoregressive model to it -- what the exogenous variable is was not clearly described and this needs to be clarified
Response: We appreciate the suggestion. The exogenous variables refer to those variables that are formed outside a model, that is, in the NARX model, the exogenous variables are the input data.
The lags selected by the model should also be clarified for each case.
Response: We appreciate the suggestion and reminder. We have discussed the lag variables in Equation 4 and specified the selection in the algorithm of section 2.3.2 through iterative steps.
To evaluate the performance of the model on a binary series it is necessary to show more than just the graphic of the wave that they present - perhaps it would be useful to have a logistic function as the final processor of their model so that the probability of a hurricane is output each year and this then allows one to use some classical performance measures for a binary outcome.
Response: We appreciate the suggestion. In the algorithm, it has been added that the radial basis function can be a logistic function or a Gaussian function. These radial basis functions can be used regardless of whether binary data or other types of variables are being analyzed. Mathematically there are no criteria to use one or another radial base, also both functions are intrinsically related. This is a free choice of the user and both radial functions give similar results. We select the Gaussian function. In the algorithm of section 2.3.2 we have added a detailed discussion of our selection.
-- what they refer to as the cost function would then very naturally be a maximum likelihood problem.
Response: We appreciate the suggestion. The user can select any cost function. We used the mean squared error (MSE). We have added a discussion about our selection in iterative steps for the algorithm under section 2.3.2
It is also not clear whether any sort of cross-validation or out of sample testing was employed - this is almost necessary for all papers nowadays. The fact that the periodic structures of hurricanes of different categories are not the same is interesting and perhaps worthy of further exploration and attribution. It is possible that the spatial expression is also very different and it seemed that they were going to explore that, but how/what was done in that regard is not so clear.
Response: For training, validation, testing and deduction of the hyper-parameters of the model. We used the K-fold cross-validation. In this work, we used K = 10, but, it is possible to vary K between 5 and 10. In the algorithm of section 2.3.2 we have now provided an extended explanation and discussion. In the case of spatial distribution (Figure 9), no cross-validation is performed. Figure 9 shows the speed distribution and classification according to the methodology described in section 2.4.
Author Response File: Author Response.pdf
Reviewer 2 Report
In this manuscript, a time-frequency wavelet analysis was first performed to investigate the variability of hurricanes and identify intrinsic patterns related to their activities. Then, a machine learning model was applied to forecast the hurricane occurrences and activities. The manuscript is a well thought, clear, and well organized. The authors have done a very good job in this regard. The following comments/questions are given with eagerness to bring out the full potential of the content and fulfill its intentions.
- The authors need to slightly rewrite their abstract since the current version looks more as a conclusion than an abstract.
- The authors need to be consistent in the use of terms hurricanes and tropical cyclones (although they refer to the same phenomenon)
- The authors claim that they have proposed a ‘newly invented Bayesian Machine Learning algorithm’. However, several studies have adopted almost the same model, so what is the key difference of the proposed machine learning model in comparison with existing algorithms?
- A unitary annual digital signal of the hurricanes was selected. Therefore, the number of hurricanes in a specific year for a given category is not accounted for. This might affect the results and the corresponding interpretations. Why not using a digital model whose amplitude is equal to the number of hurricanes that occurred in a specific year?
- The authors need to have an additional paragraph in the introduction to cover current state of the art research in hurricane/tropical cyclones using machine learning techniques.
- The authors need to carefully read the manuscript and correct several typos (e.g., Line 164 – There are different models can approximate …)
Author Response
Review 2
In this manuscript, a time-frequency wavelet analysis was first performed to investigate the variability of hurricanes and identify intrinsic patterns related to their activities. Then, a machine learning model was applied to forecast the hurricane occurrences and activities. The manuscript is well thought out, clear, and well organized. The authors have done a very good job in this regard. The following comments/questions are given with eagerness to bring out the full potential of the content and fulfill its intentions.
The authors need to slightly rewrite their abstract since the current version looks more as a conclusion than an abstract.
Response: The referee’s constructive comment and feedback is appreciated. We have made some changes to the abstract.
The authors need to be consistent in the use of terms hurricanes and tropical cyclones (although they refer to the same phenomenon)
Response: We appreciate the comment and in the revised manuscript we have now use only the term hurricane.
The authors claim that they have proposed a 'newly invented Bayesian Machine Learning algorithm'. However, several studies have adopted almost the same model, so what is the key difference of the proposed machine learning model in comparison with existing algorithms?
Response: We appreciate the question and we have now better clarified and described our works. We propose a new methodology that combined wavelet spectral analysis and the Bayesian Machine Learning algorithm.
A unitary annual digital signal of the hurricanes was selected. Therefore, the number of hurricanes in a specific year for a given category is not accounted for. This might affect the results and the corresponding interpretations. Why not using a digital model whose amplitude is equal to the number of hurricanes that occurred in a specific year?
Response: We appreciate the suggestion. The use of the total annual number of hurricanes in each category does not change the periodicities reported in each of the spectral analyses. The only change is in the relative spectral power. In the forecast, there are changes in the breadth of the Bayesian model. Therefore, there are no effects on the results or the forecasts. The advantage of this proposed methodology allows the user the freedom to choose the input data. In our case, we decided to use the proposed model in our work.
The authors need to have an additional paragraph in the introduction to cover current state of the art research on hurricane/tropical cyclones using machine learning techniques.
Response: We appreciate the recommendation. We have now added a new sentence in the Introduction describing two more recent examples of new works applying Machine Learning to the problem of forecasting hurricane activity and tropical cyclone intensity and track.
The authors need to carefully read the manuscript and correct several typos (e.g., Line 164 – There are different models can approximate …)
Response: We appreciate the gentle reminder. We hope that the revised manuscript has kept any typos to a minimum if not eliminated.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
I think the authors have improved the paper and addressed my questions. I wonder if they would like to add a discussion as to how one could update and incorporate early season climate conditions in their model, and/or consider the generation of tracks associated with each storm category - as in
Nakamura, J., Lall, U., Kushnir, Y., Harr, P. A., & McCreery, K. (2021). Early Season Hurricane Risk Assessment: Climate-Conditioned HITS Simulation of North Atlantic Tropical Storm Tracks. Journal of Applied Meteorology and Climatology, 60(4), 559-575.
Nakamura, J., Lall, U., Kushnir, Y., & Rajagopalan, B. (2015). HITS: Hurricane intensity and track simulator with North Atlantic Ocean applications for risk assessment. Journal of Applied Meteorology and Climatology, 54(7), 1620-1636.
Author Response
We fully appreciate and understand the referee’s idea to add discussion on early season prediction of Atlantic hurricanes. Unfortunately, we think that this is a different class of forecasting problem and question that would require far more information and grainy details from weekly and seasonal precursors. Therefor we hope the referee can understand and accept our reluctance in taking on this suggestion directly.
However, we have added the following discussion and clarifying explanation to the manuscript which we hope is adequate for the narrow purpose of our paper: (marked in bold font around page 18 of the final revised manuscript)
The accuracy of any forecast with Machine Learning is limited by an uncertainty principle (see Velasco Herrera et al., 2015; this uncertainty principle could be generalized to a certain quantum Machine Learning principle). Greater precision in the temporal forecast implies a significant uncertainty in the forecast on spatial location (i.e., the hurricane tracks). We suggest that one promising progress in tropical hurricanes forecasting may consist in changing the prediction paradigms from an “exact or precise” approach to probabilistic forecasting of future hurricane activity cycles. This is the reason why we focus on the problem of temporal forecasting using the Bayesian Machine Learning in this work. Our forecast is based on the natural patterns of multi-annual and decadal variations for each of the hurricane categories analyzed rather than analyzing any pre-season weather and climatic conditions and variables.
In our forecasting model, there is the clear absence of any a priori knowledge on pre-season activity. From the point of view of signal theory, this means that such pre-season information can be viewed as weather and climatic noise. Therefore, in order to predict hurricanes in the long-term sense, we are not focusing on forecasting these pre-season weather and climatic conditions. Because those weather and climatic noises are highly variable and could be accordingly assumed as a stochastic process. Therefore, without additional fine-grain information on pre-season activity, it is impossible to say exactly the weather and climatic conditions will be for the following year. We note however that for the study of hurricane trajectories that are preconditioned on the early season large-scale climate state, the model, Cluster-Based Climate-Conditioned Hurricane Intensity and Track Simulator (C3-HITS), can be used (see Nakamura et al. 2015; 2021). Ultimately, our work highlights that the multiannual and decadal variations (trends) of Atlantic hurricanes from categories 2 to 5 are stable and consistent from 1950 to 2021 which can be assumed to be the result of the more persistent and coherent interactions of the coupled atmosphere-ocean-geographical system in affecting and modulating the tropical hurricanes. This notable property is indeed a useful signal from the point of view of signal theory and proffers a probabilistic forecast as we have performed in this study.
Author Response File: Author Response.pdf
Reviewer 2 Report
The manuscript can be accepted in the present form.
Author Response
We welcome all suggestions and comments from the Reviewer.
Author Response File: Author Response.pdf