Using ARIMA to Predict the Growth in the Subscriber Data Usage
Round 1
Reviewer 1 Report
The ARIMA is the most common forecasting models, there are hundreds and even thousands of researches in forecasting using this one.
For me, I have to reject this paper due to novelty (nothing is new).
As a recommendation for improvements:
1. Explore different forecasting models, you can combine classical mathematical models and the strengths of neural networks specifically recurrent neural networks (RNN) using various ensemble algorithms
2. This paper does not explore the most important part of modeling, hyperparameter optimization, in my experience this is the most neglected part of of any forecasting models.
3. It will be also better to explore other factors that affects subscriber's data usage, specifically an multivariate forecasts.
4. This is a good paper but needs major revision on the methods as stated in number 1.
Author Response
Dear Reviewer attached is my responses to your comments.
Thank you for your recommendations and suggestions.
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors proposed and claimed to predict 5 subscriber usage trends based on historical time-stamped data to improve the Quality of Experience (QoE). The abstract and conclusion are very well structured, ideas are clear and the writing is concise and argumentative. Overall manuscript is well framed and firm. The methods chosen ARIMA and CNN, in the study are adequate. Following amendment is suggested to improve the quality of work.
1- Algorithm-1 should be presented in the correct way. It must show the process of EDA instead of steps of python script.
2- Article can be considered as a potential reference
Kumar, R., Kumar, P., & Kumar, Y. (2022). Multi-step time series analysis and forecasting strategy using ARIMA and evolutionary algorithms. International Journal of Information Technology, 14(1), 359-373.
Author Response
Dear reviewer thank you for your recommendations and suggestions.
Author Response File: Author Response.pdf
Reviewer 3 Report
Overall, the article is relevant for the readers of the journal. However, it needs a major revision in order to be regarded for publication. In particular, the structure of the paper has to be to be improved, so that it is clear (i) which problem shall be solved, (ii) why this problem is relevant, (iii) which method is applied to solve the problem and (iv) how this method performs in comparison to standard methods.
The introduction is not structured appropriately. It should describe the problem you want to solve and only marginally indicate the approach that you use to solve the problem. The detailed description of the approach should follow in a later section. Moreover, the introduction should clearly describe the importance of the problem at hand and it should name some approaches that tackled the problem in the past.
On page 2 in line 51 you state: “… ARIMA is a class of predictive classifiers…”. ARIMA is not a class of predictive classifiers but instead it is a class of time series forecasting models. Hence it is a special case of a class of regression models, not a class of classification models. See
Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: Forecasting and control (5th ed). John Wiley & Sons.
and
Hyndman, R. J., & Khandakar, Y. (2008). Automatic time series forecasting: The forecast package for R. Journal of Statistical Software, 27(1), 1–22.
I do not clearly understand the purpose of Section 2.1 and 2.2. What do the authors want to review here? From my point of view, the literature review should describe approaches to tackle the problem to be solved (predicting the growth of subscriber data). The current literature review does not show this information. The mathematical formulation of ARIMA models in 2.3 is useful. However, it should be moved to Section 3 and it should not only describe a standard ARIMA model (since this can be found in several standard text books). Instead, the ARIMA model should be directly adapted to the application problem of the paper.
Section 3.1 describes the data base in detail. Here, the problem to be solved becomes clearer. However, this is far too late in the article. Section 1 should at least describe the problem to be solved.
On page 8 in line 252 the authors describe p-values. From my point of view, each reader of the paper should have an idea of the meaning of p-values. However, if the authors decide to describe the meaning of p-values, they have to state, which statistical test they are conducting and which null hypothesis they are testing for.
On page 15 in line 379 the authors state: “Both experimental datasets achieved the same predictive accuracy value of 90%.” From my point of view, a dataset cannot achieve a predictive accuracy. However, a predictive model (such as an ARIMA model) can achieve a predictive accuracy on a dataset.
The comparison of ARIMA and CNN in Section 4.2 is useful and necessary. However, the predictive results should be compared to some other standard forecasting approaches, e.g. exponential smoothing, random walk, knn, linear regression, mlp. Moreover, it should be discussed, why some approaches achieve good results and specifically, why ARIMA works well for the problem at hand.
The results section is not structured appropriately. The authors should try to describe their approach as clearly and easily as possible. Moreover, they should start with an overview of the overall approach in order to avoid confusing the reader.
All figures and tables in the article seem to be pixel based. Therefore, some of them are hard to read. The authors should use vector formats so that the figures and tables can be scaled appropriately.
Author Response
Dear reviewer thank you for your recommendations and suggestions.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
With all comments from my part satisfied. This paper is ready for publication. Proofread the work thoroughly before publication.
Author Response
Attached is my response.
Author Response File: Author Response.pdf
Reviewer 3 Report
It seems that in the first review round, the paper had two authors, while in the second review round, only one author is mentioned. From my point of view, this is possible and the second author should join again or he should at least be acknowledged for his work.
Sections 2 and 3 should be included into Section 1.
Why did the authors choose the specific parameters for the ARIMA model in Section 5.8? It may be more succesful to try different parameter configurations or to use an automatic algorithm choosing the parameters by AIC (such as auto.arima from the forecast package in R).
In Table 7 you have to add into the desription, to which statistical test the p-value belongs.
Section 4 does not include what it should include. From the name, this section should discuss other approaches to predict Subscriber data usage. However, it rather includes mathematical details about stationarity and activation function, as well as a comparison ARIMA and LSTM in several applications. From my point of view, these paragraphs are not relevant for the paper since they are too detailed. Instead, the authors should compare different approaches to solve the application problem of the paper.
Section 4.4 is relevant and can be kept, however I do not understand, why Cyclostationarity has a special paragraph with a heading but without a numbering.
The structure of the resulzs section could be improved.
Author Response
Attached is my response.
Author Response File: Author Response.pdf