The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
Round 1
Reviewer 1 Report
This paper describes a different set of experiments with the goal of proving that machine learning models trained using data from new ransomware samples are inefficient (regardless of their age) in detecting old ransomware samples and vice versa.
While the paper is well-written and has a good flow for readers, several things should be improved to strengthen its content.
- The ML models used in these experiments were supervised learning only. However, ML models also include unsupervised and semi-supervised learning which were not explored here. Therefore, suggest authors be specific since using the word "machine learning" could be misleading.
- 10-fold cross-validation means that you are using 90% of the data for training and the remaining 10% of the data for testing. It wasn't clear to me how this was been used if 70% of the data was used for training and the remaining 30% of the data was used for testing. Suggest being clear about this.
- Neural networks are also a common machine learning model that can either be supervised or unsupervised. Is there a way that new experiments use something simple like a feed-forward neural network which is mostly used for supervised learning and compared to the performance of the other supervised learning models? It would be interesting to also consider the use of deep learning models as part of future work.
- Paper 22 seems to be a relevant and recent paper whose Accuracy was 99.3%, but this was not considered in Table 9 where a comparison between the proposed approach and prior works was made. Why is that? I believe it should be mentioned.
- Describe what else will be explored as part of future work. It only mentioned the need of using different splits regarding the ransomware years, but there are other things that could also be explored. For example, adding more recent samples (i.e., 2021 and 2022) and exploring different training set sizes (besides 70%, 80%, or 90% could also be considered).
- Also, suggest authors explain in the Results section why a supervised model performs better than the other. For example, it is common for RF to outperform the other models but in some experiments, this wasn't always the case. This is a very important aspect that is missing and should be incorporated before the paper is published.
Overall, this is a topic well received by the research community and future work will be beneficial. In addition, listing the top 10 frequent permission features is very helpful as it allows for replication.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
1. Provide a reference for the following sentence “Kaspersky, which is one of the common antiviruses, has detected more than 17k new mobile ransomwares out of three million malicious applications in 2021.”
2. Put equation number
3. In the SVM why you selected linear kernel, most of the studies prove that “Gaussian radial basis function (RBF)” is the best kernel for more information see the following paper:
a. Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
The authors present a paper on the effect of the ransomware dataset age on machine learning models. A few comments are available.
Points in favor:
+ Good examples and illustrations.
Points against
-some parts are missing
Detailed Feedback
Abstract
It misses the exact research problems of current machine learning models or datasets.
All keywords must be in the abstract.
Introduction
too lengthy, please focus on the research problem and the generic approach and remove the purpose and history text into a new section, say background section (new section)
it is highly recommended to keep features discussions in a new section
the contributions could be more readable if some points are merged, for example, points one and point 2 are the same except that old/new
Related work
it is recommended to update table 2. it is a little bit hard to follow.
remove the "X" that shows under the work [43]
add the author's name with the related work or make the first column into two columns
Section 2.4 said discussion, it is recommended to change the name to literature gaps. Or issues in related work.
proposed approach
Section 3.3 is about detection models; it is hard to know the rationale for selecting these methods.
These methods are well-discussed in the books so, to keep focused on the paper, it is recommended not to detail how they work. (table 3 should be enough)
Readers might need information on what is concluded from this research. For example, do we need new and old feature sets in all machine learning models? What is the impact of new versions of the android system or software on that?
Please add a separate section explaining the implications of this research.
Minor comments
please make sure that the text is aligned correctly in the paper.
tables are larger than text. The images are aligned extremely right.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf