Deep Learning Based Android Anomaly Detection Using a Combination of Vulnerabilities Dataset
Round 1
Reviewer 1 Report
This paper addresses Android malware detection. It is generally well written, but has a long introduction, short conclusions, and do not describe with sufficient mathematical validation the methods used. In my opinion, this work, misses to prove that the approach used outperforms other state of the art strategies to detect malware in Android apps. The goal of the work is interesting and useful, but this paper is not acceptable in this current version. In Eq. 2, "bais" should be "bias".
Author Response
We would like to thank the reviewer for the evaluation and the useful comments.
This paper addresses Android malware detection. It is generally well written, but has a long introduction, short conclusions, and do not describe with sufficient mathematical validation the methods used. In my opinion, this work, misses to prove that the approach used outperforms other state of the art strategies to detect malware in Android apps. The goal of the work is interesting and useful, but this paper is not acceptable in this current version. In Eq. 2, "bais" should be "bias".
We have addressed the comment, by improving the introduction section(1) and extending the conclusion section(6). Furthermore, we updated the” related work “section(5) by adding new recent work and comparing them with our approach. In addition, we have provided more details about our approach, by adding three algorithms describing our processes and by providing complexity information about our models’ computation. We have also improved the previous version by proposing more comparison with the state of the art.
Eq. 2. The typo has been fixed.
Author Response File: Author Response.docx
Reviewer 2 Report
This paper relies on the AndroVul dataset to train Deep Learning and Support Vector Machine models, enabling them to learn to distinguish between benign and malicious apps as well as malware families.
The novelty of the approach is questionable, as it only uses existing ML techniques and applies them. The proposed tool is interesting from a professional aspect but questionable as a novel contribution. The novelty of the approach needs to be corroborated.
A detailed description must be included in the paper that emphasizes the main pros and cons of the authors’ proposal with regard to the state of the art.
The authors should probably provide more information about the proposed architecture. This is a major issue of the paper of how the authors have chosen this specific architecture for the proposed processing method, how it emerged and why the proposed architecture is the optimal solution.
The paper includes descriptions of the different methodologies and techniques employed in their proposal that can be directly extracted from the standard bibliography. These descriptions have currently a long extension, thus reducing the parts of the paper really providing a specific contribution. The distribution of these descriptions along the different sections of the paper also originates including a mixture of descriptions about the state of the art, the authors’ proposal, and the results obtained in most of the sections of the paper. These parts must be isolated in different sections of the paper, also reducing the explanation about standard methods and techniques.
The authors should clearly state the innovations from the application point of view and they should clearly define which are the innovative features of their proposal with respect to adopted logic.
On the other hand, there are many up-to-date theoretical studies on Machine Learning and well-established communities working on different theoretical aspects and techniques. The authors must extend the explanation about the main differences between the current submission and their previous studies. I would suggest a comparison study.
I suggest to the authors to re-arrange the entire paper and improve a lot the background in order to make the paper self-consistent. Then insert a state of the art, in particular, to explain where the original contribution of the work is. Furthermore, I suggest giving more details about the considered architecture and about the application of the proposed method to it.
I think, with improvements suggested in the previous section, the paper makes an acceptable case for publication.
Author Response
Answers to Editor and Reviewers’ Comments
We thank the reviewers for their insightful comments that helped improving the quality and content of the manuscript. We appreciate the time spent and their effort in reviewing the paper.
In this document, we address the specific comments that the reviewers have made. For the reviewers’ comments, we have retyped their comments in bold followed by our response in italics and bold.
Reviewer #2:
We would like to thank the reviewer for the evaluation of our paper and the insightful comments.
This paper relies on the AndroVul dataset to train Deep Learning and Support Vector Machine models, enabling them to learn to distinguish between benign and malicious apps as well as malware families. The novelty of the approach is questionable, as it only uses existing ML techniques and applies them. The proposed tool is interesting from a professional aspect but questionable as a novel contribution. The novelty of the approach needs to be corroborated.
In the “Introduction”section(1), we have added more details and highlighted the novelty and the contibutions of our work.
We have also improved the “related work“section(5) by adding more discussion to contrast our work relatively to the state of the art. In addition, we provided comparisons with reference works as well as our previous conference paper.
We have also improved the presentation of our approach, in particular, by adding algorithms and discussing their complexities.
A detailed description must be included in the paper that emphasizes the main pros and cons of the authors’ proposal with regard to the state of the art.
We addressed the comment by adding more details highlighting the differences between our approach and the state of the art.
In the “related work” section(5), we added more references and we propose a more detailed comparison between our approach and the existing approaches as well as a summary table. Additionally, our experimental results are compared with state-of-the-art reference works.
The authors should probably provide more information about the proposed architecture. This is a major issue of the paper of how the authors have chosen this specific architecture for the proposed processing method, how it emerged and why the proposed architecture is the optimal solution. The paper includes descriptions of the different methodologies and techniques employed in their proposal that can be directly extracted from the standard bibliography. These descriptions have currently a long extension, thus reducing the parts of the paper really providing a specific contribution. The distribution of these descriptions along the different sections of the paper also originates including a mixture of descriptions about the state of the art, the authors’ proposal, and the results obtained in most of the sections of the paper. These parts must be isolated in different sections of the paper, also reducing the explanation about standard methods and techniques.
We added more description regarding the algorithms, the best hyperparameters. We reduced the part related to the standard bibliography descriptions to highlight our specific contribution.
- In the “Background” section(2), we have given more explanation regarding the features we used in our work.
- In the “Methodology” section(3), we have improved the following subsections:
- In the “Dataset” subsection(3.1), we added algorithm 1 that explained how we collected malwares apps from the VirusShare repository. We have also added a description for each step.
- In the “Feature extraction” subsection(3.2), we added more explanations about the extracted process and we described in details the Algorithm summarizing the feature extraction (Algorithm 2).
- In the “Android Malware Detection based on Deep Learning” subsection(3.4), we added Algorithm 3, and we described it in details. We added the complexity of the DL model. We also added Figure 6 that depicts our model, and Table 2 shows the best hyperparameters which are obtained by experimentation.
- In the” Android Malware Detection based on Support vector machine” subsection(3.5), we added algorithm 4, and we explained it in details. We added the complexity of the model SVM.
In the “Related work” section(5), we added the following changes:
- We have re-arranged the “related work” section(5) to add more information to the Taxonomy reported in Table 9. That taxonomy focuses on the Comparison between state of the art and our approach
- We have discussed our previous work and explained the distinctions between it and the new work presented in our paper.
- We have added more recent references and further discussed the comparaison with our approach.
The authors should clearly state the innovations from the application point of view and they should clearly define which are the innovative features of their proposal with respect to adopted logic.On the other hand, there are many up-to-date theoretical studies on Machine Learning and well-established communities working on different theoretical aspects and techniques. The authors must extend the explanation about the main differences between the current submission and their previous studies. I would suggest a comparison study.
We addressed the comment as follows. We have explained the differences between our previous studies and this paper. In the “related work” section(5), we have discussed our previous work and explained the differences between it and the new work presented in our paper. In short, our previous conference paper was to demonstrate the potential of the proposed features for the detection of malwares. In contrast to that work, our key objective in this research work is to propose a finely tuned machine learning appproach able to outperform existing approaches and anti-virus products. The additional work required involved tuning the hyper-parameters of the machine learning approaches, increasing the amount of malware apps, and conducting additional experiments and comparisons with existing literature and antiviruses.
I suggest to the authors to re-arrange the entire paper and improve a lot the background in order to make the paper self-consistent. Then insert a state of the art, in particular, to explain where the original contribution of the work is. Furthermore, I suggest giving more details about the considered architecture and about the application of the proposed method to it. I think, with improvements suggested in the previous section, the paper makes an acceptable case for publication.
We have addressed the comment accordingly .
- In the “Background section (2)” , we have given more explanation regarding the different concepts and features used in our work.
- We have improved the “Methodology” section(3): we have described the feature extraction in more details. We have also added algorithms accordingly(subsections 3.1).
- We have clarified our contribution and we have highlighted the novelty of our approach(section1).
- We have added the models’ algorithms and the tables, as well as the figures that explain the best hyperparameters( subsection 3.4,3.5).
- We have re-arranged the paper, and more particularly the related work and we have compared the results our approach achieved to the results obtained by the state-of-the art. We performed that comparison based on several criteria: accuracy, the features used, etc.
- We have expanded the conclusion with further information and future work(sections 6).
We would like to thank all reviewers once again. We have addressed the raised comments and updated the manuscript accordingly.
Author Response File: Author Response.docx
Reviewer 3 Report
The manuscript is about deep learning based android anomaly detection. It is well written. On the other hands, the authors should improve the manuscript.
Basic reporting
The work is somehow new, but several limitations from the technical hinder to grasp the main point of the work, some of them are listed below:
+ In my opinion, the abstract is too cumbersome, and It is hard to catch the key point.
+ It is essential to address their method using the algorithm, which makes it clear to grasp the steps of the improvements of the technique.
+ Some related references should be discussed and added, such as 10.7717/peerj-cs.346, 10.7717/peerj-cs.285
+ Authors are recommended to have thorough proofread to avoid the typos.
Experimental design
+ The time and space complexity and algorithm not specified.
+ Test Setup and tuning for the work are expected to elaborate and detailed for future productions.
+ You need to add one Table to show all algorithms' parameters that tuned in this research.
The validity of the findings
+ What is your justification for using your method?
Comments for the Author
+ Overall good work, well done.
+ The literature has to be strongly updated with some relevant and recent papers focused on the fields dealt with the manuscript.
Author Response
Answers to Editor and Reviewers’ Comments
We thank the reviewers for their insightful comments that helped improving the quality and content of the manuscript. We appreciate the time spent and their effort in reviewing the paper.
In this document, we address the specific comments that the reviewers have made. For the reviewers’ comments, we have retyped their comments in bold followed by our response in italics.
Reviewer #3:
We would like to thank the reviewer for the evaluation of our paper the useful comments.
The manuscript is about deep learning based android anomaly detection. It is well written. On the other hands, the authors should improve the manuscript. Basic reporting
The work is somehow new, but several limitations from the technical hinder to grasp the main point of the work, some of them are listed below:
+ In my opinion, the abstract is too cumbersome, and It is hard to catch the key point.
We have addressed the comment accordingly. We elaborated more about the key point in the abstract.
+ It is essential to address their method using the algorithm, which makes it clear to grasp the steps of the improvements of the technique.
We have addressed the comment accordingly. We have added the algorithms for both ML models and we studied their complexities.
- In the “Android Malware Detection based on Deep Learning” subsection(3.4), we added Algorithm 3, and we described it in details. We added the complexity of the DL model. We also added Figure 6 that depicts our built model, and Table 2 shows the best hyperparameters which are obtained by experimentation.
- In the “Android Malware Detection based on Support vector machine” subsection(3.5), we added algorithm 4, and we explained it in details. We added the complexity of the model SVM.
+ Some related references should be discussed and added, such as 10.7717/peerj-cs.346, 10.7717/peerj-cs.285
We have addressed the comment accordingly. In the “related work” section(5), we have added this new reference among other and we had discussed them.
+ Authors are recommended to have thorough proofread to avoid the typos.
We have addressed the proofreading issues accordingly.
Experimental design
+ The time and space complexity and algorithm not specified.
We have addressed the comment accordingly, by studing the complexity of the proposed algorithms namely the “Android Malware Detection based on Deep Learning” (subsection 3.4 ) and “Android Malware Detection based on support vector machine” (subsection 3.5).
+ Test Setup and tuning for the work are expected to elaborate and detailed for future productions.
We have addressed the comment accordingly: we have provided the best parameters to use in our approach and built a summary for our approach.
- In the “Android Malware Detection based on Deep Learning” subsection(3.4), we added Figure 6 that depicts our built model, and Table 2 that describes the best hyperparamters.
+ You need to add one Table to show all algorithms' parameters that tuned in this research.
We have addressed the comment accordingly,
- In the “Android Malware Detection based on Deep Learning” subsection(3.4), we have added Algorithms (algorithms 3 and 4) and we have described them in details. In addition, we have added the complexity description of the models. We have also added Figure 6 that depicts our built model and Table 2 that shows the best hyperparamters.
The validity of the findings
+ What is your justification for using your method?
The malware developers exploit the vulnerabilities in Android apps to inject their malicious code. Thus, identifying vulnerabilities in Android apps and including them as features was our starting point. Then, to improve the malware detection in autonomous way, we deployed machine learning technique by feeding extracted features (derived from our previous work) to train the model.
We have updated the introduction section(1) accordingly to make this idea more clear.
Comments for the Author
+ Overall good work, well done.
+ The literature has to be strongly updated with some relevant and recent papers focused on the fields dealt with the manuscript.
In the “related work” section(5), we have added a other recent references and have discussed them in regard to our solution.
We would like to thank all reviewers once again. We have addressed the raised comments and updated the manuscript accordingly.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The authors have improved the manuscript taking my comments in consideration. Hence, it is my opinion that the improved version of the manuscript can be accepted.
Reviewer 2 Report
I have read the authors' responses to my comments and have read the revised version of the paper. I am satisfied that following the initial review the authors have addressed the comments appropriately. The paper reads much better now and the clarity of the work presented has improved believe that the response provided has enhanced the paper to a level acceptable for the readership and the scientific standing of this journal. I have no further comments to make.
Reviewer 3 Report
The authors answered all my comments. In my opinion the manuscript is acceptable.