Predictive Analysis of COVID-19 Symptoms in Social Networks through Machine Learning
Round 1
Reviewer 1 Report
The authors' paper on symptom detection in social networks appears to be a well-written research paper. The literature review is presented comprehensively, taking into account the latest research. In addition, the research conducted corresponds to the topic of the journal. I suggest accepting the paper in its current form.
Author Response
We would like to thank the reviewer for the time and availability to read the results of our work.
Reviewer 2 Report
authors study detection of Symptoms on Twitter. I have some comments and questions that considering them might improve the quality of this research.
1- as a reader, I can find this paper attractive and educational. However, this is not a survey paper or an original research article. meaning that:
- if authors would like to publish a survey, need to include the most recent articles (full coverage on existing methods and surveys) and their contribution would be the categorization, good presentation, and most importantly the conclusions and future directions.
- if authors would like to publish an original research article, there needs to be enough novelty and contribution. Obviously, the enough is different for journals and conferences.
This research in its current form does not reflect non of the above items.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
The work presented for review deals with an important and current topic such as the analysis of messages on social networks. The authors address a work that refers to the study of messages on social networks to analyse emotions, feelings, ideas and even symptoms of certain diseases. Specifically, they refer to the analysis of symptoms related to possible SARS-CoV-2 infection. The study of this data was carried out by applying eight machine learning algorithms. They labelled the data from 9 keywords provided by the WHO. The results found indicate that the Ramdom Forest algorithm performed better than the other algorithms used.
However, there are a number of aspects that should be reviewed:
- revise the title, as "Symptoms Self-Report Detection in Social Networks" does not make an accurate reference to the work which is testing the effectiveness of certain machine learning algorithms for the predictive analysis of symptoms of COVID-19 disease.
- Revise the abstract for the changes I am going to suggest below.
- Revise the introduction, the paper refers to a topic of current relevance. Therefore, the authors should include more updated bibliographic references, as only 17 of those presented are updated in the last 5 years, which is 36.17%, the percentage should be increased to at least 60% of updated citations.
- After the introduction, the authors should clearly state the research objectives or hypotheses of the work.
- Regarding the methodology, the authors should give more information about the access to the database that contained the tweets in principle they indicate that they were extracted from Lamsal. However, they should refer to the permissions obtained for this purpose or give more information about the process of accessing the data.
- On the other hand, they point out that the messages were labelled with respect to 9 key terms provided by the WHO, they should include a figure indicating these terms and the frequency found in each of them with respect to the database. Also, the symptoms should be related to the possible combinations for the diagnosis of SARS-CoV-2. On the other hand, the authors talk about the detection of emotions and feelings reflected in the tweets, this is another objective of the work and the tweets should be classified with other criteria as has been done with the symptoms. But these should be different objectives and should be treated individually to check the results.
- Another doubt I have is that the authors apply machine learning algorithms of classification I understand that to relate the data with the diagnosis of SARS-CoV-2 in the users, but how did they relate the data of the tweets with the diagnosis that the people who sent them with the diagnosis they obtained, but how did they relate that data I understand that it is confidential information?
- On the other hand, the authors talk about prediction in the results, but they have used classification algorithms, not prediction algorithms. This aspect needs to be clarified.
- Likewise, the authors compare the classification efficiency of different algorithms, although in order to carry out a study of this type, they should include adjusted Rand index (ARI) tests.
An example can be found in this paper Sáiz-Manzanares, M.C., Ramos Pérez, I., Arnaiz-Rodríguez, Á., Rodríguez-Arribas, S., Almeida, L., & Martín, C.F. (2021). Analysis of the learning process through eye tracking technology and feature selection techniques. Applied Sciences, 11, 6157, 1-24-. https://doi.org/10.3390/app11136157
Or, alternatively, a fit check can be carried out using structural equations by applying a goodness of index test.
Another example can be found in
Sáiz-Manzanares, M.C., Rodríguez-Díez, J.J., Díez-Pastor, J.F., Rodríguez-Arribas, S., Marticorena, R., & Ji, P.Y. (2021). Monitoring of Student Learning in Learning Management Systems: An Application of Educational Data Mining Techniques. Applied Sciences, 11, 2677, 1-16. doi: 10.3390/app11062677.
- On the other hand, authors should review the presentation of tables and figures according to the journal's guidelines. Under each table or figure they should add a note describing the meaning of the acronyms.
- Also, the citation of references should be reviewed according to the journal's guidelines.
- With regard to the structuring of the headings of the article, the authors must follow the journal's rules, so the results section must be independent of the discussion section. In this section, the authors must relate the results found with previous studies that have served as a reference for the work based on each of the hypotheses put forward. Likewise, the method section should be divided into participants, instruments, procedure, data analysis. An example of all of these can be found in the references given as an example.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Authors have tried to address the main concern of the paper. I my view, the gap is still remaining.
Reviewer 3 Report
The authors have incorporated the recommendations made in the first review. Therefore, in my opinion, the article can be accepted.