Integrating Text Classification into Topic Discovery Using Semantic Embedding Models
Round 1
Reviewer 1 Report
This paper proposes exposes an approach integrating text classification into topic discovery from large amounts of English textual data, such as 20-Newsgroup and Reuters corpora. The proposed approach is quite novel. However, there are some issues to be fixed in this article:
- The abstract must rewrite based on WHAT, WHY, and HOW.
- English should be extensively revised and corrected. It is highly inadequate for publishing. And it is strongly suggested that the whole work should be carefully revised.
- The authors need to summary the contributions of the paper in end of the introduction part.
- The Literature survey section needs to rewrite with the most important information.
- The state-of-the-art gap and how the proposed work bridges the gap is not clear.
- This paper must address one of the methods related to the text. Unfortunately, this paper ignores the embedding techniques (BERT, GloVe, and FastText). I suggest adding a new section related to one of these techniques.
- This paper must address the list of features related to text for more information: (https://ejournal.um.edu.my/index.php/MJCS/article/view/29394/14289).
- The proposed method it’s too simple, you need to add more state-of-the-art mathods.
- The conclusion can be compact highlighting the salient findings only.
- I cannot check the list of references; it is not available.
- English should be extensively revised and corrected. It is highly inadequate for publishing. And it is strongly suggested that the whole work should be carefully revised.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
although the article presents an interesting approach in the field of natural language processing, in its current form, it totally lacks references. This I assume is trivially due to an error in the compilation. In its present form it is impossible to evaluate, please submit a form with a properly formatted bibliography so that the quality of the work can be evaluated.
A quick proofreading to correct minor typos is recommended
a rewriting using a more straightforward style in some paragraphs would benefit a better understanding of the text
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
Abstract is well written. The key objectives and motivation behind the research require more emphasize.
The papers considered in related work for analysis are good enough, but I suggest to include few more papers from year 2022 and 2023. Moreover, the key research challenges from all the papers considered for related work must be reported, afterwards it is expected that the common research challenges need to be compiled and then the proposed problem statement must be aligned to these research challenges which evolve from the existing literature. This analysis is missing in the related work section and require modifications.
What are the challenges in segregating the topics from the discovered topics.
Why these three LDA, LSA, PLSA approaches are chosen ?
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 4 Report
The paper describes an approach for the text classification process and integrates it into topic discovery with semantic embedding models by use of a convolutional neural network
The advantages of the proposed method compared with other similar methods in the field are not presented. Also, there is no comparative analysis of it with other similar techniques in the field.
The Bibliographic References section is missing, and in the text of the paper, no reference is made to any bibliographic reference in the field.
The paper has some spelling mistakes.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
No comments
No comments
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The work is well presented, and the authors have responded promptly to the suggestions raised by the reviewers.
However, for methodological rigor, it would be appropriate to supplement some citations in section 2. In particular:
Regarding open information extraction:
- Guarasci, R., Damiano, E., Minutolo, A., Esposito, M., & De Pietro, G. (2020). Lexicon-grammar based open information extraction from natural language sentences in Italian. Expert Systems with Applications, 143, 112954.
- Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multi $^ 2$ OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT." arXiv preprint arXiv:2009.08128 (2020).
Regarding the application of machine learning and sentiment analysis techniques to the covid-19 domain:
- Catelli, Rosario, et al. "Lexicon-based sentiment analysis to detect opinions and attitude towards COVID-19 vaccines on Twitter in Italy." Computers in Biology and Medicine 158 (2023): 106876.
- Guo, Yi, et al. "The application of artificial intelligence and data integration in COVID-19 studies: a scoping review." Journal of the American Medical Informatics Association 28.9 (2021): 2050-2067.
Regarding techniques involving semantic aspects:
- Bovi, Claudio Delli, Luca Telesca, and Roberto Navigli. "Large-scale information extraction from textual definitions through deep syntactic and semantic analysis." Transactions of the Association for Computational Linguistics 3 (2015): 529-543.
Accurate correction has been made, only minimal typos remain
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
The work has been completed and improved considering the observations made, so it can be accepted for publication.
The work has been completed and improved considering the observations made, so it can be accepted for publication.
Author Response
Please see the attachment.
Author Response File: Author Response.docx