Topic Modelling: Going beyond Token Outputs
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. The paper briefly discusses comparing the proposed approach with traditional topic modeling methods, but it lacks a comprehensive comparison with other state-of-the-art techniques in the field.
2. The paper may lack generalizability due to potential biases in the dataset used for evaluation. Furthermore, the proposed approach heavily relies on the keyword extraction algorithm (RAKE) to extend topic descriptions. The efficacy and quality of the extended topics are thus contingent upon RAKE's ability to extract pertinent keywords from the text data.
3. While the proposed approach exhibits promise in improving interpretability, addressing scalability concerns remains paramount. Implementing the proposed method on large-scale datasets or real-time applications may present challenges related to computational resources and processing time.
4. The paper mentions employing manual annotations to assess interpretability, quality, usefulness, and efficiency. However, providing additional details about the evaluation criteria and ensuring inter-annotator agreement is crucial for bolstering the reliability and validity of the results. Given the subjective nature of interpretability, it's imperative to offer robust evidence supporting the proposed approach.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe paper focuses on enhancing the interpretability of topic modeling outputs beyond traditional token lists. Topic modeling is a valuable text mining technique that identifies themes across a collection of documents, usually producing a set of topics represented by frequently co-occurring tokens. However, these outputs often require manual interpretation, which can be both challenging and subjective, leading to inaccuracies in understanding the topics' meanings.
Addressing this challenge, the authors propose a novel approach that extends the output of traditional topic modeling methods by utilizing the textual data itself, rather than relying on external language sources. This method involves extracting high-scoring keywords from the text and mapping them to the topic model's outputs, aiming to enhance the interpretability of topics from a human perspective. The effectiveness of this approach is demonstrated through a series of experiments, showing that it produces more informative and interpretable topic descriptions compared to traditional methods.
This study contributes to the field of text mining by offering a technique that reduces the reliance on external sources for topic interpretation, thus addressing issues related to the availability, relevance, and privacy concerns associated with external resources. The proposed method's capability to generate more contextually rich and interpretable topic descriptions could significantly benefit applications such as decision-making and information retrieval, where understanding the underlying themes of textual data is crucial.
As a drawback, the paper does not extensively compare the proposed approach with existing baseline methods in topic modeling. Such comparisons could provide a clearer picture of the method's relative performance and innovation.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsAccept in present form.
Author Response
We have addressed the editor's comments.