1. Introduction
In today’s information age, dominated by social media, online platforms have become vital venues for information dissemination. Every day, hundreds of millions of users share various types of information on these platforms, including numerous high-profile events. This information often contains rich personal opinions and emotional biases. The rise of social media has made information dissemination faster and more widespread, where a single post or comment can spark extensive discussion and reaction within a short period. This rapid spread and large-scale interaction have not only transformed the way that information circulates but have also profoundly influenced the formation and evolution of public opinion. Through posting, commenting, and sharing, users express their personal stances, creating a complex information ecosystem. This phenomenon has simultaneously brought about unprecedented social changes and challenges [
1]. While the free flow of information promotes public participation, it also brings certain challenges. Information on social media may include false or misleading content, and even malicious or hate speech, all of which pose potential threats to public opinion and social stability [
2]. Therefore, in this dynamic and ever-changing environment, understanding and harnessing the power of social media has become a crucial topic in the information age [
3].
In the context of public opinion evolution, the concept of symmetry plays a central role in understanding the balance and dynamics of emotional responses. Symmetry, in emotional analysis, refers to the equilibrium between the valence and arousal dimensions. This study applies the concept of symmetry to both the design and analysis of the research methodology for public opinion evolution. Specifically, we examine how emotional responses during public opinion events, such as the Zibo Barbecue Incident, exhibit symmetrical patterns in different stages. By capturing the balance between emotional intensity and polarity, we gain a deeper insight into the emotional dynamics of public sentiment, offering a more comprehensive understanding of how public opinion evolves over time.
The “Zibo Barbecue Incident” not only garnered widespread attention in a short period but also sparked intense public discussion on online platforms. Zibo barbecue first gained attention on Douyin, and during the COVID-19 pandemic, the Zibo local government’s warm hospitality and farewell barbecue banquets for quarantined university students laid the groundwork for its popularity. After the pandemic, university students began organizing group trips via high-speed trains to Zibo to taste its barbecue, sharing their experiences on social media. This gradually brought Zibo barbecue into the spotlight. Its unique features, such as “large meat skewers, flatbread, and scallions” and the ceremonial dining style of “one table, one grill, and one rolled pancake”, provided visitors with a distinctive dining experience. The combined efforts of the Zibo municipal government and local businesses, including market regulation, maintaining merchant reputations, and launching dedicated transportation routes, significantly contributed to the widespread popularity of Zibo barbecue. This paper takes the “Zibo Barbecue Incident” as a case study, using Python web scraping to collect comments from videos related to the event on the BiliBili platform as the dataset. Textual topic mining was performed using Top2Vec, and a custom sentiment dictionary was constructed based on the Dalian University of Technology Sentiment Dictionary for annotating the text. The sentiment classification was then evaluated using the RoBERTa model. According to the lifecycle theory, the public opinion dissemination cycle was divided into the initiation, outbreak, decline, and cessation stages. Based on the two-dimensional theory of emotion, comments were classified into valence and arousal, and two-dimensional topic extraction was performed accordingly. Finally, the evolution of sentiment means, two-dimensional topic analysis, and the evolution of comment popularity were conducted to analyze the evolution of public opinion. This approach provides valuable insights for understanding and addressing related issues and offers a useful reference for public opinion analysis in similar events.
The innovations of this study are as follows:
(1) Traditional sentiment analysis typically focuses only on the categories of emotions, neglecting the intensity of emotions. This study integrates the two-dimensional theory of emotion, analyzing changes in different emotional states from the dimensions of valence and arousal. This approach allows for a more comprehensive capture and understanding of the emotional dynamics in the evolution of public opinion.
(2) By combining a sentiment dictionary with deep learning models, this study addresses the limitation of low efficiency in traditional manual annotation. The sentiment dictionary provides rich prior knowledge to help identify emotional tendencies in the text, while the deep learning model, through automatic learning from large datasets, further enhances classification accuracy and generalization ability. Through model training, the accuracy of sentiment classification results is more comprehensively evaluated and validated.
Section 2 reviews the current state of research methods in online public opinion.
Section 3 outlines the research framework and methodology of this study.
Section 4 presents the research process and results analysis.
Section 5 concludes the study.
2. Related Research
Compared to traditional public opinion, online public opinion relies on major information exchange platforms, offering a broader reach that can connect with groups across different regions, age groups, professions, and interests. Traditional public opinion is limited by geography and media channels, whereas online public opinion transcends these limitations, often exerting a more extensive influence [
4]. Analyzing comment data on social media can help us understand public sentiment and the patterns of public opinion evolution [
5]. In recent years, sentiment classification and topic mining have been widely applied and extensively researched in this field [
6].
2.1. Research on Sentiment Classification Methods
Sentiment classification methods can be categorized into fine-grained and coarse-grained approaches. Fine-grained analysis involves sentiment classification methods based on emotional polarity and intensity using sentiment dictionaries. Sentiment classification methods based on sentiment dictionaries use pre-constructed lexicons to determine the sentiment orientation of the text by matching and analyzing the emotional vocabulary within the text. Constructing a sentiment dictionary involves filtering and categorizing vocabulary to create a lexicon that accurately reflects emotional nuances. Zhang et al. expanded sentiment dictionaries by extracting and constructing related dictionaries, such as network terminology dictionaries and negation dictionaries, to enhance topic monitoring on Weibo [
7]. Nie et al. proposed a method that combines semantic mapping functions with dictionary construction to capture the rich emotions hidden in hotel review texts [
8]. Liu et al. combined sentiment dictionaries with pre-trained word embeddings and used TF-IDF values for weighting. By calculating the weights of sentiment words and neutral words separately and highlighting the role of sentiment words in sentence vectors, they improved the accuracy of sentiment analysis [
9].
Coarse-grained text sentiment analysis methods involve using machine learning or deep learning techniques to classify the overall sentiment of the entire text.
Machine-learning-based sentiment classification methods rely on training large-scale labeled datasets to learn patterns and features of emotional expression. By extracting text features and applying classifiers, these methods identify sentiments within the text. Stefanis et al. explored the emotions related to daily COVID-19 monitoring reports posted on Facebook pages and used machine learning algorithms to predict sentiment classifications [
10]. Rahman et al. proposed a multilayer classification model that employs supervised machine learning techniques, achieving better recall rates in sentiment classification tasks [
11]. Hokijuliandy et al. used a combination of SVM classification and chi-square feature selection methods for sentiment analysis. Their analysis of user comments revealed the main trends in positive reviews [
12].
Deep learning models use word embedding techniques (such as Word2Vec and GloVe) to simplify feature engineering and capture semantic information. They employ Recurrent Neural Network (RNN) and Long Short-Term Memory network (LSTM) to handle sequential data; Convolutional Neural Network (CNN) to capture local features [
13] and incorporate attention mechanisms; and Transformer models (such as BERT) to capture global dependencies [
14] and enhance the model’s generalization ability through large-scale pre-training and fine-tuning. Sisi et al. used a CNN model, combining encoded emotional sequence features with traditional word embedding features for email sentiment classification [
15]. Arbane et al. used Bi-LSTM to reveal various issues related to COVID-19 public opinion, aiming to understand people’s concerns during the pandemic [
16]. Pota et al. used the BERT model to evaluate the impact of tweet pre-processing operations on sentiment analysis performance. They considered available data in two languages (English and Italian) to assess language dependency [
17]. He et al. proposed a BERT-CNN-BiLSTM-Att hybrid model for text sentiment analysis, addressing issues of ambiguity and feature extraction in the sentiment analysis process [
18].
2.2. Research on Topic Mining Methods
Topic mining, as an important technique in natural language processing, aims to discover hidden semantic structures and thematic information from text data. The traditional topic model, Latent Dirichlet Allocation (LDA), proposed by Blei et al., models the distribution of vocabulary over topics using probabilistic distributions. It has been widely applied to thematic analysis and document summarization tasks [
19]. Zhao et al. used word frequency statistics and LDA methods to identify key terms related to tourism in Nanjing, thereby promoting tourism development [
20]. Uthirapathy et al. used the LDA method to identify topics related to climate change in an existing Twitter dataset of public discussions [
21]. Yoo et al. utilized LDA and Word2Vec algorithms to extract papers related to specific keywords from research on COVID-19 and identified detailed topics [
22].
In summary, there are some limitations in existing research. Current mainstream topic mining methods mainly rely on LDA models and Word2Vec technology. However, these methods have limited capabilities in understanding complex semantic relationships, particularly when dealing with unstructured social media text, where capturing deep semantic information is challenging. Although deep-learning-based methods provide various metrics to evaluate model performance, they often require cumbersome manual annotation, which is a massive engineering task for large-scale data and carries a significant degree of subjectivity. Additionally, relying solely on models for sentiment analysis lacks theoretical support and has lower credibility.
Therefore, this study develops a public opinion topic analysis framework based on the two-dimensional theory of emotion and lifecycle theory, using the Top2Vec topic mining method. On the other hand, it combines a sentiment dictionary with the RoBERTa model to perform sentiment polarity analysis on public opinion comments. The sentiment dictionary is used to calculate sentiment values and perform initial sentiment classification, while the RoBERTa model is used to evaluate the accuracy of sentiment classification.
4. Results
4.1. Data COllection and Pre-Processing
This study employs Python web scraping to collect comment texts related to the “Zibo Barbecue Incident” from the BiliBili platform. The dataset, using keywords “Zibo Explosion” and “Zibo Barbecue”, includes comments from content on BiliBili between “2023-03-22” and “2023-07-01”. The study uses the Jieba segmentation tool, matching the text with words from a custom vocabulary. A stopword list is used to filter out stopwords. After removing duplicate data and meaningless texts, a total of 17,873 valid comments are obtained.
4.2. Lifecycle Classification
The distribution of public opinion data over time is shown in
Figure 4. Using the public opinion evolution cycle classification method, the data for the “Zibo Barbecue Incident” is divided into four stages: the Initiation Stage (from “2023-03-22 to 2023-04-07”), the Outbreak Stage (from “2023-04-08 to 2023-04-11”), the Decline Stage (from “2023-04-12 to 2023-05-06”), and the Resolution Stage (from “2023-05-07 to 2023-07-01”).
4.3. Sentiment Analysis
4.3.1. Sentiment Calculation
In this study, sentiment scores and emotions for each text are calculated based on the vocabulary and corresponding emotion labels and scores from the Dalian University of Technology Sentiment Dictionary. First, each text is segmented into individual words or phrases using the jieba tokenizer. The segmented results are then matched with the vocabulary in the sentiment dictionary to extract all the matching sentiment words. The sentiment dictionary contains multiple emotional categories (such as joy, anger, sadness, etc.), with each word assigned a corresponding sentiment intensity score. For each matched sentiment word, the score is weighted according to its corresponding value in the dictionary. Based on the matched sentiment words and their corresponding scores, the total sentiment score for each text is calculated. Additionally, for each text, the different emotional categories and their frequencies are also counted. As shown in
Table 3, the score for each sentiment word is weighted, and the final comprehensive sentiment score for the text is derived, along with the identification of the predominant emotional categories and their counts.
4.3.2. Roberta Model Sentiment Classification
This study uses the RoBERTa model to evaluate the accuracy of sentiment labels and further classify the comments annotated with the sentiment dictionary. The experimental environment for the research includes Windows, Jupyter Notebook as the development environment, Python 3.8 as the programming language, and TensorFlow 2.10.0 as the deep learning framework.
(1) Dataset splitting: From the corpus, 20% is randomly selected as the test set. Then, 20% is randomly selected from the remaining 80% of the dataset to form the validation set, with the remaining portion used as the training set. Overall, the dataset is divided into test, validation, and training sets in a ratio of 0.2:0.16:0.64.
(2) Input processing: To prepare the text data for RoBERTa, each input sentence is tokenized using the tokenizer provided with the pre-trained RoBERTa model. The tokenizer splits the text into subwords, words, or phrases and maps them to integer indices corresponding to the RoBERTa vocabulary. Special tokens such as [CLS] (classification token) and [SEP] (separator token) are appended to denote the start and end of a sequence, ensuring that the model processes the input in the expected format. Additionally, an attention mask is generated for each token, indicating which tokens should be attended to during processing and which should be ignored (e.g., padding tokens).
(3) Training procedure: The encoded data are fed into the RoBERTa model, which uses a multilayer bidirectional transformer architecture to capture contextual information. The model is trained to minimize the cross-entropy loss between the predicted and actual sentiment labels. To prevent overfitting, early stopping criteria were applied. After six epochs, the training accuracy continued to improve, but the validation accuracy plateaued, indicating overfitting. As a result, the training process was capped at six epochs.
(4) Performance metrics: Accuracy and loss curves for both the training and validation sets across all epochs are plotted and analyzed to monitor the model’s performance. These curves are depicted in
Figure 5.
(5) Model evaluation and comparison: In order to assess the performance of the RoBERTa model and compare it with other models, we evaluated the accuracy of four different approaches: RoBERTa, BERT, LSTM, and BiLSTM. The results of these comparisons are summarized in
Table 4.
As shown in
Table 4 and
Figure 5, after training, the RoBERTa model achieved an accuracy of 98.67% on the validation set and 98.46% on the test set, demonstrated superior performance in terms of precision, recall, and F1-score, indicating good model fitting performance. The method of annotating comment sentiment values based on the dictionary achieved high accuracy, demonstrating significant feasibility and practical value.
4.3.3. Two-Dimensional Emotion Analysis
Based on the lifecycle and emotional valence, “Happy”, “Good”, and “Surprise” are categorized as high-valence emotions, while “Sadness”, “Anger”, “Disgust”, and “Fear” are categorized as low-valence emotions. The results are shown in
Table 5 and
Figure 6.
From the valence dimension, it can be observed that in the Initiation Stage, the proportion of high-valence and low-valence emotions in public comments is relatively low. This suggests limited emotional feedback, likely due to the event being in its early stages and attracting less public attention. During the Outbreak Stage, there is a significant increase in both high-valence and low-valence emotions, with proportions being nearly equal. This reflects a vigorous reaction and diverse public sentiment as the controversy intensifies. In the Decline Stage, both high-valence and low-valence emotions show a similar but lower proportion, indicating weakened emotional responses as the event wanes. In the Resolution Stage, high-valence emotions slightly surpass low-valence emotions. Although overall emotional feedback remains balanced, positive comments slightly dominate, which may indicate that the public feels relatively satisfied with the resolution of the issue.
Based on emotional arousal, “Good”, “Sadness”, and “Disgust” are categorized as low-arousal emotions, while “Happy”, “Anger”, “Surprise”, and “Fear” are categorized as high-arousal emotions. The results are shown in
Table 6 and
Figure 7.
During the Initiation Stage, the proportion of high-arousal comments is 3.39%, while low-arousal comments account for 5.39%, indicating a relatively calm public emotional response at this time. During the Outbreak Stage, the proportion of high-arousal comments rises significantly to 36.29%, while low-arousal comments are 37.86%, showing that the event triggered intense public attention and emotional reactions. In the Decline Stage, the proportion of high-arousal comments further increases to 52.89%, while low-arousal comments account for 50.64%, indicating that despite the event gradually fading, public emotions remain highly agitated. Finally, in the Resolution Stage, the proportion of high-arousal comments drops to 7.43%, with low-arousal comments at 6.11%, reflecting a significant decrease in emotional arousal after the event’s resolution, with comments becoming calmer.
4.3.4. Evolution of Sentiment Mean
Figure 8 shows the evolution of the average sentiment values of public comments on the same date throughout the public opinion period. Significant fluctuations in sentiment values are observed between different dates. During the Resolution Stage, the number of comments decreases sharply, with some dates having only a few comments. In such cases, the sentiment values calculated from a small number of comments may cause extreme fluctuations in the results. Therefore, sentiment values from the Resolution Stage are excluded from the analysis, focusing only on data from stages with a higher volume of comments.
By analyzing the sentiment means during the high-comment volume stages, we can more clearly capture the evolution of public sentiment throughout the public opinion event. Significant fluctuations in sentiment are closely related to specific points in time. In the Initiation Stage of the public opinion, due to the limited understanding of the event, comments exhibit considerable diversity, leading to noticeable positive and negative fluctuations in sentiment means. These polarized comments reflect public emotional uncertainty and incomplete information at the early stage of the event. As time progresses and more information is disclosed, public understanding of the event deepens. During the Outbreak Stage of the public opinion, the overall sentiment mean reaches its peak. This corresponds to the public’s positive feedback on Zibo barbecue after the pandemic ended and they experienced it firsthand. The gradual stabilization of sentiment reflects the diminishing impact of the event.
4.4. Topic Analysis
Based on the Two-Dimensional Theory of Emotion, this study categorizes texts according to valence and arousal dimensions and uses Top2Vec for topic extraction. The distribution of topics and keywords under these two dimensions is shown in
Table 7.
From the perspective of valence, Topic 1 illustrates a pleasant dining experience with an overall positive emotional inclination. Topic 2 includes both positive emotions such as “honest” and “reassured” as well as negative emotions like “short weight”, resulting in a mixed emotional tendency. Topic 3 involves positive experiences related to the city and marketing, with an overall positive emotional inclination. Topic 4 encompasses both positive aspects such as “harmonious governance” and negative aspects like “deceitful”, resulting in a more complex emotional tone. This indicates that the quality of dining experiences and market management significantly affects the public’s emotional experience from positive to negative.
From the perspective of arousal, Topic 1 mainly involves novel and special experiences with moderate emotional arousal, displaying a certain level of excitement. Topic 2 includes elements of surprise and astonishment, with higher arousal that may provoke stronger emotional reactions. Topic 3 has lower emotional arousal, showing a more calm emotion. Topic 4 primarily describes local characteristics and stable experiences, also with low arousal, conveying a sense of calm and satisfaction. This suggests that integrity and pricing significantly influence public emotional responses, highlighting the importance of better serving public needs.
From this, we can conclude that public emotional experiences are diverse, influenced by factors such as dining out and market management. Novel and special experiences can enhance positive emotions, while integrity and pricing have significant impacts on emotional responses, emphasizing the importance of maintaining freshness and integrity. Despite an overall positive emotional tendency, issues in market management still provoke negative emotions, indicating a need for further supervision and management. Additionally, stable and reliable experiences convey calm and satisfaction, showing that stability and reliability are key factors in improving public satisfaction. Therefore, focusing on diverse factors, particularly novelty, integrity, fairness, and stability, is crucial for enhancing public emotional experiences and overall well-being.
4.5. Public Opinion Evolution Analysis
The details of the “Zibo Barbecue Incident” are shown in
Table 8. Due to the pandemic, students from Shandong University were quarantined at home in Zibo. During this period, the local government warmly hosted them for free and arranged a barbecue for them before their departure. This event added warmth and human touch to the image of Zibo city and marked the initiation of the incident, with relatively low public attention at this time.
In the Initiation Stage, public sentiment began to fluctuate gradually. This corresponds to the topic of “students group visiting Zibo for barbecue” gaining traction on social media starting 5 April. The extensive discussions and shares about Zibo barbecue on social platforms rapidly increased the event’s popularity, causing a surge in public attention. However, the public’s understanding was insufficient, and sentiment was mixed. In the Outbreak Stage, public sentiment was highly positive, corresponding to 8–10 April 2023. The incident gained traction due to the confirmed integrity of local businesses and the release of several favorable policies by local authorities, making it a hot topic on social media. Although the event remained a topic of discussion, attention began to decline gradually, and sentiment levels stabilized, indicating that public emotional responses had become more stable.
Keywords in the public opinion themes such as “political stability”, “honest”, “reassuring”, and “dishonest” indicate the public’s focus on policies and businesses. In fact, the confirmation of local favorable policies and conscientious businesses resulted in positive public sentiment regarding the “Zibo Barbecue Incident”. Overall, conscientious businesses and positive government policies contribute to favorable evaluations and development in the local area.
5. Conclusions
This study proposed a method for analyzing the evolution of public sentiment based on the Two-Dimensional Theory of Emotion and the Top2Vec-RoBERTa model, incorporating a sentiment analysis approach that combines sentiment dictionaries with deep learning techniques. By integrating these two methods, sentiment analysis results regarding the central figures in public opinion were obtained. Using the “Zibo Barbecue Incident” as a case study, 17,873 comments from BiliBili videos related to the event were collected as samples. The sentiment of these comments was annotated and analyzed using the Dalian University of Technology sentiment dictionary and the RoBERTa model, which reduced the workload of manual annotation. Top2Vec, combined with the Two-Dimensional Theory of Emotion, was used to analyze changes in emotional states from both the valence and arousal dimensions, providing a more comprehensive understanding of the emotional dynamics throughout the development of public opinion. Under the RoBERTa model, the accuracy of the sentiment classification was evaluated using accuracy metrics, achieving an accuracy rate of 98.46% on the test set. The analysis of sentiment mean evolution, two-dimensional topic analysis, and comment popularity evolution provided deeper insights and solutions for related issues. The limitation of this study is that it did not consider the understanding of emojis during sentiment value calculation using the sentiment dictionary. Future research will further consider more granular sentiment classification to improve sentiment analysis accuracy.