Keyword-Enhanced Multi-Expert Framework for Hate Speech Detection
Abstract
:1. Introduction
- To better examine the interaction between hate and sentiment information, we propose an MTL model that is more suitable for hate speech detection, which uses shared experts and task-specific experts to extract features, and finally employs feature-filtering gates to fuse features.
- Given the lack of use of important word information in previous work, we introduce contrastive learning to the pre-trained model to enable our model to better identify keywords in text.
- Experimental results on three baseline datasets demonstrate that our model is effective in hate speech detection.
2. Related Work
3. Methodology
3.1. Multi-Task Learning Module
3.2. Feature-Filtering Module
3.3. Contrastive Learning Module
3.4. Loss Function
4. Experiments
4.1. Datasets
4.2. Training Details
4.3. Comparison with Baselines
- (1)
- The performance of HateBERT is much better than that of BERT in the three datasets. In particular, the performance is significantly improved on the Abuse dataset, which indicates that HateBERT can better capture the semantic relationships between words in hate speech and better perform hate speech detection.
- (2)
- Our proposed model KMT obtained good performance on all three datasets. Compared with the current best performing model, the Pearson correlation of KMT increases by 0.006 on the Ruddit dataset, the F1 value of KMT improves greatly by nearly 0.028 on the Abuse dataset. These results illustrate the effectiveness of our method.
4.4. Ablation Experiments
- (1)
- When the contrastive learning module is removed, the performance of the model on the two datasets decreases the most, indicating that the swear words and certain identity terms in the sentences are highly correlated with hate speech. The results show that the contrastive learning module can improve the sensitivity of the model to keywords and thus improve the performance of hate detection effectively.
- (2)
- When the MTL module is removed, the performance of the model on the three datasets also decreases, indicating that adding sentiment information can effectively assist the detection of hate speech.
- (3)
- When the feature-filtering module is replaced with the basic gating network, the performance also decreases slightly, indicating that our proposed feature-filtering gates can better achieve the fusion of various expert information and reduce the influence of noise.
- (4)
- KMT outperforms other models, which directly demonstrates the importance and effectiveness of sentence critical information and external sentiment information for hate speech detection.
4.5. Effect of Number of Experts
4.6. Effect of Extraction Network Layer Number
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Munro, E.R. The Protection of Children Online: A Brief Scoping Review to Identify Vulnerable Groups; Childhood Wellbeing Research Centre: London, UK, 2011. [Google Scholar]
- Jahan, M.S.; Oussalah, M. A systematic review of hate speech automatic detection using natural language processing. arXiv 2021, arXiv:2106.00742. [Google Scholar]
- Zhang, Z.; Luo, L. Hate speech detection: A solved problem? the challenging case of long tail on twitter. Semant. Web. 2019, 10, 925–945. [Google Scholar] [CrossRef] [Green Version]
- Tekiroglu, S.S.; Chung, Y.L.; Guerini, M. Generating counter narratives against online hate speech: Data and strategies. arXiv 2020, arXiv:2004.04216. [Google Scholar]
- Hada, R.; Sudhir, S.; Mishra, P.; Yannakoudakis, H.; Mohammad, S.M.; Shutova, E. Ruddit: Norms of offensiveness for English Reddit comments. arXiv 2021, arXiv:2106.05664. [Google Scholar]
- Wang, C. Interpreting neural network hate speech classifiers. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), Brussels, Belgium, 31 October 2018; pp. 86–92. [Google Scholar]
- Chiril, P.; Pamungkas, E.W.; Benamara, F.; Moriceau, V.; Patti, V. Emotionally informed hate speech detection: A multi-target perspective. Cogn. Comput. 2022, 14, 322–352. [Google Scholar] [CrossRef] [PubMed]
- Kapil, P.; Ekbal, A. A deep neural network based multi-task learning approach to hate speech detection. Knowl.-Based Syst. 2020, 210, 106458. [Google Scholar] [CrossRef]
- Zhou, X.; Yong, Y.; Fan, X.; Ren, G.; Song, Y.; Diao, Y.; Yang, L.; Lin, H. Hate speech detection based on sentiment knowledge sharing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Virtual, 1–6 August 2021; pp. 7158–7166. [Google Scholar]
- Sap, M.; Card, D.; Gabriel, S.; Choi, Y.; Smith, N.A. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1668–1678. [Google Scholar]
- Tang, H.; Liu, J.; Zhao, M.; Gong, X. Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In Proceedings of the Fourteenth ACM Conference on Recommender Systems, New York, NY, USA, 22 September 2020; pp. 269–278. [Google Scholar]
- Lai, T.; Ji, H.; Bui, T.; Tran, Q.H.; Dernoncourt, F.; Chang, W. A context-dependent gated module for incorporating symbolic semantics into event coreference resolution. arXiv 2021, arXiv:2104.01697. [Google Scholar]
- Hu, J.; Li, Z.; Chen, Z.; Li, Z.; Wan, X.; Chang, T.H. Graph Enhanced Contrastive Learning for Radiology Findings Summarization. arXiv 2022, arXiv:2204.00203. [Google Scholar]
- Kshirsagar, R.; Cukuvac, T.; McKeown, K.; McGregor, S. Predictive embeddings for hate speech detection on twitter. arXiv 2018, arXiv:1809.10644. [Google Scholar]
- Gou, J.; Yu, B.; Maybank, S.J.; Tao, D. Knowledge distillation: A survey. Int. J. Comput. Vision 2021, 129, 1789–1819. [Google Scholar] [CrossRef]
- Liu, H.; Burnap, P.; Alorainy, W.; Williams, M.L. Fuzzy multi-task learning for hate speech type identification. In Proceedings of the The World Wide Web Conference, New York, NY, United States, 13 May 2019; pp. 3006–3012. [Google Scholar]
- Ousidhoum, N.; Lin, Z.; Zhang, H.; Song, Y.; Yeung, D.Y. Multilingual and multi-aspect hate speech analysis. arXiv 2019, arXiv:1908.11049. [Google Scholar]
- Gou, J.; He, X.; Lu, J.; Ma, H.; Ou, W.; Yuan, Y. A class-specific mean vector-based weighted competitive and collaborative representation method for classification. Neural Networks. 2022, 150, 12–27. [Google Scholar] [CrossRef] [PubMed]
- Gou, J.; Yuan, X.; Du, L.; Xia, S.; Yi, Z. Hierarchical Graph Augmented Deep Collaborative Dictionary Learning for Classification. IEEE Trans. Intell. Transp. Syst. 2022, 23, 25308–25322. [Google Scholar] [CrossRef]
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality reduction by learning an invariant mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; IEEE: Piscataway Township, NJ, USA, 2006; Volume 2, pp. 1735–1742. [Google Scholar]
- Meng, Y.; Xiong, C.; Bajaj, P.; Bennett, P.; Han, J.; Song, X. Coco-lm: Correcting and contrasting text sequences for language model pretraining. Adv. Neural Inf. Process. Syst. 2021, 34, 23102–23114. [Google Scholar]
- Janson, S.; Gogoulou, E.; Ylipää, E.; Cuba Gyllensten, A.; Sahlgren, M. Semantic re-tuning with contrastive tension. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 4 May 2021. [Google Scholar]
- Kim, T.; Yoo, K.M.; Lee, S.G. Self-guided contrastive learning for BERT sentence representations. arXiv 2021, arXiv:2106.07345. [Google Scholar]
- Yan, Y.; Li, R.; Wang, S.; Zhang, F.; Wu, W.; Xu, W. Consert: A contrastive framework for self-supervised sentence representation transfer. arXiv 2021, arXiv:2105.11741. [Google Scholar]
- Gao, T.; Yao, X.; Chen, D. Simcse: Simple contrastive learning of sentence embeddings. arXiv 2021, arXiv:2104.08821. [Google Scholar]
- Robinson, J.; Chuang, C.Y.; Sra, S.; Jegelka, S. Contrastive learning with hard negative samples. arXiv 2020, arXiv:2010.04592. [Google Scholar]
- Zampieri, M.; Malmasi, S.; Nakov, P.; Rosenthal, S.; Farra, N.; Kumar, R. Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). arXiv 2019, arXiv:1903.08983. [Google Scholar]
- Caselli, T.; Basile, V.; Mitrović, J.; Kartoziya, I.; Granitzer, M. I feel offended, don’t be abusive! implicit/explicit messages in offensive and abusive language. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 6193–6202. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Caselli, T.; Basile, V.; Mitrović, J.; Granitzer, M. Hatebert: Retraining bert for abusive language detection in english. arXiv 2020, arXiv:2010.12472. [Google Scholar]
- Basile, V.; Bosco, C.; Fersini, E.; Nozza, D.; Patti, V.; Pardo, F.M.R.; Rosso, P.; Sanguinetti, M. Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA, 6–7 June 2019; pp. 54–63. [Google Scholar]
Dataset | Total | Classes |
---|---|---|
Ruddit | 5828 | Score 0–1 (2514) |
Score −1–0 (3442) | ||
Offen | 14,100 | hate (4640) |
non-hate (9460) | ||
Abuse | 14,100 | exp-hate (2129) |
imp-hate (798) | ||
non-hate (11,173) | ||
RSA | 37,249 | neutral (13,142) |
negative (8277) | ||
positive (15,830) | ||
TSA | 31,962 | negative (2242) |
positive (29,720) |
Models | Ruddit (Regression) | Abuse (3 Class) | Offen (2 Class) | |
---|---|---|---|---|
Pear ↑ | MSE ↓ | F1 ↑ | F1 ↑ | |
BERT * [5,30] | 0.873 ± 0.005 | 0.027 ± 0.001 | 0.727 ± 0.008 | 0.803 ± 0.006 |
HateBERT * [5,30] | 0.886 ± 0.005 | 0.025 ± 0.001 | 0.765 ± 0.006 | 0.809 ± 0.008 |
KMT (BERT) | 0.8764 ± 0.007 | 0.027 ± 0.0007 | 0.7882 ± 0.01 | 0.8028 ± 0.02 |
KMT (HateBERT) | 0.8921 ± 0.006 | 0.0231 ± 0.001 | 0.7929 ± 0.01 | 0.8064 ± 0.01 |
Models | Ruddit (Regression) | Abuse (3 Class) | Offen (2 Class) | |
---|---|---|---|---|
Pear ↑ | MSE ↓ | F1 ↑ | F1 ↑ | |
KMT | 0.8879 ± 0.005 | 0.0246 ± 0.0005 | 0.7827 ± 0.02 | 0.7995 ± 0.024 |
KMT | 0.8907 ± 0.004 | 0.0234 ± 0.0008 | 0.7846 ± 0.02 | 0.7957 ± 0.019 |
KMT | 0.8892 ± 0.004 | 0.0249 ± 0.001 | 0.7886 ± 0.02 | 0.8035 ± 0.02 |
KMT | 0.8921 ± 0.006 | 0.0231 ± 0.001 | 0.7929 ± 0.01 | 0.8064 ± 0.01 |
Models | Ruddit (Regression) | |
---|---|---|
Pear ↑ | MSE ↓ | |
1 layer | 0.8921 ± 0.006 | 0.0231 ± 0.001 |
2 layers | 0.8731 ± 0.007 | 0.0283 ± 0.001 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhong, W.; Wu, Q.; Lu, G.; Xue, Y.; Hu, X. Keyword-Enhanced Multi-Expert Framework for Hate Speech Detection. Mathematics 2022, 10, 4706. https://doi.org/10.3390/math10244706
Zhong W, Wu Q, Lu G, Xue Y, Hu X. Keyword-Enhanced Multi-Expert Framework for Hate Speech Detection. Mathematics. 2022; 10(24):4706. https://doi.org/10.3390/math10244706
Chicago/Turabian StyleZhong, Weiyu, Qiaofeng Wu, Guojun Lu, Yun Xue, and Xiaohui Hu. 2022. "Keyword-Enhanced Multi-Expert Framework for Hate Speech Detection" Mathematics 10, no. 24: 4706. https://doi.org/10.3390/math10244706
APA StyleZhong, W., Wu, Q., Lu, G., Xue, Y., & Hu, X. (2022). Keyword-Enhanced Multi-Expert Framework for Hate Speech Detection. Mathematics, 10(24), 4706. https://doi.org/10.3390/math10244706