Next Article in Journal
Analysis of Highway Vehicle Lane Change Duration Based on Survival Model
Previous Article in Journal
Sentiment Informed Sentence BERT-Ensemble Algorithm for Depression Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study

by
Luis Gabriel Moreno-Sandoval
1,*,†,
Alexandra Pomares-Quimbaya
1,†,
Sergio Andres Barbosa-Sierra
1,† and
Liliana Maria Pantoja-Rojas
2,†
1
Engineering Faculty, Pontificia Universidad Javeriana, Bogotá 110231, Colombia
2
Engineering Faculty, Universidad Distrital Francisco José de Caldas, Bogotá 111611, Colombia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Big Data Cogn. Comput. 2024, 8(9), 113; https://doi.org/10.3390/bdcc8090113
Submission received: 18 June 2024 / Revised: 13 August 2024 / Accepted: 20 August 2024 / Published: 6 September 2024

Abstract

:
The growing popularity of social networking platforms worldwide has substantially increased the presence of offensive language on these platforms. To date, most of the systems developed to mitigate this challenge focus primarily on English content. However, this issue is a global concern, and therefore, other languages, such as Spanish, are involved. This article addresses the task of identifying hate speech, racism, and misogyny in Spanish within the Colombian context on social networks, and introduces a gold standard dataset specifically developed for this purpose. Indeed, the experiment compares the performance of TLM models from Deep Learning methods, such as BERT, Roberta, XLM, and BETO adjusted to the Colombian slang domain, then compares the best TLM model against a GPT, having a significant impact on achieving more accurate predictions in this task. Finally, this study provides a detailed understanding of the different components used in the system, including the architecture of the models and the selection of functions. The best results show that the BERT model achieves an accuracy of 83.6% for hate speech detection, while the GPT model achieves an accuracy of 90.8% for racism speech and 90.4% for misogyny detection.

1. Introduction

In recent years, worldwide digital platforms, especially social networks, have come to facilitate contact between people and generally work to share ideas, exchange information and content on a particular topic, and be an accessible tool to express feelings and opinions. An example is the social network X (previously Twitter), which currently has approximately 330 million active users, with an increase of 3.93% in 2022 [1]. Indeed, this considerable increase in the number of users has revolutionized the manifestations of communication and the expression of feelings.
Nonetheless, the freedom of expression granted by social networks expands the ability to easily spread hate speech (HS) and misinformation that could cause harm to individuals and society as a whole [2]. The term “hate speech” has been defined as any communication denigrating a person or group, judging by specific features such as color, ethnicity, gender, sexual orientation, nationality, and religion, among others [3]. In addition, when hate speech is gender-oriented and specifically targets women, it is called misogyny [4] (MS).
In the digital era, the increasing spread of HS on social networks and the urgent need for effective action have drawn the research community to dig deeper and comprehend, measure, and monitor user sentiment toward certain topics or events. Using data from the year 2022, the International Telecommunication Union (ITU) [5] pointed out that 75% of people in the world between the ages of 15 and 24 used the Internet, ten percentage points more than the rest of the population (65%). Moreover, as an alarming fact, Latin America represented the highest level of harassment on social networks with 76%.
In a short time, many methods have been developed for the automated detection of HS online [3]. However, implicitly offensive language is a particularly tough task [6]. The challenge addressed in most studies is detecting offensive language as a single optimization task. In a globalized world, the development of computational models of natural language with Artificial Intelligence (AI), focused on the fight against racism, gender violence, and hate, among other problems that affect the social environment, contributes to the global commitment that seeks to prevent and counteract the spread of illegal hate speech online [7] and have a significant impact on the common well-being of society.
Social networks are a part of the daily lives of millions of people. In 2022, the average world use per day dedicated to social networks was 2 h and 27 min, representing 37% of the total time spent on the Internet. Colombia is the second country in the world whose citizens spend the most time a day using digital platforms, with an average of 3 h and 45 min [8]. Recent studies show that seven out of ten children in Latin America report suffering from cyberbullying [9], which implicitly includes inciting the spread of hate, threats of violence, false accusations, blackmail, and sexual harassment [10].
Colombia is one of the countries where this problem is currently more prevalent, with statistics ranging between 30% and 40%, along with Argentina and Peru; this means that two out of five people have suffered cyberbullying [9]. Therefore, the severe and irreparable consequences of the propagation of hate and the inherent limitations of human interpretation make it necessary to implement tools for early detection and propagation, as well as regulation, to counter hate speech online. So, the following research question arises: how can the performance of existing text classification models focused on hate speech, racism, and misogyny in social networks be improved in the local Colombian context?
The widespread adoption of social networking platforms has led to an unprecedented rise in the exchange of information and ideas across the globe. However, this has also resulted in a significant increase in the prevalence of offensive language, including hate speech, racism, and misogyny. While substantial efforts have been made to address this issue in English content, there is a notable gap in resources and systems designed to tackle these challenges in other languages, such as Spanish. The motivation of the article is to develop an experiment to detect offensive language, including hate speech, racism, and misogyny, in order to reduce the gap in the Spanish language. This gap is particularly pronounced in specific regional contexts, where dialects and local expressions may differ significantly from standard Spanish. With its unique slang and regional expressions, the Colombian context presents a challenge for automated hate speech detection systems.
This study presents several significant contributions. It introduces a novel dataset tailored to the Colombian Spanish context, featuring 13,339 manually labeled tweets extracted from social networks and categorized into hate speech, racism, and misogyny. This georeferenced dataset, enriched with dialect-specific features, offers a unique resource for analyzing offensive content within the Colombian context. Furthermore, this research proposes a challenging approach to improve the Transformer Language Model (TLM) performance for the task of hate identification in the Colombian dialect and compares it with a Large Language Model (LLM). Therefore, a dataset in Colombian slang is created to carry out the transfer learning and fine-tuning of existing TLMs to achieve better performance as evidenced in the analysis of the experimental results of the evaluation with various configurations of models and data. Subsequently, a set of tweets from the previous dataset is chosen to compare the best-performing TLM model with a GPT model (representative of the LLM models).
The article is structured into eight sections. The Section 2, Material and Methods, is divided into two subsections: Background, which provides essential theoretical context, and Methodology, which details the research approach and its specific phases. The Section 3, Dataset, covers the dataset implementation and the pre-processing techniques applied. The Section 4, Models, discusses model selection, hyperparameter definitions, optimization algorithms, and fine-tuning strategies. TheSection 5 presents the experimental results, practical implications, and threats to validity. The Section 6, Discussion, interprets the findings within the context of existing research. The Section 7, Conclusion, summarizes the main findings of the study. Finally, the Section 8, Future Work, outlines recommendations and directions for further research.

2. Material and Methods

This section is organized into two main parts. The first part, Background, provides essential theoretical concepts and discusses the key milestones that have influenced the development of the research. The second part, Methodology, details the research approach and is structured into three phases to illustrate the process comprehensively.

2.1. Background

There has been a lack of consensus among researchers regarding offensive language, leaving the possibility of subjective interpretations open. Therefore, the same linguistic phenomenon can receive different terms; conversely, the same label can be used for different meanings or expressions [11].
In particular, offensive language intends to offend a person or a specific group through derogatory, hurtful, or obscene expressions [12], which may include insults, toxic comments, threats, profanity, or swearing [13]. Therefore, the similarities between the approaches proposed in previous works have encouraged the argument of a typology that differentiates whether the offensive language is directed to a specific individual or entity or a generalized group and whether the abusive content is explicit or implicit [13].
Recently, many researchers have been investigating the characterization and taxonomy of offensive language to identify abusive content and develop classification systems with different types: aggression identification, cyberbullying detection, HS identification, bullying identification, offensive language, and the identification of toxic comments [14].
Therefore, hate speech as a part of offensive language is a type of manifestation characterized by expressing hostility, prejudice, discrimination, and verbal aggression towards individuals or groups based on their ethnic origin, religion, gender, sexual orientation, and disability, among other factors [15].
Racism (RS) and misogyny are specific forms of hate speech directed at particular groups. The current research on sexism in social networks focuses on detecting misogyny or hatred towards women. The Oxford Dictionary defines misogyny as “a feeling of hate or dislike towards women, or a feeling that women are not as good as men” [16]. The Royal Spanish Academy (RSA) dictionary, or RAE in Spanish, defines it as “aversion to women” [17]. Instead, racism refers to “the belief that some races of people are better than others, or a general belief about a whole group of people based only on their race” [18], which leads to discrimination or social persecution.
Generally, a comment is sexist when it discriminates against people based on gender. This discrimination, whose predominant objective is women, is a prevalent cultural component based on the superiority of men over women in different sectors of life, such as work, politics, society, and the family [19]. Although both men and women can experience violence and abuse online, women are much more likely to be the victims of harmful actions in severe and repeated forms. Young girls are particularly vulnerable to sexual exploitation and abuse, as well as bullying by their peers in the digital space [20].
In the context of the Internet and social networks, hate speech not only creates tension between groups of people, but its impact can also influence business or even lead to real-life conflicts [21]. Therefore, to prevent and counter the spread of hate speech on social networks, the European Commission agreed with Facebook, Microsoft, Twitter, and YouTube on a code of conduct to counter illegal hate speech online [7,22,23]. However, controlling and filtering all the content is a challenge. For this reason, researchers have tried to develop different automatic hate speech detection tools developed in the Natural Language Processing (NLP) and Machine Learning (ML) fields.
In ML, hate speech detection can be modeled as a dichotomous (hate speech or non-hate speech) or multiclass classification problem (misogyny, racism, etc.), which can be adequately addressed using both classical learning algorithms and Deep Learning (DL) algorithms [24]. NLP techniques such as sentiment analysis and identifying offensive keywords and linguistic patterns can also be used. Although Reinforcement Neural Network (RNN) algorithms have been widely used in processing data streams, their limitation in the length of the streams they can handle has led to the increasing popularity of self-attention-based models, such as TLM.
End-to-end memory networks are based on a recursive attention mechanism rather than sequence-aligned recursion and have been shown to perform well in simple language question-answering and natural language modeling tasks [25]. Nevertheless, the literature reveals that the TLM is the first transduction model completely based on self-attention to calculate representations of its input and output without using sequence-aligned RNN or convolution [26].
Regarding works on the automatic detection of hate speech, the vast majority of developments are in the English language, predominating over other languages, such as Spanish. In short, since Spanish is the third most used language on the Internet [27], it is essential to research and implement natural language models focused on the Spanish language, especially the domain and dialect of Spanish-speaking countries such as Colombia.
Hate speech in Colombia is profoundly influenced by historical, cultural, and socioeconomic factors. The armed conflict and social polarization have created an environment where hate can emerge as a tool of resistance or rejection. The country’s ethnic and regional diversity also means that expressions of hate vary depending on the group and region, while socioeconomic inequalities foster resentments that can manifest as discriminatory speech. Additionally, media and social networks amplify and propagate these discourses, although they can also serve as platforms for denouncement and mobilization against hate. Also, the level of education and social norms impact the prevalence of hate speech, with higher prevalence in areas with lower awareness of tolerance issues. Thus, it is crucial to consider these factors when addressing hate speech and formulating strategies specifically tailored to the Colombian context.
In short, the detection of offensive language is a difficult task. The most critical challenges identified in this research are factors of the subjective nature of language framed in culture, gender, demographics, and the social environment. Therefore, there is a great complexity of words that have different senses or meanings depending on the region; this leads to significant challenges, such as the variation of vocabulary depending on the place where it is spoken, the variants of Spanish (e.g., Spain and Latin America), the content of comments with sarcasm, literary figures, linguistic registers in social networks such as emojis, and even rhetorical figures with an ironic sense such as hyperbole, for instance, “casi me muero del susto”, “Tu cara parece la Luna con tantos hoyos”, “Eres más lenta que una Tortuga”, “Te dejé un millón de mensajes en tu celular y nunca me devolviste la llamada”, among others.
The detection of hate speech is an active and constantly evolving research topic [28]. Early studies mainly used a combination of feature extraction and ML modeling for detection in social networks [29]. A variation on that approach was to use Paragraph2vec with Bag of Words (BOW) to detect hate speech in a collection of comments pulled from Yahoo! Finance. In total, 56,280 comments with hate speech and 895,456 comments without hate speech were collected. The results showed that the Paragraph2vec had a higher Area Under the Curve (AUC) than any BOW model [30].
The authors [31] propose an approach to automatically detect hate speech on Twitter using n-grams and patterns as features to train the Support Vector Machine (SVM) algorithm. The approach achieves a precision of 87.4% for binary classification and 78.4% for multiclass classification.
In [32], the performance of different feature extraction techniques (Term Frequency-Inverse Document Frequency (TF-IDF), Word2vec, and Doc2vec) and ML algorithms are compared, such as Naïve Bayes, Random Forest, Decision Tree, Logistic Regression, SVM, K-Nearest Neighbors, AdaBoost and Multilayer Perceptron (MLP) to detect hate speech messages. The best-performing combination led to the representation of TF-IDF features with bi-gram features and SVM algorithm, achieving an overall accuracy of 79%.
In addition, text mining features for predicting different forms of online hate speech are explored, including features such as character and word n-grams, dependency tuples, sentiment scores, and first- and second-person pronoun counts [21].
Nevertheless, despite the good results of the above approach, the unstructured nature of human language presents various intrinsic challenges for automated text classification methods [24]. In 2017, researchers introduced a neural network-based method that learned semantic word embeddings to handle this complexity. Running experiments with a reference dataset of 16K tweets, they found that these DL methods outperform state-of-the-art character/word n-gram methods by 18 points in the F1 score [33].
In [34], they classified Arabic tweets into five hate speech categories using SVM and four DL models. The results obtained showed that the DL models outperformed the SVM model. Indeed, the Long Short-Term Memory (LSTM) network model with a Convolutional Neural Network (CNN) layer perfected the highest performance with a precision of 72%, recall of 75%, and F1 score of 73%.
The researchers in [35] propose a hate detection system with a set of RNN and LSTM classifiers, incorporating user-related information characteristics, such as bias towards racism or sexism. Likewise, the work [29] develops a hate classifier for different social networks using multiple algorithms, such as Logistic Regression, Naive Bayes, SVM, XGBoost, and neural networks, in addition to the use of feature representations (BoW, TF-IDF, Word2Vec, Bidirectional Encoder Representations from Transformers (BERT), and their combination), obtaining the BERT [26] model with the best results when comparing individual characteristics.
The evaluation of pre-trained models based on TLM for detecting Spanish (Castilian) speech has obtained promising results [36]. In particular, Table 1 compares the performance of pre-trained multilingual models (mBERT [37]) and Cross-lingual Language Models (XLM) [36] with a monolingual Spanish BERT model (BETO) [38] trained with a specific Spanish corpus.
The results obtained in Table 1 show that the BETO pre-trained model scores better in the F1 evaluation metric than mBERT and XLM. In this context, it can be concluded that it is necessary to train a model in Spanish (Colombian slang) since the system is capable of more accurately modulating the vocabulary.
Consequently, the DL models generally implemented for detecting hate speech are pre-trained with the union of monolingual corpora from different languages. Although models such as mBERT and XLM provide a greater vocabulary, it is observed that the greatest coverage in the case of Spanish is for the BETO model [38], which could be one of the main reasons why it achieves the best performance on the HatEval [44] and HaterNet [36,39] hate speech datasets.
In particular, the automatic detection of hate speech in Spanish is closely related to participation in Task 5: Multilingual detection of hate speech against immigrants and women on Twitter at SemEval-2019 [36]. The task consists of two sub-tasks in English and Spanish, the first for detecting hate speech and the second for classifying aggressive hate tweets and identifying the affected individual or group.
Table 2 details the statistics of the public datasets of hate speech in Spanish based on social networks. Considering the limited availability of publicly accessible data, there are approaches for training the data augmentation of hate speech sequences with examples automatically generated with BERT and Generative Pre-trained Transformer 2 (GPT-2) [45]. These advances present significant improvements in the performance of the models when increasing the training data, leading to an improvement of +73% in recall and, consequently, an increase of +33.1% in F1 score [45]. To a great extent, the close relationship between the decrease in performance and the training of a classifier with small amounts of data is evident.
On the other hand, for the solution of the first SemEval task, the participants with the highest score for the Spanish language implemented an SVM model with a combinatorial framework with linguistically motivated techniques and different types of n-grams, whereas for the second task, they opted for a multilabel approach using the Random Forest (RF) classifier [42]. However, one of the main limitations of this type of traditional classifiers is that they are not flexible enough to capture more complex relationships naturally and do not usually work well on large datasets [36].
As illustrated in Table 3, it is possible to observe general trends in the literature on the different techniques of NLP to address the problem of hate speech classification.
After a careful review of previous studies, it becomes clear that a number of recent pre-trained LLMs based on the Transformer mechanism have not yet been tested for detecting HS in Spanish. However, recent work has explored GLLMs such as GPT3, GPT-3.5 ChatGPT, and GPT-4 for various text classification problems, such as sentiment analysis [49], stance detection [50], intent classification [51], mental health analysis [52], hate speech detection [53], misinformation detection [54], paraphrase detection [55], news classification [56], natural language inference [55], and text classification [56].
In addition, there is no evidence of comparative studies of monolingual and multilingual pre-trained LLMs to prove their validity in this language; some of them are dialects.
Finally, this article is based on TLMs due to the main advantage of not needing a large dataset that is only sometimes available, specifically for languages other than English. Indeed, it can capture long-term dependencies in the language and effectively incorporate hierarchical relationships, which is very important in languages such as Spanish due to its syntactic and semantic complexity [36]. Additionally, a comparative study is carried out between the traditional transformer language systems, considering them the baselines in this study, and the pre-trained TLM and GPT with the dataset of the Colombian slang.
Table 3. Summarized state of the art.
Table 3. Summarized state of the art.
ModelDatasetContribution
LR [30]951,736 Yahoo Finance user commentsApplied BOW, TF, TF-IDF, paragraph2vec embeddings. AUC 0.8007
SVM, NB, kNN [57]TweetsApplied uni-grams, TF-IDF, retweets, favourites, page authenticity. F1 score 0.971
LSTM, CNN + LSTM, GRU, CNN+GRU [34]11,000 Arabic tweets into five classes: none, religious, racial, sexism or general hateSVM achieves an overall recall of 74%, DL have an average recall of 75%. However, adding a layer of CNN to LTSM enhances the overall performance of detection with 72% precision, 75% recall and 73% F1 score.
CNN + GRU [45]1M hate and nonhate, produced by BERT and GPT-2Significant improvements in the performance of a classification model
ELMO, BERT and CNN [58]SemEval 2019 Task-5 13,000 tweets in EnglishPerformances of the fusion method are better than the original methods (accuracy = 0.750 and F1 score = 0.704)
Ensemble of BERT models for Spanish (BETO) [36,59]MeOffendEs IberLEF 2021: Offensive Language Detection in Spanish VariantsExternal data was from hate speech detection and sentiment analysis was used to augment the training set [59].
BETO [36]HaterNet and HatEval (Spanish)The results obtained with LM BETO outperform the other ML models.
XLMRoBERTa [60]MeOffendEs IberLEF 2021The model was trained with both tweets and sentiment analysis data in Spanish [60]. A diversity of configurations were tested, a model pre-trained on tweets and sentiment analysis data obtained the best performance [61].
Bidirectional LSTM + BERT (bertbase-ML) [62]MeOffendEs IberLEF 2021Better results were obtained withe the Bi-LSTM model
Transformerbased [46]IberLEF 2021The pre-trained transformers for Spanish in the modeling process was very helpful. Most of the top ranked participants used transformers. They think that more specialized mechanisms could help to boost performance when using transformers.

2.2. Methodology

In the previous sections, the evolution, methods, applications, and techniques of NLP were analyzed, emphasizing the complexity of solving the hate speech classification problem and the linguistic complexities for the domain of Colombian Spanish present in communicative interactions on the digital stage. The methodology section is divided into three phases, which analyze the methods for implementing the database and the computational models.
Figure 1 shows the workflow of the hate speech, misogyny, and racism detection process from the initial stage, which used various datasets, including our own (Dataset 3), the implementation of pre-processing with a variety of methods, techniques, and analyses, and finally models and their evaluation.
Now, the methodology used in the research is outlined and divided into three phases: the first phase involves the initial selection and evaluation of models; the second focuses on pre-training with the Gold Standard dataset; and the third examines the comparison of these models with GPT models.

2.2.1. Phase 1: Initial Model Selection and Evaluation

In the first phase, a systematic evaluation was conducted to select the best models for the task and optimize their performance. The evaluation began by choosing BERT, RoBERTa, BETO, and RuPERTa as our baseline models without pre-training, given their proven effectiveness in NLP tasks according to the literature. These models were evaluated using Dataset 1 to analyze their initial performance. BERT and RoBERTa are well-established models known for their robustness in various NLP tasks [58]. BETO, designed specifically for Spanish [36], and RuPERTa, a robust model for Romance languages, were included to address the linguistic nuances of our datasets. These models were selected based on state-of-the-art research, as they demonstrated strong performance on text classification tasks. Subsequently, the three best-performing models were pre-trained using the Spanish subset of the SemEval and HatEval datasets to refine their accuracy further.

2.2.2. Phase 2: Pre-Training with Gold Standard

The second phase involved pre-training the models with the best initial performance using a consolidated database of Colombian tweets. This dataset, comprising 13,339 tweets, was categorized into hate speech (HS), misogyny (MS), and racism (RS). The selection of this dataset was crucial to ensure that the models could handle the specific dialectical and cultural context of Colombian Spanish. Each dataset underwent a rigorous pre-processing phase, including data cleaning, symbol removal, tokenization, and more, as detailed in the pre-processing and data formation subsection. This phase was essential for tailoring the models to detect offensive content accurately within the Colombian context.

2.2.3. Phase 3: Comparison with GPT Models

In the third phase, the best-performing BERT model from the previous phase was compared with an LLM from the GPT family using Dataset 3 and the ADAM optimization algorithm. The closed-source models of the GPT-3 family can understand and generate natural language. In addition, the GPT-3.5 models have the additional ability to generate code [63]. However, neither the GPT-3 nor the GPT-3.5 models are optimized for use in chat rooms. This drawback is solved by the ChatGPT (GPT-3.5-turbo-0125) and GPT-4 models [64].
The GPT-3.5-turbo-0125 model was chosen based on its suitability for chatbot implementation, which closely simulates social media interactions where hate content classification is critical. The GPT-3.5-turbo-0125 model was selected for its enhanced complex reasoning abilities, instruction-following capabilities, and reduced generation of harmful text. This selection was informed by its fine-tuning on code data and the usage of Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), making it an optimal choice for these specific needs [65].
Each phase’s methodological choices were driven by the need to balance accuracy, cultural relevance, and practical application in the scope of detecting offensive content in Colombian Spanish.

3. Dataset

The performance of Deep Learning models is highly dependent on an appropriate and valid dataset. The following datasets are used in this research.

3.1. Dataset 1: AMAZON Reviews Multilanguage

Dataset 1 is a collection of Amazon product reviews in multiple languages, including Spanish [66]. It contains over 200 million product reviews in 16 different languages, totaling 5.8 million reviews in Spanish. From this, a random subset of 5000 reviews was selected for analysis.
The dataset includes several key attributes. Product categories provide contextual information about the reviewed items. The number of stars reflects customer ratings on a scale from 1 to 5, indicating sentiment polarity. Additionally, the number of helpful votes measures how many users found the review useful.
The usage of this Spanish subset served as the foundation for assessing comment polarity and enabled a detailed analysis of the model’s performance with Spanish reviews. The product categories and detailed ratings contribute to a nuanced evaluation of sentiment and review usefulness. Nevertheless, it is essential to note that while this analysis is refined for the Spanish language, it does not explicitly address the Colombian context, including local slang and dialects, which are crucial for our study.

3.2. Dataset 2: SEMEVAL and HATEVAL

Dataset 2 of SemEval-2018 Task 1: Affect in Tweets consists of 6348 tweets tagged in English and Spanish [67]. The subset of data in Spanish consists of 1589 tagged tweets. In terms of polarity, it consists of 813 positive tweets, 443 negative tweets, and 333 neutral tweets. Regarding emotional intensity, 441 tweets were labeled as high intensity, 712 as medium intensity, and 436 as low intensity.
The SemEval-2019 Task 5: Multilingual detection of hate speech against immigrants and women in the Twitter dataset consists of 19,600 tweets [44]. They are distributed in two hate categories: 9091 tweets about immigrants and 10,509 tweets about women. For Spanish, the training set includes 5000 tweets, of which 3209 allude to the female gender and 1991 about immigrants. The test set consists of 1600 tweets in Spanish, of which 660 are hateful and 940 are not.
The attributes in this dataset include the comment text, which forms the basis of the analysis. It features a binary indicator for hate speech (HS), specifying whether hate speech is present against the given targets, such as women or immigrants. If hate speech is identified, the dataset also includes information on the target range (TG), indicating whether the target is a generic group or a specific individual. An aggressiveness (AG) attribute shows whether the tweet is aggressive when hate speech is present.
Emotional polarities and intensities provide valuable context for interpreting emotions in tweets. By differentiating across hate categories, the model’s ability to detect various forms of hate speech, such as misogyny and racism targeting different minorities, can be evaluated. However, it is important to note that these tweets do not yet address the specific Colombian context.

3.3. Dataset 3: Colombian Context HS, RS and MS (Gold Standard)

Raw tweet data extracted from Twitter were labeled and classified into three distinct categories for independent analysis to create a database tailored to the Colombian Spanish context. The tweets were randomly pre-selected based on their georeference in Colombia, and the categories were balanced to prevent biases. As a result, some tweets may not have been included in this study. This approach ensured an equitable representation of various types of offensive content, positive and negative, facilitating a more accurate and relevant analysis within the Colombian context.
In total, the dataset includes 13,339 manually labeled tweets categorized as hate speech (HS), racism (RS), and misogyny (MS), which allowed for a detailed examination of specific offensive content relevant to the Colombian context. Regularization techniques were applied to address the overfitting problem, and the model’s performance was evaluated to ensure effective generalization to unseen data.
Following the guidelines proposed to minimize subjectivity [4,44], the first category corresponds to hate speech, regardless of whether or not it is directed against a collective or vulnerable group. This particular type of message is labeled as HS. It may involve attacks and threats toward activists, public figures, and/or celebrities, as well as messages with hostile and generalized content against social groups, migrants, people of race or belonging to ethnic groups, women, people from the LGBTIQ+ collective, among other minorities [44].
The second category is racism, which is labeled RS. Racist comments are considered to be those that incite contempt, violence, or the denial of rights to people or groups perceived as different due to their somatic features (e.g., skin color), origin (e.g., foreigners), cultural identity, language, and traditions. Racist comments typically associate the origin/ethnicity of others with cognitive deficiencies, a predisposition to criminal behavior, laziness, or other vices. In addition, they usually attribute the perceived inferiority of these groups to the majority group or others perceived as superior.
Finally, the third labeled class was misogyny, categorized with the MS label. Misogynistic messages typically express prejudice and animosity with characteristics culturally associated with women. This animosity often reaches the point of openly expressing aggressive positions towards women, encouraging insults, belittling, harassment, threats of violence, objectification, harmful stereotypes, or denying male responsibility in matters concerning women and men alike. Misogynistic language is often camouflaged in sexist comments, reinforcing anachronistic ideas about women’s roles and relationships with other genders. Likewise, this offensive language exaggerates characteristics or behaviors traditionally attributed to women to disseminate messages for ideological purposes that oppose women’s rights.
In the implementation, specific terms from the Colombian dialect, such as “paila”, were identified. In Colombia, “paila” is used to describe a bad or unfortunate situation, whereas in other Spanish-speaking countries like Mexico, “paila” primarily refers to a cooking pot with no negative connotations. This variation in usage and context highlights the importance of adapting the model to local linguistic characteristics. Recognizing these specific terms enables the more accurate detection of offensive language and enhances the model’s effectiveness in different cultural contexts.
For the previous categories, the value 0 was set in the text label when lacking any previous characteristics, and 1 when there are hate patterns in its content. Also, the labeling process was reviewed by multiple experts. Next, Table 4 shows the content of tweets and the general composition of the Gold Standard for training and testing.
The available API from Twitter was used to download around forty million tweets to obtain the data proposed in this research. This information was stored in a non-relational database to obtain a sample of size N records, using the function .aggregate([’$sample’:’size’:N]), which returns an N-dimensional iterator of records. Considering the database’s size to carry out the labeling process, several samples of N = 5,000,000 were formed with random data. Likewise, the metadata of each record with the location were filtered to guarantee that all the information processed from tweets belongs to the location of Colombia.
For the lemmatization and tokenization of the data, a set of broad-spectrum models from the .spaCy library was used, as well as, in particular, the large es_core_news_lg model, which contains the collection of .anCora with an extensive corpus of texts in Catalan and Spanish with various levels of annotation designed by the University of Barcelona [68]. Additionally, with Python’s NLTK library, stop terms were removed from tweets, resulting in a refined version with reduced complexity.
Simultaneously, to have a more profound knowledge of the data, metrics applied in other studies were used to understand and analyze the characteristics of the data [69]. First, a polarization score of the lemmatized tweets was calculated using the library pysentimiento [70], which specializes in analyzing social media posts in Spanish and has more useful text analysis tools. An initial vision of its composition was obtained, indicating a balance between “positive” and “negative”, as well as samples that contain non-harmful information and without explicit forms of hate (“neutral”), which does not mean that it does not express another harmful language in a friendly way.
To confirm the above, the distribution of labels was analyzed according to the previous guidelines of experts in similar tasks [71,72]. Indeed, the non-truncated tweets (with loss of information), the responses associated with the publications, and all the georeferenced tweets in Colombia were extracted. Indeed, to label a tweet as hateful content, a series of criteria based on observations and heuristics were established, as well as the following [73]:
1.
Sexist or racist content following the above definitions.
2.
An explicit expression of attacks on ethnic or racial, national, and international minorities and/or people of the female gender.
3.
Content that seeks to belittle or silence the opinions of women or people who are part of minorities, or third parties who are not in favor of their causes.
4.
Comments that criticize stereotypical and limited aspects of women based on unfounded arguments or misrepresenting figures or statistics.
5.
Comments that indirectly promote crimes or violence against these groups.
6.
Exaggerate real or perceived negative features associated with women or minorities to generate a general opinion about them.
7.
Shows of support with hashtags (ht) alluding to social minorities whose information contains hate, such as #ChaoVenecos, #Indios, and #LasViejasSiempre, among others.
8.
Tweets with political content addressed to personalities or parties are discarded. It is because it can generate biases in the results related to terms specific to these groups.
The relationship between the measurements with manual and heuristic labeling was also analyzed. For this, the correlation coefficient of all the mined variables was calculated, and it was determined which ones could be used to understand the behavior of the data. A significant observation is a visible correlation between the negative polarity score and the prevalence of HS. Likewise, it was observed that there is an inverse correlation between the prevalence of HS and the neutral polarity score. This characteristic is fundamental since it suggests that these measures can be useful as an indicator of the prevalence of patterns associated with whether a publication belongs to a hateful language.
Finally, we sought to keep the data balanced and the domain of the tokens using a word map that scales them according to their prevalence, noting that the data typically follow a scale-free distribution under the Zipf principle [74]. Some words are tokenized with the word or token <user>, such as the name tagged in a publication, since social networks such as Twitter base their operation on chains of mentions. The incidence of the <url> token means that there are a large number of web links per tweet, which explains the character limit of 7838 as the average length when the social network only supports up to a total of 280 characters per tweet, including special characters and emojis.

Pre-Processing

For the development of a methodology that allows the understanding of the data, the work [39] was taken as a basis, which uses the HateBase database (SemEval competition) as the primary source that collects and keeps updated the information on forms of HS. The main challenges considered for the understanding of the data are as follows [40]:
1.
Discern whether the content of a publication contains a message that can be classified with little or null ambiguity or as hate speech.
2.
Understand the syntactic and semantic structures that characterize hate speech. Likewise, it is crucial to consider the local characteristics of Colombian Spanish and the most common patterns found in Colombian hate speech.
3.
Create appropriate linguistic resources to detect hate language in Spanish.
4.
Develop a method for exploring the available data and filtering potential publications classifiable as hate languages based on lexicographical techniques, use of lexicons, n-grams, and classification techniques based on ML or DL. This method will create a first Gold Standard with tweets classified as hate speech, racism, and misogyny and create expert systems capable of discerning between each one.
5.
Compare the results with those obtained in other challenges and similar investigations.
This research seeks to replicate these challenges using a completely new database, formed from a set of raw tweets, sampling with manual processing, and labeling to obtain the data that will feed the classification models. The process includes the following steps:
1.
Normalize tweets by converting uppercase to lowercase and removing accents.
2.
Transliterate emojis into written descriptions of what they represent.
3.
Replace references, so mentioning another user is replaced by the @user keyword.
4.
Convert the labels hashtag into sentences. For example, if a tweet contains the hashtag #todossomoselpueblo, this expression may be converted to todos somos el pueblo.
5.
Search for n-grams based on the frequency of occurrence of grouped expressions. The most common ones were converted to individual tokens to reduce the complexity of our dictionary.
6.
Remove links from web pages referenced in a tweet.
7.
Recode the characters to utf-8 to remove ambiguities in the text and ensure that special characters can be processed using regular expressions.
8.
Remove special characters, tabs, line breaks, and extra spaces.
In previous conferences and challenges focused on this task, the characteristics of publications with hate, racism, and misogyny content were established [71]. A helpful technique was segmenting the data based on keywords comprising a list of obscene words, typically used pejoratively against vulnerable groups, such as the SHARE resource, the lexicon of harmful expressions of Spanish speakers [75]. Additionally, using dictionaries, expert criteria, and technical documents, it was sought that these expressions are typical of the Colombian context.
Finally, the data distribution prevailed, with 80% for training and 20% for testing. Nevertheless, tests were carried out with other configurations, which did not exceed the performance of the models. The following subsection will explain in detail such performance.

4. Models

This section explains the selection of TLMs in detail, based on state-of-the-art literature. The section also covers the definition and tuning of hyperparameters, which are divided into subsections focusing on the choice of optimization algorithm and fine-tuning strategy.

4.1. Selection of Models

This subsection provides a detailed explanation of the selection process for TLMs, grounded in state-of-the-art literature. The choice of models is informed by their performance in similar tasks and their suitability for handling the specific characteristics of the dataset. Key considerations include the models’ ability to address the nuances of Spanish and their effectiveness in detecting hate speech, racism, and misogyny. This approach ensures that the chosen models are well aligned with current best practices and advancements in the field.
According to the literature, TLM model architectures have outperformed RNN model implementations in addressing text classification problems, particularly sentiment prediction and hate speech detection. As established in previous research [76,77], these models will be employed for the analytical task of text classification.
TLMs are usually worked on in two phases [77]. One is pre-training, where the model learns to structure the language in a general way and acquires a generic knowledge of the meaning of the words. The other phase is tuning, which aims to add specific layers to the architecture to adapt models to tasks based on a specific domain.
First, the DL models BERT, ROBERTA, BETO, and RuPERTa were selected without performing the transfer of learning and/or pre-training. Then, these models were evaluated using Amazon Dataset 1 in Spanish, which contains 5000 comments (negative, neutral, and positive), to predict sentiments. Next, Table 5 shows the results.
Given that the ROBERTUITO model is pre-trained with roBERTa’s base model, it was decided to pre-train the three models with better precision in Dataset 1: BERT, ROBERTA, and BETO evaluation. After the pre-training, the models were evaluated with the SemEval Dataset 2 and HatEval in Spanish of the multilingual challenges of hate speech on the Twitter social network for 2018 and 2019, analyzing and treating the data before being used in models. As shown in Table 6, the evaluation results with the models’ pre-training evidence an increase in accuracy regarding the previously observed results.
Consequently, the models finally selected for the learning transfer with the Gold Standard consolidated from the georeferenced tweets in Colombia were the BERT, BETO, and roBERTa models. In a first approximation, after preparing the data, a fraction of Dataset 3 was used. However, as shown in Table 7, the results for the initial tests of the pre-trained algorithms evidence worse performance than in previous tests.
Of particular significance during the development of this research, the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology was proposed because, in any situation, it is flexible to return to the previous phases during the life cycle of the research. In this case, it was common to return to the data preparation stage during the evaluation phase [78] as a strategy to improve both the quantity and quality of the labeled data for the three categories—HS, RS, and MS. Initially, Dataset 3 contained 5200 tweets. However, during the experimental phases, it was determined that increasing this number was necessary to improve the performance results of the pre-trained models. Consequently, the dataset was expanded to a total of 13,339 tweets, ultimately forming the Gold Standard.
Consequently, with the selection of the models and the balanced Dataset 3, the pre-training was performed with the Compute Unified Device Architecture (CUDA) resource, defining a strategy for tuning the hyperparameters that will be deepened in the next subsection. The results of the models proposed in this article will be presented in the results section.

4.2. Definitions of Hyperparameters

Essentially, the main objective of manual hyperparameter search was to adjust the effective value of the model’s capability to fit the complexity of the task. The effective capacity was limited by three factors: the model’s representational capacity, the learning algorithm ability to minimize the cost function used to train the model successfully, and the degree to which the cost function regularizes the model training [79].
The process of configuring the hyperparameters for a DL model requires experience, trial, and error. Therefore, for the proposed model to provide the best result, it was necessary to find the optimal value of these modified hyperparameters during model pre-training.
To perform the pre-training of the BERT, ROBERTA, and BETO TLMs, it was necessary to initially define the optimizer and establish a hyperparameter tuning strategy. The values of the base models (number of layers, neurons, dropout, activation function, weight update, regularization rates, etc.) were kept fixed, and the algorithm’s hyperparameters (number of epochs, learning rate, and batch) were adjusted. The next two subsections will emphasize the selection of the optimizer and the strategy used to fine-tune the hyperparameters.

4.2.1. Algorithm of Optimization

In the first instance, ADAM was the optimizer implemented for training the models. The method computes individual adaptive learning rates for different parameters from estimates of the first and second moments of the gradients. The name ADAM is derived from adaptive moment estimation. One of its main advantages is that it is very robust to the learning rate, providing greater security to find the global minimum and reducing its sensitivity in training, compared to other optimizers such as gradient descent [77].
The ADAM algorithm updates the exponential moving averages of the gradient (mt) and the gradient squared (Vt), where the hyperparameters β 1 and β 2 0 , 1 control the exponential decay rates of these averages. The moving averages themselves are estimates of the first moment (the mean) and the second raw moment (the uncentered variance) of the gradient [77]. As shown in Table 8, we set a default optimizer configuration for machine learning problems where good results were obtained. In addition, another of its advantages is that it has intuitive interpretations and normally requires little tuning [77].
Therefore, based on the existing literature, the default settings of the ADAM optimizer were used as a baseline criterion for our research, serving as the initial starting point for model tuning during pre-training and testing.

4.2.2. Fine-Tuning Strategy

The proposed strategy for the fine-tuning of the hyperparameters during the TLMs’ pre-training is the “Grid Search”, which is a search process where the different hyperparameter values are combined to create a grid that includes all possible combinations of uniformly distributed parameters. In that case, the search process consists of actions that allow the algorithm to move through the grid, opting for the best parameter selections based on the result obtained by the objective function [80].
Although the ADAM optimizer was used with standard settings, the learning rate was a hyperparameter that required tuning and was initially set as a constant for the model. Nevertheless, grid search with short runs was considered to identify learning rates that either converge or diverge. This approach involves applying the “Cyclical Learning Rates” (CLRs) strategy proposed by Leslie N. Smith, who argues that varying the learning rate during training is generally beneficial. Smith suggests adjusting the learning rate cyclically within a range of values rather than keeping it fixed [81]. In this research, the learning rate of 1 × 10−5 yielded the best results.
For batch tuning, and unlike the ADAM learning rate hyperparameter, where its value does not affect computation time, the batch size is checked along with the training execution time. That is, this parameter is limited to the available hardware memory. Some researchers in the literature recommend using a batch size that fits the hardware memory and allows higher learning rates to be used.
Indeed, a number of experiments with different combinations were obtained with each different configuration of the model, such as the batch 4 ,   16 ,   32 , the epochs 3 ,   5 ,   10 ,   20 , and the learning rate [1 × 10−3, 1 × 10−5, 6 × 10−4]. The loss and accuracy values were monitored during training by a callback function, which stops the training if it observes an increase in the loss values or a decrease in the precision values.
Regularization techniques were applied to address the overfitting problem, and the model’s performance was evaluated to ensure effective generalization to unseen data. Data balancing was performed to handle class imbalance, which is crucial for avoiding biases in the model. Additionally, early stopping was implemented to halt training when performance on the validation set began to deteriorate, thus preventing overfitting. The data were divided into training and test sets for proper model evaluation, ensuring that the model was trained and assessed on distinct, unrelated data.
Consequently, when testing with 64 GB RAM, 5118 Intel XEON Gold processor @ 2.30 GHz 2.29 GHz (2 processors) 48 cores, 8 GB RAM NVIDIA Quadro P 4000 dedicated graphics card, and the operating system Debian GNU Linux 11, it was possible to use a batch of different sizes and compare the performance of the model. Smaller batch sizes increase regularization, while larger batch sizes have a lesser impact, thus requiring a balance to achieve the optimal amount of regularization. The literature often recommends using larger batch sizes to accommodate higher learning rates [82]. This consideration was an additional criterion during the pre-training process that aimed to improve the performance of the models.
Finally, the TLM pre-training requires keeping the first layers of the neural network’s weights intact. This requirement is because the first layers capture universal features relevant to the text classification problem [83]. In contrast, subsequent layers were adapted to focus on learning features specific to Dataset 3, which comprises tweets from Colombia. For this study, no architectural adjustments or new feature fusion techniques were applied.

5. Results

This section explores the capabilities and limits of the different Transformer Language (TL) approaches that have been evaluated. The research uses the usual metrics in NLP tasks, including Accuracy (A) and F1 score (F1). Table 9 presents the evaluation metrics of the best results, along with the tuning of the hyperparameters implemented to achieve them. For all tests, the optimizer ADAM was configured with β 1 = 0.9, β 2 = 0.999, decay ( ϵ ) = 1 × 10−8. The learning rate ( α ) hyperparameter varied, but the predominant value used was 1 × 10−5, which demonstrated optimal performance compared to other values tested.
The best final F1 score results over the Dataset 3 test set were 0.8566, 0.9598, and 0.8795 for HS, RS, and MS, respectively. All these measurements correspond to the BERT model, which obtained the best scores and performance in training times per season for their respective categories.
Figure 2 and Table 10 show the results comparing the BERT model and the GPT-3.5-turbo-0125 model for the three offensive speech labels (HS, RS, and MS) using Dataset 3. The GPT model shows lower performance in detecting hate speech, reaching 63.6%, in contrast to the 83.6% obtained by the BERT model. This disparity is attributed to the lower ability of the GPT model to identify insults in Colombian slang. As for the detection of racist content, the GPT model outperforms with 90.8%, while the model trained with BERT achieves 88.4%. For the misogyny category, the GPT model maintains a superior performance of 90.4%, compared to the 76.4% observed in the BERT model.

5.1. Validation of Results

This section presents the validation process for the best-performing model. The fundamental thing is to detect whether or not there is content that denotes hate in its most general form and racism and/or misogyny in a more particular way. Once the best models have been saved according to the evaluation metrics, an interface is developed that allows the user to view the confidence metric and interact practically with the deployed model.
As illustrated in Figure 3, the pipeline that details the process of the methodology section is defined, where a manual tweet is entered into the web API fast-API deployed with the Python programming language, which handles text classification. After the analogous processing of the sample tweet, the results of torch.save() are taken and reintegrated using its companion function torch.load(), which retrieves the state of the TLM and ingests a tokenized version of the processed tweet and lemmatized with the model B e r t T o k e n i z e r . f r o m p r e t r a i n e d (*args, **kwargs).
In fact, this allows input to the model which will return the ranked categorical evaluations and the probabilities associated with those categories using the SoftMax function. The application presents the results in the console from the API in JSON format, containing the inferences for the HS, RS, and MS categories.
Warning: The following images contain offensive language for illustrative purposes. The following examples are not intended to allude to any person or group.
In Figure 4, it can be seen that the tweet enters the API interface: ¡Ese <usuario> es una malparida negra doméstica! <emoji><emoji>. In the response, it is evident that the tweet is indeed classified in all categories of hate, misogyny, and racism, with probabilities greater than 95%. Indeed, the correct classification of the explicit language of hate is evident.
In another example shown in Figure 5, the API is entered “bobo hpta <emoji>”. It is positively classified as hateful with a probability of 99% of being non-misogynistic and racist. In this case, the classification is correct even though it is a tweet that contains offensive language at the same time as sarcasm when accompanying rudeness with a laughing emoji.

5.2. Practical Implications

The research has several significant practical implications, particularly in the automation field. This research offers important practical implications, especially for automated content moderation and sentiment analysis. By developing and evaluating TLMs tailored for detecting hate speech, racism, and misogyny in Colombian Spanish, this study lays the groundwork for more accurate and culturally relevant content moderation tools. These models can be integrated into social media platforms and online forums to improve the detection and management of offensive content, contributing to safer and more inclusive online environments.
The fine-tuning of models to address regional linguistic features demonstrates their effectiveness in capturing local nuances. This enhances automated systems in applications such as customer service and content filtering, where understanding regional language variations is essential.

5.3. Threats to Validity

One of the primary limitations of this study is the relative data size limitations, which consist of 13,339 manually labeled tweets. With a limited dataset, there is an inherent challenge in generalizing the results across a broader context. The small sample size may not fully represent the diversity of language use or the full spectrum of offensive content, potentially impacting the model’s ability to generalize to unseen data or other contexts.
The dataset is tailored to the Colombian Spanish context, which, while specific and relevant, may not encompass all variations of Spanish or the full range of dialects and slang used in different regions of Colombia. This specificity could limit the generalizability of the findings to other Spanish-speaking populations or different cultural contexts.
Given the manual labeling process, subjective biases may be introduced. These biases may affect the labeling accuracy and the subsequent performance evaluation of the models. Efforts were made to ensure balanced categorization, but inherent data collection and labeling biases could still influence the results.
The evaluation of models based on this dataset provides valuable insights but should be interpreted with caution. The performance metrics reported are specific to the dataset and its characteristics. The effectiveness of the models in real-world applications or with other datasets may vary.

6. Discussion

It is relevant to consider that the subject of hate speech in social networks using analytical tools is a multidisciplinary field in which mathematics and computing interact, as well as human sciences and psychology. For the best results, it is recommended that human behavior experts supervise data labeling heuristics to reduce bias and error incidence and, in turn, generate reliable labels for offensive content. Otherwise, there will always be a risk that the data entered into the models are not general enough to capture the essence of what is being sought, whose objective is to determine with the greatest accuracy and objectivity the incidence of hate content in social networks, taking into account the ambiguities and complexity of language.
From the perspective of previous studies, the results demonstrate a significant performance improvement, outperforming the winning teams Atalaya and MineriaUNAM of SemEval2019 in Task 5 for hate detection in Spanish by 14.35%. This improvement can be attributed to our corpus, which is 3.5 times larger than that of the mentioned competition, coupled with the manual tagging of our database. Consequently, our findings align with the literature on the advantages of TLM models in Deep Learning over classical classification models, specifically the BERT model’s superior performance compared to the winning teams’ SVM.
From a general perspective, the results of our model surpass all the mentioned studies that utilized the HartNet and HatEval datasets. When considering the amount of data labeled by HatEval (almost half the size of our database), the difference in performance is only 6.7%. This study thus demonstrates the substantial improvement achieved when the model is trained with specific (Colombian) slang as opposed to the variety of sociolects in the HatEval dataset.

7. Conclusions

Understanding the scope of text classification and natural language comprehension in social networks and language analysis is essential. This understanding is evident when one considers that most of the previous research carried out in Spanish used the lexicon and dialects of Spanish slang. The Colombian context was shown to be sufficiently different such that the grammatical structure of the sentences generated different results than those seen using the test databases. This means that the contraction of Colombian corpora is essential to be able to distinguish hate speech in Colombia and exploit its possible applications in debugging the content of a comment chain.
It was possible to identify the improvement in the precision metric with the models pre-trained with the Gold Standard created in this research to detect hate in the Colombian context. The comparison was made with the language models without transfer of learning. The ADAM classifier with hyperparameter optimization during pre-training guaranteed an increase in performance relative to the initial baseline for the family of multilanguage transformer models. It will probably continue to perform better if the dataset and computational power to perform pre-training with greater complexity in hyperparameter tuning have an opportunity to be improved using different architectures and parameter sets.
In comparing TLM and LLM models with dataset 3, better performance was obtained when identifying offensive language that is not directly targeted towards a group, i.e., the BERT model outperformed tweets tagged with HS by 19.8%. On the other hand, the GPT model outperformed the TLMs when the offensive language labels were MS and RS, suggesting its advantages in identifying hate speech directed at specific groups of people. However, it should also be considered that increasing the dataset size may involve higher computational costs if a GPT model is chosen over a TLM model, as the current difference is 16 s for a sample of 250 tweets.

8. Future Works

For this study, the classification of hate in general, misogyny, and racism was considered. However, there are comments on social networks with homophobic and xenophobic content. For future work, expanding the database with the other categories is recommended to complement the results obtained in this study. Likewise, consider the integration of different linguistic phenomena that may be part of the expression of offensive language.
To address these limitations, future research should consider expanding the dataset, including additional variations of Spanish and exploring other regions and contexts. Further validation with larger and more diverse datasets will help assess the models’ robustness and generalizability.

Author Contributions

Conceptualization, L.G.M.-S. and S.A.B.-S.; methodology, L.G.M.-S., A.P.-Q. and S.A.B.-S.; software, L.G.M.-S. and S.A.B.-S.; validation, L.G.M.-S., A.P.-Q. and S.A.B.-S.; formal analisis, L.G.M.-S., A.P.-Q. and S.A.B.-S.; investigation, L.G.M.-S., S.A.B.-S. and L.M.P.-R.; resources, L.G.M.-S. and S.A.B.-S.; data curation, S.A.B.-S.; writing original draft preparation, L.G.M.-S., S.A.B.-S. and L.M.P.-R.; writing review and editing, S.A.B.-S. and L.M.P.-R.; visualization, A.P.-Q. and S.A.B.-S.; supervision, L.G.M.-S. and A.P.-Q.; project administration, L.G.M.-S. and L.M.P.-R.; funding acquisition, L.G.M.-S. and A.P.-Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Pontificia Universidad Javeriana grant number 21186. The APC was funded by Pontificia Universidad Javeriana.

Data Availability Statement

The raw data for supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

This work has been supported by Pontificia Universidad Javeriana and the Center of Excellence and Appropriation in Big Data and Data Analytics in Colombia (CAOBA). Likewise, thanks are expressed to the students of the Master in Artificial Intelligence Eng. Gisell Natalia Cristiano and Eng. Andrés Felipe Ethorimn, for their valuable contribution. We also thank the International Research Group in Computer Science, Communications and Knowledge Management (GICOGE) and IDEAS of the Universidad Distrital Francisco José de Caldas, and LUMON LV TECH research group.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
HSHate Speech
RSRacism
MSMisogyny
ITUInternational Telecommunication Union
AIArtificial Intelligence
TLMTransformer Language Model
LLMLarge Language Model
RSARoyal Spanish Academy
NLPNatural Language Processing
MLMachine Learning
DLDeep Learning
RNNReinforcement Neural Network
BoWBag of Words
AUCArea Under the Curve
SVMSupport Vector Machine
TF-IDFTerm Frequency-Inverse Document Frequency
MLPMultilayer Perceptron
LSTMLong Short-Term Memory
CNNConvolutional Neural Networkl
BERTBidirectional Encoder Representations from Transformers
XLMCross-lingual Language Models
BETOMonolingual Spanish BERT model
GPTGenerative Pre-trained Transformer
RFRandom Forest
SFTSupervised Fine-Tuning
RLHFReinforcement Learning from Human Feedback
CRISP-DMCross-Industry Standard Process for Data Mining
CUDACompute Unified Device Architecture
CLRCyclical Learning Rates
TLTransformer Language

References

  1. Ash Turner. How Many Users Does Twitter Have? Available online: https://www.bankmycell.com/blog/how-many-users-does-twitter-have (accessed on 15 November 2023).
  2. LibertiesEU. Freedom of Expression on Social Media: Filtering Methods, Rights, and Future Perspectives. Available online: https://www.liberties.eu/es/stories/libertad-expresion-redes-sociales/43773 (accessed on 25 May 2023).
  3. Zhang, Z.; Luo, L. Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter. arXiv 2018. [Google Scholar] [CrossRef]
  4. Pamungkas, E.W.; Basile, V.; Patti, V. Misogyny Detection in Twitter: A Multilingual and Cross-Domain Study. Inf. Process. Manag. 2020, 57, 102360. [Google Scholar] [CrossRef]
  5. International Telecommunication Union, ITU Publications. Measuring Digital Development: Facts and Figures 2022. Available online: https://www.itu.int/hub/publication/d-ind-ict_mdd-2022/ (accessed on 5 April 2024).
  6. Wiegand, M.; Siegel, M.; Ruppenhofer, J. Overview of the GermEval 2018 Shared Task on the Identification of Offensive Language. In Proceedings of the GermEval 2018, 14th Conference on Natural Language Processing (KONVENS 2018), Vienna, Austria, 21 September 2018. [Google Scholar]
  7. Council Europe. Initiatives, Policies, Strategies. Available online: https://www.coe.int/en/web/cyberviolence/-/european-commission-the-eu-code-of-conduct-on-countering-illegal-hate-speech-online (accessed on 4 January 2024).
  8. Simon Kemp. Digital 2022: Global Overview Report. Available online: https://datareportal.com/reports/digital-2022-global-overview-report (accessed on 14 May 2024).
  9. Semana Magazine: New Campaign against Cyberbullying Launched in Colombia. Available online: https://www.semana.com/economia/empresas/articulo/lanzan-nueva-campana-contra-el-ciberbullying-en-colombia/202245/ (accessed on 15 December 2023).
  10. Federation of Progressive Women; Government of Spain. Information Guide on Gender-Based Hate Crimes and Cyber-Violations. Available online: https://plataformavoluntariado.org/wp-content/uploads/2021/06/guia-ciberacoso-fmp-2020-1.pdf (accessed on 19 November 2023).
  11. Plaza-del-Arco, F.M.; Molina-González, M.D.; Ureña-López, L.A.; Martín-Valdivia, M.T. Integrating implicit and explicit linguistic phenomena via multi-task learning for offensive language detection. Knowl.-Based Syst. 2022, 258, 109965. [Google Scholar] [CrossRef]
  12. Wiegand, M.; Ruppenhofer, J.; Kleinbauer, T. Detection of Abusive Language: The Problem of Biased Datasets. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics: Minneapolis, MN, USA, 2019; pp. 602–608. [Google Scholar] [CrossRef]
  13. Zampieri, M.; Malmasi, S.; Nakov, P.; Rosenthal, S.; Farra, N.; Kumar, R. Predicting the Type and Target of Offensive Posts in Social Media. In Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, MN, USA, 2–7 June 2019; pp. 1415–1420. [Google Scholar]
  14. Kogilavani, S.V.; Malliga, S.; Jaiabinaya, K.R.; Malini, M.; Kokila, M.M. Characterization and mechanical properties of offensive language taxonomy and detection techniques. Mater. Today Proc. 2023, 81, 630–633. [Google Scholar] [CrossRef]
  15. United Nations. What Is Hate Speech? Available online: https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech (accessed on 21 March 2021).
  16. Oxford Dictionaries. Misogyny. Available online: https://www.oxfordlearnersdictionaries.com/definition/english/misogyny?q=misogyny (accessed on 8 March 2023).
  17. Royal Spanish Academy. Misogyny. Available online: https://dle.rae.es/misoginia (accessed on 10 March 2023).
  18. Royal Spanish Academy. Racism. Available online: https://dle.rae.es/racismo?m=form (accessed on 10 March 2023).
  19. Rodríguez-Sánchez, F.; Carrillo-de-Albornoz, J.; Plaza, L. Automatic Classification of Sexism in Social Networks: An Empirical Study on Twitter Data. IEEE Access 2020, 8, 219563–219576. [Google Scholar] [CrossRef]
  20. Council Europe. No Space for Violence against Women and Girls in the Digital World. Available online: https://www.coe.int/en/web/commissioner/-/no-space-for-violence-against-women-and-girls-in-the-digital-world (accessed on 17 October 2023).
  21. Qureshi, K.A.; Sabih, M. Un-Compromised Credibility: Social Media Based Multi-Class Hate Speech Classification for Text. IEEE Access 2021, 9, 109465–109477. [Google Scholar] [CrossRef]
  22. X. Hateful Conduct. Available online: https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy (accessed on 26 April 2022).
  23. Youtube. Hate Speech Policy. Available online: https://support.google.com/youtube/answer/2801939?hl=en (accessed on 14 February 2023).
  24. Mutanga, R.T.; Naicker, N.; Olugbara, O.O. Hate Speech Detection in Twitter using Transformer Methods. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 9. [Google Scholar] [CrossRef]
  25. Sukhbaatar, S.; Weston, J.; Fergus, R. End-to-end memory networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015; pp. 2440–2448. [Google Scholar]
  26. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–8 December 2017. [Google Scholar]
  27. Raquel Martín. What Languages Are Most Used on the Internet? Available online: https://forbes.es/listas/5184/que-lenguas-son-las-mas-utilizadas-en-internet/ (accessed on 8 March 2024).
  28. Mansur, Z.; Omar, N.; Tiun, S. Twitter Hate Speech Detection: A Systematic Review of Methods, Taxonomy Analysis, Challenges, and Opportunities. IEEE Access 2023, 11, 16226–16249. [Google Scholar] [CrossRef]
  29. Salminen, J.; Hopf, M.; Chowdhury, S.A.; Jung, S.G.; Almerekhi, H.; Jansen, B.J. Developing an online hate classifier for multiple social media platforms. Hum.-Centric Comput. Inf. Sci. 2020, 10, 1. [Google Scholar] [CrossRef]
  30. Djuric, N.; Zhou, J.; Morris, R.; Grbovic, M.; Radosavljevic, V.; Bhamidipati, N. Hate Speech Detection with Comment Embeddings. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion), New York, NY, USA, 18–22 May 2015; pp. 29–30. [Google Scholar]
  31. Watanabe, H.; Bouazizi, M.; Ohtsuki, T. Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection. IEEE Access 2018, 6, 13825–13835. [Google Scholar] [CrossRef]
  32. Sindhu, A.; Sarang, S.; Zahid, H.; Zafar, A.; Sajid, K.; Ghulam, M. Automatic Hate Speech Detection using Machine Learning: A Comparative Study. Int. J. Adv. Comput. Sci. Appl. 2020, 11. [Google Scholar] [CrossRef]
  33. Badjatiya, P.; Gupta, S.; Gupta, M.; Varma, V. Deep Learning for Hate Speech Detection in Tweets. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW ’17 Companion), Perth, Austria, 3–7 April 2017; pp. 759–760. [Google Scholar]
  34. Al-Hassan, A.; Al-Dossari, H. Detection of hate speech in Arabic tweets using deep learning. Multimed. Syst. 2022, 28, 1963–1974. [Google Scholar] [CrossRef]
  35. Pitsilis, G.K.; Ramampiaro, H.; Langseth, H. Effective hate-speech detection in Twitter data using recurrent neural networks. Applied Intelligence 2018, 48, 4730–4742. [Google Scholar] [CrossRef]
  36. Plaza-del-Arco, F.M.; Molina-González, M.D.; Ureña-López, L.A.; Martín-Valdivia, M.T. Comparing pre-trained language models for Spanish hate speech detection. Expert Syst. Appl. 2021, 166, 114120. [Google Scholar] [CrossRef]
  37. Sohn, H.; Lee, H. MC-BERT4HATE: Hate Speech Detection using Multi-channel BERT for Different Languages and Translations. In Proceedings of the 2019 International Conference on Data Mining Workshops (ICDMW), Beijing, China, 8–11 November 2019; pp. 551–559. [Google Scholar]
  38. Cañete, J.; Chaperon, G.; Fuentes, R.; Ho, J.H.; Kang, H.; Pérez, J. Spanish Pre-Trained BERT Model and Evaluation Data. In Proceedings of the Practical ML for Developing Countries Workshop, Addis Ababa, Ethiopia, 26 April 2020. [Google Scholar]
  39. Pereira-Kohatsu, J.C.; Quijano-Sánchez, L.; Liberatore, F.; Camacho-Collados, M. Detecting and Monitoring Hate Speech in Twitter. Sensors 2019, 19, 4654. [Google Scholar] [CrossRef]
  40. Plaza-Del-Arco, F.M.; Molina-González, M.D.; Ureña-López, L.A.; Martín-Valdivia, M.T. Detecting Misogyny and Xenophobia in Spanish Tweets Using Language Technologies. ACM Trans. Int. Technol. 2020, 20, 1–19. [Google Scholar] [CrossRef]
  41. Gertner, A.; Henderson, J.; Merkhofer, E.; Marsh, A.; Wellner, B.; Zarrella, G. MITRE at SemEval-2019 Task 5: Transfer Learning for Multilingual Hate Speech Detection. In Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA, 6–7 June 2019; pp. 453–459. [Google Scholar]
  42. Vega, L.E.A.; Reyes-Magaña, J.C.; Gómez-Adorno, H.; Bel-Enguix, G. MineriaUNAM at SemEval-2019 Task 5: Detecting Hate Speech in Twitter using Multiple Features in a Combinatorial Framework. In Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA, 6–7 June 2019; pp. 447–452. [Google Scholar]
  43. Paetzold, G.H.; Zampieri, M.; Malmasi, S. UTFPR at SemEval-2019 Task 5: Hate Speech Identification with Recurrent Neural Networks. In Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA, 6–7 June 2019; pp. 519–523. [Google Scholar]
  44. Basile, V.; Bosco, C.; Fersini, E.; Nozza, D.; Patti, V.; Pardo, F.M.R.; Rosso, P.; Sanguinetti, M. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA, 6–7 June 2019; pp. 54–63. [Google Scholar]
  45. Wullach, T.; Adler, A.; Minkov, E. Towards Hate Speech Detection at Large via Deep Generative Modeling. IEEE Internet Comput. 2021, 25, 48–57. [Google Scholar] [CrossRef]
  46. Plaza del Arco, F.M.; Casavantes, M.; Escalante, J.H.; Martín-Valdivia, M.; Montejo-Ráez, A.; Montes-y-Gómez, M.; Jarquín-Vásquez, H.; Villaseñor-Pineda, L. Overview of MeOffendEs at IberLEF 2021: Offensive Language Detection in Spanish Variants. Proces. Del Leng. Nat. 2021, 67, 183–194. [Google Scholar]
  47. Gonzalo, J.; Montes-y-Gómez, M.; Rosso, P. IberLEF 2021 Overview: Natural Language Processing for Iberian Languages. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Malaga, Spain, 21 September 2021. [Google Scholar]
  48. Plaza-del-Arco, F.M.; Montejo-Raez, A.; Urena-López, L.A.; Martín-Valdivia, M.T. OffendES: A New Corpus in Spanish for Offensive Language Research. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Virtual Event, 1–3 September 2021; pp. 1096–1108. [Google Scholar]
  49. Wang, Z.; Xie, Q.; Feng, Y.; Ding, Z.; Yang, Z.; Xia, R. Is ChatGPT a good sentiment analyzer? A preliminary study. arXiv 2023. [Google Scholar] [CrossRef]
  50. Zhang, B.; Fu, X.; Ding, D.; Huang, H.; Li, Y.; Jing, L. Investigating chain-of-thought with ChatGPT for stance detection on social media. arXiv 2023. [Google Scholar] [CrossRef]
  51. Parikh, S.; Vohra, Q.; Tumbade, P.; Tiwari, M. Exploring zero and few-shot techniques for intent classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 10–12 July 2023; pp. 744–755. [Google Scholar]
  52. Lamichhane, B. Evaluation of ChatGPT for nlp-based mental health applications. arXiv 2023. [Google Scholar] [CrossRef]
  53. Chiu, K.L.; Collins, A.; Alexander, R. Detecting hate speech with GPT-3. arXiv 2021. [Google Scholar] [CrossRef]
  54. Bang, Y.; Cahyawijaya, S.; Lee, N.; Dai, W.; Su, D.; Wilie, B.; Lovenia, H.; Ji, Z.; Yu, T.; Chung, W.; et al. A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, Nusa Dua, Bali, 1–4 November 2023; pp. 675–718. [Google Scholar]
  55. Zhong, Q.; Ding, L.; Liu, J.; Du, B.; Tao, D. Can ChatGPT understand too? A comparative study on ChatGPT and fine-tuned bert. arXiv 2023. [Google Scholar] [CrossRef]
  56. Li, X.; Chan, S.; Zhu, X.; Pei, Y.; Ma, Z.; Liu, X.; Shah, S. Are ChatGPT and GPT-4 general-purpose solvers for financial text analytics? An examination on several typical tasks. arXiv 2023. [Google Scholar] [CrossRef]
  57. Tehseen, Z.; Akram, S.M.; Nawaz, M.S.; Shahzad, B.; Abdullatif, M.; Mustafa, U.R.; Lali, M.I. Identification of Hatred Speeches on Twitter. In Proceedings of the 52nd The IRES International Conference, Kuala Lumpur, Malasya, 5–6 November 2016; pp. 27–32. [Google Scholar]
  58. Zhou, Y.; Yang, Y.; Liu, H.; Liu, X.; Savage, N. Deep Learning Based Fusion Approach for Hate Speech Detection. IEEE Access 2020, 8, 128923–128929. [Google Scholar] [CrossRef]
  59. Gómez-Espinosa, V.; Muñiz-Sanchez, V.; López-Monroy, A.P. Transformers pipeline for offensiveness detection in Mexican Spanish social media. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Malaga, Spain, 21 September 2021; pp. 251–258. [Google Scholar]
  60. Aroyehun, S.T.; Gelbukh, A. Evaluation of intermediate pretraining for the detection of offensive language. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Malaga, Spain, 21 September 2021; pp. 313–328. [Google Scholar]
  61. Huerta-Velasco, D.A.; Calvo, H. Using lexical resources for detecting offensiveness in Mexican Spanish tweets. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Malaga, Spain, 21 September 2021; pp. 240–250. [Google Scholar]
  62. Sreelakshmi, K.; Premjith, B.; Soman, K. Transformer based offensive language identification in Spanish. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Malaga, Spain, 21 September 2021; pp. 233–239. [Google Scholar]
  63. Kalyan, K.S. A survey of GPT-3 family large language models including ChatGPT and GPT-4. Nat. Lang. Process. J. 2023, 6. [Google Scholar] [CrossRef]
  64. OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
  65. Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. In Proceedings of the 36th Conference on Neural Information Processing Systems, New Oleans, LA, USA, 28 November–9 December 2024. [Google Scholar]
  66. Keung, P.; Lu, Y.; Szarvas, G.; Smith, N.A. The Multilingual Amazon Reviews Corpus. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Online, 16–20 November 2020; pp. 4563–4568. [Google Scholar]
  67. Mohammad, S.; Bravo-Marquez, F.; Salameh, M.; Kiritchenko, S. SemEval-2018 Task 1: Affect in Tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, LA, USA, 5–6 June 2018. [Google Scholar]
  68. Zeman, D.; Martínez-Alonso, H. The Spanish Data for the anCora Corpus. Available online: https://github.com/UniversalDependencies/UD_Spanish-AnCora (accessed on 18 June 2024).
  69. Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; pp. 38–45. [Google Scholar]
  70. Pérez, J.M.; Rajngewerc, M.; Giudici, J.C.; Furman, D.A.; Luque, F.; Alemany, L.A.; Martínez, M.V. Pysentimiento: A Python Toolkit for Sentiment Analysis and Social NLP tasks. arXiv 2021. [Google Scholar] [CrossRef]
  71. Álvarez-Carmona, M.A.; Guzmán-Falcón, E.; Montes-y-Gómez, M.; Escalante, H.J.; Villaseñor-Pineda, L.; Reyes-Meza, V.; Rico-Sulayes, A. Overview of MEX-A3T at IberEval 2018: Authorship and aggressiveness analysis in Mexican Spanish tweets. In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Seville, Spain, 18 September 2018; pp. 74–96. [Google Scholar]
  72. Fersini, E.; Rosso, P.; Anzovino, M. Overview of the task on automatic misogyny identification at IberEval 2018. In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Seville, Spain, 18 September 2018; pp. 215–228. [Google Scholar]
  73. Waseem, Z.; Hovy, D. Hateful symbols or hateful people? In predictive features for hate speech detection on twitter. In Proceedings of the NAACL Student Research Workshop, San Diego, CA, USA, 16 June 2016; pp. 88–93. [Google Scholar]
  74. Bafna, P.B.; Saini, J.R. An Application of Zipf’s Law for Prose and Verse Corpora Neutrality for Hindi and Marathi Languages. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 261–265. [Google Scholar] [CrossRef]
  75. Plaza-del-Arco, F.M.; Parras-Portillo, A.B.; López-Úbeda, P.; Gil, B.; Martín-Valdivia, M.T. SHARE: A Lexicon of Harmful Expressions by Spanish Speakers. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2022; pp. 1307–1316. [Google Scholar]
  76. Institute of Knowledge Engineering: Transformers in Natural Language Processing. Available online: https://www.iic.uam.es/innovacion/transformers-en-procesamiento (accessed on 19 June 2024).
  77. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the Conference Paper at the 3rd International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
  78. IBM. CRISP-DM Help Overview. Last Updated: 2021-08-17. Available online: https://www.ibm.com/docs/en/spss-modeler/saas?topic=dm-crisp-help-overview (accessed on 29 June 2024).
  79. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; pp. 1–800. ISBN 978-0262035613. [Google Scholar]
  80. Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
  81. Smith, L.N. Cyclical Learning Rates for Training Neural Networks. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 464–472. [Google Scholar]
  82. Nabi, J. Hyper-Parameter Tuning Techniques in Deep Learning. Available online: https://towardsdatascience.com/hyper-parameter-tuning-techniques-in-deep-learning-4dad592c63c8 (accessed on 29 March 2024).
  83. Yu, F. A Comprehensive Guide to Fine-Tuning Deep Learning Models in Keras (Part I). Available online: https://flyyufelix.github.io/2016/10/03/fine-tuning-in-keras-part1 (accessed on 29 September 2023).
Figure 1. Pipeline of text classification methodology.
Figure 1. Pipeline of text classification methodology.
Bdcc 08 00113 g001
Figure 2. Bar chart evaluating the results between BERT-finetuning-Colombian-context and GPT-3.5. The checkmarks indicate the best-performing model according to evaluation criteria (Time execution or Accuracy). These agree with the bold values in Table 10.
Figure 2. Bar chart evaluating the results between BERT-finetuning-Colombian-context and GPT-3.5. The checkmarks indicate the best-performing model according to evaluation criteria (Time execution or Accuracy). These agree with the bold values in Table 10.
Bdcc 08 00113 g002
Figure 3. Deployment pipeline of pre-trained models.
Figure 3. Deployment pipeline of pre-trained models.
Bdcc 08 00113 g003
Figure 4. Example 1. Operation of the API deployment for the detection of a tweet with hate, racism, and misogyny content.
Figure 4. Example 1. Operation of the API deployment for the detection of a tweet with hate, racism, and misogyny content.
Bdcc 08 00113 g004
Figure 5. Example 2. Operation of the API deployment for the detection of a tweet with hate, racism, and misogyny content.
Figure 5. Example 2. Operation of the API deployment for the detection of a tweet with hate, racism, and misogyny content.
Bdcc 08 00113 g005
Table 1. State-of-the-art results for HS detection in Spanish.
Table 1. State-of-the-art results for HS detection in Spanish.
DatasetSystemF1(0_HS)F1(1_HS)MacroF1
HaterNetSVM [39]-48.3-
LSTM + MLP [39]-61.1-
BETO [36]88.765.877.2
HatEvalmultichannel BERT [37]--76.6
Ensem.voting class [40]80.068.874.2
BERT [41]73.072.772.9
SVM [42]76.169.973.0
BiGRU [43]77.152.164.6
BETO [36]79.775.577.6
Table 2. Spanish dataset related to publicly available hate speech belonging to examples strictly labeled as hate or non-hate.
Table 2. Spanish dataset related to publicly available hate speech belonging to examples strictly labeled as hate or non-hate.
DatasetSourceHateNon Hate
SemEval-2019 [42]Twitter16612060
HatEval [44]Twitter38282772
HaterNet [39]Twitter44331567
IberLEF_2021 [46,47]Twitter398226,524
YouTube
Instagram
OffendES_spans [48]Twitter452728,895
YouTube
Instagram
Table 4. Number of tweets in Spanish in configuration Dataset 3.
Table 4. Number of tweets in Spanish in configuration Dataset 3.
CategoryTotalTrain (80%)Test (20%)
Classs 1Classs 0Classs 1Classs 0Classs 1Classs 0
HS7264 6174 1090
6075 5164 911
13,33911,3382001
RS1623 1379 244
1600 1360 240
32232739484
MS1719 1461 258
1700 1445 255
34192906513
Note: Data from the RS and MS categories are contained within the RS data.
Table 5. Evaluation results (accuracy) with Dataset 1.
Table 5. Evaluation results (accuracy) with Dataset 1.
Comparison of Models
Evaluated ModelTime (s)Accuracy
BERT base multilingual sentiment3300.7646
ROBERTA BNE sentiment analysis es3900.6848
ROBERTUITO sentiment analysis12230.6754
BETO sentiment analysis4200.6568
RuPERTa base sentiment analysis es3140.5622
Table 6. Evaluation results (accuracy) with Dataset 2.
Table 6. Evaluation results (accuracy) with Dataset 2.
Comparison of Models
Evaluated ModelTimeAccuracy
BETO sentiment analysis12 h 36 m0.804
BERT base multilingual sentiment9 h 14 m0.791
DISTILROBERTA-base4 h 36 m0.779
Table 7. Evaluation results (accuracy) with Dataset 3.
Table 7. Evaluation results (accuracy) with Dataset 3.
Comparison of Pre-Trained Models
Evaluated ModelTimeAccuracyF1 Score
BETO sentiment analysis15 h0.7670.590
BERT base multilingual sentiment118 m0.7100.727
DISTILROBERTA base12 h0.7500.594
Table 8. ADAM, the proposed algorithm for stochastic optimization [77].
Table 8. ADAM, the proposed algorithm for stochastic optimization [77].
ParameterValue
Stepsize: α 0.001
Exponential decay rates for the moment estimates: β 1 0.9
Exponential decay rates for the moment estimates: β 2 0.999
Decay: ϵ 1 × 10−8
Table 9. Better evaluation results with models pre-trained with Dataset 3.
Table 9. Better evaluation results with models pre-trained with Dataset 3.
Label ModelBatchADAM α EpochVal_LossAccuracyF1
HSBERT[4–32]1 × 10−5101.21220.85380.8402
[4–32]1 × 10−551.30210.84350.8566
BETO[16–16]1 × 10−5101.29510.69790.5675
[16–16]1 × 10−531.27950.68100.5480
RoBERTa[16–16]6 × 10−451.15670.66040.5358
RSBERT[4–32]1 × 10−5100.33950.96060.9598
[4–32]1 × 10−550.22210.96060.9594
[4–32]1 × 10−5200.47720.95220.9505
MSBERT[4–32]1 × 10−550.56490.88280.8795
[4–32]1 × 10−550.66420.85740.8507
[4–32]1 × 10−5101.54270.84780.8433
Table 10. Evaluation results comparing TLMs and LLM model with Dataset 3.
Table 10. Evaluation results comparing TLMs and LLM model with Dataset 3.
CategoryTime Execution (s)Accuracy (%)
GPTBERTGPTBERT
HS23421863.683.6
RS90.888.4
MS90.476.4
Note: The execution time and results correspond to the evaluation of 250 tweets.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moreno-Sandoval, L.G.; Pomares-Quimbaya, A.; Barbosa-Sierra, S.A.; Pantoja-Rojas, L.M. Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study. Big Data Cogn. Comput. 2024, 8, 113. https://doi.org/10.3390/bdcc8090113

AMA Style

Moreno-Sandoval LG, Pomares-Quimbaya A, Barbosa-Sierra SA, Pantoja-Rojas LM. Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study. Big Data and Cognitive Computing. 2024; 8(9):113. https://doi.org/10.3390/bdcc8090113

Chicago/Turabian Style

Moreno-Sandoval, Luis Gabriel, Alexandra Pomares-Quimbaya, Sergio Andres Barbosa-Sierra, and Liliana Maria Pantoja-Rojas. 2024. "Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study" Big Data and Cognitive Computing 8, no. 9: 113. https://doi.org/10.3390/bdcc8090113

APA Style

Moreno-Sandoval, L. G., Pomares-Quimbaya, A., Barbosa-Sierra, S. A., & Pantoja-Rojas, L. M. (2024). Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study. Big Data and Cognitive Computing, 8(9), 113. https://doi.org/10.3390/bdcc8090113

Article Metrics

Back to TopTop