1. Introduction
Throughout history, social protests have been an effective means for individuals and groups to express their demands, concerns, and aspirations in the search for a significant change in several dimensions of society. These collective movements take many forms, from marches to public demonstrations, boycotts, and street protests. They all share a common goal: to raise awareness about a specific problem and exert pressure for social change. This constant quest for social transformation has evolved over time, and in the current digital era social media has emerged as a revolutionary tool that has completely transformed the dynamics of social protests.
The influence of social media in the emergence of demonstrations is undeniable. As stated by Isa et al. [
1], these digital platforms have provided an instantaneous and global avenue for effervescent individual ideas and opinions to connect and amplify, overcoming geographical and cultural barriers. This phenomenon has radically changed how social protests are organised, promoted, and achieved success in contemporary society. In this context, understanding how social networks influence social demonstrations has become imperative, as this allows us to unravel both the complex interactions and the transfer of mass effect from virtual societies to tangible reality.
Previous studies have pointed to the interplay between online platforms and the subsequent effect of street demonstrations. In a survey of more than 3000 individuals conducted by Gray-Hawkins et al. [
2], it was evidenced that 44% of the interviewees publicly expressed their support for political campaigns on social networks and that
had attended political demonstrations in the last five years. Similarly, it was observed that
of men and
of women changed their choices on political or social issues due to information they consumed through social networks. Remarkably, the authors reported that among those who changed their opinion due to social networks
consider that social networks allow them to find people sharing similar opinions on important issues,
consider that they allow them to get involved in political and social issues, and
consider that social media allows them to express their opinion on such issues. In addition, it was also evident that
of the total number of respondents had participated in protests and demonstrations.
Among the various social networks, X (former Twitter) stands out as one of the most impactful platforms for driving social movements. As a public platform, it facilitates the rapid dissemination of information, enabling activists to swiftly make decisions that can trigger prompt mobilisation among its users [
3,
4]. X also has hashtags, denoted by the symbol “#” before a word or phrase. Initially conceived as "channel tags" to allow users to participate in specific discussions, the hashtag has gained recognition for its pivotal role in promoting social movements [
5]. In [
6], the authors reported that hashtags work as thematic markers that highlight the relevance of well-known topics and facilitate the effective dissemination of information beyond an individual’s network of followers. Using hashtags increases the visibility of a message, as tweets with hashtags are easier to find than text messages. This visibility is crucial to gaining symbolic influence, as it contributes to a rapid and wide dissemination of information. Therefore, the strategic use of hashtags allows the spreading of the content of a tweet to a broader and more diverse audience [
5].
Through the tweets’ hashtags on four nationwide social demonstrations, in a previous work Beiro et al. [
7] proposed that the high connectivity at specific points of a social demonstration has some characteristics similar to the critical transitions studied in physics. Specifically, it resembles the divergence of the correlation length in those types of transitions. They allege that the simplifications caused in the statistics of the manifestations studied in that article result from those high correlations, as happens in the mentioned critical transitions.
For example, they found that among all demonstrations the distribution of hashtag frequency shows the highest heterogeneity in the time window during the protests. They argued that this was not due to increased activity but to heterogeneity in user activity. However, there are at least three possible sources for such heterogeneity: one is due to the temporal activity of users, the second is due to the heterogeneity in how users choose which hashtags to post in their tweets, and the last option is related to how users share the hashtags they post. The most relevant heterogeneity type has essential implications for the field of complex systems, as it can shed light on the interplay between the temporal and spatial domains related to the high correlation occurring at several massive social events.
Furthermore, in the same work by Beiro et al. the authors reported that social demonstrations are also characterised by several temporal points of a sustained correlation where hashtags are mostly connected into modular structures. In addition, all the demonstrations presented a transition between that modular interconnectivity and a state characterised by a nested hierarchical structure. Hence, each system passes through several phases characterised by the coordination within subgroups and one state where the system is self-organised into nestedness. A network is said to be perfectly nested when the contacts of a node of a given degree are a subset of the contacts of all the nodes of a higher degree. In terms of the networks of hashtags connected by users posting, this situation means that the hashtags posted by
n users are a subset of the hashtags posted by
users. Although the hashtag networks were not perfectly nested, the metrics were high enough to confirm that nestedness was present at those temporal points [
7].
The transitions’ modular-to-nested structures have already been studied in social demonstrations [
7,
8]. However, the reason for such a dramatic change is still unknown. Our work is an attempt to understand this phenomenon better. Specifically, we present an entropic-based study addressing two different points. First, as said before, we are interested in identifying which diversity pattern is the main one responsible for the high heterogeneity found in the hashtag frequency distribution reported in [
7]. Secondly, we also want to answer the following question: Is entropy able to find the transition point without the need to build network representations? Payrató-Borràs et al. [
9] have already conducted an entropic study of nestedness in ecological systems. However, as in most scientific literature, the study of nestedness refers only to static structures that have always been in that state, whereas here we are interested in using entropic metrics to analyse dynamic structural change during the modular-to-nested transition.
The manuscript continues with a brief description of the events around the three nationwide social demonstrations analysed, followed by an explanation of the data collection. Then, we explain the construction of the networks. We continue with the definitions used for the different calculus of the entropy-based metrics computed. We also briefly explain the metrics used to compute modularity and nestedness. Next, we show our results, continue with the discussion, and present our overall conclusions in the
Section 5.
2. Data and Methods
2.1. The Historical Context of the Nation-Wide Events Analysed
The first dataset (9n) involves a protest against the government’s proposed justice reform plans in Argentina on 9 November 2019, known as the “9ngranchaporlajusticia”. That major demonstration attracted the attention and participation of a wide range of people throughout the country, including opposition groups, civil society organisations, and concerned citizens. Using the hashtags “9ngranmarchaporlajusticia” and “9n” in social networks allowed protesters to organise and express their grievances, increasing the reach and influence of the demonstration. This event demonstrated the importance of active citizen participation and civic engagement in Argentina’s democratic processes.
The second dataset involves the “noaltarifazo” protest that also occurred in Argentina between 4 and 6 January of the same year. During this protest, citizens expressed their discontent with government policies, particularly those related to taxes and the cost of public essential services such as electricity and gas. Using the hashtags “noaltarifazo” and “ruidazonacional” in social networks played a crucial role in mobilising and organising protesters, allowing them to coordinate their efforts and share information about the demonstration. Furthermore, it demonstrated the power of social media to facilitate public discontent with government policies.
Finally, the last dataset is related to the tragic event that marked France in January 2015. The terrorist attack on the offices of the satirical magazine Charlie Hebdo in Paris shocked the country. The assailants targeted the magazine for publishing controversial caricatures of the Prophet Muhammad. The attack resulted in the death of twelve people, including prominent cartoonists, and sparked debates worldwide on freedom of expression, extremism, and national security. As is customary, these responses swiftly inundated social media platforms, with Twitter being the focal point. Commencing on 7 January, millions of tweets surfaced employing hashtags like #CharlieHebdo and #JesuisCharlie, refs. [
10,
11], which resulted in a massive demonstration on 11 January.
2.2. Data Collection
We found the two most used hashtags for each protest during the event day(s), as already mentioned for each demonstration. Next, we created the universe of users, listing all users who tweeted at least one of those two hashtags on the event day(s), as shown in
Table 1. Finally, we collected all the hashtags posted by the universe of users over a broader period, which is also displayed in the same table.
2.3. Construction of the Networks
Our one-hour temporal networks have been built so that nodes are hashtags. The link weight between two hashtags represents the number of users who posted that pair of hashtags during that hour. We compute modularity and nestedness on our temporal networks. Modularity quantifies the presence of community structure in the network, and we use the Louvain method [
12] from the NetworkX package. In general terms, modularity compares the number of edges observed within clusters and what is expected in a comparable-size network in which edges or links are randomly distributed. High modularity means dense intra-community connections but sparse inter-community ones.
We also quantify the presence of nested structure in our networks, i.e., the neighbourhood of each node is contained in the neighbourhood of nodes with higher degrees. For measuring nestedness, we used the Nestedness Calculator for Python based on the measurement proposed in [
13]. Both self-organisations—modular and nested—are incompatible as the first one is arranged into low-connected communities while a hierarchical structure characterises the last one.
2.4. Entropy
We calculate the entropy as defined in information theory [
14], namely as the uncertainty or variability of the probabilities of a specific output. We calculate the entropy or variability in each hour,
h, by:
We compute the entropy for four different ways of defining the probabilities. First, we calculate the variability of hashtags per hour. For this purpose, we compute, for each hour, the frequency of each hashtag existing in that hour. In this way, we obtain the probability that a user, from the universe of users in that hour, posts each of the Twitter hashtags used in that hour. In that sense, a high entropy value refers to the situation where some hashtags are posted very little while others are posted quite a lot. Conversely, a low entropy value is related to a situation where either only one (or very few) hashtag(s) is (are) posted or all of them are posted with the same frequency.
Secondly, we perform the same calculation for users. After computing the frequency of each user, i.e., the number of hashtags (whether repeated or not) posted in each hour, we calculate the entropy. Hence, low entropy means that only one (or very few) user(s) posts in that hour or that everyone posts the same number of times. Maximum entropy is the case where users posting few and many hashtags are equally likely.
We are interested in the diversity connecting users and hashtags in what follows. First, we compute the diversity of users per hashtag. For this purpose, we group by hashtags in each hour and compute the frequency of users (normalised number of unique users). This probability in the entropy calculation allows us to quantify the variability in how hashtags are more or less shared by different users in that hour. A low entropy value means that roughly all hashtags have been posted by the same number of users. On the other hand, a high entropy indicates that some hashtags were posted by only a few users while others were highly posted.
Finally, we are also interested in quantifying the variability of how users post different hashtags each hour, i.e., the variability of hashtags per user. It makes a difference whether or not a user posts the same hashtag (or a few of them) or different ones during the hour, h. For this endeavour, we grouped by users and computed the normalised number of unique hashtags in each hour. Henceforth, high entropy means little difference between the number of users posting a few and many different hashtags.
2.5. Modularity and Nestedness
We calculate the modularity in the networks using the community module of the Python ‘python-louvain’ library. The equation for modularity in this library is based on the original definition proposed by Newman and Girvan [
15]:
where:
Q is the modularity score and m is the total number of edges in the network;
is the element in the adjacency matrix representing the connection between nodes i and j;
are the degrees of nodes i and j, respectively, representing the number of edges connected to each node;
and are the community assignments of nodes i and j, respectively;
is the Kronecker delta function, which is 1 when nodes i and j are in the same community and 0 otherwise.
Similarly, we calculate the nestedness in the networks using the NODF (Nestedness metric based on Overlap and Decreasing Fill) index [
13]:
The NODF index compares the overlap (O) of nodes between modules in a network with the total overlap expected in a perfectly nested network.
4. Discussion
Our results are consistent over the three demonstrations analysed. First, looking at the points of active hours, we can see that, in general, the four entropic-based measurements remain at stable values except at the point of the highest activity, signalled by the red line. At that point, there is a decrease (increase) in the entropy of hashtags (per user). The last can be understood by the nested structure of the system at that critical point. Namely, hashtags posted by N users, with considerable statistics, are a subset of the hashtags posted by users. This will inevitably lead to a configuration characterised by less diversity of hashtags but, at the same time, high variability in how users share them due to the hierarchical structure in the hashtags posting.
Regarding the users’ entropy and entropy of users per hashtag, we cannot say that they present a significant behaviour at the critical point (red line). However, they both attain their maximum values at that temporal point. With respect to our first research question about the primary source of heterogeneity in user activity, we can say that it is how users share hashtags that is more variable. Finally, regarding the second research question, entropy-based measures can well characterise when a system presents nestedness.
Concerning the possible limitations of our study, we could say that the entropic study does not depend on the network representation, which eliminates possible sources of bias related to the type of representation. However, the data itself could suffer from bias. For example, other points of high nestedness or modularity could exist, not captured by the Twitter data but by other media, such as other social networks. The last is a difficult limitation to overcome, as studies are often based on a single data source.