On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions

Alsinet, Teresa; Argelich, Josep; Béjar, Ramón; Martínez, Santi

doi:10.3390/a17080369

Open AccessArticle

On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions

Department of Computer Engineering and Digital Design, University of Lleida, Jaume II, 69, 25001 Lleida, Spain

^*

Author to whom correspondence should be addressed.

Algorithms 2024, 17(8), 369; https://doi.org/10.3390/a17080369

Submission received: 29 July 2024 / Revised: 16 August 2024 / Accepted: 20 August 2024 / Published: 21 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

The bipartite polarization problem is an optimization problem where the goal is to find the highest polarized bipartition on a weighted and labeled graph that represents a debate developed through some social network, where nodes represent user’s opinions and edges agreement or disagreement between users. This problem can be seen as a generalization of the maxcut problem, and in previous work, approximate solutions and exact solutions have been obtained for real instances obtained from Reddit discussions, showing that such real instances seem to be very easy to solve. In this paper, we further investigate the complexity of this problem by introducing an instance generation model where a single parameter controls the polarization of the instances in such a way that this correlates with the average complexity to solve those instances. The average complexity results we obtain are consistent with our hypothesis: the higher the polarization of the instance, the easier is to find the corresponding polarized bipartition. In view of the experimental results, it is computationally feasible to implement transparent mechanisms to monitor polarization on online discussions and to inform about solutions for creating healthier social media environments.

Keywords:

social networks; polarization; combinatorial optimization; experimental results

1. Introduction

Polarization is a segregation process that is based on the construction of a thought based on the assumptions of confrontation of identities [1]. The polarization process does not take into account what the people of both groups may have in common, but the differences, creating a dynamic of rejection that can lead to growing hostility and discrimination. Ultimately, this could lead to situations in which intolerance gives rise to hate messages that are produced, in this case, on social media networks, but that can be reflected in the offline world [2]. Within such an environment, certain groups or individuals may become radicalized to violent extremism [3].

The sixteenth Sustainable Development Goal [4] promotes peaceful and inclusive societies, providing access to justice for all and building effective, accountable and inclusive institutions at all levels. This goal aligns with the broader human rights framework by promoting societies that respect and uphold individual rights, as well as the right to privacy, freedom of expression, and access to information. In accordance with this goal, the emergence of polarization in discussions on social media networks, and the responsibility of companies in this problem, is a topic that is causing a significant interest in society [5], although the complete relationship between digital platforms and polarized attitudes remains unclear [6]. For example, there is some work which indicates that, at least for polarization in people’s views of political parties, the increased use of social networks and the internet may not necessarily be increasing polarization [7]. In this sense, a systematic literature review has been developed by Iandoli et al. in [8]. The results of their analysis showed that social media contribute to increase polarization, either by amplifying and escalating social processes that also occur offline, or in specific ways enabled by their design affordances, which also make these platforms prone to manipulation.

Additionally, polarizing content is often the target for hoaxes and fake news [9]. Thus, the ability to detect polarization is essential in the early detection of misinformation. Indeed, misinformation and disinformation are a threat for society. For example, political disinformation can interfere with elections in democratic countries, while medical misinformation can pose a public health threat [10]. For this reason, the European Union is concerned about the risk of disinformation and misinformation, and has decided to tackle it [11].

Online social networks are platforms for social and communicative interaction. Studies have shown that polarization varies greatly across platforms [12], with the strength of the results being a function of how polarization is measured [13]. Previous work has studied the presence of polarization in different concrete examples, trying to analyze the relevant characteristics in these cases. For example, the works [14,15] studied the emergence of so-called echo chambers, where users are found to interact mainly or only with the users that agree with them, and to have very few interactions with other groups. However, online discussions in social networks can also show polarization where there are answers with negative sentiment between different groups, which can be considered the most worrying situation. For example, ref. [16] studied hyper-partisanship and polarization in Twitter during the 2018 Brazilian presidential election. Their findings showed that there was more interaction within each group (pro-/anti-Bolzonaro), but there was also an interaction between both groups. Actually, there are also cases where the interaction between groups can be more relevant than those within groups, like in the case studied in [17], where the analysis of the 2016 U.S. elections on Reddit showed a significant number of interactions between pro-Trump and pro-Hillary supporters. So, the extent to which bias is due to inline echo chambers and filter bubbles is also misunderstood, with evidence pointing in opposite directions [6]. A filter bubble (https://www.techopedia.com/definition/28556/filter-bubble, accessed on 12 August 2024) is the intellectual isolation that occurs when websites use artificial intelligence (AI)-driven algorithms to selectively present information based on user behavior, such as clicks, browsing history, search history, and location. This can lead to us being exposed to only one side of an issue, which can make it difficult to form an informed opinion.

A major problem is that divisive content tends to spread widely and quickly on social media. Posts that express moral outrage [18] or bash one’s outparty [19], for example, tend to be particularly successful at going viral. This virality may be due, in part, to the prioritization made by social media algorithms, coupled with some possible people’s inclination to favor sensational content. As a consequence, it can happen that when some people log on to social media, they are likely to see content that is divisive and presses their emotional buttons. In addition, these trends encourage politicians, media outlets, and potential influencers to publish divisive content, because it is most likely to give the commitment they desire.

It is worth noting that people exposed to a comments section when reading an online news article tend to express a more extreme view on the topic than people that read the same article without being exposed to the comments [20]. Thus, online polarization can lead to a more polarized population.

To try to mitigate the factors that may be helping the spread of divisive content, for instance, Facebook has launched some initiatives [21], even if this kind of content may be the type that produces the maximum attention of their users, thus also being the type producing maximum economic benefit.

Another approach to the analysis of polarization is proposed in the works of Auletta et al. [22] and Candogan et al. [23], which consider, in an analytical way, the effect of the social media algorithm on the opinions of members. On the one hand, in [22], the authors studied how the strength of the influence of the social media and the homophily ratio affect the probability of users reaching a consensus, and how these factors can determine the type of consensus reached. The homophily ratio aims at measuring how much individuals weigh in their similar or alike population compared to others. In particular, they showed that when the homophily ratio is large, social media has a very low capacity of determining the outcome of the opinion dynamics. However, when the homophily ratio is low, the social media influence can have an important role on the dynamics, either by making harder to reach a consensus, or inducing it on extreme opinions. On the other hand, in [23], the authors defined an opinion dynamics model where social network users update their opinions based on their neighbors’ opinions and on the content shown to them by the platform. Specifically, they showed that users’ opinions approximately converge to some limiting opinion, which is either consensus, where all users agree, or persistent disagreement, where users’ opinions differ.

Because each online social network company can have its own personal interest regarding how and when to control anomalous and extremist behaviors, one fundamental aspect is to define more transparent ways to monitor such possible non-desirable behaviors, so that we can inform solutions for creating healthier social media environments. From our point of view, the goal to be achieved in this direction is to provide mechanisms that allow us to detect situations where one can deduce that polarization is taking place within an online debate, and to a certain level of severity, because there is some criterion that can be quantified and measured.

With this goal in mind, and given that Reddit turned out to be a decisive political debate platform in different elections in the United States, such as those of 2016, in a research work, we introduced a problem [24] of finding the highest polarized bipartition on a weighted and labeled graph that represents a Reddit debate, where nodes represent user’s opinions and edges agreement or disagreement between users.

To find the highest polarized bipartition of a Reddit debate, our model considers the premise that, in a debate, we can have interactions in two ways: within groups and between groups. Then, an important aspect of our polarization measure is that it allows us to quantify the relevance of the kinds of interactions within the debate, and the polarization degree of the bipartition, which are based on how homogeneous each partition is, and how negative the interactions are between both partitions.

Our measure does not try to prove a correlation between the polarization of a particular online debate and the true opinions of their users. That is, the measure takes only into account the specific comments expressed in a particular debate.

Also, in [24], we showed that finding this target bipartition is an optimization problem that can be seen as a generalization of the maxcut problem, and we introduced a basic local search algorithm to find approximate solutions of the problem. Later on, in [25], we also studied the complexity of solving real instances from Reddit discussions with both complete and approximate algorithms. These results on real Reddit instances showed that solving those instances could be performed with little computational cost as the size of the instances increased, and that all the instances, obtained from different subreddits, tended to show polarization values away from neutral discussions, although without reaching the case of extreme polarization.

Our aim in this paper is to further investigate the complexity of this problem by introducing an instance generation model, where a single parameter controls the polarization of the instances in such a way that this correlates with the average complexity to solve those instances. The average complexity results we obtain are consistent with our hypothesis: the higher the polarization of the instance, the easier is to find the corresponding polarized bipartition. Although one can consider other alternative ways to measure the polarization in online social networks, we believe that our generation model could be used to easily check other measures. The results obtained with our generation model, together with the previous results on real instances from Reddit discussions, seem to indicate that, in practice, it may be feasible to compute the polarization of online discussions, at least with a polarization measure like the one we use here.

Therefore, in view of the experimental results, checking polarization with this measure seems to only be difficult in unrealistic cases. Clearly, it is advantageous to offer algorithms for monitoring polarization in online social networks.

The structure of the rest of the paper is as follows. In Section 2, we present both the model for weighted and labeled graphs that represents an online debate, and the measure to quantify the polarization in such debate graphs, studied and developed in [24]. In Section 3, we prove that the computation of such a measure is NP-hard. In Section 4, we describe the algorithms that we use to solve the bipartite polarization problem. In Section 5, we introduce our random generation model for user debate graphs based on a main parameter, called

α

, to control the polarization of the instances. In Section 6, we characterize a case of the bipartite polarization problem that can be solved in linear time. Finally, in Section 7, we perform an empirical evaluation to study how the complexity of solving the typical instances obtained with our random model changes as we modify the parameter

α

, and how this, at the same time, affects to the polarization of the instances.

2. Problem Definition

Polarization corresponds to the social process whereby a group of persons is divided into two opposing subgroups having conflicting and contrasting positions, goals, and viewpoints, with few individuals remaining neutral or holding an intermediate position [26]. The model we consider in this work is inspired by the model defined in [27] to identify supporting or opposing opinions in online debates, based on finding a maximum cut in a graph that models the interactions between users. Moreover, as we are interested in quantifying polarization, we consider a model that is based on a weighted graph and with labeled edges, where node weights represent the side of the user in the debate, and edge labels represent the overall sentiment between two users. Then, given a bipartition of this graph, the polarization degree of the bipartition is based on how homogeneous each partition is, and how negative the interactions are between both partitions. Finally, the measure of debate polarization is based on the maximum polarization obtained in all the possible bipartitions of the graph.

Adapting from [24], an online debate

Γ

on a root comment r is a non-empty set of comments that were originated as successive answers to a root comment r. An online debate is modeled as a two-sided debate tree, where nodes are labeled with a binary value that denotes whether the comment is in agreement (1) or in disagreement (−1) with the root comment. Notice that in [24], we introduced this model specifically for Reddit debates. However, it is clearly a model suitable for other, similar social networks.

Definition 1

(Two-Sided Debate Tree). Let Γ be an online debate on a root comment r. A Two-Sided Debate Tree (SDebT) for Γ is a tuple

T_{S} = 〈 C, r, E, W, S 〉

, defined as follows:

For every comment $c_{i}$ in Γ, there is a node $c_{i}$ in C.
Node $r \in C$ is the root node of $T$ .
If a comment $c_{1} \in C$ answers another comment $c_{2} \in C$ , there is a directed edge $(c_{1}, c_{2})$ in E.
W is a labeling function of answers (edges) $W : E \to [- 2, 2]$ , where the value assigned to an edge $(c_{1}, c_{2}) \in E$ denotes the sentiment of the answer $c_{1}$ with respect to $c_{2}$ , from highly negative ( $- 2$ ) to highly positive (2).
S is a labeling function of comments (nodes) $S : C \to {- 1, 1}$ , where the value assigned to a node $c_{i} \in C$ denotes whether the comment $c_{i}$ is in agreement (1) or in disagreement (−1) with the root comment r, and it is defined as follows:
-
$S (r) = 1$ ;
-
For all nodes $c_{1} \neq r$ in C, $S (c_{1}) = 1$ if, for some node $c_{2} \in C$ , $(c_{1}, c_{2}) \in E$ and either $S (c_{2}) = 1$ and $W (c_{1}, c_{2}) > 0$ , or $S (c_{2}) = - 1$ and $W (c_{1}, c_{2}) \leq 0$ ; otherwise, $S (c_{1}) = - 1$ .

Only the nodes and edges obtained by applying this process belong to C and E, respectively.

Next, we present the formalization of a User Debate Graph based on a Two-Sided Debate Tree, where now all the comments of the same user are aggregated into a single node that represents the user’s opinion.

Definition 2

(User Debate Graph). Let Γ be a online debate on a root comment r with users’ identifiers

U = {u_{1}, \dots, u_{m}}

, and let

T_{S} = 〈 C, r, E, W, S 〉

be a SDebT for Γ. A User Debate Graph (UDebG) for

T_{S}

is a tuple

G = 〈 C, E, S, W 〉

, where:

$C$ is the set of nodes of $G$ , defined as the set of users’ opinions ${C_{1}, \dots, C_{m}}$ , i.e., $C = {C_{1}, \dots, C_{m}}$ with $C_{i} = {c \in Γ ∣ c \neq r and user (c) = u_{i}}$ , for all users $u_{i} \in U$ .
$E \subseteq C \times C$ is the set of edges of $G$ , defined as the set of interactions between different users in the debate, i.e., there is an edge $(C_{i}, C_{j}) \in E$ , with $C_{i}, C_{j} \in C$ and $i \neq j$ , if and only if, for some $(c_{1}, c_{2}) \in E$ , we have that $c_{1} \in C_{i}$ and $c_{2} \in C_{j}$ .
$S$ is an opinion weighting scheme for $C$ that expresses the side of users in the debate based on the side of their comments. We define $S$ as the mapping $S : C \to [- 1, 1]$ that assigns to every node $C_{i} \in C$ the value

$S (C_{i}) = \frac{\sum_{c \in C_{i}} S (c_{i})}{| C_{i} |}$

in the real interval $[- 1, 1]$ , which expresses the side of the user $u_{i}$ with respect to the root comment, from strictly disagreement (−1) to strictly agreement (1), going through undecided opinions (0).
$W$ is an interaction weighting scheme for $E$ that expresses the overall sentiment between users by combining the individual sentiment values assigned to the responses between their comments.
We define $W$ as the mapping $W : E \to [- 2, 2]$ that assigns to every edge $(C_{i}, C_{j}) \in E$ a value $w \in [- 2, 2]$ , defined as follows:

$w = \sum_{{(c_{1}, c_{2}) \in E \cap (C_{i} \times C_{j})}} W (c_{1}, c_{2}) / | {(c_{1}, c_{2}) \in E \cap (C_{i} \times C_{j})} |$

where w expresses the overall sentiment of the user $u_{i}$ regarding the comments of the user $u_{j}$ , from highly negative (−2) to highly positive (2).
Only the nodes and edges obtained by applying this process belong to $C$ and $E$ , respectively.

Given a User Debate Graph

G = 〈 C, E, S, W 〉

, a model to measure the level of polarization in the debate between its users was also introduced in [24]. It is based on two characteristics that a polarization measure should capture. First, a polarized debate should contain a bipartition of

C

into two sets

(L, R)

, such that the set L contains mainly users in disagreement, the set R contains mainly users in agreement, and both sets should be similar in size. The second ingredient is the sentiment between users of L and R. A polarized discussion should contain most of the negative interactions between users of L and users of R, whereas the positive interactions, if any, should be mainly within the users of L and within the users of R.

To capture these two characteristics with a single value, two different measures are combined in a final one, referred to as the bipartite polarization.

Definition 3

(Bipartite polarization). Given a User Debate Graph

G = 〈 C, E, S, W 〉

and a bipartition

(L, R)

of

C

, we define:

The level of consistency and balance of $(L, R)$ is a real value in $[0, 0.25]$ , defined as follows:

$SC (L, R, G) = LC (L, G) \cdot RC (R, G)$

with

$LC (L, G) = \frac{\sum_{\begin{matrix} C_{i} \in L, \\ S (C_{i}) \leq 0 \end{matrix}} - S (C_{i})}{| C |}$

and

$RC (R, G) = \frac{\sum_{\begin{matrix} C_{i} \in R, \\ S (C_{i}) > 0 \end{matrix}} S (C_{i})}{| C |} .$
The sentiment of the interactions between users of different sides is a real value in $[0, 4]$ , defined as follows:

$SWeight (L, R, G) = \frac{\sum_{\begin{matrix} (i, j) \in E \cap \\ ((L \times R) \cup (R \times L)) \end{matrix}} - W (i, j)}{| E |} + 2 .$

Notice that according to [25], each edge label $(i, j)$ can incorporate a correction factor that is used to modify the final weight used in the $SWeight$ function. However, to simplify the model notation, in this work we will consider that the interaction weighting scheme $W (i, j)$ already reflects this factor.

Then, the bipartite polarization of

G

on a bipartition

(L, R)

is the value in the real interval

[0, 1]

, defined as follows:

BipPol (L, R, G) = SC (L, R, G) \cdot SWeight (L, R, G) .

Finally, the bipartite polarization of

G

is the maximum value of

BipPol (L, R, G)

among all possible bipartitions

(L, R)

.

3. Worst-Case Complexity of the Bipartite Polarization Problem

Proposition 1.

The UDebG bipartite polarization problem is NP-hard.

Proof.

We prove that the simple maxcut problem, which it is NP-hard even for graphs with bounded degree

\leq 3

[28], can be reduced to the UDebG bipartite polarization problem in polynomial time. Consider an undirected graph instance

G = (V, E)

of the simple maxcut problem. Then, we build an instance

G = 〈 C, E, S, W 〉

of the bipartite polarization problem, such that:

The set of vertices $C$ is equal to $V \cup {u^{-}, u^{+}}$ .
For $S$ , we have that $\forall v \in V, S (v) = 0$ and $S (u^{-}) = - 1, S (u^{+}) = + 1$ .
The set of edges $E$ is defined only for those vertices that have an edge in the input graph:

$E = {(v_{1}, v_{2}), (v_{2}, v_{1}) | {v_{1}, v_{2}} \in E} .$
For any $e \in E$ , we have that $W (e) = - 1 / 2$ .

Then, assume

G = (V, E)

has a bipartition

(L, R)

with total weight (number of edges between L and R) equal to W. Consider the bipartition

(L \cup {u^{-}}, R \cup {u^{+}})

for the instance

G

obtained by the reduction. Clearly,

BipPol (L \cup {u^{-}}, R \cup {u^{+}}, G)) = \frac{- (- 1)}{| C |} \frac{1}{| C |} (\frac{W}{| E |} + 2) .

Next, assume

G = 〈 C, E, S, W 〉

has a bipartition

(L^{'}, R^{'})

with a bipartite polarization value equal to

B P o l

. Then, as all the nodes have

S (v) = 0

, except for

S (u^{-}) = - 1,

S (u^{+}) = + 1

, we have that if

B i p P o l > 0

, then

u^{-}

will be in

L^{'}

and

u^{+}

will be in

R^{'}

, so the BipPol value will be of the same form as before:

BipPol (L^{'}, R^{'}, G)) = \frac{- (- 1)}{| C |} \frac{1}{| C |} (\frac{W}{| E |} + 2),

where W will be equal to half the number of edges between

L^{'}

and

R^{'}

in

G

, as for any undirected edge in the input graph G, we have two directed edges in

G

, each one with

W (e) = - 1 / 2

. But this implies that

(L^{'} ∖ {u^{-}}, R^{'} ∖ {u^{+}})

is a bipartition for the input graph G with value W. For the case

B P o l = 0

, this could only happen because the vertices

u^{-}

or

u^{+}

are in the wrong side. But we can always transform any bipartition

(L^{'}, R^{'})

to one in which

u^{-}

and

u^{+}

are in the right side, and so derive the value of W from the value of

B P o l

.

As a consequence,

(L, R)

is a simple maximum cut for

G = (V, E)

if and only if

(L \cup {u^{-}}, R \cup {u^{+}})

is a bipartition for

G = 〈 C, E, S, W 〉

with maximum

BipPol

value. □

This reduction shows two interesting facts about our problem. First, we can have instances that are as hard to solve as the Maxcut problem when almost all the users have

S (v) = 0

. But at the same time, observe that we need to have at least one user with a negative value and another one with a positive value. This is because if strictly all the users have

S (v) = 0

, then the

BipPol (L, R, G)

value is equal to zero for any bipartition

(L, R)

, and then it becomes trivial to solve the problem. These facts, together with the experimental results in [25] with real instances from Reddit discussions, where instances with polarization values away from neutral, were very easy to solve on average, which makes us present the following hypothesis.

Hypothesis 1.

On the one hand, the closer the maximum bipartite polarization of a UDebG instance is to zero, the more difficult it is to find such bipartition, as many possible bipartitions will have a bipartite polarization very close to the optimum. On the other hand, the closer the maximum bipartite polarization of a UDebG instance is to one, the easier is to find such bipartition, as any user i will tend to have values for

S (i)

and

W (i, j)

that are correlated, and users will tend to have only negative sentiment answers to the users of the other side in the optimal bipartition.

The random generator model we present in Section 5 will be used in the experimental section (Section 7) to try to check this hypothesis, at least with the synthetic instances generated with our random generator model.

4. Solving the Bipartite Polarization Problem

To solve the bipartite polarization problem, we use two approaches: one based on a complete algorithm to find the exact polarization, and another one based on a local search algorithm that finds an approximate solution.

For the complete algorithm, we can find the polarization of a debate by solving the integer nonlinear programming formulation (MINLP) defined in [25]. Thus, we can find the bipartite polarization of a User Debate Graph

G = 〈 C, E, S, W 〉

by solving the following MINLP of it, where each node

C_{i}

from

C

is associated with an integer variable

x_{i}

, such that

x_{i} = - 1

represents that

C_{i}

is in the L partition and

x_{i} = + 1

represents that

C_{i}

is in R:

\begin{matrix} m a x_{x} (\frac{1}{{| C |}^{2}} \sum_{\begin{matrix} (x_{i}, x_{j}) w i t h \\ S (C_{i}) \leq 0, S (C_{j}) > 0 \end{matrix}} - S (C_{i}) S (C_{j}) (1 - x_{i}) (1 + x_{j}) / 4.0) * \\ (2 + \frac{1}{| E |} \sum_{(i, j) \in E} - c (p (i, j)) \cdot w (i, j) * (1 - x_{i} * x_{j}) / 2.0) \\ subject to : x_{i}^{2} = 1 \forall C_{i} \in C \end{matrix}

(1)

Observe that the first summatory in the objective function represents the term

SC (L, R, G)

, and the second one the term

SWeight (L, R, G)

. Then, we use the branch-and-bound solver [29] of the SCIP Optimization suite (version 8.0) [30] to optimally solve problem instances with this MINLP formulation.

For the approximate approach, we use the local search solver developed in [24], which is inspired by an algorithm for the maxcut problem [31]. The solver uses a greedy approach based on a steepest ascent hill climbing strategy plus restarts to escape from local minima. The number of steps of the local search algorithm is bounded by the number of nodes, and the number of restarts is set to 10. So, the worst case running time of the local search approach is linear in the number of nodes.

The initial bipartition

(L, R)

in the local search solver is randomly generated as follows. We place each user’s opinion

C_{i}

in L with probability

P_{L} = (1 - S (C_{i})) / 2

and in R with probability

1 - P_{L}

. That is, the closer

S (C_{i})

is to

- 1

, the higher the probability to be placed initially in L. As we will see in Section 6, this initial bipartition can be very close to the optimum solution for at least the polynomial time case of the problem we present in that section.

5. A Random Generator Model for `UDebG` Instances

We present a random generator of UDebG instances, where the goal is to control the expected bipartite polarization of the instance by means of a single parameter

α

, that lies in the range

(0, 1]

, and where the number of nodes (users) is given by a parameter m. Note that, for

α = 0

, all the nodes would have

S (C_{i}) = 0

. This extreme case corresponds to the trivial problem where the bipartite polarization is zero for any bipartition. For this reason,

α = 0

is not considered in our random generator.

The generation process consists of the two following steps:

Generation of the set of nodes $C$ with their $S (C_{i})$ value. Each node $C_{i}$ is generated with a value $S (C_{i})$ from $[- α, α]$ , obtained with a bimodal distribution that arises as a mixture of two different truncated normal distributions $T N (- α, 0, μ_{1}, σ_{1})$ and $T N (0, α, μ_{2}, σ_{2})$ . The first distribution is defined on the interval $[- α, 0]$ with mean $μ_{1}$ and standard deviation $σ_{1}$ equal to:

$μ_{1} = - α, σ_{1} = \frac{1}{1 + 20 α} .$

And, analogously for the second distribution, but now defined on the interval $[0, α]$ , and with values:

$μ_{2} = α, σ_{2} = \frac{1}{1 + 20 α} .$

So, with this bimodal distribution, the values are concentrated around the two modes ( $- α$ and $α$ ), but how tightly they concentrate is inversely proportional to $α$ , so the higher the value of $α$ , the smaller the standard deviation.
Generation of the set of edges $E$ with their $W (i, j)$ value. For each node i, we randomly select a set of k target vertices ${j_{1}, j_{2}, \dots, j_{k}}$ , with k randomly selected from $[1, ⌈ {log}_{10} (m) ⌉]$ , to build outgoing edges from i to those vertices. The value of $W (i, j)$ is generated with a truncated normal distribution on the interval $[- 2, 2]$ with $μ$ and $σ$ that depend on the values of $S (C_{i})$ and $S (C_{j})$ as follows:

$μ = \{\begin{matrix} 2 \cdot | S (C_{i}) | - | S (C_{i}) - S (C_{j}) |, & if C_{i} and C_{j} are on the same side \\ - | S (C_{i}) | \cdot | S (C_{i}) - S (C_{j}) |, & if C_{i} and C_{j} are on different side \end{matrix}$

$σ = \frac{2}{3 + 10 | μ |} .$

So, when the users $C_{i}$ and $C_{j}$ are on the same side ( $S (C_{i})$ and $S (C_{j})$ are both either positive or $\leq 0$ ), the mean of the distribution will be positive, and the more similar the values $S (C_{i})$ and $S (C_{j})$ are, and the closer $| S (C_{i}) |$ is to 1, the closer $μ$ will be to 2. By contrast, when the users $C_{i}$ and $C_{j}$ are on different sides, the more different the values $S (C_{i})$ and $S (C_{j})$ are, and the closer $| S (C_{i}) |$ is to 1, the closer $μ$ will be to −2. Observe that the sign of $μ$ depends on the sign of users i and j.
Regarding the absolute value $| μ |$ , it depends on both $S (C_{i})$ and $S (C_{j})$ , but more on the first one. This is because, in principle, the overall sentiment $W (i, j)$ (answers from user i to user j) is not necessarily equal to $W (j, i)$ , although it is natural to think that there will be some positive correlation between them. In any case, we think that it makes sense to force the sign of the mean to be the same when generating $W (i, j)$ and $W (j, i)$ . As for the absolute value, it is natural that users with stronger opinions (those with $| S (C_{i}) |$ closer to 1) have stronger sentiments towards the rest of the users.

The overall intention of this way to select the values of

S (C_{i})

and

W (i, j)

is to move from a UDebG instance corresponding to a clearly neutral discussion when

α

approaches 0 to one corresponding to a highly polarized discussion when

α

approaches 1. We control this by making the expected values of both

S (C_{i})

and

W (i, j)

approach the neutral value 0 as

α

approaches 0. Observe, however, that even in the case when

μ = 0

for the generation of

W (i, j)

, the standard deviation is

2 / 3

. This allows for the generation of some negative and positive values around the mean, to ensure a more realistic generation of users where not all the answers to any user are strictly neutral. Then, as

α

approaches 1, the expected value for

S (C_{i})

tends to be more tightly concentrated around −1 or 1 due to the standard deviation tending to

1 / 21

, and the expected value for

W (i, j)

is more concentrated around the extreme values

- 2

and 2, due to the standard deviation tending to

2 / 13

.

We set that, for

α = 1

, the standard deviation for the generation of

S (C_{i})

and

W (i, j)

is not 0, but very low, because we do not believe that, in practice, one encounters real discussions with such extreme polarization value of 1. So, we have preferred instead to move towards a zone of high polarization, but without reaching the extreme case. However, it is very easy to expand the range of possible polarizations by simply using other values to move the standard deviation of both distributions closer to 0 as

α

approaches 1.

To select the number of out edges for each node, we use the function

⌈ l o g_{10} (m) ⌉

as the maximum number of out edges per node. Once more, we use this function as we believe this better represents the number of interactions of real debates in social networks. If we compute the maximum out degree and mean out degree of each instance used in [25] (instances obtained from the authors of the paper), we obtain that the mean of the maximum out degree of all the instances is 4.05, and the mean of the mean out degree of all the instances is 0.89, where the instances have a median number of nodes around 50. Notice that there are some nodes (users) that only answer to the root node. As the answers to the root node are not considered for the UDebG, they do not have out edges, so it is possible to have a mean out degree below 1 in the UDebG. With those numbers in mind, the

l o g_{10} (m)

function allows us to limit the maximum out degree to a realistic number of edges as m increases and, at the same time, keep a low mean value.

6. A P-Time Solvable Case

In this section, we present a simple case of the bipartite polarization problem that can be trivially solved in linear time. It is an interesting case because the characteristics of the instances of this case are clearly more present in the random generator model we have presented in the previous section as parameter

α

tends to 1. So, the fact that this case can be solved in polynomial time can be considered as an argument for supporting the experimental results that we show in the next section, where the higher the value of

α

, the lower the typical complexity to solve the instances of the random generator.

The instances

G

of this case satisfy the following restriction:

\forall (i, j) \in E : \{\begin{matrix} W (i, j) \leq 0 if C_{i} and C_{j} are on different side \\ W (i, j) > 0 if C_{i} and C_{j} are on the same side \end{matrix}

So, observe that this case represents those instances where the only negative interactions are between users in opposite sides, and the only positive ones are between users in the same side. Observe that this case does not only contain the extreme case of bipartite polarization equal to one, as it also contains instances where the number of users with positive

S (C_{i})

can be as different, as we want with respect to the number of users with negative

S (C_{i})

.

Then, it turns out that a solution for the

BipPol

problem (an optimal bipartition) for any such instance

G

is simply:

(V_{1}, V_{2}), with V_{1} = {v | S (C_{v}) \leq 0}, V_{2} = {v | S (C_{v}) > 0} .

Because any other different bipartition

(V_{1}^{'}, V_{2}^{'})

satisfies:

BipPol (V_{1}^{'}, V_{2}^{'}, G) \leq BipPol (V_{1}, V_{2}, G) .

To see why, consider the two following sets that define the differences between

(V_{1}, V_{2})

and

(V_{1}^{'}, V_{2}^{'})

:

N e g_{1 \to 2} = V_{1} ∖ V_{1}^{'}, P o s_{2 \to 1} = V_{2} ∖ V_{2}^{'} .

In short,

N e g_{1 \to 2}

contains the nodes with negative (or 0)

S (C_{v})

that are not in

V_{1}^{'}

but in

V_{2}^{'}

; similarly,

P o s_{2 \to 1}

contains the nodes with positive

S (C_{v})

that are not in

V_{2}^{'}

, but in

V_{1}^{'}

.

Given these differences between

(V_{1}, V_{2})

and

(V_{1}^{'}, V_{2}^{'})

, it turns out that:

Regarding the value of $SC (V_{1}^{'}, V_{2}^{'}, G)$ , we have that:

$\begin{matrix} SC (V_{1}^{'}, V_{2}^{'}, G) = LC (V_{1}^{'}, G) \cdot RC (V_{2}^{'}, G) = (\frac{\sum_{\begin{matrix} C_{i} \in V_{1}^{'}, \\ S (C_{i}) \leq 0 \end{matrix}} - S (C_{i})}{| C |}) \cdot (\frac{\sum_{\begin{matrix} C_{i} \in V_{2}^{'}, \\ S (C_{i}) > 0 \end{matrix}} S (C_{i})}{| C |}) \\ = (\frac{\sum_{\begin{matrix} C_{i} \in V_{1} \end{matrix}} - S (C_{i}) - \sum_{\begin{matrix} C_{i} \in N e g_{1 \to 2} \end{matrix}} - S (C_{i})}{| C |}) \cdot (\frac{\sum_{\begin{matrix} C_{i} \in V_{2} \end{matrix}} S (C_{i}) - \sum_{\begin{matrix} C_{i} \in P o s_{2 \to 1} \end{matrix}} S (C_{i})}{| C |}) \\ = (LC (V_{1}, G) - \frac{\sum_{\begin{matrix} C_{i} \in N e g_{1 \to 2} \end{matrix}} - S (C_{i})}{| C |}) \cdot (RC (V_{2}, G) - \frac{\sum_{\begin{matrix} C_{i} \in P o s_{2 \to 1} \end{matrix}} S (C_{i})}{| C |}) \end{matrix}$

(2)

Then, if $(V_{1}, V_{2}) \neq (V_{1}^{'}, V_{2}^{'})$ , that means that at least one the sets $N e g_{1 \to 2}$ and $P o s_{2 \to 1}$ is not empty, so $SC (V_{1}^{'}, V_{2}^{'}, G)$ will be smaller than or equal to $SC (V_{1}, V_{2}, G)$ . Observe that it will be equal when $P o s_{2 \to 1}$ is empty, and any vertex v in $N e g_{1 \to 2}$ has $S (C_{v}) = 0$ .
Regarding the value of $SWeight (V_{1}^{'}, V_{2}^{'}, G)$ , this value can be equal to or smaller than the value $SWeight (V_{1}, V_{2}, G)$ . It will be equal when

$\begin{matrix} {SWeight}^{'} (N e g_{1 \to 2}, V_{2} ∖ P o s_{2 \to 1}, G) & = 0, \\ {SWeight}^{'} (P o s_{2 \to 1}, V_{1} ∖ N e g_{1 \to 2}, G) & = 0, \\ {SWeight}^{'} (N e g_{1 \to 2}, V_{1} ∖ N e g_{1 \to 2}, G) & = 0, \\ {SWeight}^{'} (P o s_{2 \to 1}, V_{2} ∖ P o s_{2 \to 1}, G) & = 0, \end{matrix}$

(3)

where ${SWeight}^{'} (S_{1}, S_{2}, G) = SWeight (S_{1}, S_{2}, G) - 2$ . That is, when all the negative edges between $V_{1}$ and $V_{2}$ are also present between $V_{1}^{'}$ and $V_{2}^{'}$ , when all the positive edges between vertices within $V_{1}$ are also present within $V_{1}^{'}$ , and all the positive edges between vertices within $V_{2}$ are also present within $V_{2}^{'}$ .
Then, it will be smaller when there are some negative edges between $N e g_{1 \to 2}$ and $V_{2} ∖ P o s_{2 \to 1}$ , or between $P o s_{2 \to 1}$ and $V_{1} ∖ N e g_{1 \to 2}$ , so these negative edges are not present between $V_{1}^{'}$ and $V_{2}^{'}$ . Or, it will also be smaller when there are positive edges between vertices within $V_{1}$ or within $V_{2}$ that are present between $V_{1}^{'}$ and $V_{2}^{'}$ .

So, the value of

B i p P o l (V_{1}^{'}, V_{2}^{'}, G) = SC (V_{1}^{'}, V_{2}^{'}, G) \cdot SWeight (V_{1}^{'}, V_{2}^{'}, G)

will be smaller than or equal to the value of

BipPol (V_{1}, V_{2}, G)

. Then, as it can be checked in linear time whether an input instance

G

satisfies the restrictions of this class, this class can be solved in polynomial time.

Even though none of the solving algorithms presented in Section 4 try to explicitly check for this case, it turns out that the random way to initialize the bipartition in the local search algorithm creates a bipartition that is very close to the optimal bipartition for this case when the values for

S (C_{i})

are close to the extreme values −1 and 1. Therefore, at least for the instances generated by our random generator when

α

tends to 1, it is highly likely that the initial bipartition generated in the local search algorithm is very close to the optimal one for the instances of this case.

7. Experimental Results

In this section, we present an empirical evaluation of the complexity to solve the instances obtained with our random generator for different values of the parameter

α

and increasing the number of nodes (users) of the instances. Our main interest is to understand how the complexity of the exact algorithm based on integer programming changes as

α

moves from almost 0 to 1. As a second goal, we want also to compare the quality of the approximate polarization value returned by the algorithm based on local search. The results reported in [25] indicate that, for real instances coming from Reddit discussions, the median time complexity was always very low and that the approximation of the polarization value given by the local search algorithm was always almost identical to the exact value. The experiments were performed on an Intel® Core® i5-6400 CPU at 2.70 GHz with 16 GB of RAM memory.

The results of the median time to solve instances with

α \in {0.05, 0.08, 0.11, 0.14, 0.4, 0.7, 1.0}

and

m \in {25, 30, 35, 40}

as well as the bipartite polarization of the instances, are shown in Figure 1. For each combination of

α

and m, the median value shown on the figure is obtained from a set of 50 instances obtained with our random generator. We have used close values of

α

up to

0.14

because, as one can observe, the median time to solve the instances increases quickly as m increases for very low values of

α

, but as we move away from 0 the median time decreases abruptly. At the same time, the median bipartite polarization of the instances increases slowly for low values of

α

, but then it starts to increase more quickly as

α

increases. So, these results are consistent with our hypothesis about the relation between polarization and complexity to solve the instances.

If we compare these results with the ones with real instances from Reddit in [25], which have a median number of nodes of around 50, we observe that they are also consistent, in the sense that in the Reddit discussions used in that paper, the median polarization of the instances was around 0.5, and the median time was around 2 s, using the same hardware used here. To make a more complete comparison of the complexity, we have also computed the median number of nodes performed by the exact algorithm based on the SCIP solver (the same one used in that previous paper), given that in our previous paper, we also showed the median number of nodes of the SCIP search tree. Figure 2 shows the median number of nodes for the same instances of Figure 1. As we can observe, the number of nodes of the SCIP search tree follows the same qualitative behavior as the time, and for instances with median polarization around 0.5 (which have

α > 0.7

), the median number of nodes would be around 1 (by interpolating between our cases

α = 0.7

and

α = 1.0

), as happens in the results of our previous paper with the Reddit instances.

Although our choice of the value

⌈ l o g_{10} (m) ⌉

as the value for the maximum number of out edges for each node in our random generator is consistent with the characteristics of the Reddit instances solved in [25], we wanted to check if the complexity patterns we have observed in the previous experiments were not a result of a too low value for the number of out edges. This is something relevant, as the NP-hardness results for maxcut in [28], and the problem we have used to show NP-hardness of our problem, are valid for graphs with bounded degree 3, but the average degree of the graphs generated for the experiments in Figure 1 and Figure 2 have an average degree around 1, given the maximum value used for m (40). So, in order to eliminate the possibility of easy instances due to low degree graphs, we have generated a second set of instances of larger size, with m ranging from 40 to 90, and where the maximum out degree of the instances is now equal to

3 ⌈ l o g_{10} (m) ⌉

, now giving the value 6 as the maximum out degree for vertices, and 3.5 being the average degree.

Figure 3 shows the same plots as Figure 1, but obtained with these new instances with higher vertex degrees. Notice that the plots now start at

α = 0.4

, since the complexity of solving these instances for lower values of

α

is expected to be greater than in the previous experiments, given the higher values for m. In the left plot, we can observe that the complexity in CPU time increases as we lower the

α

value, but this increase does not seem exponential, at least for

α \geq 0.5

. The right plot shows the median bipartite polarization of the instances, and we can see that its behavior is very similar to the one of the previous experiment. So, these results confirm that even increasing the vertex degree, the typical complexity of solving the instances seems to be tightly correlated with the value of

α

, such that the exponential time increase seems more clear for low values of

α

(

α \leq 0.4

), but not for higher values that show a much slow increase in the typical complexity.

Finally, we were also interested in comparing the quality of the approximation provided by the local search approach, presented in Section 4, with the one of the exact algorithm. Table 1 shows, for each combination of number of nodes and value of

α

, the minimum, median and maximum value for the polarization obtained in each set of instances with the exact algorithm (columns labeled with SCIP BipPol). Observe that for higher values of

α

, we observe more variations in the polarization of the instances (difference between the min and the max values). Then, in the next three columns we show for the same three instances of the previous columns (instance with minimum polarization, with median one, and with maximum one) the ratio of the approximate polarization computed by the local search solver to the exact polarization of the instance (LS ratio). As we observe, the quality of the solutions computed by the local search algorithm is perfect for instances with

α \geq 0.4

(that are precisely the ones that are easily solved by the exact algorithm) and for lower values of

α

we observe only a tiny relative error. Overall, we can say that the quality of the solution provided by the local search algorithm is always satisfactory with the advantage that its computational complexity is always linear with respect to the number of nodes, so there is no hard region for the local search approach.

8. Conclusions

We have presented a random generator of User Debate Graph instances, where a parameter

α

is introduced to control the expected bipartite polarization of the instances. As previous results with real instances seemed to indicate that instances with polarization away from neutral (0) were easy to solve on average, we wanted to check if this was still the case when working with instances with a more wide set of polarization values.

On the one hand, the results obtained are consistent with the ones obtained with real instances, but show that hard to solve instances are possible in principle, at least for very low polarization values, something that is consistent with the fact that the problem is NP-hard. On the other hand, the results also show that the verification of polarization with this measurement seems difficult only in unrealistic cases, so that it can be used to monitor polarization in online debates with the final goal to inform solutions for creating healthier social media environments.

Despite the existence of hard instances, they are tightly concentrated around a very thin region with very low polarization (which, we could argue, represents a zone of uncommon real instances) and, in any case, the results in terms of the quality of the solutions obtained with the efficient local search approach indicates that it is a good approach to make the computation of the polarization feasible. Of course, it could be that some specific characteristics of instances coming from some social networks could be not significantly present in our random generator model, or that other measures of polarization could present a different behavior to the one of the measure used in this work. So, as a future work, we could consider the validation of our model with respect to other social networks, and consider alternative measures for the polarization that could make sense in some settings, as the polarization metric introduced in [32] to analyze opinions expressed on the gun control issue in Twitter. Another possible future work may be to use a fuzzy logic approach where, instead of two-sided debate trees, agreement is treated as a fuzzy set with a continuous membership function. This will capture the idea that user comments seldom exhibit extremely polarized opinions.

Author Contributions

Conceptualization, T.A., J.A., R.B. and S.M.; Data curation, T.A., J.A., R.B. and S.M.; Formal analysis, T.A., J.A., R.B. and S.M.; Funding acquisition, T.A., J.A., R.B. and S.M.; Investigation, T.A., J.A., R.B. and S.M.; Methodology, T.A., J.A., R.B. and S.M.; Project administration, T.A., J.A., R.B. and S.M.; Resources, T.A., J.A., R.B. and S.M.; Software, T.A., J.A., R.B. and S.M.; Supervision, J.A., R.B. and S.M.; Validation, T.A., J.A., R.B. and S.M.; Visualization, J.A., R.B. and S.M.; Writing—original draft, T.A., J.A., R.B. and S.M.; Writing—review and editing, T.A., J.A., R.B. and S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the Spanish Project PID2022-139835NB-C22 (MICINN) and by consolidated research group grant 2021 SGR 01615.

Data Availability Statement

To train and test our models, we use a dataset publicly available at https://greia-git.udl.cat/josep/OpenData/src/branch/master/data/2024-Algorithms (accessed on 22 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Neudert, L.M.; Marchal, N. Polarisation and the Use of Technology in Political Campaigns and Communication; Technical Report; European Parliamentary Research Service: Brussels, Belgium, 2019. [Google Scholar] [CrossRef]
Valentini, D.; Lorusso, A.M.; Stephan, A. Onlife Extremism: Dynamic Integration of Digital and Physical Spaces in Radicalization. Front. Psychol. 2020, 11, 524. [Google Scholar] [CrossRef]
Almagro, M.; Villanueva, N. Polarización y tecnologías de la información: Radicales vs. extremistas. Trevista Int. éTicas Apl. 2021, 34, 51–69. [Google Scholar]
Cf, O. Transforming Our World: The 2030 Agenda for Sustainable Development; United Nations: New York, NY, USA, 2015.
Lorenz-Spreen, P.; Oswald, L.; Lewandowsky, S.; Hertwig, R. A systematic review of worldwide causal and correlational evidence on digital media and democracy. Nat. Hum. Behav. 2023, 7, 74–101. [Google Scholar] [CrossRef] [PubMed]
Barberà, P. Social Media, Echo Chambers, and Political Polarization. In Social Media and Democracy: The State of the Field, Prospects for Reform; SSRC Anxieties of Democracy; Cambridge University Press: Cambridge, UK, 2020; pp. 34–55. [Google Scholar]
Boxell, L.; Gentzkow, M.; Shapiro, J.M. Cross-Country Trends in Affective Polarization. Natl. Bur. Econ. Res. 2020, 106, 557–565. [Google Scholar]
Iandoli, L.; Primario, S.; Zollo, G. The impact of group polarization on the quality of online debate in social media: A systematic literature review. Technol. Forecast. Soc. Chang. 2021, 170, 120924. [Google Scholar] [CrossRef]
Vicario, M.D.; Quattrociocchi, W.; Scala, A.; Zollo, F. Polarization and fake news: Early warning of potential misinformation targets. ACM Trans. Web (TWEB) 2019, 13, 1–22. [Google Scholar]
Larson, H.J. The biggest pandemic risk? Viral misinformation. Nature 2018, 562, 309–310. [Google Scholar] [PubMed]
European Commission, Directorate-General for Communications Networks, Content and Technology. Tackling online disinformation: A European approach. In Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions; European Commission: Brussels, Belgium, 2018. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52018DC0236 (accessed on 12 August 2024).
Yarchi, M.; Baden, C.; Kligler-Vilenchik, N. Political Polarization on the Digital Sphere: A Cross-platform, Over-time Analysis of Interactional, Positional, and Affective Polarization on Social Media. Political Commun. 2021, 38, 98–139. [Google Scholar] [CrossRef]
Kubin, E.; von Sikorski, C. The role of (social) media in political polarization: A systematic review. Ann. Int. Commun. Assoc. 2021, 45, 188–206. [Google Scholar] [CrossRef]
Bessi, A.; Zollo, F.; Vicario, M.D.; Puliga, M.; Scala, A.; Caldarelli, G.; Uzzi, B.; Quattrociocchi, W. Users Polarization on Facebook and Youtube. PLoS ONE 2016, 11, e0159641. [Google Scholar] [CrossRef]
Vicario, M.D.; Vivaldo, G.; Bessi, A.; Zollo, F.; Scala, A.; Caldarelli, G.; Quattrociocchi, W. Echo chambers: Emotional contagion and group polarization on Facebook. Sci. Rep. 2016, 6, 37825. [Google Scholar] [CrossRef]
Recuero, R.; Soares, F.B.; Gruzd, A.A. Hyperpartisanship, Disinformation and Political Conversations on Twitter: The Brazilian Presidential Election of 2018. In Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, ICWSM 2020, Held Virtually, Original Venue, Atlanta, GA, USA, 8–11 June 2020; Choudhury, M.D., Chunara, R., Culotta, A., Welles, B.F., Eds.; AAAI Press: New York, NY, USA, 2020; pp. 569–578. [Google Scholar]
Morales, G.D.F.; Monti, C.; Starnini, M. No echo in the chambers of political interactions on Reddit. Sci. Rep. 2021, 11, 2818. [Google Scholar] [CrossRef]
Brady, W.J.; Wills, J.A.; Jost, J.T.; Bavel, J.J.V. Emotion shapes the diffusion of moralized content in social networks. Psychol. Cogn. Sci. 2017, 114, 7313–7318. [Google Scholar] [CrossRef]
Rathje, S.; Bavel, J.J.V.; van der Linden, S. Out-group animosity drives engagement on social media. Psychol. Cogn. Sci. 2021, 118, e2024292118. [Google Scholar] [CrossRef]
Asker, D.; Dinas, E. Thinking fast and furious: Emotional intensity and opinion polarization in online media. Public Opin. Q. 2019, 83, 487–509. [Google Scholar]
Rosen, G. Facebook: Investments to Fight Polarization. 2020. Available online: https://about.fb.com/news/2020/05/investments-to-fight-polarization/ (accessed on 27 May 2020).
Auletta, V.; Coppola, A.; Ferraioli, D. On the Impact of Social Media Recommendations on Opinion Consensus. In Proceedings of the AIxIA 2021—Advances in Artificial Intelligence—20th International Conference of the Italian Association for Artificial Intelligence, Virtual Event, 1–3 December 2021; Revised Selected Papers; Lecture Notes in Computer Science. Bandini, S., Gasparini, F., Mascardi, V., Palmonari, M., Vizzari, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2021; Volume 13196, pp. 263–278. [Google Scholar] [CrossRef]
Candogan, O.; Immorlica, N.; Anunrojwong, B.L.J. Social Learning under Platform Influence: Consensus and Persistent Disagreement. arXiv 2022, arXiv:2202.12453. [Google Scholar] [CrossRef]
Alsinet, T.; Argelich, J.; Béjar, R.; Martínez, S. Measuring Polarization in Online Debates. Appl. Sci. 2021, 11, 1879. [Google Scholar] [CrossRef]
Alsinet, T.; Argelich, J.; Béjar, R.; Martínez, S. Approximate and Optimal Solutions for the Bipartite Polarization Problem. In Proceedings of the Artificial Intelligence Research and Development—Proceedings of the 24th International Conference of the Catalan Association for Artificial Intelligence, CCIA 2022, Sitges, Spain, 19–21 October 2022; Frontiers in Artificial Intelligence and Applications. Cortés, A., Grimaldo, F., Flaminio, T., Eds.; IOS Press: Amsterdam, The Netherlands, 2022; Volume 356, pp. 17–24. [Google Scholar] [CrossRef]
Sunstein, C.R. The Law of Group Polarization. J. Political Philos. 2002, 10, 175–195. [Google Scholar] [CrossRef]
Murakami, A.; Raymond, R. Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions. In Proceedings of the COLING 2010, 23rd International Conference on Computational Linguistics, Posters Volume, Beijing, China, 23–27 August 2010; Huang, C., Jurafsky, D., Eds.; Chinese Information Processing Society of China (CIPS): Beijing, China, 2010; pp. 869–875. [Google Scholar]
Yannakakis, M. Node- and Edge-Deletion NP-Complete Problems. In Proceedings of the 10th Annual ACM Symposium on Theory of Computing, San Diego, CA, USA, 1–3 May 1978; Lipton, R.J., Burkhard, W.A., Savitch, W.J., Friedman, E.P., Aho, A.V., Eds.; ACM: New York, NY, USA, 1978; pp. 253–264. [Google Scholar] [CrossRef]
Vigerske, S.; Gleixner, A. SCIP: Global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optim. Methods Softw. 2018, 33, 563–593. [Google Scholar] [CrossRef]
Bestuzheva, K.; Besançon, M.; Chen, W.K.; Chmiela, A.; Donkiewicz, T.; van Doornmalen, J.; Eifler, L.; Gaul, O.; Gamrath, G.; Gleixner, A.; et al. The SCIP Optimization Suite 8.0. Technical Report 21–41, ZIB, Takustr. 7, 14195 Berlin, 2021. arXiv 2021, arXiv:2112.08872. [Google Scholar]
Benlic, U.; Hao, J. Breakout Local Search for the Max-Cutproblem. Eng. Appl. Artif. Intell. 2013, 26, 1162–1173. [Google Scholar] [CrossRef]
Guerra, P.H.C.; Meira, W., Jr.; Cardie, C.; Kleinberg, R. A Measure of Polarization on Social Media Networks Based on Community Boundaries. In Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, MA, USA, 8–11 July 2013; Kiciman, E., Ellison, N.B., Hogan, B., Resnick, P., Soboroff, I., Eds.; AAAI Press: New York, NY, USA, 2013. [Google Scholar]

Figure 1. CPU time needed to solve the instances (left plot) and polarization of the solution (right plot) as we increase the alpha value for instances with nodes ranging from 25 to 40.

Figure 2. Number of nodes of the SCIP search tree needed to solve the instances as we increase the alpha value for instances with nodes ranging from 25 to 40.

Figure 3. CPU time needed to solve the instances (left plot) and polarization of the solution (right plot) as we increase the alpha value (starting from 0.4) for instances with nodes ranging from 40 to 90 and with average degree equal to 3.5.

Table 1. Comparison of the quality of the solutions between SCIP and LS solvers for different

α

values and number of nodes in the range 25–40.

Table 1. Comparison of the quality of the solutions between SCIP and LS solvers for different

α

values and number of nodes in the range 25–40.

		SCIP BipPol			LS Ratio
$m$	$α$	Min.	Median	Max.	Min.	Median	Max.
25	0.05	0.0001	0.0003	0.0005	1	0.998	1
	0.08	0.0004	0.0008	0.0013	1	0.9995	0.9999
	0.11	0.0007	0.0015	0.0025	1	0.9998	1
	0.14	0.0013	0.0025	0.0042	1	1	0.9999
	0.4	0.0359	0.0497	0.0611	1	1	1
	0.7	0.1907	0.2487	0.2786	1	1	1
	1.0	0.4834	0.6631	0.7446	1	1	1
30	0.05	0.0002	0.0003	0.0005	0.9963	0.9985	0.9989
	0.08	0.0005	0.0008	0.0012	0.9989	0.9995	0.9992
	0.11	0.0010	0.0015	0.0023	0.9997	0.9997	1
	0.14	0.0017	0.0025	0.0038	0.9998	0.9999	0.9999
	0.4	0.0424	0.0505	0.0597	1	1	1
	0.7	0.2046	0.2492	0.2724	1	1	1
	1.0	0.5289	0.6657	0.7459	1	1	1
35	0.05	0.0002	0.0003	0.0005	1	0.9991	1
	0.08	0.0006	0.0008	0.0012	1	0.9996	1
	0.11	0.0011	0.0015	0.0022	0.9996	0.9999	1
	0.14	0.0018	0.0025	0.0037	0.9998	1	1
	0.4	0.0432	0.0503	0.0588	1	1	1
	0.7	0.2150	0.2482	0.2697	1	1	1
	1.0	0.5530	0.6590	0.7490	1	1	1
40	0.05	0.0002	0.0003	0.0004	1	1	0.9983
	0.08	0.0006	0.0008	0.0011	0.9994	1	0.9989
	0.11	0.0011	0.0015	0.0021	0.9996	0.9995	0.9995
	0.14	0.0018	0.0025	0.0035	0.9995	0.9999	0.9999
	0.4	0.0423	0.0507	0.0574	1	1	1
	0.7	0.2113	0.2481	0.2679	1	1	1
	1.0	0.5491	0.6553	0.7400	1	1	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alsinet, T.; Argelich, J.; Béjar, R.; Martínez, S. On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions. Algorithms 2024, 17, 369. https://doi.org/10.3390/a17080369

AMA Style

Alsinet T, Argelich J, Béjar R, Martínez S. On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions. Algorithms. 2024; 17(8):369. https://doi.org/10.3390/a17080369

Chicago/Turabian Style

Alsinet, Teresa, Josep Argelich, Ramón Béjar, and Santi Martínez. 2024. "On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions" Algorithms 17, no. 8: 369. https://doi.org/10.3390/a17080369

APA Style

Alsinet, T., Argelich, J., Béjar, R., & Martínez, S. (2024). On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions. Algorithms, 17(8), 369. https://doi.org/10.3390/a17080369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions

Abstract

1. Introduction

2. Problem Definition

3. Worst-Case Complexity of the Bipartite Polarization Problem

4. Solving the Bipartite Polarization Problem

5. A Random Generator Model for `UDebG` Instances

6. A P-Time Solvable Case

7. Experimental Results

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On the Complexity of the Bipartite Polarization Problem: From Neutral to Highly Polarized Discussions

Abstract

1. Introduction

2. Problem Definition

3. Worst-Case Complexity of the Bipartite Polarization Problem

4. Solving the Bipartite Polarization Problem

5. A Random Generator Model for UDebG Instances

6. A P-Time Solvable Case

7. Experimental Results

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5. A Random Generator Model for `UDebG` Instances