1. Introduction
Online social networks (OSN), like other types of social organizations, undergo a steady evolution. This is not only caused by the behavior of users while using the network [
1], but even more if new users enter the network, while other users may decide to leave. New links between users are formed online, based on shared information, friendship, common interests, etc., and existing links may be deleted if users leave, or commonalities have changed. Under normal circumstances, such events may not jeopardize the existence of the OSN, in particular if the OSN is still popular and growing. However, losing users and links can pose a serious risk to OSN, even for large and successful ones on the scale of
Twitter and
Facebook. As a recent empirical study [
2] has shown, the dropout of some users can trigger cascades of other users leaving, which quickly accumulates to a level that threatens the existence of the OSN.
To investigate the emergence and the impact of such cascades analytically and numerically, we propose a model that explicitly considers the costs and benefits of users of an OSN.
Benefits are measured in terms of relative
reputation (see
Section 2.1), that is we take into account (i) the
number and the
direction of links between users, and (ii) the impact of the reputation of the counterparts on the reputation of a given user. A gain of reputation can be seen as a benefit of being part of a social network. However, there are also
costs involved in being a member of an OSN, even if they are implicit. These include, among others, the efforts to login, to maintain the individual profile, to learn how to efficiently use the GUI of the provider, or the effort to adapt to changes in the GUI.
Our model extends previous notions of reputation in social networks which are based on the degree or the centrality of a user, measured by traditional centrality metrics [
3]. In contrast to such topological metrics, we define the reputation of a user based on the reputation of those users that are linked to it. That means there is a
value assigned to incoming and outgoing links, as they increase either the own reputation or that of others. This motivates us to consider (i) a
directed network and (ii) a self-consistent
dynamics of how reputation changes with the reputation of others.
The assumption that individual reputation increases with the reputation of connected users is quite common in different ranking schemes. For example, the earliest version of
Google’s PageRank algorithm calculates the rank of a website as a function of the ranks of the sites linking to it. Similarly, the vulnerability of financial institutions can be calculated dependent on the vulnerability of the counterparties connected to it [
4]. In more general terms, such an assumption follows (hetero)catalytic models of prebiotic evolution or foodweb interactions, e.g., from chemistry and biology, where the concentration of a particular (chemical) species depends on the concentration of those species that produce, or feed, it [
5].
Jain and Krishna [
6] have combined these dynamics with a network dynamic that runs on a different time scale. At each time step (measured in network time), the system was perturbed by an extremal dynamic where the least performing node in the network was replaced by a new node that randomly rewires itself back to the system. This model was already analyzed in detail [
7,
8] and extended to cover other phenomena, such as strategic link formation between economic agents [
9].
In this paper, we build on the existing model class, but extend it in several important points: (i) In contrast to the simple extremal dynamics (where only one user is removed), we introduce a condition for users to decide to leave, namely we allow
many more users to leave dependent on their
personal cost-benefit ratio. (ii) We focus on the effects that cascades of users leaving have on the robustness and the performance of the OSN.
Performance is quantified using the long-term average reputation of users (see also
Section 3.1) and is taken as a systemic measure for the OSN rather than as an individual one.
Robustness implies that the decision of users to leave the OSN does not trigger large cascades of other users leaving. We will proxy this by means of the lifetime of the core of the OSN (see also
Section 2.3).
Our aim is not to focus on the
size of cascades, which was done in different network approaches to
systemic risk [
10,
11], but rather to understand the impact that leaving users have on the OSN. As mentioned, users will leave if their costs exceed their benefits. Different from previous publications we
vary this cost, which is the crucial parameter in our model. Interestingly, we find that both the robustness and the performance of the OSN are maximized for a
non-zero cost of usage.
To explain this, we must investigate how directed links impact the reputation across the OSN. This leads us to the problem of reciprocity: If a user contributes to the reputation of others, she may expect that these users in return also increase her own reputation. Mutual directed links between two users, e.g., , would indicate direct reciprocity. However, users can also indirectly increase their reputation if they are part of a cycle, e.g., , which indicates indirect reciprocity.
While direct and indirect reciprocity is advantageous for the OSN from a user perspective, it is not well understood how reciprocity adds to systemic properties such as the robustness of the OSN. We will address this question by a mathematical analysis that explains the impact such cycles on the core-periphery structure of the OSN. Users that are part of a cycle are well integrated in the OSN, maintain a high reputation and have little incentives to leave. However, if users leave because of a negative cost-benefit ratio, this will impact the number and sizes of such cycles, which we need to understand by means of an analytic approach that is complemented by computer simulations.
Our paper is organized as follows:
Section 2 introduces the model and the analytical framework to analyze the structure of an OSN. Different Appendices allow the study of this framework by following concrete step-by-step examples. In
Section 3 we present the results of computer simulations, to demonstrate the emergence of an optimal cost value, and we discuss the impact of different model parameters on the structure and the performance of the OSN. A discussion of the general insights and concluding remarks are provided in
Section 4.
2. A Reputation Model
2.1. Costs and Benefits
Why do users leave an online social network (OSN)? A rational answer should be, they leave because their costs of staying in the network exceed, at a given point in time, their benefits of being members. This can be expressed by the dynamics:
Here,
characterizes the current state of user
i at time
t as a binary variable:
means that the user is part of the network and
means that at time
t the user leaves.
is the Heavyside function which returns 1 if
and 0 if
. Thus, the current state of user
i depends on the difference between its benefits
and costs
.
In the following, we assume that the benefits of a user to join, and to stay, in the OSN result from the reputation that the user receives from being connected to other users in the OSN. There are many social networks that operate this way: in Twitter users get a reputation from the number of their followers, in product review communities like Amazon or YouTube users earn their reputation from the votes of other users. In our model, we particularly assume that the reputation of a user, i, does not just depend on the sheer number of other users that follow i, but also on their reputation. In other words, if a user j with a high reputation connects to user i, the latter receives more reputation than from a follower with low reputation.
Such a reputation measure can be explicitly displayed on the site, like the Reddit karma or the RG score of Researchgate, which increases with the reputation of followers and the feedback of the community. On the other hand, user reputation can be implicit and not part of a user profile, but can still be perceived through the activity of other users. Examples of this implicit reputation are retweets in Twitter and likes in Facebook.
We can express the impact of the followers on the reputation of a user by the following dynamics:
The coefficients
’s are elements of the adjacency matrix of the OSN,
A. They represent the link between users
j and
i in the OSN at time
t. These are unweighted, but directed links, because it makes a difference whether user
j follows user
i, or the other way round. If
then there is a link from
j to
i, in which case
j is called a
follower of
i.
indicates the absence of directed connection from
j to
i. Since a user cannot follow herself, we set
for all
t.
The sum in Equation (
2) is over all users that can potentially link to
i. That means if several users leave the OSN, they are replaced by the same number of new users joining the OSN. This way the total number of users,
N, is kept constant. This is a first, and not necessarily the most realistic, approximation to consider an
entry and exit dynamics, which can be refined in subsequent investigations. The second term on the r.h.s. of Equation (
2) takes into account the effort to keep a certain level of reputation. Without several followers, the reputation of user
i cannot be maintained and thus decays exponentially over time with a rate
. The larger
, the higher the effort to maintain the reputation.
To relate reputation to the benefits of users in the OSN, we could simply assume that benefits grow in proportion to reputation. On the other hand, the costs to stay in the OSN can be assumed to be a constant
equal for all users, which can be seen as the effort to login and stay active. Please note that there is no cost involved in maintaining links. According to Equation (
2) and dependent on the specific social network, the reputation of users can grow to large numbers, while on the other hand it can become infinitely small, but never zero. Because we are more interested in the reputation of users
relative to others, we will rescale the benefits from the reputation by the largest value
, which makes different networks more comparable. Hence, the benefits and costs are specified in this model as follows:
Please note that in contrast to Equation (
3), it is common to express the relative reputation in terms of the sum of the individual reputations,
:
In the context of an OSN, however, this is unrealistic, as it requires that either the total reputation or everyone’s reputation is public knowledge. For this reason, we posit that users compare their reputation to the most reputable individual, who is often visible in rankings of user reputation. We note that despite the conceptual difference, a solution to Equation (
3) can be mapped directly to a corresponding solution to Equation (
4), by normalizing
with respect to
as follows:
In
Appendix A, we further show that an equilibrium solution to Equation (
2) is also an equilibrium for
and
up to a scaling factor. This means that the entry/exit dynamics introduced in
Section 2.3 is invariant to the particular way in which users evaluate their relative reputation.
According to Equation (
1), users leave the OSN at time
t if their
relative reputation is lower than a fixed threshold. Their links
are then set to zero, which according to Equation (
2) reduces the reputation of other users
j at the next time step. This can lead to cascades of users leaving the OSN at consecutive time steps. The aim of our paper is to understand how a decrease of users’ motivation to stay because of an increase in their fixed costs
will impact the OSN. Therefore, as a next step, in
Section 2.2 we first investigate how the reputation depends on the social network, before turning to the entry/exit dynamics in
Section 2.3.
2.2. Quasi-Stationary Equilibrium
Let us first discuss the reputation dynamics of users for a
fixed social network. Expressing the dynamics of
from Equation (
3) yields (see
Appendix A):
where
z is the index of the individual with highest absolute reputation
. The first term describes the reputation boost that individual
i obtains from all her followers. The second term is a scaling factor and represents the reputation decay with strength equal to the total boost in reputation that user
z receives.
2.2.1. Eigenvalues and Eigenvectors
The set of Equations (
6) forms a linear dynamical system of coupled first-order differential equations with the initial conditions described by the vector
. The (constant in time)
forms the elements of an adjacency matrix
of size
(see
Figure 1).
At equilibrium we require
, hence
In matrix form, this equation becomes
If
is an eigenvector of
with a corresponding eigenvalue
, then rescaling
will also produce an eigenvector, i.e.,
is a solution to Equation (
8). In this case the scaling factor
gives the
th component,
, of the vector
. Since
, it follows that
.
The dynamics of Equation (
8) are dominated by the largest eigenvalue of
and the corresponding eigenvector gives us the steady-state reputation values. Let us illustrate this by the didactical example of a rather small network shown in
Figure 1.
The characteristic polynomial determining the eigenvalues of the corresponding adjacency matrix
given in
Figure 1 is
and the largest eigenvalue is
. The corresponding eigenvector gives us the absolute reputation:
. Rescaling this eigenvector by
gives the relative reputation in equilibrium:
.
2.2.2. Core-Periphery Structure
To facilitate the interpretation of the relative reputation values in
Figure 1, let us take a look at the corresponding network structure. This toy network already shows a topological feature typical for many social networks, known as a
core-periphery structure [
12,
13,
14,
15,
16,
17,
18]. Following Borgatti and Everett [
19], Everett and Borgatti [
20], the core is defined as a cohesive subgroup (e.g., a clique,
n-clique,
n-club or
n-clan,
k-plex) and the periphery is everything else.
Accordingly, and accounting for the directionality in reputation-based OSNs, we define the
core,
Q, as the
largest strongly connected component (SCC) in the OSN. Each node in the SCC is reachable from all other nodes in the SCC. The
periphery consists of all nodes that do not belong to the core. In
Appendix B we characterize the structure of the code
Q as it depends on the largest eigenvalue of the adjacency matrix of the OSN
. In
Figure 1, the core is the SCC formed by users
1,
2,
3, who mutually boost each other’s reputations. The periphery consists of users
4 and
5 who only benefit from being connected to the core. The periphery usually contains simple
chains of users (
4 →
5) which emanate from the core, in this case from user
2. The core itself does not contain simple chains, but
cycles, that is
closed directed chains that involve
users [
7]. In the example, we observe two cycles,
1 →
2→
1 and
1→
2→
3 →
1, i.e., users can be part of more than one cycle. Moreover, all users have one follower each, except for user
1 who has two followers, thus her reputation can be expected to be higher than that of the others. Please note that even though users
4 and
5 have the same number of followers, their reputation is different, as user
4 has a more reputable follower belonging to the core.
More formally, a
cycle is defined as a subgraph in which there is a closed path from every node of the subgraph back to itself. Cycles and structures of interlocking cycles represent
irreducible subgraphs. The core
Q must always contain at least one cycle to qualify as a strongly connected component. In
Appendix B we show the dependency between the largest eigenvalue of the adjacency matrix of the network and the length and number of the cycles in the core.
2.2.3. Direct and Indirect Reciprocity
The shortest cycle,
, expresses
direct reciprocity. In the example in
Figure 1 users
1 and
2 mutually follow each other and boost each other’s reputations. For
, we have
indirect reciprocity, i.e., user
2 follows
3; however,
3 does not follow
2, but follows
1 instead, and only
1 may follow
2, thus closing the loop.
Direct reciprocity is very common in OSN, e.g., in twitter or google+ it is seen as good practice to link back to someone that has chosen to follow you or to have you as his/her friend. Likewise, likes, +1, or shared posts often take reciprocity into account. Compared to this, indirect reciprocity is more difficult to detect. To boost interaction along a chain of followers, and to hopefully close the loop, some OSN, e.g., google+ or researchgate, indicate for each follower the number of additional users that the user and the follower both have in common. This may increase the likelihood of creating shortcuts and shorter cycles.
2.2.4. Length of Simple Chains
At equilibrium,
, we can insert the eigenvector
corresponding to
into Equation (
6) to get
This means that in the long run, the reputation
of user
i is equal to the sum of the reputations of all users
j that follow
i,
attenuated by a factor
[
6,
7].
Equation (
10) allows us to draw some conclusions about the maximum length of simple chains involving peripheral users. In the example discussed, we note that the reputations of users
4 and
5 are related by the attenuation factor in Equation (
10) such that
,
, and in general
for a chain of length
n. If we require a simple chain to be exactly of length
n,
and
must hold. In other words, the
th peripheral user finds it beneficial to stay while the
th leaves. Hence, we obtain for
n:
since we require
n to be an integer value. The maximum length of a simple chain in a core-periphery network thus depends on the cost level
, the connectedness within the core
Q expressed by the largest eigenvalue
, and the relative reputation of the core user who connects the core to the chain (in our example user
2).
2.2.5. Unstable Cores
Imposing the condition
in Equation (
11) requires that
and
which holds only if the OSN contains
cycles. Without these cycles,
and the core-periphery structure breaks down. The condition,
, requires that the core user
2, who connects to the simple chain, needs to obtain a net gain from participating in the OSN. Otherwise, due to the attenuation factor in Equation (
10), none of the users in the chain would have an incentive to stay in the network.
The special case of represents an important exception. In this case, and provided that , the length of the simple chain is no longer bounded, since all users in the chain would have the same relative reputation. More importantly, however, network structures like these are very unstable, because core users have a reputation comparable to the periphery. In fact, peripheral users can often obtain a higher reputation from multiple connections to the core, which reduces the benefit of the core users and increases their likelihood of leaving. In contrast to peripheral users, the leave of core users considerably affects other core users that are part of the same cycle. This starts cascades of users leaving and thus destroys the core. Without the core, the periphery would not be able to sustain its reputation and would break down as well.
2.2.6. Number and Length of Cycles
Unstable core-periphery structures can be avoided as long as
interlocking cycles appear in the core. These contain users involved in multiple cycles which in turn receive a much higher reputation and increase the benefit (the relative reputation), also for others. Both the number of cycles in a network and their length have an impact on the largest eigenvalue
as illustrated in
Appendix B. In general, we can conclude that
increases with the
number of cycles, but
decreases with the
length of the cycle (keeping all other variables constant).
The number of cycles in the network further depends on the average density m (average number of links per user), a parameter discussed in the next section when we introduce the dynamics for the network.
2.3. Network Dynamics
In the previous section, we have explained that the reputation dynamics of Equation (
2), for a
fixed network, converges to an equilibrium state in which the relative reputations of users are fixed. This convergence time defines the time scale for the reputation dynamics. Dependent on their stationary reputation value that defines their benefit in relation to their costs
, Equation (
3), users can decide to leave the OSN, Equation (
1), and will be replaced by new users joining the OSN. As described above, we assume that
, to keep
N constant.
T is the time at which entry and exit happen. We assume that the time scale for entry and exit, i.e., for changing the network structure, can be separated from the time scale of the reputation dynamics which is much shorter. This means that users make their decision based on the quasi-stationary benefit, which can only change after the network has changed.
If a user leaves the OSN, all her (incoming and outgoing) links are set to zero. The newcomer will establish the same number of incoming and outgoing links on average but, assuming that she does not know all other users, these links will be distributed at random. We assume that each user follows on average m other users. Precisely, there is a constant probability p that a new user links to the other users, and is a constant related to the average density of a random OSN.
Our major interest in this paper is in the role of the cost
that according to Equations (
1), (
3) defines the level at which users will leave the OSN, measured in terms of the relative reputation. The latter is between zero and one, so
gives the fractional benefit that must be reached to stay in the OSN. In our computer simulations, we will vary this level from zero to 0.5 to study the impact of increasing costs.
would imply no costs. To still allow for a network dynamics in this case, we apply the so-called extremal dynamics. In this case, we choose the user with the lowest relative reputation, and force her to leave the OSN, to allow a new user to enter. In case of several users with the same low b value, we choose one of them randomly. The other limiting case would imply that all N users will leave and be replaced by a completely new cohort. Then, the network at every time T starts as a new random network, which has no chance to evolve. Hence, small or intermediate values of would be most appropriate.
4. Discussion
In this paper, we have extended previous works on online social networks (OSN) in different respect: (a) we have studied directed networks of users and their followers; (b) users are characterized by a dynamic variable, their reputation, which is quantified not by simple topological centrality measures, but by a non-local measure, eigenvector centrality; (c) to model changes in reputation, we use the idea of hetero-catalytic processes, i.e., cycle-related feedback, which is not often used in describing the dynamics of online social networks. These features are reflected in an agent-based model, which captures several generic aspects of OSN. We summarize some of these in the following.
4.1. Core-Periphery Structure
Most OSNs are characterized by a core of active users that are closely linked to each other, and a periphery of less active users that are loosely connected to the core. Empirical studies have shown [
2,
21] that the core of OSN is not only quite large, compared to the size of the OSN, but also “deep”, i.e., most users are well integrated in the OSN. Our model can reproduce such a structure, but also to explain its origin, as a combination of direct and indirect reciprocity. The former implies mutual directed links between two users, whereas the latter occurs if users are part of closed cycles and thus benefit from others through the cycle.
4.2. Reputation Dynamics
Many models of OSN take the number of “friends”, i.e., direct links between users into account and take degree and traditional centrality metrics as proxies of a user’s reputation [
22,
23,
24]. Our model, in contrast, not only considers that such social relations are
directional (i.e., a link from user
j to
i does not necessarily implies a link from
i to
j), but also explicitly takes the different impact of users into account, weighting it by means of a (generalized)
reputation. Notably, reputation is not determined by the user herself, instead it results from the interaction with others. A user is said to have a high reputation if it is followed by many other users with high reputation. In this way, the
value or "quality" of one’s followers is taken into account, in addition to the sheer
number. Reputation that is not actively maintained will fade out in the course of time, so that users that are not able to attract followers automatically lose in impact.
4.3. Cost-Benefit Analysis
In our model, users join the OSN for a purpose. That means they bear costs, to obtain some benefit. If their costs are higher than their benefits, users leave the OSN. Hence, our model includes an entry-exit dynamics which is motivated by rational arguments. We have assumed that a user’s benefit increases with the number of followers it attracted, which in turn boost its reputation. Hence, benefits are measured by the relative reputation of a user, scaled by the maximum reputation in the OSN. The latter allows us to better compare OSNs of different sizes and link densities. Costs, on the other hand, are assumed to be fixed, they include for example the effort of using the website or to maintain a profile.
Our main focus in this paper was on the role of the cost incurred for every active user. Increasing would imply a pressure on users to leave because it worsens their cost-benefit relation. This short-term negative effect, however, has a positive influence on the core-periphery structure of the OSN. As we demonstrate in this paper, an optimal cost level can increase the long-term benefit of users in terms of a higher relative reputation.
4.4. Performance
To estimate the impact of increasing costs, we have defined the long-term benefit averaged over the whole OSN, , as a performance measure. Intuition would suggest that monotonously decreases with increasing costs up to a point where the whole OSN collapses. Interestingly, this picture does not hold for comparably small cost values. On the contrary, a small cost up to improves the situation, i.e., the long-term benefit increases compared to a reference case without any cost (where no user would leave). In other words, a small cost forces those users to leave which were never able to attract any follower. This, in our model, gives way to new users that might be more successful in this respect.
4.5. Robustness
As a second important insight, we analyze both mathematically and by means of computer simulations how increasing costs change the structure, and hence the robustness, of the OSN. Our model is set up in a way that it allows the expression of the outcome of the reputation dynamics by , the largest eigenvalue of the adjacency matrix that describes the interaction between users. We verified that an increasing cost leads to higher values of , which means a smaller, but more compact core and, most remarkably, in an increasing likelihood to have more than one follower. This, on the other hand, increases the direct and indirect reciprocity that characterizes the core.
Robustness implies that cascades of users leaving will not destroy the core, i.e., the lifetime of the core can be used as a proxy measure of robustness. We have shown that this lifetime is maximized for the optimal cost value .
We emphasize that the above conclusions are obtained from a model that, as with every model, only captures part of the features of real OSN. The value of the model is in what it can produce despite some of the simplified assumptions. Our insights now allow us to specifically search for optimal cost-benefit relations, or to test other assumptions to calculate the benefit.
Why is this issue important? First, there
are costs involved in being a member of an OSN, even if they are implicit. These include, among others, the efforts to login and to retain one’s own social network (e.g., by maintaining regular information stream), the effort of learning how to efficiently use the GUI of the provider, or the effort to adapt to changes in the GUI. The latter can make it harder to maintain social contacts, at least temporarily. As Garcia et al. [
2] argued, it was ill-timed interface changes that caused the massive dropout of users from
Friendster, which came at a time when new competitors, such as
Facebook, were ramping up in popularity.
There is a second issue involved in this discussion. Even if most users enjoy participating in an OSN free of charge, companies would like to know these users would respond, if at some point in time costs such as monthly membership fees are introduced. Will members remain loyal, or will this lead to a massive exodus of users, making the OSN less attractive for investors? Subsequently, if users leave how would this affect the OSN? What will be the impact of less active users leaving compared to core users leaving? What incentives should be introduced to keep power users engaged?
With our agent-based modeling framework that consider both entry-exit dynamics and cost-benefit considerations of users, we can address such questions. It also helps to better understand the relation between the integration of users and the overall performance of the OSN, which allows to the formal consideration of issues such as user satisfaction and service quality.