Next Article in Journal
Casimir Friction between Dense Polarizable Media
Next Article in Special Issue
Low-Temperature Behaviour of Social and Economic Networks
Previous Article in Journal
Optimization of Curvilinear Tracing Applied to Solar Physics and Biophysics
Previous Article in Special Issue
Exploring the Characteristics of Innovation Adoption in Social Networks: Structure, Homophily, and Strategy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

What Do Leaders Know?

Abdus Salam International Centre for Theoretical Physics, Strada Costiera 11, Trieste 34151, Italy
*
Author to whom correspondence should be addressed.
Entropy 2013, 15(8), 3031-3044; https://doi.org/10.3390/e15083031
Submission received: 18 June 2013 / Revised: 19 July 2013 / Accepted: 22 July 2013 / Published: 26 July 2013
(This article belongs to the Special Issue Social Networks and Information Diffusion)

Abstract

:
The ability of a society to make the right decisions on relevant matters relies on its capability to properly aggregate the noisy information spread across the individuals of which it is made. In this paper, we study the information aggregation performance of a stylized model of a society, whose most influential individuals—the leaders—are highly connected among themselves and uninformed. Agents update their state of knowledge in a Bayesian manner by listening to their neighbors. We find analytical and numerical evidence of a transition, as a function of the noise level in the information initially available to agents, from a regime where information is correctly aggregated, to one where the population reaches consensus on the wrong outcome with finite probability. Furthermore, information aggregation depends in a non-trivial manner on the relative size of the clique of leaders, with the limit of a vanishingly small clique being singular.

The Chinese famines of 1958–1961 killed, it is now estimated, close to thirty million people [...]. The so called Great Leap Forward initiated in the late 1950s had been a massive failure, but the Chinese government refused to admit that and continued to pursue dogmatically much of the same disastrous policies for three more years [...]. In 1962, just after the famine had killed so many millions, Mao made the following observation, to a gathering of seven thousand cadres: “Without democracy, you have no understanding of what is happening down below; the situation will be unclear; you will be unable to collect sufficient opinions from all sides; there can be no communication between top and bottom; top-level organs of leadership will depend on one sided and incorrect material to decide issues [...]”.
(from A. Sen [1])

1. Introduction

Amartya Sen [1] argues that famine and other catastrophes are easily avoided in a democracy. This argument relies on the fact that where information can freely diffuse, decision makers can form an unbiased picture of the state of a society and take proper measures. Biases due to individual opinions are expected to be washed out in the information aggregation process, a phenomenon often referred to as the “wisdom of crowds” [2]. Still, cases of information aggregation failure abound even in democratic societies (while writing, Turkey is in a state of turmoil exacerbated by the fact that the democratically elected government failed to properly understand and respond to the issues that were raised [3].). For example, in the aftermath of the 2008 Lehman Brothers bankruptcy, former Federal Reserve chairman, Alan Greenspan, expressed his state of “shocked disbelief” during his hearing before the US Committee of Government Oversight and Reform, leaving the public opinion to wonder: where did he get the information the policy of the FED was based on? Al Gore [4] argues that shortcuts between decision makers and the media are often such that the former are not in the best position to be informed about what is going on.
A number of models on social dynamics have addressed the issue of information aggregation (see, e.g., [5] for an excellent review). The simplest is probably the voter model, which entails agents taking the same opinion of a randomly chosen node amongst their neighbors. This allows for sharp predictions [6,7], where, generally, the information aggregation process converges to the incorrect outcome with finite probability. Other contributions have, instead, proposed different opinion dynamics mechanisms, such as majority rules [8] or social impact models [9], which support different conclusions. These models, however, come short in their micro-economic foundation, as the interaction mechanism is somewhat arbitrary.
More detailed micro-economic models of social learning have been proposed in economics literature. It is well known that information aggregation may fail when agents free ride on the information gathered by others, without seeking independent sources. This phenomenon, called rational herding [10], is also supported by experimental evidence [11].
A sequel of papers have focused on Bayesian learning schemes (At odds with models of rational herding, where agents deduce signals from the behavior of others, in Bayesian learning schemes, agents exchange the full probability distribution on the signals they have received. This avoids the loss of information, which lies at the heart of rational herding. This phenomenon in the context of a social network is discussed, e.g., in [12]) [13,14,15,16], coming to the generic conclusion that when agents update their beliefs following Bayes’ rule, society correctly aggregates information (still, see [17,18,19]). Some authors have focused on the impact of dominant groups of individuals on the aggregation of information. For example, Bala and Goyal [20] introduced the notion of “Royal Family” as a group of agents whose behavior is observed by anyone else. Alternatively, Golub and Jackson [21] defined t-step “prominent groups” as those groups whose behavior eventually influences all other agents within time, t. Regardless of the specific definitions, these and other studies unanimously highlighted the negative role that exceedingly influential groups have on the information aggregation process.
In this paper, we focus on an extremely stylized model of a society, and we address the issue of whether information distributed across the population is able to diffuse to an uniformed, well-connected clique of decision makers. Our model assumes Bayesian learning, but differently from [13,14], who study a continuum of agents, we study a finite, but large population of agents connected by a social network. On a finite network, when agents talk repeatedly with their peers, they may not be able to disentangle what in their peer’s opinion is new information and what reflects information exchanged in previous interactions, including the one provided by themselves to them. This phenomenon, called “persuasion bias” in [22], introduces a non-trivial positive feedback and leads to information aggregation failure, at odds with the conclusions of [13,14].
The main conclusions of our paper can be summarized in two points: (i) Information aggregation crucially depends on the synchronicity of the information updates of different agents: in the extreme case of a parallel update dynamics, where we can derive analytical results, information diffusion leads to the correct outcome in the limit of a very large society or for very informative initial signals. When the fraction of agents who update their beliefs at each time step is lower than a critical threshold, the society converges with finite probability to the wrong outcome, no matter how large the society is; (ii) In the case of parallel dynamics, information aggregation degrades as the size of the clique of uninformed agents gets smaller. In particular, the limit of a vanishingly small clique of uninformed agents behaves markedly differently from the case of a homogeneous society (with no clique). Both results suggest that it might not be wise to rely on crowds in situations that are reminiscent of those prevailing in our societies, where update is sequential and the social network is characterized by highly connected cliques (news corporations, political parties).
The paper is organized as follows. In Section 2, we detail the network structure and the information update rules, as well as our quantitative measure of a social network’s ability to correctly aggregate information. In Section 3, we provide analytic results for the case of parallel information update. In Section 4, we numerically compare these results with those obtained in sequential update schemes. We conclude the paper with a few conclusive remarks in Section 5.

2. Dynamics and Network Topology

Following [13,14], let us consider a population of agents who need to decide whether a certain event, X, will occur ( X = + 1 ) or not ( X = 1 ). Let us denote the prior probability of X as P 0 { X = + 1 } = ν = 1 P 0 { X = 1 } , where ν ( 0 , 1 ) .

2.1. Core-Periphery Network Structure

As already stated, our interest is mainly focused on the information aggregation process as performed by societies where a fraction of individuals matters much more than the vast majority of the population. In the language of networks, the most obvious measure of the importance of a node is its degree, i.e., the number of neighbors. For this very reason, throughout the rest of the paper, we shall focus on a highly stylized society structure, where only few nodes have a large degree, which we build starting from a connected regular graph, where all N nodes have degree c O ( 1 ) . Then, a randomly chosen set, H , made of N H non-neighboring sites are connected among themselves, thus forming a clique of nodes with degree N H + c 1 (this construction is such that each hub has exactly c links connecting it to nodes outside, H ). In the following, we shall be mostly interested in the case, N H c , i.e., when H becomes a group of mutually connected hubs.

2.2. Initial Beliefs

At time t = 0 , each agent i H , receives signals about event X, which are independently drawn from a probability distribution, P H { s | X } . We assume these signals to be informative [13,17], i.e., P H { s i = ± 1 | X = ± 1 } > P H { s i = ± 1 | X = 1 } i , and we focus on the particular case:
p = P H { s = X | X } = 1 P H { s = X | X } .
On the other hand, agents i H —the leaders—are assumed to be initially uninformed. This means that their signals are independently drawn from a probability, P H { s i | X } = 1 / 2 for s i , X = ± 1 .

2.3. Belief Update Dynamics

In our model, agents repeatedly exchange information with their neighbors. In this exchange, the generic agent i collects a certain number n of signals that we denote by s i = ( s i ( 0 ) , s i ( 1 ) , s i ( n ) ) , where s i ( 0 ) are the initial signals discussed above. Given this information set, s i , by Bayes’ theorem [23] the agent’s state of knowledge about X is quantified by the conditional probability:
P { X | s i } = P { s i | X } P 0 { X } P { s i } ,
where P { s i } is the probability of the signals s i . Notice that the likelihood ratio of P { X = + 1 | s i } and P { X = 1 | s i } does not depend on P { s i } . If the agent believes that the different signals are independent, then:
P { s i | X } = a = 0 n P { s i ( a ) | X } ,
and the logarithm of the likelihood ratio, which embodies the state of information of agent i, can be described by a single variable θ i :
θ i = log P { X = + 1 | s i } P { X = 1 | s i } log ν 1 ν = a = 0 n log P { s i ( a ) | X = + 1 } P { s i ( a ) | X = 1 } .
At t = 0 , agents have just one signal. Then, we have n = 0 , and the above expression reduces to the very compact form:
θ i = s i ( 0 ) log p 1 p .
When two agents, say i and j with signals s i and s j respectively, meet, they communicate by exchanging signals, and, as a result, their state of knowledge changes. Indeed, if s i s i = ( s i , s j ) , then θ i θ i = θ i + θ j . Likewise, if s j s j = ( s i , s j ) , then θ j θ j = θ i + θ j .
Starting from an initial state of knowledge θ i ( t = 0 ) , for i = 1 , , N , one can think of different types of information updating. Our assumption will be that at each time step, t = 1 , 2 , , a certain fraction Φ = N Φ / N (where N Φ N ) of randomly selected agents update their state of knowledge by listening to their neighbors. Therefore, assuming that agents in the set I t = { i 1 , i 2 , , i N Φ } are the ones to update their information at time t, one has:
θ i ( t + 1 ) = θ i ( t ) + j = 1 N a i j θ j ( t ) , i I t ,
where a i j is the ( i , j ) element of the adjacency matrix, A = { a i j } i , j = 1 , , N , i.e. a i j = a j i = 1 if agents i and j are connected, and a i j = a j i = 0 if they are not.
Clearly, the above dynamics has two limiting cases: Φ = 1 / N and Φ = 1 . The former describes cases where agents update their information one at a time, and we shall refer to this particular situation as random node sequential (RNS) dynamics. The latter case, instead, describes a parallel dynamics, where all agents simultaneously update their state of knowledge. This information update rule was initially proposed in [16] and, due to its analytical tractability, represents the most frequent choice in social learning models. In the following, we also shall investigate this type of dynamics, and then explore other cases in Section 4.
The dynamics in Equation (6) is unbounded, i.e. each θ i will either diverge to + or . Thus, information aggregation properties can be assessed simply by looking at the signs of the θ i s in the long run. Thus, a good measure of information aggregation is given by the “magnetization” of the system:
Θ ( t ) = 1 N i = 1 N sign ( θ i ( t ) ) .
The quantity X Θ ( t ) tells what is the fraction of the population holding the right information on event X at time t. A quantitative measure of information aggregation is given by the probability P { X Θ ( t ) > 0 } that the majority will converge to the true outcome in an ensemble of repeated trials.

3. Parallel Dynamics

According to the parallel dynamics prescription, all agents in a social network listen to their neighbors at any time, t = 1 , 2 , and update their state of knowledge accordingly:
θ i ( t + 1 ) = θ i ( t ) + j = 1 N a i j θ j ( t ) , i .
By collecting all θ i ( t ) s into a column vector, | θ ( t ) (here, we switch to bracket notation, i.e. we denote by | w the column vector with components w 1 , w 2 , , and by w | the corresponding row vector), the dynamics described in Equation (8) can be rewritten as:
| θ ( t ) = ( 1 + A ) t | θ ( 0 ) .
The above equation clearly suggests that the spectral properties of the adjacency matrix A play a crucial role in the time evolution of the state of knowledge vector | θ ( t ) . Being symmetric, the adjacency matrix A yields N real eigenvalues, λ 1 λ 2 λ N , whose corresponding eigenvectors | λ i ( i = 1 , , N ) form an orthogonal set in R N . By decomposing the adjacency matrix as A = i = 1 N λ i | λ i λ i | , one can see that, for large enough times, Equation (9) becomes:
| θ ( t ) ( 1 + λ 1 ) t λ 1 | θ ( 0 ) | λ 1 .
As is well known from the Frobenius-Perron theorem [24], all components of the eigenvector | λ 1 , corresponding to the largest eigenvalue of the adjacency matrix A, share the same sign, which we shall assume to be positive from now on. Thus, in light of the relation in Equation (10), two main points become apparent:
  • For large enough times, | θ ( t ) is proportional to | λ 1 , meaning that all agents in the network either learn the correct value of X or they all get it wrong.
  • The sign of the components in | λ 1 is completely determined by the sign of the overlap λ 1 | θ ( 0 ) , so that the probability of the whole network learning the right information reads:
    P { X Θ ( t ) > 0 } = P { X λ 1 | θ ( 0 ) > 0 } .
In the following, we shall compute the probability Equation (11) for the simple network topology discussed above.
For the sake of simplicity, let us assume X = + 1 , so that the probability in Equation (11) is equivalent to the probability of the scalar product λ 1 | θ ( 0 ) being positive, and that each agent is initially given one signal, s = ± 1 , at time t = 0 . Assuming that hubs, i.e. nodes in the clique H , have no initial information ( θ i ( 0 ) = 0 for i H ), such a scalar product can be written as a sum over the N N H sites not belonging to H :
Y = λ 1 | θ ( 0 ) = i H λ 1 ( i ) θ i ( 0 ) ,
where λ 1 ( i ) denotes the i-th component of the first eigenvector and θ i ( 0 ) = s i log p 1 p (see Equation (5)). A good approximation scheme to estimate the probability of the quantity in Equation (12) being positive is via the central limit theorem: as a matter of fact, the scalar product in Equation (12) is the sum of N N H random variables, each given by the product of two random variables: y i = λ 1 ( i ) θ i ( 0 ) . Thus, the probability of Y in Equation (12) being positive is approximately given by:
P { Y > 0 } 1 2 erfc μ Y 2 σ Y ,
where μ Y and σ Y denote the mean and standard deviation, respectively, of the random variable Y. Given the independence of the θ i s and the eigenvector components λ 1 ( i ) s, such two quantities are given by:
μ Y = ( N N H ) μ θ μ λ
σ Y 2 = ( N N H ) μ θ 2 σ λ 2 + μ λ 2 σ θ 2 + σ θ 2 σ λ 2 ,
where μ θ and σ θ denote the mean and standard deviation of the random variables θ i , whereas μ λ and σ λ denote the mean and standard deviation of the eigenvector components λ 1 ( i ) for i H . Computing μ θ and σ θ is easy. Recalling that signals must be informative (see Equation (1)), one has p = P { s = + 1 | X = + 1 } > 1 / 2 . Let us rewrite such probability as p = ( 1 + x ) / 2 with x ( 0 , 1 ) . Then, one can immediately verify that:
μ θ = x log 1 + x 1 x
σ θ = 1 x 2 log 1 + x 1 x .
As regards μ λ and σ λ , good approximate expressions for them can be computed by employing standard perturbation theory up to the second order (see Appendix A for the details). With the leading order in N, one gets:
(16) μ λ c f ( 1 f ) N 3 / 2 σ λ 2 c ( 1 f ( c + 1 ) ) f 2 ( 1 f ) 2 N 3 ,
where f = N H / N denotes the fraction of hubs in the network. As can be seen from the inset in Figure 1, the above approximations are in excellent agreement with results obtained from numerical diagonalization of adjacency matrices, especially for large network sizes.
Figure 1. The prediction by Equation (18) for the probability of correct information aggregation (solid line) is compared with the results of numerical simulations run with the parallel dynamics of Equation (9). Each dot represents the empirical estimate of such a probability for a given value of x computed as the fraction (over 10 4 samples) of networks that reached consensus on the true value of X. All simulations were performed on networks with c = 4 and f = 0 . 05 for different values of the system size N (shown in the plot). Inset: Comparison between the large N approximations (solid lines) for μ λ and σ λ in Equation (19) and the corresponding quantities estimated by averaging over the top eigenvectors | λ 1 of 100 network configurations. As can be seen, for large enough values of N the empirically measured mean and standard deviation are in excellent agreement with the approximations in Equation (16). In all cases, we have c = 4 and f = 0 . 05 .
Figure 1. The prediction by Equation (18) for the probability of correct information aggregation (solid line) is compared with the results of numerical simulations run with the parallel dynamics of Equation (9). Each dot represents the empirical estimate of such a probability for a given value of x computed as the fraction (over 10 4 samples) of networks that reached consensus on the true value of X. All simulations were performed on networks with c = 4 and f = 0 . 05 for different values of the system size N (shown in the plot). Inset: Comparison between the large N approximations (solid lines) for μ λ and σ λ in Equation (19) and the corresponding quantities estimated by averaging over the top eigenvectors | λ 1 of 100 network configurations. As can be seen, for large enough values of N the empirically measured mean and standard deviation are in excellent agreement with the approximations in Equation (16). In all cases, we have c = 4 and f = 0 . 05 .
Entropy 15 03031 g001
Plugging Equations (15) and (16) into Equation (14), one can eventually compute the probability of converging to the right value of event X, as in Equation (13):
P { Y > 0 } 1 2 erfc x N 2 c f ( 1 f ) 1 f ( 1 + c x 2 ) .
As already stated, we are mostly interested in cases where only a few nodes in the network play the role of hubs, i.e. f 1 : in this case, the probability in Equation (17) further simplifies to the following remarkably simple expression:
P { Y > 0 } 1 2 erfc x N 2 f c .
In Figure 1, the prediction by the above equation is compared with the results of numerical simulations for c = 4 and f = 0 . 05 , and for several different system sizes N: all results are in very good agreement with Equation (18) (all data points are rescaled in order to collapse on the function erfc ( x ) / 2 ).
A few comments are in order on the approximate result of Equation (18). Since erfc ( 2 ) / 2 1 , according to Equation (18) for each system size N correct information aggregation happens with a probability that, for all practical purposes, can be considered equal to one when the initial signals’ informativeness is p p 0 = ( 1 + x 0 ) / 2 , where x 0 = 2 2 / ( N f c ) . This point essentially means that for any population size N correct information aggregation is possible, for informative enough initial signals, despite the presence of a fraction f of dominant nodes. Such a result shows that the presence of a group of individuals with large influence does not necessarily jeopardize correct information aggregation. Moreover, the threshold value x 0 is inversely proportional to N , meaning that large populations will be able to aggregate information correctly as soon as signals are informative, i.e. as soon as p is slightly larger than 1 / 2 . This is essentially a stronger statement of previous results obtained for infinite networks (see, for example, [17]), where the presence of signals with arbitrarily large informativeness, combined with the lack of individuals with unbounded influence, is identified as a sufficient condition for correct information aggregation. On the other hand, for p < p 0 the population reaches consensus on the wrong value of X with non-zero probability.
A very interesting role in the information aggregation process is played by the fraction of hubs f. In Figure 2, one can see how, for a fixed system size N, the probability of correct information aggregation behaves when increasing the fraction f of hubs in the network. Furthermore, it is rather interesting to compare such results with the information aggregation capabilities of a regular graph, where all nodes have the same degree c. In such a case, one can immediately verify that the first eigenvector of the adjacency matrix is uniform with all components equal to 1 / N , and the probability of the scalar product in Equation (12) being positive simply reduces to the probability of the sum i = 1 N θ i ( 0 ) being positive. Therefore, one can compute the probability of correct information aggregation of a regular network with easy central limit theorem considerations, analogous to those already presented in this Section. Such a probability does not depend on c and reads:
P RN { Y > 0 } = 1 2 erfc x N 2 ( 1 x 2 ) 1 2 erfc x N 2 ,
where the last approximation holds for large values of N. As one can see, Equation (18) reduces to the above expression for f = c 1 (though, numerically, one does not find perfect agreement between the two, since Equations (17) and (18) represent good approximations only for very low values of f). Therefore, the lesson to be learned from the plots in Figure 2 is two-fold. First, one can see that as soon as a very small clique of uninformed hubs is introduced, in a regular graph, the overall population’s ability to correctly aggregate information decreases sharply. This can also be understood by observing that the probability in Equation (17) does not recover the regular network (RN) result, Equation (19), when considering vanishingly small fractions of hubs, i.e.
lim f 0 P { Y > 0 } P RN { Y > 0 } .
On the other hand, whenever a clique of hubs is present in the network, then information aggregation can actually be improved by increasing the size of the clique itself, up to the point (for f c 1 ) where the aggregation ability of the original regular graph can almost be reproduced.
Figure 2. Probability of correct information aggregation as a function of the informativeness level of initial signals. The solid line refers to the case of a regular graph with N = 10 4 (see Equation (19)). The other data points refer to networks with N = 10 4 and c = 4 with different fractions of hubs f. As can be seen, “perturbing” a regular graph with the introduction of a very small fraction of hubs seriously reduces the network’s information aggregation performance. Increasing the fraction of hubs up to f c 1 progressively restores the regular graph levels of aggregation. Each data point was obtained by averaging over 10 4 independent networks.
Figure 2. Probability of correct information aggregation as a function of the informativeness level of initial signals. The solid line refers to the case of a regular graph with N = 10 4 (see Equation (19)). The other data points refer to networks with N = 10 4 and c = 4 with different fractions of hubs f. As can be seen, “perturbing” a regular graph with the introduction of a very small fraction of hubs seriously reduces the network’s information aggregation performance. Increasing the fraction of hubs up to f c 1 progressively restores the regular graph levels of aggregation. Each data point was obtained by averaging over 10 4 independent networks.
Entropy 15 03031 g002
Intuitively, the above findings can be altogether understood in the following terms. According to our setting, all hubs in the clique H are mutually connected and have a degree equal to N H + c 1 . This means that each hub has exactly c neighbors outside H , so that one can expect roughly c N H = c f N nodes to fall within the clique’s neighborhood, H . Therefore, for very low values of f, H contains a negligibly small number of nodes, which, however, will largely influence the initially uninformed hubs whenever they communicate for the first time. Given the small size of H , its initial state of knowledge will be much more sensitive to fluctuations in the initial signal’s distribution among agents. On the other hand, when f c 1 the number of nodes in the neighborhood of H becomes of order N, hence much more robust with respect to fluctuations.
In summary, the role of hubs in our model is subtle, as a handful of them is enough to heavily damage the good information aggregation properties of a population of equals (as modeled by a regular graph), whereas increasing their number also has “healing” effects, which can restore such good properties.

4. General Dynamics

So far, we only have considered the most popular and widely used evolution rule for the information propagation on a network, i.e. the parallel dynamics introduced in Equation (8). However, as already discussed in Section 2, parallel dynamics represents one of the two extreme cases of the general dynamics in Equation (6), according to which a fraction Φ of agents listens to their neighbors at each time step t, i.e. the case Φ = 1 . The other extreme case is the already mentioned RNS dynamics ( Φ = 1 / N ), according to which agents update their state of knowledge one at a time.
Numerical simulations highlight significant differences in a social network’s ability to aggregate information correctly under parallel or RNS dynamics, the latter performing much worse than the former: as shown in the left panel of Figure 3, the probability of correct information aggregation under parallel dynamics outperforms the one obtained under RNS dynamics over a wide range of signal informativeness levels (we also performed numerical simulations under a random link sequential (RLS) dynamics, i.e. an information update rule according to which at each time step t a link is randomly selected, and the two nodes that share it exchange information. However, none of the simulations we performed highlighted any significant difference between such a dynamics and the parallel one). Moreover, results obtained via RNS dynamics show no relevant dependence on the system size N.
The above findings suggest to look for a transition in information aggregation as a function of the number of agents that update their state of knowledge at a given time step by letting the parameter Φ take values over the whole interval [ 1 / N , 1 ] . In the right panel of Figure 3, we plot the probability P of correct information aggregation as a function of Φ for different system sizes and a fixed informativeness level of the signals initially distributed to agents (the qualitative overall appearance of the results is not changed when considering different levels of informativeness). As can be seen, for increasing values of Φ a transition is observed towards better information aggregation capabilities for all system sizes. This can essentially be interpreted in terms of the speed of information update. As one could expect, RNS dynamics is extremely slow compared to parallel dynamics (depending on the system size, we find, on average, that RNS dynamics reaches consensus in times that are 3–4 orders of magnitude larger than the ones required by parallel update), hence more prone to allow the spreading of misleading signals in the agents’ initial distribution. On the other hand, parallel dynamics is fast, in such a way that, in a few time steps, each agent receives through his/her neighbors aggregated information coming from the whole network.
Figure 3. Left: Probability of correct information aggregation as a function of the informativeness level of initial signals. The different data sets refer to different types of dynamics run on networks with N = 10 3 or N = 5 · 10 3 , f = 0 . 05 and c = 4 . As can be seen, RNS dynamics does not show any significant dependence on the system size and performs much worse than parallel dynamics at correctly aggregating information. Right: Probability of correct information aggregation as a function of the fraction Φ of agents that listen to their neighbors at each time step. The extreme cases, Φ = 1 / N and Φ = 1 , correspond, respectively, to RNS and parallel dynamics. All data were obtained for signal informativeness fixed as x = 0 . 16 . In both plots, all data points are obtained by averaging over 10 4 independent network configurations.
Figure 3. Left: Probability of correct information aggregation as a function of the informativeness level of initial signals. The different data sets refer to different types of dynamics run on networks with N = 10 3 or N = 5 · 10 3 , f = 0 . 05 and c = 4 . As can be seen, RNS dynamics does not show any significant dependence on the system size and performs much worse than parallel dynamics at correctly aggregating information. Right: Probability of correct information aggregation as a function of the fraction Φ of agents that listen to their neighbors at each time step. The extreme cases, Φ = 1 / N and Φ = 1 , correspond, respectively, to RNS and parallel dynamics. All data were obtained for signal informativeness fixed as x = 0 . 16 . In both plots, all data points are obtained by averaging over 10 4 independent network configurations.
Entropy 15 03031 g003

5. Conclusions

In summary, we have presented a stylized dynamic network model of the information diffusion throughout a large society featuring a small fraction of uninformed leaders. The model’s simplicity allows, in some cases, to make analytical considerations. Namely, when assuming all agents to simultaneously update their state of knowledge on a given issue, we are able to provide a closed-form expression for the probability of correct information aggregation as a function of the system size, i.e. the number of agents in the society, and the fraction of individuals playing the role of hubs. Our results partially overlap with previous works from the social learning literature in economics, as we show that larger populations are better, on average, at aggregating information. On the other hand, we provide interesting novel results on the role played by the size of an uninformed élite, portrayed in our model by a clique of nodes that do not own any prior information on the issue being discussed by the population. First, we show a rather counterintuitive result, i.e. that increasing the relative size (compared to the overall population) of such uninformed élites actually helps the information aggregation process. Moreover, we show that letting the fraction of hubs go to zero does not recover the results obtained for the corresponding hub-free regular network.
Rather interestingly, we also show our model to be sensitive to the information update speed, as defined by the fraction of agents who simultaneously revise their information at each time step, by showing the existence of a transition towards better information aggregation capabilities when moving from the low speed towards the high speed regime.

Conflict of Interest

The authors declare no conflict of interest.

References

  1. Sen, A. Development as Freedom; Oxford University Press: Oxford, UK, 1999; p. 182. [Google Scholar]
  2. Surowiecki, J. The Wisdom of Crowds: Why the Many are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations; Doubleday Books: New York, NY, USA, 2004. [Google Scholar]
  3. Shafak, E. The view from Taksim Square: why is Turkey now in turmoil? Available online: http://www.guardian.co.uk/world/2013/jun/03/taksim-square-istanbul-turkey-protest (accessed on 3 June 2013).
  4. Gore, A. The Assault on Reason; Penguin Press: London, UK, 2007. [Google Scholar]
  5. Castellano, C.; Fortunato, S.; Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 2009, 81, 591–646. [Google Scholar] [CrossRef]
  6. Clifford, P.; Sudbury, A.W. A model for spacial conflict. Biometrika 1973, 60, 581–588. [Google Scholar] [CrossRef]
  7. Redner, S. A Guide to First-passage Processes; Cambridge University Press: Cambridge, UK, 2001. [Google Scholar]
  8. Galam, S. Minority opinion spreading in random geometry. Eur. Phys. J. B 2002, 25, 403–406. [Google Scholar] [CrossRef]
  9. Sznajd-Weron, K.; Sznajd, J. Opinion evolution in closed community. Int. J. Mod. Phys. C 2000, 11, 1157–1165. [Google Scholar] [CrossRef]
  10. Bikhchandani, S.; Hirshleifer, D.; Welch, I. A theory of fads, fashion, custom, and cultural change as informational cascades. J. Polit. Econ. 1992, 100, 992–1026. [Google Scholar] [CrossRef]
  11. Lorenz, J.; Rauhut, H.; Schweitzer, F.; Helbing, D. How social influence can undermine the wisdom of crowd effect. Proc. Natl. Acad. Sci. USA 2011, 108, 9020–9025. [Google Scholar] [CrossRef] [PubMed]
  12. Curty, P.; Marsili, M. Phase coexistence in a forecasting game. J. Stat. Mech. 2006, P03013. [Google Scholar] [CrossRef]
  13. Duffie, D.; Manso, G. Information percolation in large markets. Am. Econ. Rev. 2007, 97, 203–209. [Google Scholar] [CrossRef]
  14. Duffie, D.; Giroux, G.; Manso, G. Information percolation. Am. Econ. J. Microecon. 2010, 2, 100–111. [Google Scholar] [CrossRef]
  15. Gale, D.; Kariv, S. Bayesian learning in social networks. Games Econ. Behav. 2003, 45, 329–246. [Google Scholar] [CrossRef]
  16. DeGroot, M.H. Reaching a consensus. J. Am. Stat. Assoc. 1974, 69, 118–121. [Google Scholar] [CrossRef]
  17. Acemoglu, D.; Dahleh, M.A.; Lobel, I.; Ozdaglar, A. Bayesian learning in social networks. Rev. Econ. Stud. 2011, 78, 1201–1236. [Google Scholar] [CrossRef]
  18. González-Avella, J.C.; Eguíluz, V.M.; Marsili, M.; Vega-Redondo, F.; San Miguel, M. Threshold learning dynamics in social networks. PLoS One 2011, 6, e20207. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Smith, L.; Sørensen, P. Pathological outcomes of observational learning. Econometrica 2000, 68, 371–398. [Google Scholar] [CrossRef]
  20. Bala, V.; Goyal, S. Learning from neighbors. Rev. Econ. Stud. 1998, 65, 595–621. [Google Scholar] [CrossRef]
  21. Golub, B.; Jackson, M.O. Naive learning in social networks and the wisdom of crowds. Am. Econ. J. Microecon. 2010, 2, 112–149. [Google Scholar] [CrossRef]
  22. Demarzo, P.M.; Vayanos, D.; Zwiebel, J. Persuasion bias, social influence, and unidimensional opinions. Q. J. Econ. 2003, 18, 909–968. [Google Scholar] [CrossRef]
  23. Lee, P.M. Bayesian Statistics: An Introduction; Arnold: London, UK, 1997. [Google Scholar]
  24. Berman, A.; Plemmons, R.J. Nonnegative Matrices in the Mathematical Sciences; SIAM: New York, NY, USA, 1994. [Google Scholar]

Appendix

A. Perturbative Approximation of the Top Eigenvector | λ 1

When assuming hubs to be identified by nodes 1 , , N H , the network adjacency matrix A takes the following block form:
A = I G G T C .
In the above equation, I is an N H × N H block, such that I i j = 1 for i j and I i i = 0 i . The off-diagonal block G is of size N H × ( N N H ) , and it accounts for neighbors of the clique H , i.e. G i j = 1 for i H and j H , or vice versa, and zero otherwise. Lastly, the block C is of size ( N N H ) × ( N N H ) , and it accounts for links between nodes that do not belong to H .
Spectral properties of the adjacency matrix A expressed in block form, as in Equation (21), can be deduced from standard perturbation theory. As a matter of fact, such a matrix can be decomposed as A = A H + A ˜ , where:
A H = I 0 0 0 , A ˜ = 0 G G T C .
For small values of c (i.e., the degree of nodes outside of H ), the matrix A ˜ above is sparse and can be interpreted as a perturbation to the matrix A H , describing the fully connected clique H plus a sea of N N H disconnected nodes.
Let us denote the eigenvalues and eigenvectors of the “unperturbed” adjacency matrix A H as λ i , H and | λ i , H . They fall within three categories:
  • The largest eigenvalue reads λ 1 , H = N H 1 , and its normalized eigenvector | λ i , H has the first N H components equal to 1 / N H and the remaining N N H ones equal to zero.
  • λ i , H = 1 for i = 2 , , N N H , with eigenvectors having non-zero components only in the first N N H sites.
  • λ i , H = 0 for i = N N H + 1 , , N , with eigenvectors that can simply be chosen as having all components equal to zero, except for the i-th component being equal to one.
Let us then approximate the first eigenvector of the full adjacency matrix A as:
| λ 1 | λ 1 , H + | λ 1 + | λ 1 ,
where | λ 1 and | λ 1 denote the first and second order corrections, respectively, to the unperturbed eigenvector | λ 1 , H . The first order correction only involves neighbors of the clique H and it reads:
| λ 1 = j > 1 λ j , H | A ˜ | λ 1 , H λ 1 , H λ j , H | λ j , H = 1 N H ( N H 1 ) j H n j | λ j , H ,
where n i = j H a i j represents the number of neighbors that node i has within the clique H . The second order correction (second order corrections also involve the first N H components of λ 1 , H ; however, such corrections are irrelevant to our analysis, so we can safely neglect them) involves neighbors of the nearest neighbors of the clique, H :
| λ 1 = j > 1 | λ j , H > 1 λ j , H | A ˜ | λ , H λ , H | A ˜ | λ 1 , H ( λ 1 , H λ j , H ) ( λ 1 , H λ , H )
= 1 N H ( N H 1 ) 2 j ( H ) n j | λ j , H ,
where ( H ) denotes the set of next to nearest neighbors of the clique H , whereas n i = j H a i j is the number of neighbors that node i has amongst neighbors of the clique H . In order to perform exact calculations up to the second order, one should, in principle, compute the expected number of nodes belonging to H and ( H ) and the expected values of the quantities n j in Equation (24) and n j in Equation (25) by averaging over all possible network configurations built as explained in Section 2 for given N, N H and c. However, in order to keep things simple, let us just assume that each node in H has just one neighbor in the clique H , and, in a similar fashion, that each node in ( H ) has just one neighbor in H , which amounts to posing n i = 1 , i H , and n j = 1 , j ( H ) . Clearly, both such approximations work well, as long as the number of nodes in H is small compared to N, i.e. for f 1 , where f = N H / N .
According to the above approximations, the N N H nodes not belonging to H yield the following components in | λ 1 , as computed with Equation (23):
  • c N H components (i.e. the number of nodes in H ) equal to 1 / ( N H ( N H 1 ) ) 1 / N H 3 / 2
  • c ( c 1 ) N H components (i.e. the number of nodes in ( H ) ) equal to 1 / ( N H ( N H 1 ) 2 ) 1 / N H 5 / 2
  • N ( 1 + c 2 ) N H components equal to zero.
Therefore, the mean μ λ and standard deviation σ λ (see Equation (14)), can be computed as follows:
μ λ = 1 N N H c N H + c ( c 1 ) N H 3 / 2
σ λ 2 = 1 N N H 1 N H 3 / 2 μ λ 2 c N H + 1 N H 5 / 2 μ λ 2 c ( c 1 ) N H + μ λ 2 N ( 1 + c 2 ) N H ,
and the approximations in Equation (16) can be immediately derived as leading order results in N of the above expressions.

Share and Cite

MDPI and ACS Style

Livan, G.; Marsili, M. What Do Leaders Know? Entropy 2013, 15, 3031-3044. https://doi.org/10.3390/e15083031

AMA Style

Livan G, Marsili M. What Do Leaders Know? Entropy. 2013; 15(8):3031-3044. https://doi.org/10.3390/e15083031

Chicago/Turabian Style

Livan, Giacomo, and Matteo Marsili. 2013. "What Do Leaders Know?" Entropy 15, no. 8: 3031-3044. https://doi.org/10.3390/e15083031

APA Style

Livan, G., & Marsili, M. (2013). What Do Leaders Know? Entropy, 15(8), 3031-3044. https://doi.org/10.3390/e15083031

Article Metrics

Back to TopTop