Innovation adoption and collective experimentation

Innovation adoption and collective experimentation

Games and Economic Behavior 120 (2020) 121–131 Contents lists available at ScienceDirect Games and Economic Behavior www.elsevier.com/locate/geb No...

396KB Sizes 0 Downloads 45 Views

Games and Economic Behavior 120 (2020) 121–131

Contents lists available at ScienceDirect

Games and Economic Behavior www.elsevier.com/locate/geb

Note

Innovation adoption and collective experimentation ✩ Evan Sadler Columbia University, United States of America

a r t i c l e

i n f o

Article history: Received 15 October 2018 Available online 8 January 2020 Keywords: Experimentation Innovation adoption Networks Social learning

a b s t r a c t I study learning about an innovation with costly information acquisition and knowledge sharing through a network. Agents situated in an arbitrary graph follow a myopic belief update rule. The network structure and initial beliefs jointly determine long-run adoption behavior. Networks that share information effectively converge on a consensus more quickly but are prone to errors. Consequently, dense or centralized networks have more volatile outcomes, and efforts to seed adoption should focus on individuals who are disconnected from one another. I argue that anti-seeding, preventing central individuals from experimenting early in the learning process, is an effective intervention because the population as a whole may gather more information. © 2020 Elsevier Inc. All rights reserved.

1. Introduction Mounting evidence demonstrates that information transmission through social ties plays a central role in the diffusion of innovations. From agricultural technologies (Munshi, 2004; Conley and Udry, 2010) to health products (Dupas, 2014) and plans (Sorensen, 2006) to investments (Duflo and Saez, 2003) and microfinance (Banerjee et al., 2013), social learning strongly influences whether people adopt. Learning is often important because of uncertainty about the merits of innovations. To resolve this uncertainty, someone must first gather information. While we can acquire information through our own experiences—say planting a new high yielding seed variety or using the latest smart phone—we can also learn from the experiences of friends and acquaintances. The structure of social networks can therefore influence the gathering and dissemination of information about new technologies. This paper addresses two key questions about innovation adoption and collective experimentation. First, how does the structure of a social network affect adoption patterns? I seek to better understand how individual choices to experiment interact with the information sharing network. Second, how can we encourage communities to adopt new technologies? Many authors study how to best “seed” a network (Kempe et al., 2003, 2015), but I argue we should also study whom to avoid seeding. To answer these questions, I study a social learning model that incorporates both costly information acquisition and information sharing. A population faces a two-armed bandit problem in which one arm, the standard technology, has known payoffs, and the other, the innovation, has uncertain payoffs. Agents are embedded in a social network. They choose how much to experiment with the innovation, and they share all resulting information with their neighbors. The agents are

✩ I am grateful to Roy Radner for many fruitful discussions about this project, and I thank Ilan Lobel, David Pearce, and Elliot Lipnowski for their comments and suggestions. I am also grateful to participants at the IESE Workshop on Learning in Social Networks and the 2014 European Winter Meeting of the Econometric Society for their feedback. E-mail address: [email protected].

https://doi.org/10.1016/j.geb.2019.12.011 0899-8256/© 2020 Elsevier Inc. All rights reserved.

122

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

strategically myopic: while they use Bayes’ rule to update their beliefs in response to experimental outcomes, they do not make further inferences from neighbors’ choices whether to experiment. Over time, agents update their beliefs about the innovation’s merits, and they continue using the innovation only if they are sufficiently optimistic. The network structure governs a collective tradeoff between gathering more information and wasting effort on bad innovations. I characterize this tradeoff, highlighting why dense and centralized networks are more prone to mistakes, and I use simulations to explore its implications for seeding strategies. In this context, encouraging adoption means minimizing the probability that the society mistakenly abandons a good innovation. To reduce the probability of long-run mistakes, we need to increase the amount of information that society gathers. If we can induce several individuals to start experimenting, this goal is best served by choosing disconnected seeds. When seeds share information with each other, their beliefs move together, and adoption decisions become more correlated. Increased independence makes it more likely that someone keeps experimenting, and eventually the information they gather will spread. In addition to seeding, I highlight a counterintuitive intervention: preventing certain agents from gathering information. Individuals who are widely observed contribute to more correlated adoption decisions throughout the population. Stopping experimentation by such individuals leads to more experimentation overall. This finding depends crucially on information sharing being the main channel of social influence, and it also requires that others are already willing to engage in some experimentation. The strategies that encourage adoption in this setting contrast with what works for other influence mechanisms. When social influence acts through direct externalities (e.g. Morris, 2000; Kempe et al., 2003), clustering seeds together is typically important to ensure agents have sufficiently many adopting neighbors. The prescription here is exactly the opposite. Recent research uses field experiments to distinguish underlying influence mechanisms (Bursztyn et al., 2014). The results in this paper suggest another approach to disentagle social learning from social utility: we can look at the macroscopic effects of different seeding strategies. If social learning and information gathering are important, then clustered seeds should lead to higher variance outcomes with lower average adoption levels. 1.1. Related work This paper takes insights from the literature on social learning and attempts to draw practical advice. Recent work in the development literature has begun to explore the role of social networks in technology adoption and the potential of seeding to encourage broader adoption (Banerjee et al., 2012, 2013). The findings in this paper can help guide further work in this direction. As in much of the social learning literature, long-run consensus is a robust feature of the model. While most work focuses on conditions under which this consensus is correct in large network limits (Acemoglu et al., 2011; Lobel and Sadler, 2015; Golub and Jackson, 2010, 2012), I try to quantify the extent of error in networks of a fixed size. A number of recent papers consider learning under various behavioral assumptions. The most popular approaches assume correlation neglect (Dasaratha and He, 2018) or some form of model misspecification (Bohren and Hauser, 2018; Frick et al., 2018). My agents never draw an incorrect inference, and they do not suffer from misspecification. Instead, they ignore a potential source of additional information because that source is difficult to interpret. Within the social learning literature, Bala and Goyal (1998) is closest to the present paper. Like Bala and Goyal, I assume agents are strategically myopic, and I permit heterogeneous priors—this serves to focus our attention on the network microstructure rather than on individual incentives. I adapt the two-armed Brownian bandit of Bolton and Harris (1999) to obtain more complete outcome characterizations. I argue that network microstructure has a first-order effect on long-run adoption behavior, dominating the effect of strategic considerations like free-riding and encouragement. In more recent related work, Salish (2015) studies strategic experimentation in networks using a discrete exponential bandit to model the information process. The paper characterizes Markov equilibrium strategies for three special cases—complete networks, ring networks, and star networks. The focus in this work is on welfare comparisons across these different network structures, while the present paper focuses on long-run outcomes and interventions. My work falls within a broader literature in recent years that studies the economic consequences of social networks. This literature explores a range of problems related to product adoption (Jackson and Yariv, 2007), diffusion (Jackson and Rogers, 2007), public goods (Elliott and Golub, 2019), games on networks (Galeotti et al., 2010), and network formation (Galeotti and Goyal, 2010). Studying information transmission, and how network structures influence who gathers that information, is a natural complement to this research. 2. The learning process A population of N agents faces a choice between two competing technologies: technology 0 is the “standard” technology, and technology 1 is the “innovation.” At each instant t, the agents choose which of the two to use, and I write αi (t ) ∈ {0, 1} for the choice of agent i at time t. Agent i’s actions produce cumulative output πi (t ) that evolves stochastically over time. If αi (t ) = 0, we have

dπi (t ) = μ0 dt + σ d Z i (t ),

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

and if

123

αi (t ) = 1, we have dπi (t ) = μ1 dt + σ d Z i (t ),

where { Z i }iN=1 are mutually independent standard Brownian motions. Hence, the output πi (t ) follows a Brownian motion with drift that depends on agent i’s decision at each instant. The value μ0 > 0 is commonly known, while μ1 takes one of two possible values {l, h} with l < μ0 < h. If μ1 = h, the innovation is good; otherwise it is bad. Agents are uncertain whether the innovation is good. Write p i (t ) for the probability that agent i assigns to the event that μ1 = h at time t—these beliefs can vary across agents, and I do not assume common priors. Agents update their beliefs using the information that they observe. This information comes from two sources. First, each agent observes her own output process. Additionally, an agent has some neighbors in the population, and she observes both the action of each neighbor and the output process associated with each neighbor. I represent the network of neighbor relationships as a directed graph G—I allow any graph. Write g i j for the entries of the associated adjacency matrix, where g i j = 1 indicates that i observes the output of j, and we take g ii = 1 by convention. Write G i ≡ { j : g i j = 1} for the set of agents that i observes and di = |G i | − 1 for i’s (out) degree. At time t, agent i has conducted total experimentation

t

ηi (t ) =

αi (s)ds, 0

resulting in total output X i (t ) obtained from the innovation. If j ∈ G i , then agent i observes the values η j (t ) and X j (t ) for all t. Given their observations, agents update their beliefs according to Bayes’ rule. The output X i (t ) is normally distributed with mean ηi (t )μ1 and variance ηi (t )σ 2 . After total experimentation ηi , the relative likelihood of a output X i in each state is

l ηi ( X i ) =

  1 μ1 = l) 2(l−h) X i +ηi (h2 −l2 ) . = e 2σ 2 dP ( X i | ηi , μ1 = h)

dP ( X i | ηi ,

(1)

Since i observes the output process for each of her neighbors, her beliefs at time t are

⎛ p i (t ) = P (μ1 = h | X j ,

η j , j ∈ G i ) = ⎝1 +

1 − p i (0)  p i (0)

⎞ −1 lη j ( X j )⎠

(2)

.

j ∈G i

I make a behavioral assumption that all agents adopt a threshold policy. Assumption 1. Each agent i has a threshold belief p ∈ (0, 1) such that i chooses i whenever p i (t ) < p .

αi (t ) = 1 whenever p i (t ) > p i and αi (t ) = 0

i

As in Bala and Goyal (1998), the agents are strategically myopic in that they do not make inferences based on neighbors’ actions, and they do not take account of how their own actions might influence neighbors. In essence, I assume the reasons behind neighbors’ actions are hard for the agents to model, so they do not try. Due to this myopia assumption, beliefs about neighbors’ beliefs are irrelevant to agents’ decisions. Note this does not necessarily imply that agents ignore the future value of information. While I am agnostic about the precise nature of these thresholds, they could reflect many idiosyncratic considerations, such as

• Heterogeneous priors; • Heterogeneous benefits from produced output; • Expectations about how much additional information will be gained from neighbors. In an individual decision problem, a threshold rule is clearly optimal. I immediately assume the thresholds to avoid excess notation and focus attention on the network’s role. Assumption 1 precludes policies that are non-monotonic in beliefs, carefully calibrated based on expectations about neighbors—in principle, an agent might choose to experiment more or less at a given moment in order to manipulate the amount of experimentation her neighbors carry out in the future. A growing empirical literature on learning in networks suggests myopia is a far more realistic approach than studying equilibria with such forward-looking agents.1 Reasoning about network structure is complex, and individuals follow simple heuristics in practice. Moreover, several results are robust to changes in the decision rule. As long as each agent follows a policy that is Markovian in own beliefs, and stops

1

See for instance Eyster et al. (2015); Grimm and Mengel (2015); Chandrasekhar et al. (2016); Mueller-Frank and Neri (2017).

124

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

experimenting when sufficiently pessimistic, then the results on consensus, and versions of the bounds in Theorem 2, continue to hold. One obvious limitation is a failure to capture the effect of changing individual behavior in response to changes in the network structure. There are a few reasons why the analysis may still offer a reasonable first-order approximation for the intended applications. First, individuals typically know the structure of their local neighborhoods but have limited knowledge of the global network. Networks with very different structures can look similar from a local perspective, and the results allow non-trivial comparisons between such networks. Second, seeding interventions rely on individual incentives to experiment more or less without changing the overall structure of the network. Particularly if such interventions are unobserved by neighbors, there is little reason to expect feedback effects. Finally, the role of strategic effects—free-riding and the encouragement effect—has been studied previously (Bolton and Harris, 1999). As I argue in section 5.1, at least in the case of regular graphs, these effects are an order of magnitude less significant than the effect of increasing or decreasing network density—the direct effect of access to information should be larger than the indirect effect of the equilibrium response. 3. Outcome metrics Our main questions are whether the society ends up adopting a good innovation and how long it takes to abandon a bad innovation. There is a natural tradeoff to experimentation. To ensure good innovations get adopted in the long run, agents must continue to experiment, even after bad results. However, doing so could mean continuing to use a bad innovation. Individuals make this tradeoff via their belief thresholds. Fixing the thresholds, the network shapes how society as a whole conducts this tradeoff. Definition 1. If limt →∞ αi (t ) = 1, I say agent i adopts the innovation. If limt →∞ αi (t ) = 0, I say agent i abandons the innovation. If all agents adopt (abandon) the innovation, I say that society adopts (abandons) the innovation. Let A 0 denote the event that society abandons the innovation and A 1 the event that society adopts the innovation. As in many models of social learning, long-run consensus is a robust outcome, so it makes sense to talk about how society balances the costs and benefits of experimentation. Proposition 1. If μ1 = l, then society abandons the innovation with probability one:

P ( A 0 | μ1 = l) = 1. If μ1 = h, and G is strongly connected, then with probability one either society adopts the innovation or society abandons the innovation. That is,

P ( A 0 | μ1 = h) + P ( A 1 | μ1 = h) = 1. I omit the straightforward proof of Proposition 1. No individual can ever adopt a bad innovation in the long run because with continued use, agents are sure to learn that μ1 = l. If an innovation is good, and at least one agent adopts it, a strongly connected network ensures that all eventually learn that fact. Even though society always abandons bad innovations, this may occur only after much experimentation. Write

η(t ) =

N

ηi (t )

i =1

for the total experimentation across society at time t, and write

η = lim η(t ) t →∞

for the total experimentation society ever conducts. Similarly, write η i for the total experimentation by agent i. Formal results explore how the network structure affects the tradeoff between the abandonment probability P ( A 0 | μ1 = h) and the expected waste E [η | μ1 = l]. 4. The conditional belief process To study the abandonment probability and the expected waste, we first need to understand how beliefs in the network evolve conditional on the underlying state. Taking the natural logarithm of (1), we define the random process

y i (ηi ) = ln lηi ( X i ) =

1 2σ

2



2(l − h) X i + ηi (h2 − l2 ) .

(3)

Conditional on a value of μ1 , the output X i follows a time-changed Brownian motion with drift—the total experimentation ηi serves as the “clock” for the process.

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

125

Lemma 1. Conditional on μ1 , the log-likelihood process is a Brownian motion with drift. If μ1 = l, then

d y i (ηi ) =

1 2σ

2

(h − l)2 dηi +

h −l

σ

dB (ηi ),

where B is a standard Brownian motion. If μ1 = h, then

d y i (ηi ) = −

1 2σ 2

Proof. Conditional on

(h − l)2 dηi +

h −l

σ

dB (ηi ).

μ1 = l, we can characterize X i via

d X i (ηi ) = ldηi + σ dB (ηi ), where B is a standard brownian motion. Hence, the log-likelihood y i (ηi ) evolves as

d y i (ηi ) =

1 2σ

The calculation for

2



2σ (l − h)dB (ηi ) + (h − l)2 dηi =

1 2σ

2

(h − l)2 dηi +

h −l

σ

dB (ηi ).

μ1 = h is analogous. 2

If μ1 = l, the log-likelihood process drifts upward over time, and if μ1 = h, the process drifts downwards. Agent i’s belief at any time depends not just on her own log-likelihood process, but on those of her neighbors as well. Substituting into (2), we have

p i (t ) = 1 +

1 − p i (0)  j∈G i e p i (0)

y j (η j (t ))

 −1 .

(4)

Agent i’s beliefs at time t are completely characterized by the ith entry of the vector

yG (t ) = Gy(t ). There exists a unique y i , depending on p , such that agent i continues experimenting only if i

y G ,i ≤ y i . Think of y i as how much bad news the agent tolerates before giving up on the innovation. While agents’ beliefs are correlated through the network G, conditional on the clocks {ηi }iN=1 , the log-likelihood processes { y i }iN=1 are mutually independent. This independence provides a crucial tool to analyze the evolution of beliefs in the population. A complication arises because the clocks are correlated: whether dηi equals 0 or dt depends on whether y G ,i is greater or less than y i . Holding fixed the set K of agents who experiment, the processes { y i }i ∈ K evolve independently, the processes { y i }i ∈/ K remain fixed. Periodically, an agent’s belief will cross her experimentation threshold, changing who is in the set K . One consequence of Lemma 1 is a fact I call the information invariance principle. The variance σ is an inverse measure of experimentation’s informativeness—higher σ means agents learn more slowly. Changes in σ are formally equivalent to a time rescaling of the underlying Brownian motions. This means that, holding the network and thresholds fixed, more informative experimentation does not increase the likelihood that society will adopt an innovation. Society may simply abandon it more quickly. Corollary 1 (Information Invariance Principle). The abandonment probability does not depend on the variance parameter output parameters l and h.

σ or the

Intuitively, more informative experimentation speeds up decisions without affecting the underlying tradeoffs. An agent who observes data twice as fast reaches a decision to abandon in half the time and is just as likely to make a mistake. Corollary 1 clearly depends on holding fixed the individual thresholds at which agents abandon the innovation. One might expect the thresholds to decrease when experimentation becomes more informative because, due to time discounting, the effective cost of experimentation is lower. This would cause agents to gather more information, reducing the probability of a mistake.

126

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

5. Network structure and long-run outcomes This section studies the collective tradeoff society makes between the risk of abandoning good innovations and the experimentation wasted on bad ones. I first give a characterization in terms of differential equations. As these may not have simple solutions, I subsequently establish bounds that are easier to compute. In general, networks that disseminate information more effectively—those with dense or centralized connections—are more likely to abandon good innovations. I also explore implications for seeding strategies designed to maximize the probability of long-run adoption. If endogenous information gathering is the primary mechanism of social influence, seeds should be as separated from each other as possible to ensure the collection of more independent information. Let K (y) denote the set of agents who experiment when the output observed up to the present moment is y: agent i is contained in K (y) if and only if y G ,i ≤ y i . If we reach the observed output y, let P (y) denote the probability that society subsequently abandons the innovation, conditional on the innovation being good. Similarly, let W (y) denote the expected waste going forward, conditional on the innovation being bad—that is, the expected waste assuming the thresholds are given by y − y. Theorem 1. The abandonment probability P (y) satisfies the differential equation

∂2 P

∂P − =0 ∂ yi ∂ yi

i ∈ K (y)

(5)

i ∈ K (y)

subject to the boundary conditions P (y) = 1 whenever Gy ≥ y, and

lim

y i →−∞

P (y) = 0

for each i. The expected waste W (y) satisfies the differential equation

| K (y)| +

(h − l)2 ∂ 2 W ∂W + = 0, ∂ yi 2σ 2 ∂ y 2i i ∈ K (y)

(6)

subject to the boundary conditions W (y) = 0 whenever Gy ≥ y, and

lim

y i →−∞

W (y) = ∞

for each i. Proof. We compute

P (y) = E [P (y + dy) | μ1 = h]

⎤ N N



∂2 P ∂ P 1 ∂2 P ≈ P (y) + E ⎣ d yi + (d y i )2 + d yi d y j ⎦ ∂ yi 2 ∂ y 2i ∂ yi ∂ y j ⎡

i =1

i =1

i = j

∂ P (h − l)2 ∂ 2 P (h − l)2 ≈ P (y) − dt + dt , 2 ∂ y i 2σ ∂ y 2i 2σ 2 i ∈ K (y) where we have dropped higher order terms, used the independence of the log-likelihood processes to eliminate crossterms, and used that (dB (t ))2 = dt for a standard Brownian motion B (t ). The abandonment probability therefore satisfies the differential equation

∂2 P

∂P = . ∂ yi ∂ yi

i ∈ K (y)

i ∈ K (y)

Similarly, for the expected waste we have

W (y) = | K (y)|dt + E [W (y + dy) | μ1 = l]

≈ | K (y)|dt + W (y) +

∂ W (h − l)2 ∂ 2 W (h − l)2 dt + dt , 2 ∂ y i 2σ ∂ y 2i 2σ 2 i ∈ K (y)

implying the second part of the result.

2

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

127

Each subset of agents defines an element in a partition of R N , the collection of y such that the experimenting agents are exactly those in the subset. The differential equations change at the boundaries of the partition elements. Within each partition element, we have an elliptic equation that can be solved numerically. Solving for the functions P (y) and W (y) gives us values for all possible sets of individual thresholds—for any threshold vector y˜ , the initial abandonment probability and expected waste are P (y − y˜ ) and W (y − y˜ ) respectively. Unfortunately, closed form solutions do not generally exist, motivating a study of simpler bounds. 5.1. Simple bounds, network density, and centralization We can leverage standard results for Brownian motions to obtain simple bounds on the abandonment probability and expected waste. Given a threshold vector y, agents must gather a minimal amount of “bad news” in order to get Gy ≥ y. This minimal amount of bad news solves the linear program

min y



yi

(7)

i≤N

s.t .

Gy ≥ y y ≥ 0.

Assuming no one gathers any good news (i.e. y i ≥ 0 for each i), the solution y ∗ (y) to (7) describes the easiest path to abandonment. We obtain the following bounds. Theorem 2. Let y ∗ (y) denote the minimal objective value for the problem (7). We have

P ( A 0 | μ1 = h) ≤ e − y

∗ (y)

and

E[η | μ1 = l] ≥

2σ 2

(h − l)2

y ∗ (y).

Proof. I note two well-known facts about Brownian motions. First, let X (t ) = σ B (t ) + μt be a Brownian motion with drift μ < 0 and variance σ 2 t. Suppose X (0) = 0, and let M = maxt ≥0 X (t ) denote the maximum the process attains. The probability that M is above some threshold x > 0 is

P ( M > x) = e

2

− 2σ|μ| x

.

This will allow us to bound the probability of abandonment. If we suppose instead that the drift expected hitting time of x > 0 is

E( T x ) =

x

μ

μ is positive, then the

.

This will allow us to bound the expected total experimentation. Fixing a vector x ∈ R N , the probability that we can ever have y ≥ x is no more than

 i≤N



P max y i (ηi ) ≥ xi = e −



η i ≥0

i≤N

max{0,xi }

.

(8)

The linear program (7) exactly maximizes this probability subject to the necessary condition for abandonment. Similarly, conditional on μ1 = l, the expected total experimentation that agent i observes must satisfy

j ∈G i

E[η j | μ1 = l] ≥

2σ 2

(h − l)2

yi ,

and the same linear program describes minimal expected experimentation in the network that can satisfy this condition.

2

While far from an exact characterization, the bounds in Theorem 2 are tight in special cases. In particular, if the agents are partitioned into completely connected cliques, with no links between cliques, the bounds are necessarily met with equality. These bounds suggest a few patterns for how the network structure influences the way society trades off errors in each state. We can interpret the vector y in the linear program as a scaled allocation of who gathers bad news when society abandons the innovation. We minimize the amount of bad news required when we allocate it to those who are most widely observed in G. Doing so induces a larger shift in societal beliefs for a given amount of bad news. Intuitively, the bound on the abandonment probability is higher, and the bound on expected waste lower, when the network is more dense or more centralized.

128

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

Table 1 Abandonment probabilities and network density. In-degree

Ring network

Random network

Clique network

Upper bound

1 2 3 4 5 6 7 8 9 10

0.0002 0.0012 0.0022 0.0062 0.0152 0.0258 0.0354 0.0638 0.0988 0.1602

0.0000 0.0002 0.0016 0.0042 0.0134 0.0228 0.0358 0.0476 0.0828 0.1352

0.0025 0.0183 0.0498

0.0025 0.0183 0.0498 0.0907 0.1353 0.1801 0.2231 0.2636 0.3012 0.3359

0.1353

We can isolate the role of density by looking at graphs in which all agents have the same in-degree. Define G i = { j : g ji = 1} as the set of agents who observe agent i, and consider a network in which |G i | = k for all i and some integer k. If we further suppose that each agent i has the same threshold y i = y > 0, the optimization problem (7) admits a particularly simple solution: we take y i =

y k

for each i, and the minimum is

Ny . k

Corollary 2. Suppose |G i | = k for each i, and all agents share the abandonment threshold y. We have N

P ( A 0 | μ1 = h) ≤ e − k y and E[η | μ1 = l] ≥

2σ 2

N

(h − l)2 k

y.

The bound on the abandonment probability declines exponentially with population size, while the bound on expected waste increases linearly. The exponent and the slope in the respective bounds are smaller when the network is more dense. This suggests that on average, dense networks experiment less, making more mistakes. Simulations show that the intuition we get from this bound is not spurious. Table 1 presents abandonment probabilities for several networks with varying density.2 Each network contains 12 agents, and the left column gives the in-degree of the agents. To construct the ring networks, one can imagine arranging agents in a ring, and an agent with degree d observes the d nearest agents. In the random networks, neighbors are chosen uniformly at random. The clique networks are only defined when d + 1 divides the number of agents—in these networks the agents are partitioned into completely connected groups. In all cases, I assume h − l = σ = 1, and y i = 1 for all i. In each column, there is a clear positive relationship between the abandonment probability and density. It is also clear that other details of the network structure matter. While the clique networks attain the upper bound exactly, the other two networks have significantly lower abandonment probabilities. Within a clique, belief changes are perfectly correlated, so whatever bad news gets gathered affects all members of the clique. In the other graphs, it matters exactly who gathers how much bad news, even holding the total amount of bad news fixed. This makes it harder for all agents to abandon the innovation. A comparison with the complete network is also instructive. If all agents observe everyone else’s output, beliefs move σ 2 y, in lock step. Holding thresholds fixed at y, the abandonment probability is exactly e − y , and the expected waste is (h2− l)2 independent of the population size. In light of our upper bound, we see that lower network density leads to a rapid reduction in the probability of mistakes relative to this benchmark. Considering other factors that may influence collective experimentation, the effect of network density should dominate that of individual level strategic effects. In the model of Bolton and Harris (1999), which studies the symmetric Bayesian equilibrium in a complete network, the abandonment probability declines at a rate N1 , and the expected waste grows logarithmically with population size. The scaling rates in Corollary 2 highlight that removing links has a far more pronounced impact than informational free riding. Table 2 highlights the effect of centralization. The table reports abandonment probabilities for random networks with increasing centralization.3 All of the networks have average degree 5, but going down the table the distribution of indegrees becomes more unequal. Increasing centralization leads to higher correlation in beliefs and hence a greater chance of abandoning the good innovation.

5.2. Seeding How might we encourage more widespread adoption of an innovation? Common approaches take the form of “seeding” strategies: induce adoption among a small subset of the population targeted to maximize social influence on others. In

2

For ring network and random network, values are estimate using 5000 simulations. These networks again contain 12 agents. I measure centralization using a version of the Gini coefficient applied to in-degrees—a more centralized network is one with a less equal distribution of attention from others. 3

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

129

Table 2 Abandonment probabilities and centralization. Gini

Prob. abandon

0.000 0.083 0.167 0.226

0.0134 0.0166 0.0686 0.1130

Fig. 1. Two possible seeding strategies. (For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.)

Fig. 2. Decentralized and centralized experimentation.

practice, this entails providing information, subsidies, or rewards to selected individuals in a community. The subsequent effects of such programs depend both on who is seeded and the underlying mechanisms of social influence. When the key mechanism is information sharing, seeding can be a double-edged sword. A well-connected seed helps quickly spread good news about an innovation, but it also helps spread bad news that could lead to abandonment—long-run adoption may be more likely if well-connected individuals avoid gathering and sharing information. Two examples help illustrate key ideas. The intuition is that effective seeding minimizes correlations in individuals’ belief evolution processes. More correlation leads to higher outcome variance, which typically means a higher abandonment probability: society either quickly learns an innovation is good, or quickly gives up after bad early results. Reducing correlation means choosing seeds who are not connected to each other and avoiding central individuals. In some cases, we might like to “anti-seed” some agents, preventing them from experimenting to give others a chance to gather their own information. First, suppose agents are highly skeptical of the innovation, so y i 0 for each i. Seeding might take the form of a subsidy—e.g. pay some agents to use the innovation, thereby lowering the adoption thresholds. Fig. 1 illustrates two possible seedings in a simple network, where the red nodes indicate seeds. Since the unseeded nodes will not adopt without a large amount of good news, we can ignore them as we estimate the abandonment probability. In seeding (a), the two seeded agents will reach a long-run decision independently from one another. If one seed adopts, the entire network will eventually learn the true state and adopt. In order for the network to abandon the innovation, both seeds must independently abandon it. Contrast this outcome with that of seeding (b), in which the two seeds share identical beliefs unless and until the unseeded agents start experimenting. Since the beliefs of the two seeds move together, we effectively have just one seed. If pa is the abandonment probability under seeding (a), the abandonment probability under seeding (b) is approximately √ pa . Placing seeds adjacent to each other eliminates independent experimentation and therefore reduces the amount of information that the population gathers. Disconnected seeds gather more independent information, making abandonment less likely. The same intuition helps us understand why anti-seeding can encourage adoption in the long run: if central agents experiment less, then other agents’ beliefs are less correlated. Consider the network in Fig. 2 with a central agent whom all others observe. Suppose all agents have the same abandonment threshold y > 0, so all are inclined to engage in some experimentation. Society can abandon the innovation only if the central agent does so. Following our earlier analysis, the central agent abandons with probability e − y . Conditional on the central agent abandoning the innovation, whether each peripheral agent abandons is an independent event. Writing P for the probability that a peripheral agent abandons the innovation conditional on the central agent doing so, the probability that society abandons the innovation is P N e − y . Consider two interventions on the central agent. First, suppose we provide a subsidy to lower the abandonment threshold. This has two effects. Most obviously, the central agent is less likely to abandon the innovation. However, it also increases

130

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

Fig. 3. A seeding example.

the conditional probability P that peripheral agents abandon it—in order to abandon, the central agent must gather more bad news, which makes it easier for each peripheral agent to abandon as well. For N sufficiently large, this latter effect dominates, and abandonment becomes more likely. Suppose instead that we block the central agent from experimenting. We lose the term e − y in our evaluation of the abandonment probability, but P goes down for each peripheral agent. Again, if N is sufficiently large, the latter effect dominates: anti-seeding may be more effective than seeding in encouraging long-run adoption. This simple example already highlights some subtlety in the relationship between network structure, individual thresholds, and the abandonment probability. Decreasing an individual’s abandonment threshold may increase society’s abandonment probability. Not only that, the relationship between an individual’s threshold and society’s abandonment probability may be non-monotonic. Extreme centralization as in the above example is unnecessary to see this effect. Consider the network in Fig. 3 in which there are three cliques of three agents each and a single intermediary. Suppose all agents initially have thresholds y = −0.5, and we can seed three agents, increasing their thresholds to 0.5 to induce experimentation. If we seed the intermediary and two of that agent’s neighbors, the abandonment probability is approximately 0.3386. If instead we seed just the three neighbors of the intermediary, the abandonment probability falls to 0.2416. If we further have the ability to reduce the threshold of the intermediary, say to y = −2, the abandonment probability falls further to 0.2096. By delaying spillovers across the three cliques, each is able to independently gather more information, and this increases the chance that seeding succeeds. 6. Final remarks Information sharing networks are important drivers of innovation diffusion in firms, organizations, and communities. When individuals must engage in costly experimentation to learn about an innovation, the network structure has complex effects on learning and long term adoption patterns. There is a tradeoff between gathering information and sharing information. The network structure and individual choices jointly determine how the group as a whole conducts this tradeoff. When individuals who are separated from one another gather most of the information, the group is less likely to reject useful innovations, but it takes longer to eliminate inferior ones. The analysis offers insight on how—or how not—to encourage more widespread adoption of innovations. One finding is the information invariance principle: holding all else equal, making experimentation more informative does not make adoption more likely, even after conditioning on a good innovation. This would suggest, for instance, that increasing the size of test plots for new agricultural technologies should have little effect on long term adoption. Individuals learn faster, but abandonment is just as likely. Our exploration of seeding strategies illustrates the key role of belief correlations. Long-run adoption depends upon gathering independent information. When more information is shared, beliefs move together, and decisions to stop experimenting become correlated. This simple observation has important implications for how to encourage widespread adoption. While seeding individuals directly contributes to information acquisition, it may also increase correlations between agents’ beliefs. In some cases, this can result in less information gathering overall. Anti-seeding, preventing certain individuals from experimenting, may be just as important as seeding because it helps reduce belief correlations. These results suggest a novel way to distinguish different mechanisms of social influence: we can look at the macro level effects of different seeding strategies. If network externalities are behind social influence, we should expect clustered seeds to produce more adoption more reliably. In contrast, if social learning and experimentation are the primary mechanisms, then clustered seeds should produce higher variance outcomes with lower average adoption. References Acemoglu, Daron, Dahleh, Munther, Lobel, Ilan, Ozdaglar, Asuman, 2011. Bayesian learning in social networks. Rev. Econ. Stud. 78, 1201–1236. Bala, Venkatesh, Goyal, Sanjeev, 1998. Learning from neighbours. Rev. Econ. Stud. 65, 595–621. Banerjee, Abhijit, Breza, Emily, Chandrasekhar, Arun, Duflo, Esther, Jackson, Matthew, 2012. Come Play With Me: Experimental Evidence of Information Diffusion about Rival Goods. Working Paper. Banerjee, Abhijit, Chandrasekhar, Arun, Duflo, Esther, Jackson, Matthew, 2013. The diffusion of microfinance. Science 341.

E. Sadler / Games and Economic Behavior 120 (2020) 121–131

131

Bohren, Aislinn, Hauser, Daniel, 2018. Social Learning with Model Misspecification: a Framework and a Characterization. Working Paper. Bolton, Patrick, Harris, Christopher, 1999. Strategic experimentation. Econometrica 67, 349–374. Bursztyn, Leonardo, Ederer, Florian, Ferman, Bruno, Yuchtman, Noam, 2014. Understanding mechanisms underlying peer effects: evidence from a field experiment on financial decisions. Econometrica 82, 1273–1301. Chandrasekhar, Arun, Larreguy, Horacio, Xandri, Juan Pablo, 2016. Testing Models of Social Learning on Networks: Evidence from a Framed Field Experiment. NBER Working Paper No. 21468. Conley, Timothy, Udry, Christopher, 2010. Learning about a new technology: Pineapple in Ghana. Am. Econ. Rev. 100, 35–69. Dasaratha, Krishna, He, Kevin, 2018. Network Structure and Naïve Sequential Learning. Working Paper. Duflo, Esther, Saez, Emmanuel, 2003. The role of information and social interactions in retirement plan decisions: evidence from a randomized experiment. Q. J. Econ. 118, 815–842. Dupas, Pascaline, 2014. Short-run subsidies and long-run adoption of new health products: evidence from a field experiment. Econometrica 82, 197–228. Elliott, Matthew, Golub, Ben, 2019. A network approach to public goods. J. Polit. Econ 127, 730–776. Eyster, Erik, Rabin, Matthew, Weizsäecker, Georg, 2015. An Experiment on Social Mislearning. Working Paper. Frick, Mira, Iijima, Ryota, Ishii, Yuhta, 2018. Misinterpreting Others and the Fragility of Social Learning. Working Paper. Galeotti, Andrea, Goyal, Sanjeev, 2010. The law of the few. Am. Econ. Rev. 100, 1468–1492. Galeotti, Andrea, Goyal, Sanjeev, Jackson, Matthew, Vega-Redondo, Fernando, Yariv, Leeat, 2010. Network games. Rev. Econ. Stud. 77, 218–244. Golub, Ben, Jackson, Matthew, 2010. Naïve learning in social networks and the wisdom of crowds. Am. Econ. J. Microecon. 2, 112–149. Golub, Ben, Jackson, Matthew, 2012. How homophily affects the speed of learning and best-response dynamics. Q. J. Econ. 127, 1287–1338. Grimm, Veronika, Mengel, Friederike, 2015. Experiments on Belief Formation in Networks. Working Paper. Jackson, Matthew, Rogers, Brian, 2007. Relating network structure to diffusion properties through stochastic dominance. B.E. J. Theor. Econ. 7. Jackson, M., Yariv, L., 2007. Diffusion of behavior and equilibrium properties in network games. Am. Econ. Rev. 97, 92–98. Kempe, David, Kleinberg, Jon, Tardos, Eva, 2003. Maximizing the spread of influence through a social network. In: KDD Conference Proceedings. Kempe, David, Kleinberg, Jon, Tardos, Eva, 2015. Maximizing the spread of influence through a social network. Theory Comput. 11, 105–147. Lobel, Ilan, Sadler, Evan, 2015. Information diffusion in networks through social learning. Theor. Econ. 10, 807–851. Morris, Stephen, 2000. Contagion. Rev. Econ. Stud. 67, 57–78. Mueller-Frank, Manuel, Neri, Claudia, 2017. Quasi-Bayesian Updating in Social Networks. Working Paper. Munshi, Kaivan, 2004. Social learning in a heterogeneous population: technology diffusion in the Indian green revolution. J. Dev. Econ. 73, 185–213. Salish, Mirjam, 2015. Learning Faster or More Precisely? Strategic Experimentation in Networks. Working Paper. Sorensen, Alan, 2006. Social learning and health plan choice. Rand J. Econ. 37, 929–945.