1 of 202

Social Networks & Health

2023

2 of 202

Welcome!

Monday 15

Tuesday 16

Wednesday 17

Thursday 18

Friday 19

9:00

jimi adams:

Panel on Network Interventions

James Moody

James Moody

9:30

Modalities and Pragmatics in data collection

Tom Valente

Review/Intro statistics on networks

Agent Based Models for SN&H.

 

10:00

Craig Rawlings:

Yamile Molina

James Moody

Sam Jenness

10:30

Social Balance, Children, Video

Nina Yamanis

New Features of ERGM 4.0

EpiModel: Epidemic Disease Model Sim

10:45

Coffee

Coffee

Coffee

Coffee

11:00

Moody:

Panel on Network Experiments

Thomas Wolff

Peter Cho, Jessilyn Dunn

11:30

Implications of Missing data

Sharique Hassan: Field Experiments

ERGM for Ego nets & Bipartite graphs

Wearables for Data Collection

12:00

Brea Perry

Ashley Harrell: Lab Experiments

Carter Butts:

Summary Q&A session

12:30

Advances in ego network data collection and analysis?

In Silico Network Experiments to Probe Mechanisms and Inform Study Design

1:00

Welcome

Lunch: DNAC Crew

Lunch: Dana Pasquale:

Lunch: Robin Dodsworth

1:30

James Moody

Open Discussion on Ethical Issues

Data Archiving

Linguistic variation and social networks

2:00

Networks & Health

Ashton Verdery

Craig Rawlings

Alex Volfovsky

2:30

Review

RDS/Link Data

Peer Influence Models

Latent Spaces & Causal Effects

2:45

Coffee

Coffee

Coffee

Coffee

3:00

DNAC RA

James Moody

Ashton Verdery

David Schaefer

3:30

Coding in R for SNA

What's the best community detection method?

Simulating Kinship Networks

Siena Models

4:00

Peter Mucha

James Moody

Scott Duxbury

4:30

Gabriel Varela: IDEANet

Proper Community Detection via CHAMP

Relational Block Models

Relational Event Models

5:00

Stump the Chumps

Stump the Chumps

Stump the Chumps

Stump the Chumps

3 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • (Visualization)
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • (Random Graph Models)
    • (SOAM)
    • Open Questions

Outline

Social Networks & Health

4 of 202

But scientists are starting to take network seriously:

“Networks”

Introduction: why networks?

5 of 202

“Networks”

“Obesity”

But scientists are starting to take network seriously: why?

5918 papers in 2018

Introduction: why networks?

6 of 202

…and NSF is investing heavily in it.

Introduction: why networks?

7 of 202

Social network analysis is:

  • a set of relational methods for systematically understanding and identifying connections among actors. SNA
    • is motivated by a structural intuition based on ties linking social actors
    • is grounded in systematic empirical data
    • draws heavily on graphic imagery
    • relies on the use of mathematical and/or computational models.

  • Social Network Analysis embodies a range of theories relating types of observable social spaces and their relation to individual and group behavior.

Introduction: why networks?

8 of 202

Social Determinants of Health

Social effects hold promising multiplier effects:

Introduction: why networks & health?

9 of 202

A brief history of Networks & health

Chapman, Alexander. Verdery, Ashton. M. and Moody James. (2022). “Analytic Advances in Social Networks and Health in the Twenty-First Century.. Journal of health and social behavior, 221465221086532. doi:10.1177/00221465221086532

10 of 202

A brief history of Networks & health

11 of 202

12 of 202

Social Science & Medicine, 2000

13 of 202

14 of 202

15 of 202

16 of 202

17 of 202

Science

Social

Networks

18 of 202

…more than 7500 publications…

19 of 202

Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris (2003).�statnet: Software tools for the Statistical Modeling of Network Data. URL http://statnetproject.org

20 of 202

21 of 202

22 of 202

State of the field

Trends

English language Articles indexed in Web of Science Social Science Citation Index on: ("health" or "well being" or "medicine") and "network*").

There have been 18572 such papers between 2000 - 2018.

23 of 202

State of the field

Trends

24 of 202

State of the field

Big-Picture

Bibliographic Similarity Networks: 1-step neighborhood of a single paper

25 of 202

State of the field

Big-Picture

Bibliographic Similarity Networks: 2-step neighborhood of a single paper

26 of 202

State of the field

Big-Picture

Since the net is large…

Use a force-directed layout to display the full space & overlay clusters….

27 of 202

The example paper…

28 of 202

Modularity:

Top-Level: 0.798 @ 32 Clusters

2nd Level: 0.785 @ 150 Clusters

29 of 202

30 of 202

Rogers:

31 of 202

Valente: Various

32 of 202

Christakis & Fowler

33 of 202

Martina Morris: Concurrency

34 of 202

Provan: Network Effectiveness

35 of 202

Add Health

36 of 202

37 of 202

38 of 202

39 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • (Visualization)
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • (Random Graph Models)
    • (SOAM)
    • Open Questions

Outline

Social Networks & Health

40 of 202

Introduction

Network Research Lifecycle

41 of 202

Introduction

Shameless Plug:

Available this fall!!

42 of 202

Introduction

Key Questions

Social Network analysis lets us answer questions about social interdependence. These include:

“Networks as Variables” approaches

        • Are kids with smoking peers more likely to smoke themselves?
        • Do unpopular kids get in more trouble than popular kids?
        • Do central actors control resources?

“Networks as Structures” approaches

        • What generates hierarchy in social relations?
        • What network patterns spread diseases most quickly?
        • How do role sets evolve out of consistent relational activity?

Both: Connectionist vs. Positional features of the network

We don’t want to draw this line too sharply: emergent role positions can affect individual outcomes in a ‘variable’ way, and variable approaches constrain relational activity.

43 of 202

Connectionist:

Positional:

Networks as pipes

Networks as roles

Ego

Complete

Multiple

- Structural Holes

- Density

- Mixing Models

- Size

- Community

Detection

- Reachability

- Homophily

- Degree

Distribution

- Social Balance

- ERGm

- Multi-layer networks

- Multi-level models of multiple networks

- Local Roles (Mandel 1983, Mandel & Winship 1984)

- Relational Block Models

- Motifs

Centrality

Cohesive blocking

2 ideas:

  • Patterns in networks

  • Patterns of networks

Connections & Positions: Network Problems

44 of 202

Why do networks matter?

Two fundamental mechanisms: Connections

Connectionist network mechanisms : Networks matter because of the things that flow through them. Networks as pipes.

  • Disease diffusion
  • Adoption of innovations
  • Joining social movements
  • Spread of misinformation online
  • Rumors
  • Peer influence

45 of 202

The spread of any epidemic depends on the number of secondary cases per infected case, known as the reproductive rate (R0). R0 depends on the probability that a contact will be infected over the duration of contact (β), the likelihood of contact (c), and the duration of infectiousness (D).

For network transmission problems, the trick is specifying c, which depends on the network.

Why do networks matter?

Two fundamental mechanisms: Connections example

46 of 202

Isolated vision

Why do networks matter?

Two fundamental mechanisms: Connections example

47 of 202

Connected vision

Why do networks matter?

Why do networks matter?

Two fundamental mechanisms: Connections example

48 of 202

Connections: Diffusion

Connectionist approaches are (by far) the most common aspect of network models in health research.

Theoretically any feature of the setting that governs spread through the network is of interest and will be reflected at multiple levels of the network.

Often we don’t have exact traces of the diffusion itself, only roughly timed outcome differences, which causes problems.

49 of 202

Provides food for

Romantic Love

Bickers with

Why do networks matter?

Two fundamental mechanisms: Positions

Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.

50 of 202

Parent

Parent

Child

Child

Child

Provides food for

Romantic Love

Bickers with

Why do networks matter?

Two fundamental mechanisms: Positions

Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.

C

P

X

Y

51 of 202

Pescosolido & Rubin

Basic structuralist duality: persons are the intersection of who (& how) they are connected to; while collectives are the emergent structure built from those connections.

Social life is a collective & structured affair: the groups we belong to & the roles we occupy simultaneously define us and our social setting.

Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.

Why do networks matter?

Two fundamental mechanisms: Positions

52 of 202

There is a classic structure-action tension between network structuralism and any duality-as-identity.

On the one hand, strong structures constrain interaction opportunities and acceptable activities to such a degree that the network, and one’s position in it, is fixed. Actors are structural dupes.

On the other hand, personal agency implied by self-authorship & “selection” suggest rapidly changing networks that (seem to) lack the substantive stability necessary to warrant being called “structure” in any meaningful sense. Networks become the epiphenomenon of action.

Network tools allow us to empirically interrogate the (also classic) route out of this dilemma: structures are (re)constituted in the ways actors behave; regularized ways of interacting create dynamically-stable settings.

This creates multiple levels of measurement & modeling

Why do networks matter?

Two fundamental mechanisms: wider theory

53 of 202

Why do networks matter?

Two fundamental mechanisms: Multiple levels

54 of 202

Stability & Trajectory patterns across:

  1. Direct ties: consistent micro mechanisms
  2. Popularity structure: global macro structure
  3. Role positions
  4. Peer groups

Each level requires new techniques or tools to capture setting

That we find stability at the macro level despite lots of churn at the local level raises interesting questions:

Why do networks matter?

Two fundamental mechanisms: Multiple levels

55 of 202

Stability & Trajectory patterns across:

  1. Direct ties: consistent micro mechanisms
  2. Popularity structure: global macro structure
  3. Role positions
  4. Peer groups

Each level requires new techniques or tools to capture setting

That we find stability at the macro level despite lots of churn at the local level raises interesting questions:

Why do networks matter?

Two fundamental mechanisms: Multiple levels

56 of 202

Each level requires new techniques or tools to capture setting

That we find stability at the macro level despite lots of churn at the local level raises interesting questions:

Stability & Trajectory patterns across:

  1. Direct ties: consistent micro mechanisms
  2. Popularity structure: global macro structure
  3. Role positions
  4. Peer groups

Why do networks matter?

Two fundamental mechanisms: Multiple levels

57 of 202

Each level requires new techniques or tools to capture setting

That we find stability at the macro level despite lots of churn at the local level raises interesting questions:

Stability & Trajectory patterns across:

  1. Direct ties: consistent micro mechanisms
  2. Popularity structure: global macro structure
  3. Role positions
  4. Peer groups

Why do networks matter?

Two fundamental mechanisms: Multiple levels

58 of 202

Each level requires new techniques or tools to capture setting

Positional approaches are less common in health research, but very promising as new ways to conceptualize context effects.

That we find stability at the macro level despite lots of churn at the local level raises interesting questions:

Stability & Trajectory patterns across:

  1. Direct ties: consistent micro mechanisms
  2. Popularity structure: global macro structure
  3. Role positions
  4. Peer groups

Why do networks matter?

Two fundamental mechanisms: Multiple levels

59 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • (Visualization)
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • Random Graph Models
    • SOAM
    • Open Questions

Outline

Social Networks & Health

60 of 202

The unit of interest in a network are the combined sets of actors and their relations.

We represent actors with points and relations with lines.

Actors are referred to variously as:

Nodes, vertices or points

Relations are referred to variously as:

Edges, Arcs, Lines, Ties

Example:

a

b

c

e

d

Social Network Data

61 of 202

In general, a relation can be:

Binary or Valued

Directed or Undirected

a

b

c

e

d

Undirected, binary

Directed, binary

a

b

c

e

d

a

b

c

e

d

Undirected, Valued

Directed, Valued

a

b

c

e

d

1

3

4

2

1

Social Network Data

62 of 202

In general, a relation can be: (1) Binary or Valued (2) Directed or Undirected

Social Network Data

Basic Data Elements

The social process of interest will often determine what form your data take. Conceptually, almost all of the techniques and measures we describe can be generalized across data format, but you may have to do some of the coding work yourself….

Directed,

Multiplex categorical edges

a

b

c

e

d

63 of 202

We can examine networks across multiple levels:

1) Ego-network

- Have data on a respondent (ego) and the people they are connected to (alters).

- May include estimates of connections among alters

2) Partial network

- Ego networks plus some amount of tracing to reach contacts of contacts

- Something less than full account of connections among all pairs of actors in the relevant population

- Example: CDC Contact tracing data for STDs

Social Network Data

Basic Data Elements: Levels of analysis

64 of 202

3) Complete or “Global” data

- Data on all actors within a particular (relevant) boundary

- Never exactly complete (due to missing data), but boundaries are set

    • Example: Coauthorship data among all writers in the social sciences, friendships among all students in a classroom

We can examine networks across multiple levels:

Social Network Data

Basic Data Elements: Levels of analysis

65 of 202

Ego-Net

Global-Net

Best Friend

Dyad

Primary

Group

Social Network Data

Basic Data Elements: Levels of analysis

2-step

Partial network

66 of 202

Social Network Data

Social network data are substantively divided by the number of modes in the data.

1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples:Communication, friendship, giving orders, sending email.

This is commonly what people think about when thinking about networks: nodes having direct relations with each other.

67 of 202

Social Network Data

Social network data are substantively divided by the number of modes in the data.

2-mode data represents nodes from two separate classes, where all ties are across classes. Examples:

People as members of groups

People as authors on papers

Words used often by people

Events in the life history of people

The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to each other through common membership

There may be multiple relations of multiple types connecting your nodes.

68 of 202

Bipartite networks imply a constraint on the mixing, such that ties only cross classes.

Here we see a tie connecting each woman with the party she attended (Davis data)

Social Network Data

Basic Data Elements: Modes

69 of 202

Social Network Data

Basic Data Elements: Modes

Bipartite networks imply a constraint on the mixing, such that ties only cross classes.

Here we see a tie connecting each woman with the party she attended (Davis data)

70 of 202

By projecting the data, one can look at the shared between people or the common memberships in groups: this is the person-to-person projection of the 2-mode data.

Social Network Data

Basic Data Elements: Modes

71 of 202

Social Network Data

Basic Data Elements: Modes

By projecting the data, one can look at the shared between people or the common memberships in groups: this is the group-to-group projection of the 2-mode data.

72 of 202

Casalino, Lawrence P., Michael F. Pesko, Andrew M. Ryan, David J. Nyweide, Theodore J. Iwashyna, Xuming Sun, Jayme Mendelsohn and James Moody. “Physician Networks and Ambulatory Care Admissions” Medical Care 53:534-41

Social Network Data

Example of a 2-mode network: Patients & Care Settings

73 of 202

Social Network Data

Multi-layer networks

A generalization of multiplex networks and multi-mode networks is the multi-layer network. A multi-layer network is a network that has qualitatively different classes of nodes (like multi-mode networks) and qualitatively different types of relations (like multi-plex). Multiplex and multi-mode networks can be subsumed under the multi-layer formalism.

See Kivela, Mikko, Alex Arenas Marc Bartheelemy, James P. Gleeson, Yamir Moreno, and Mason Porter. “Multilayer Networks” Journal of complex Networks 2:203-271. https://doi.org/10.1093/comnet/cnu016

74 of 202

From pictures to matrices

a

b

c

e

d

Undirected, binary

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

An undirected graph and the �corresponding matrix is symmetric.

The traditional way to store & represent network data is with an adjacency matrix.�

The matrix (X) at right represents an undirected binary network. Each node (a-e) is listed on both the row and the column.

The ith row and the jth column (Xij) records the value of a tie from node i to node j. For example, the line between nodes a and b is represented as an entry in the first row and second column (red at right).

Because the graph is undirected the ties sent are the same as the ties receive, so every entry above the diagonal equals the entries below the diagonal.

Basic Data Structures

Social Network Data

75 of 202

Directed, binary

a

b

c

e

d

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

A directed graph and the �corresponding matrix is asymmetrical.

Directed graphs, on the other hand,�are asymmetrical.

We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.

Basic Data Structures

Social Network Data

76 of 202

a

b

c

d

e

a

b

c

d

e

1

3

1

2

4

2

1

A directed graph and the �corresponding matrix is asymmetrical.

Directed graphs, on the other hand,�are asymmetrical.

We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.

Basic Data Structures

Social Network Data

Directed, Valued

a

b

c

e

d

77 of 202

From matrices to lists (binary)

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

a b

b a c

c b d e

d c e

e c d

a b

b a

b c

c b

c d

c e

d c

d e

e c

e d

Adjacency List

Arc List

Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.

a

b

c

e

d

Basic Data Structures

Social Network Data

78 of 202

From matrices to lists (valued)

a

b

c

d

e

a

b

c

d

e

1

1

2

2

3

5

3

1

5

1

a b

b a c

c b d e

d c e

e c d

a b 1

b a 1

b c 2

c b 2

c d 3

c e 5

d c 3

d e 1

e c 5

e d 1

Adjacency List

Arc List

Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.

a

b

c

e

d

Basic Data Structures

Social Network Data

1

2

5

1

3

a 1

b 1 2

c 2 3 1

d 3 1

e 5 1

contact

value

79 of 202

From matrices to lists (valued)

a

b

c

d

e

a

b

c

d

e

1

1

2

3

a b 1 green

b a 1 green

c b 2 green

c b 2 red

c d 3 green

c e 5 red

e c 5 red

e d 1 red

Arc List

Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.

Basic Data Structures

Social Network Data

a

b

c

e

d

a

b

c

d

e

2

5

5

1

Set/stack/list of adj mats

80 of 202

Working with two-mode data

A person-to-group adjacency matrix is rectangular, with one mode (persons, say) down rows and the other (groups, say) across columns

1 2 3 4 5

A 0 0 0 0 1

B 1 0 0 0 0

C 1 1 0 0 0

D 0 1 1 1 1

E 0 0 1 0 0

F 0 0 1 1 0

A =

Each column is a group, each row a person, and the cell = 1 if the person in that row belongs to that group.

You can tell how many groups two people both belong to by comparing the rows: Identify every place that both rows = 1, sum them, and you have the overlap.

Basic Data Structures

Social Network Data

G = AT(A) = group to group projection

P = A(AT) = person to person projection

81 of 202

Social Network Data

Multi-layer networks

5 waves of Newcombe fraternity data as multi-layer.

82 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • Visualization
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • Random Graph Models
    • SOAM
    • Open Questions

Outline

Social Networks & Health

83 of 202

Exploratory analysis: Visualization

One of the first analysis tasks one undertakes is to visualize the network. This gives you a chance to gut-check the data, make sure nothing looks terribly out of place and start exploring key “shape” functions of the setting.

A good visualization can go along way toward building intuition about what features are shaping the network. This is a very deep field, we’re just going to touch on enough to get you started.

84 of 202

A good network drawing allows viewers to come away from the image with an almost immediate intuition about the underlying structure of the network being displayed.

However, because there are multiple ways to display the same information, and standards for doing so are few, the information content of a network display can be quite variable.

Now trace the actual pattern of ties. You will see that these 4 graphs are exactly the same.

Consider the 4 graphs drawn at right.

After asking yourself what intuition you gain from each graph, click on the screen.

Exploratory analysis: Visualization

85 of 202

Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:

Tree-Based layouts

Most effective for very sparse, regular graphs. Very useful when relations are strongly directed, such as organization charts, cluster tress or internet connections.

“Force” layouts

Most effective with graphs that have a strong community structure (clustering, etc). Provides a very clear correspondence between social distance and plotted distance

Two images of the same network

(good)

(Fair - poor)

Basic Layout Heuristics

86 of 202

Tree-Based layouts

Two layouts of the same network

(poor)

(good)

Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:

Basic Layout Heuristics

“good” = high correlation between spatial distance and network distance.

“Force” layouts

87 of 202

Fixed Coordinate Layouts

Two layouts of the same network

Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:

Basic Layout Heuristics

Fit: 0.528

Fit: .682

88 of 202

Fixed Coordinate Layouts

Two layouts of the same network

Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:

Basic Layout Heuristics

Fit=0.57

Fit=0.26

89 of 202

Levels of distortion?

Two layouts of the same network

Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:

Basic Layout Heuristics

Shrink each cluster to accent separation

Fit=0.75

Fit=0.81

90 of 202

Secondary style elements

Some seemingly innocuous features can shape the effect of network diagrams

Use curves sparingly…most useful when there are multiple edges between the same nodes.

91 of 202

Abstraction helps – consider dense networks

Co-voting similarity, 109th senate

Basic Problem 1: Density/Scale

92 of 202

An alternative is to think of a network trace as the likely number reached from any given node, where “likely” is determined either by an edge transmission p(transfer|dist) or simple distance (geodesic), and follow each starting node.

Trace Curves: Number reachable at each step

The collection of curves is useful, to either compare to a null hypothesis or to contextualize an observed epidemic. Curves are simple to use, as we have good statistical measures of their shape.

But they are still one-step removed from the network itself, so how might we incorporate this information into the network?

Basic Problem 2: Spread over a dynamic network

93 of 202

Tree Based Layout.

Since concurrency is an edge property, we color the edges by the concurrency status of the tie…

Start with Context: Now follow only the diffusion paths

Basic Problem 2: Spread over a dynamic network

94 of 202

Tree Based Layout.

..but since the path carries the infection, we highlight the cumulative effect of concurrency by noting any place where a transmission would have been impossible were it not for an earlier concurrent edge.

This is one way to deal with the importance of network dynamics.

Start with Context: Now follow only the diffusion paths

Basic Problem 2: Spread over a dynamic network

95 of 202

Another way to represent time is by stacking relations against time:

Basic Problem 3: Dynamic network Evolution

96 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • Visualization
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • Random Graph Models
    • SOAM
    • Open Questions

Outline

Social Networks & Health

97 of 202

Basic Network Metrics

Much of network “analysis” is measuring a metric on the graph. These often define features that are then used as independent variables in a substantive analysis.

We next describe some of the most common basic elements; there are many alternatives and nuances to each…this is the tip of the iceberg.

  • Basic Volume (Degree, Density)
  • Reachability & Paths
  • Centrality
  • Community Detection

98 of 202

Density: Mean of the adjacency matrix

a

b

c

e

d

1

2

5

1

3

 

a

b

c

d

e

a

b

c

d

e

1

1

2

2

3

5

3

1

5

1

--

--

--

--

--

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

--

--

--

--

--

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

 

 

For most real-life social networks, density is hard to interpret because the denominator scales by the square of population. So average degree is often better.

Social Network Metrics: Basic Volume

99 of 202

Degree: Number of links adjacent to a node

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

3

0

2

1 2 1 2 1 7

a

b

c

d

e

a

b

c

d

e

1

1

2

3

1

5

1

1

1

6

0

6

1 3 5 4 1 14

Social Network Metrics: Basic Volume

100 of 202

Connectivity refers to how actors in one part of the network are connected to actors in another part of the network.

    • Reachability: Is it possible for actor i to reach actor j? This can only be true if there is a chain of contact from one actor to another.

    • Distance: Given they can be reached, how many steps are they from each other?

    • Redundancy: How many different paths connect each pair?

Social Network Metrics: Connectivity

101 of 202

d

e

c

Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them.

a

b

c

e

d

f

b

f

a

Paths can be directed, leading to a distinction between strong and weak components

Social Network Metrics: Connectivity

102 of 202

Basic elements in connectivity

    • A path is a sequence of nodes and edges starting with one node and ending with another, tracing the indirect connection between the two. On a path, you never go backwards or revisit the same node twice.

Example: a 🡪 b 🡪 c🡪d

    • A walk is any sequence of nodes and edges, and may go backwards. Example: a 🡪 b 🡪 c 🡪 b 🡪c 🡪d

    • A cycle is a path that starts and ends with the same node. Example: a 🡪 b 🡪 c 🡪 a

Social Network Metrics: Connectivity

103 of 202

Reachability

If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component.

Intuitively, a component is the set of people who are all connected by a chain of relations.

Social Network Metrics: Connectivity

104 of 202

This example contains many components.

Social Network Metrics: Connectivity

105 of 202

Because relations can be directed or undirected, components come in two flavors:

For a graph with any directed edges, there are two types of components:

Strong components consist of the set(s) of all nodes that are mutually reachable

Weak components consist of the set(s) of all nodes where at least one node can reach the other.

Social Network Metrics: Connectivity

106 of 202

There are only 2 strong components with more than 1 person in this network.

Components are the minimum requirement for social groups. As we will see later, they are necessary but not sufficient

All of the major network analysis software identifies strong and weak components

Social Network Metrics: Connectivity

107 of 202

Many large networks are characterized by a highly skewed distribution of the number of partners (degree)

Social Network Metrics: Connectivity

Degree distributions

108 of 202

Many large networks are characterized by a highly skewed distribution of the number of partners (degree)

Social Network Metrics: Connectivity

Degree distributions

109 of 202

The scale-free model focuses on the distance-reducing capacity of high-degree nodes:

Social Network Metrics: Connectivity

Degree distributions

110 of 202

The scale-free model focuses on the distance-reducing capacity of high-degree nodes:

If a preferential attachment model is active, then high-degree nodes are hubs, that if removed can disconnect the network.

Since hubs are rare, PA networks are robust to random attack, but very fragile to targeted attack.

Social Network Metrics: Connectivity

Degree distributions

111 of 202

Colorado Springs High-Risk

(Sexual contact only)

  • Network is approximately scale-free, with λ = -1.3

  • But connectivity does not depend on the hubs.

  • PrefAttach🡪 Scale-free distribution, but scale-free distribution ^🡪 preferential attachment.

Social Network Metrics: Connectivity

Degree distributions

112 of 202

a

Geodesic distance is measured by the smallest (weighted) number of relations separating a pair:

Actor “a” is:

1 step from 4

2 steps from 5

3 steps from 4

4 steps from 3

5 steps from 1

a

Social Network Metrics: Connectivity

Distance

113 of 202

a b c d e f g h i j k l m

------------------------------------------

a. . 1 2 . . . . . . . . 2 1

b. 3 . 1 . . . . . . . . 1 2

c. . . . . . . . . . . . . .

d. 4 3 1 . 1 2 1 . 2 . . 2 3

e. 3 2 2 1 . 1 2 . 1 . . 1 2

f. 4 3 3 2 1 . 3 . 2 . . 2 3

g. 5 4 4 3 2 1 . . 3 . . 3 4

h. . . . . . . . . 1 . . . .

i. . . . . . . . . . . . . .

j. . . . . . . . . 1 . . . .

k. . . . . . . . . 1 . . . .

l. 2 1 2 . . . . . . . . . 1

m. 1 2 3 . . . . . . . . 1 .

b

c

d

g

f

e

k

i

j

h

l

m

a

When the graph is directed, distance is also directed (distance to vs distance from), following the direction of the tie.

Social Network Metrics: Connectivity

Distance

114 of 202

Reachability in Colorado Springs

(Sexual contact only)

(Node size = log of degree)

  • High-risk actors over 4 years
  • 695 people represented
  • Longest path is 17 steps
  • Average distance is about 5 steps
  • Average person is within 3 steps of 75 other people

Social Network Metrics: Connectivity

Distance

115 of 202

As a graph statistic, the distribution of distance can tell you a good deal about how close people are to each other (we’ll see this more fully when we get to closeness centrality).

The diameter of a graph is the longest geodesic, giving the maximum distance. We often use the l, or the mean distance between every pair to characterize the entire graph.

For example, all else equal, we would expect rumors to travel faster through settings where the average distance is small.

Measuring Networks: Connectivity

Distance

116 of 202

For a real network, people’s friends are not random, but clustered. We can modify the random equation by adjusting a, such that some portion of the contacts are random, the rest not. This adjustment is a ‘bias’ - I.e. a non-random element in the model -- that gives rise to the notion of ‘biased networks’. People have studied (mathematically) biases associated with:

      • Race (and categorical homophily more generally)
      • Transitivity (Friends of friends are friends)
      • Reciprocity (i--> j, j--> i)

There is still a great deal of work to be done in this area empirically, and it promises to be a good way of studying the structure of very large networks.

Measuring Networks: Connectivity

Redundancy Local

117 of 202

Measuring Networks: Connectivity

Redundancy (Loca)

Local redundancy is known as “clustering” or “transitivity” - that one’s friends are friends with each other.

Density is the proportion of pairs tied, excluding ego.

Transitivity is the proportion of two-step ties that are closed (Friend of a Friend is a friend)

Density

Transitivity

Transitivity

No ego

0 0 0

0.4 0.71 1.0

1 1 1

0.7 0.78 .64

118 of 202

Node Connectivity

As size of cut-set

0

1

2

3

Structural Cohesion:

A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

119 of 202

0

1

2

3

Node Connectivity

As number of node-independent paths

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

Structural Cohesion:

A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.

120 of 202

1

2

3

4

Nestedness Structure

Cohesive Blocks

Depth

Sociogram

5

1

2

3

4

5

6

7

8

9

Cohesive Blocking

The arrangement of subsequently more connected sets by branches and depth uniquely characterize the connectivity structure of a network

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

121 of 202

Distance & Connectivity measures “locate” a node based on particular features of the path structure, but there are many other ways of locating nodes in networks.

Centrality refers to (one dimension of) location, identifying where an actor resides in a network.

    • For example, we can compare actors at the edge of the network to actors at the center.

    • In general, this is a way to formalize intuitive notions about the distinction between insiders and outsiders.

As a terminology point, some authors distinguish centrality from prestige based on the directionality of the tie. Since the formulas are the same in every other respect, I stick with “centrality” for simplicity.

Measuring Networks

Centrality

122 of 202

Conceptually, centrality is fairly straight forward: we want to identify which nodes are in the ‘center’ of the network. In practice, identifying exactly what we mean by ‘center’ is somewhat complicated, but substantively we often have reason to believe that people at the center are very important with respect to some pattern of flow/spread on the network.

The standard centrality measures capture a wide range of “importance” in a network:

      • Degree
      • Closeness
      • Betweenness
      • Eigenvector / Power measures

After discussing these, I will describe measures that combine features of each of them.

Measuring Networks

Centrality

123 of 202

Measuring Networks

Centrality

..and just a teaser of all the elements we are leaving out:

Schoch lists over 100 different measures…

124 of 202

The most intuitive notion of centrality focuses on degree. Degree is the number of direct contacts a person has. The ideas is that the actor with the most ties is the most important:

Measuring Networks

Centrality

125 of 202

Degree centrality, however, can be deceiving, because it is a purely local measure.

Measuring Networks

Centrality

126 of 202

If we want to measure the degree to which the graph as a whole is centralized, we look at the dispersion of centrality:

Simple: variance of the individual centrality scores.

Or, using Freeman’s general formula for centralization (which ranges from 0 to 1):

UCINET, SPAN, PAJEK and most other network software will calculate these measures.

Measuring Networks

Centrality

127 of 202

Degree Centralization Scores

Freeman: .07

Variance: .20

Freeman: 1.0

Variance: 3.9

Freeman: .02

Variance: .17

Freeman: 0.0

Variance: 0.0

Measuring Networks

Centrality

128 of 202

A second measure of centrality is closeness centrality. An actor is considered important if he/she is relatively close to all other actors.

Closeness is based on the inverse of the distance of each actor to every other actor in the network.

Closeness Centrality:

Normalized Closeness Centrality

Measuring Networks

Centrality

129 of 202

Distance Closeness normalized

0 1 1 1 1 1 1 1 .143 1.00

1 0 2 2 2 2 2 2 .077 .538

1 2 0 2 2 2 2 2 .077 .538

1 2 2 0 2 2 2 2 .077 .538

1 2 2 2 0 2 2 2 .077 .538

1 2 2 2 2 0 2 2 .077 .538

1 2 2 2 2 2 0 2 .077 .538

1 2 2 2 2 2 2 0 .077 .538

Closeness Centrality in the examples

Distance Closeness normalized

0 1 2 3 4 4 3 2 1 .050 .400

1 0 1 2 3 4 4 3 2 .050 .400

2 1 0 1 2 3 4 4 3 .050 .400

3 2 1 0 1 2 3 4 4 .050 .400

4 3 2 1 0 1 2 3 4 .050 .400

4 4 3 2 1 0 1 2 3 .050 .400

3 4 4 3 2 1 0 1 2 .050 .400

2 3 4 4 3 2 1 0 1 .050 .400

1 2 3 4 4 3 2 1 0 .050 .400

Measuring Networks

Centrality

130 of 202

Distance Closeness normalized

0 1 2 3 4 5 6 .048 .286

1 0 1 2 3 4 5 .063 .375

2 1 0 1 2 3 4 .077 .462

3 2 1 0 1 2 3 .083 .500

4 3 2 1 0 1 2 .077 .462

5 4 3 2 1 0 1 .063 .375

6 5 4 3 2 1 0 .048 .286

Closeness Centrality in the examples

Measuring Networks

Centrality

131 of 202

Distance Closeness normalized

0 1 1 2 3 4 4 5 5 6 5 5 6 .021 .255

1 0 1 1 2 3 3 4 4 5 4 4 5 .027 .324

1 1 0 1 2 3 3 4 4 5 4 4 5 .027 .324

2 1 1 0 1 2 2 3 3 4 3 3 4 .034 .414

3 2 2 1 0 1 1 2 2 3 2 2 3 .042 .500

4 3 3 2 1 0 2 3 3 4 1 1 2 .034 .414

4 3 3 2 1 2 0 1 1 2 3 3 4 .034 .414

5 4 4 3 2 3 1 0 1 1 4 4 5 .027 .324

5 4 4 3 2 3 1 1 0 1 4 4 5 .027 .324

6 5 5 4 3 4 2 1 1 0 5 5 6 .021 .255

5 4 4 3 2 1 3 4 4 5 0 1 1 .027 .324

5 4 4 3 2 1 3 4 4 5 1 0 1 .027 .324

6 5 5 4 3 2 4 5 5 6 1 1 0 .021 .255

Closeness Centrality in the examples

Measuring Networks

Centrality

132 of 202

Betweenness Centrality:

Model based on communication flow: A person who lies on communication paths can control communication flow, and is thus important. Betweenness centrality counts the number of shortest paths between i and k that actor j resides on.

b

a

C d e f g h

Measuring Networks

Centrality

133 of 202

Betweenness Centrality:

Where gjk = the number of geodesics connecting jk, and

gjk(ni) = the number that actor i is on.

Usually normalized by:

Measuring Networks

Centrality

134 of 202

Centralization: 1.0

Centralization: .31

Centralization: .59

Centralization: 0

Betweenness Centrality:

Measuring Networks

Centrality

135 of 202

Centralization: .183

Betweenness Centrality:

Measuring Networks

Centrality

136 of 202

Information Centrality:

It is quite likely that information can flow through paths other than the geodesic. The Information Centrality score uses all paths in the network, and weights them based on their length.

Measuring Networks

Centrality

137 of 202

Comparing across these 3 centrality values

    • Generally, the 3 centrality types will be positively correlated
    • When they are not (low) correlated, it probably tells you something interesting about the network.

 

Low

Degree

Low

Closeness

Low

Betweenness

High Degree

 

Embedded in cluster that is far from the rest of the network

Ego's connections are redundant - communication bypasses him/her

High Closeness

Key player tied to important important/active alters

 

Probably multiple paths in the network, ego is near many people, but so are many others

High Betweenness

Ego's few ties are crucial for network flow

Very rare cell. Would mean that ego monopolizes the ties from a small number of people to many others.

 

Measuring Networks

Centrality

138 of 202

Bonacich Power Centrality: Actor’s centrality (prestige) is equal to a function of the prestige of those they are connected to. Thus, actors who are tied to very central actors should have higher prestige/ centrality than those who are not.

This is a variant of eigenvector centrality/pagerank.

  • α is a scaling vector, which is set to normalize the score.
  • β reflects the extent to which you weight the centrality of people ego is tied to.
  • R is the adjacency matrix (can be valued)
  • I is the identity matrix (1s down the diagonal)
  • 1 is a matrix of all ones.

Measuring Networks

Centrality

139 of 202

Bonacich Power Centrality:

The magnitude of β reflects the radius of power. Small values of β weight local structure, larger values weight global structure.

If β is positive, then ego has higher centrality when tied to people who are central.

If β is negative, then ego has higher centrality when tied to people who are not central.

As β approaches zero, you get degree centrality.

Measuring Networks

Centrality

140 of 202

Bonacich Power Centrality:

β = 0.23

Measuring Networks

Centrality

141 of 202

A primary interest in Social Network Analysis is the identification of “significant social subgroups” – some smaller collection of nodes in the graph that can be considered, at least in some senses, as a “unit” based on the pattern, strength, or frequency of ties.

Measuring Networks

Community Detection

There are many ways to identify groups. They all insist on a group being in a connected component, but other than that the variation is wide.

We’re covering this in-depth tomorrow!

142 of 202

Modularity is probably the most commonly used contemporary clustering metric

Intuitively this is the

Expected ties

Indicator for same group (0,1)

Observed ties

Resolution parameter

Normalizing

constant

Measuring Networks

Community Detection

143 of 202

Q=0.24

Q=0.51

Q=0.68

Measuring Networks

Community Detection

144 of 202

Elements of a Role:

    • Rights and obligations with respect to other people or classes of people

    • Roles require a ‘role compliment’ another person who the role-occupant acts with respect to

Examples:

Parent – child

Teacher – student

Lover – lover

Friend – Friend

Husband - Wife

Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together.

Measuring Networks

Community Role Detection

145 of 202

Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as:

P

P

C

C

C

Provides food for

Romantic Love

Bickers with

(and there are, of course, many other relations inside a family)

White et al: From logical role systems to empirical social structures

Measuring Networks

Community Role Detection

146 of 202

Blockmodeling: basic steps

In any positional analysis, there are 4 basic steps:

1) Identify a definition of equivalence

2) Measure the degree to which pairs of actors are equivalent

3) Develop a representation of the equivalencies

4) Assess the adequacy of the representation

At the end of the day, this is community detection on a role-similarity matrix rather than an adjacency matrix.

Measuring Networks

Community Role Detection

147 of 202

If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations, usually including both “directions” of asymmetric relations:

P

P

C

C

C

Provides food for

Romantic Love

Bickers with

Measuring Networks

Community Role Detection

Sim

1 1 0 0 0

1 1 0 0 0

0 0 1 1 1

0 0 1 1 1

0 0 1 1 1

Covering this in-depth on WED

148 of 202

  1. Part 1: Introduction & Theory
    1. History & Big Picture
    2. Network Relevance to Health Research
    3. Network Theory
      1. Connections & Positions

  • Part 2: Points & Lines
    • Network data
    • Visualization
    • Network metrics

  • Part 3: Network Models
    • Diffusion
      • Disease
      • Network autocorrelation
    • Random Graph Models
    • Experimental interventions (brief)
    • Open Questions

Outline

Social Networks & Health

149 of 202

Introduction

Network Research Lifecycle

Models?

-OF networks

- An explication of why a network looks as it does.

- formal or informal

ERGM vs. “Birds of a feather”

-ON networks.

- An explication of a (social) process that uses networks

- Diffusion of innovations

-Peer Influence

- Brokerage

We spend a lot of time here on formal statistical models; but substantive models should drive the case.

150 of 202

Network Models: Diffusion

The primary route from networks to health is via diffusion, either of a pathogen or a health-related behavior.

We start by discussing biological diffusion as it’s a clear and mechanistic model that we can build from for social diffusion. The treatment here is *very brief,* aimed at giving a sense of the issues at play.

151 of 202

Network Diffusion & Peer Influence

Basics

Classic (disease) diffusion makes use of compartmental models. Large N and homogenous mixing allows one to express spread as generalized probability models.

Works very well for highly infectious bits in large populations…

SI(S) model – actors are in only two states, susceptible or infectious.

SIIR(S) model – adds an “exposed” but not infectious state and recovered.

152 of 202

Network Diffusion & Peer Influence

Basics

Network Models

Same basic SI(R,,etc) setup, but connectivity is not assumed random, rather it is structured by the network contact pattern.

If pij is small or the network is very clustered, these two can yield very different diffusion patterns.*

Real

Random

*these conditions do matter. Compartmental models work surprisingly well if the network is large, dense or the bit highly infectiousness…because most networks have a bit of randomness in them. We are focusing on the elements that are unique/different for network as opposed to general diffusion.

153 of 202

Network Diffusion & Peer Influence

Basics

If 0 < pij < 1

154 of 202

Network Diffusion & Peer Influence

Basics

If 0 < pij < 1

0.01

0.06

0.11

0.26

0.46

155 of 202

In addition to* the dyadic probability that one actor passes something to another (pij), two factors affect flow through a network:

Topology

    • the shape, or form, of the network

- Example: one actor cannot pass information to another unless they are either directly or indirectly connected

Time

- the timing of contact matters

- Example: an actor cannot pass information he has not receive yet

*This is a big conditional! – lots of work on how the dyadic transmission rate may differ across populations.

Key Question: What features of a network contribute most to diffusion potential?

Network Diffusion & Peer Influence

Network diffusion features

Use simulation tools to explore the relative effects of structural connectivity features

156 of 202

  • A network has to be connected for a bit to pass over it
  • If transmission is uncertain, the longer the distance the lower the likelihood of spread.

0

0.2

0.4

2

3

4

5

6

Path distance

probability

Distance and diffusion (p(transfer)=pijdist

Here pij of 0.6

Network Diffusion & Peer Influence

Network diffusion features

We need:

(1) reachability

(2) distance

(3) local clustering

(4) multiple routes

(5) star spreaders

157 of 202

  • Local clustering turns flow “in” on a potential transmission tree

Arcs: 11

Largest component: 12,

Clustering: 0

Arcs: 11

Largest component: 8,

Clustering: 0.205

We need:

(1) reachability

(2) distance

(3) local clustering

(4) multiple routes

(5) star spreaders

Network Diffusion & Peer Influence

Network diffusion features

158 of 202

  • The more alternate routes one has for transmission, the more likely flow should be.
    • Operationalize alternate routes with structural cohesion

We need:

(1) reachability

(2) distance

(3) local clustering

(4) multiple routes

(5) star spreaders

Network Diffusion & Peer Influence

Network diffusion features

159 of 202

Probability of transfer

by distance and number of non-overlapping paths, assume a constant pij of 0.6

0

0.2

0.4

0.6

0.8

1

1.2

2

3

4

5

6

Path distance

probability

1 path

10 paths

5 paths

2 paths

Cohesion 🡺 Redundancy 🡺Diffusion

Network Diffusion & Peer Influence

Network diffusion features

160 of 202

0

1

2

3

Node Connectivity

As number of node-independent paths

Structural Cohesion:

A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.

Network Diffusion & Peer Influence

Network diffusion features

161 of 202

STD Transmission danger: sex or drugs?

Structural core more realistic than nominal core

Data from “Project 90,” of a high-risk population in Colorado Springs

Network Diffusion & Peer Influence

Network diffusion features

162 of 202

  • Much of the work on “core groups” or “at risk” populations focus on high-degree nodes. The assumption is that high-degree nodes are likely to contact lots of people.

We need:

(1) reachability

(2) distance

(3) local clustering

(4) multiple routes

(5) star spreaders

Network Diffusion & Peer Influence

Network diffusion features

163 of 202

  • Much of the work on “core groups” or “at risk” populations focus on high-degree nodes. The assumption is that high-degree nodes are likely to contact lots of people.

We need:

(1) reachability

(2) distance

(3) local clustering

(4) multiple routes

(5) star spreaders

Network Diffusion & Peer Influence

Network diffusion features

164 of 202

Network Diffusion & Peer Influence

Network diffusion features

Assortative mixing:

A more traditional way to think about “star” effects.

165 of 202

Partner

Distribution

Component

Size/Shape

Emergent Connectivity in low-degree networks

Network Diffusion & Peer Influence

A closer look at emerging connectivity

166 of 202

In both distributions, a giant component & reconnected core emerges as density increases, but at very different speeds and ultimate extent.

Network Diffusion & Peer Influence

A closer look at emerging connectivity

167 of 202

In addition to* the dyadic probability that one actor passes something to another (pij), two factors affect flow through a network:

Topology

    • the shape, or form, of the network

- Example: one actor cannot pass information to another unless they are either directly or indirectly connected

Time

- the timing of contact matters

- Example: an actor cannot pass information he has not receive yet

*This is a big conditional! – lots of work on how the dyadic transmission rate may differ across populations.

Key Question: What features of a network contribute most to diffusion potential?

Network Diffusion & Peer Influence

Relational Dynamics

Use simulation tools to explore the relative effects of structural connectivity features

168 of 202

Contact network: Everyone, it is a connected component

Who can “A” reach?

Network Diffusion & Peer Influence

Relational Dynamics

Discussions of network effects on STD spread often speak loosely of “the network.”

There are three relevant networks that are often conflated:

Three relevant networks

169 of 202

Exposure network: here, node “A” could reach up to 8 others

Who can “A” reach?

Network Diffusion & Peer Influence

Relational Dynamics

Discussions of network effects on STD spread often speak loosely of “the network.”

There are three relevant networks that are often conflated:

Three relevant networks

170 of 202

Transmission network: upper limit is 8 through the exposure links (dark blue). Transmission is path dependent: if no transmission to B, then also none to {K,L,O,J,M}

Who can “A” reach?

Exposable Link (from A’s p.o.v.)

Contact

Network Diffusion & Peer Influence

Relational Dynamics

Discussions of network effects on STD spread often speak loosely of “the network.”

There are three relevant networks that are often conflated:

Three relevant networks

171 of 202

The mapping between the contact network and the exposure network is based on relational timing. In a dynamic network, edge timing determines if something can flow down a path because things can only be passed forward in time.

Definitions:

Two edges are adjacent if they share a node.

A path is a sequence of adjacent edges (E1, E2, …Ed).

A time-ordered path is a sequence of adjacent edges where, for each pair of edges in the sequence, the start time Si is less than or equal to Ej S(E1) < E(E2)

Adjacent edges are concurrent if they share a node and have start and end dates that overlap. This occurs if:

S(E2) < E(E1)

Concurrency

Network Diffusion & Peer Influence

Relational Dynamics

172 of 202

A

B

C

D

time

1 2 3 4 5 6 7 8 9 10

AB

BC

CE

E

CD

2 - 7

1 - 3

5 - 6

8 - 9

S(ab)

E(ab)

S(bc)

E(bc)

S(ce)

E(ce)

The mapping between the contact network and the exposure network is based on relational timing. In a dynamic network, edge timing determines if something can flow down a path because things can only be passed forward in time.

Concurrency

Network Diffusion & Peer Influence

Relational Dynamics

173 of 202

The constraints of time-ordered paths change our understanding of the system structure of the network. Paths make a network a system: linking actors together through indirect connections. Relational timing changes how paths cumulate in networks.

Indirect connectivity is no longer transitive:

A

B

C

D

1 - 2

3 - 4

1 - 2

Here A can reach C, and C and reach D. But A cannot reach D (nor D A). Why? Because any infection A passes to C would have happened after the relation between C and D ended.

A

B

C

D

1 - 2

3 - 4

1 - 2

Network Diffusion & Peer Influence

Relational Dynamics

174 of 202

Edge time structures are characterized by sequence, duration and overlap.

Paths between i and j, have length and duration, but these need not be symmetric even if the constituent edges are symmetric.

Network Diffusion & Peer Influence

Relational Dynamics

175 of 202

Implied Contact Network of 8 people in a ring

All relations Concurrent

Reachability = 1.0

Network Diffusion & Peer Influence

Relational Dynamics

176 of 202

Implied Contact Network of 8 people in a ring

Serial Monogamy (1)

1

2

3

7

6

5

8

4

Reachability = 0.71

Network Diffusion & Peer Influence

Relational Dynamics

177 of 202

Implied Contact Network of 8 people in a ring

Mixed Concurrent

2

2

1

1

2

2

3

3

Reachability = 0.57

Network Diffusion & Peer Influence

Relational Dynamics

178 of 202

Implied Contact Network of 8 people in a ring

Serial Monogamy (3)

1

2

1

1

2

1

2

2

Reachability = 0.43

Network Diffusion & Peer Influence

Relational Dynamics

179 of 202

1

2

1

1

2

1

2

2

Timing alone can change mean reachability from 1.0 when all ties are concurrent to 0.42.

In general, ignoring time order is equivalent to assuming all relations occur simultaneously – assumes perfect concurrency across all relations.

Network Diffusion & Peer Influence

Relational Dynamics

180 of 202

Resulting infection trace from a simulation (Morris et al, AJPH 2010).

Observed infection paths from 10 seeds in an STD simulation, edges coded for concurrency status.

Network Diffusion & Peer Influence

Relational Dynamics

181 of 202

Resulting infection trace from a simulation (Morris et al, AJPH 2010).

Network Diffusion & Peer Influence

Relational Dynamics

Observed infection paths from 10 seeds in an STD simulation, edges coded for concurrency status.

182 of 202

Timing constrains potential diffusion paths in networks, since bits can flow through edges that have ended.

This means that:

    • Structural paths are not equivalent to the diffusion-relevant path set.
    • Network distances don’t build on each other.
    • Weakly connected components overlap without diffusion reaching across sets.
    • Small changes in edge timing can have dramatic effects on overall diffusion
    • Diffusion potential is maximized when edges are concurrent and minimized when they are “inter-woven” to limit reachability.

Combined, this means that many of our standard path-based network measures will be incorrect on dynamic graphs.

Network Diffusion & Peer Influence

Relational Dynamics

183 of 202

Network Diffusion & Peer Influence

Structural Transmission Dynamics: beyond disease diffusion

Complex Contagion

Thus far we have focused on a “simple” dyadic diffusion parameter, pij, where the probability of passing/receiving the bit is purely dependent on discordant status of the dyad, sometimes called the “independent cascade model” (), which suggests a monotonic relation between the number of times you are exposed through peers.

High exposure could be due to repeated interaction with one person or weak interaction with many, effectively equating:

Alternative models exist. Under “complex contagion” for example, the likelihood that I accept the bit that flows through the network depends on the proportion of my peers that have the bit.

184 of 202

Network Diffusion & Peer Influence

Structural Transmission Dynamics: beyond disease diffusion

1

1

2

3

Complex Contagion

Assume adoption requires k neighbors having adopted, then transmission can only occur within dense clusters:

185 of 202

Network Diffusion & Peer Influence

Structural Transmission Dynamics: beyond disease diffusion

Complex Contagion

Assume adoption requires k neighbors having adopted, then transmission can only occur within dense clusters:

For this network under weak complex diffusion (k=2), the maximum risk size is reaches 98%.

One of the Prosper schools:

Start

186 of 202

Network Diffusion & Peer Influence

Structural Transmission Dynamics: beyond disease diffusion

Complex Contagion

Can lead to widely varying sizes of potential diffusion cascades. Here’s the distribution across all PROPSPER schools:

Distribution is largely bimodal (even with a connected pair start)

187 of 202

Network Diffusion & Peer Influence

Structural Transmission Dynamics: beyond disease diffusion

Complex Contagion

Can lead to widely varying sizes of potential diffusion cascades. Here’s the distribution across all PROPSPER schools:

The governing factors are (a) curved effect of local redundancy and (b) structural cohesion

Network Average Proportion Reached

k=2 complex contagion

Mean Cascade Size

Coh=0.3

Coh=1.2

Coh=2.2

Coh=3.2

Coh=4.1

188 of 202

Background:

      • Long standing research interest in how our relations shape our attitudes and behaviors.

      • Most often assumed mechanism is that people (through conversation or similar) change each others beliefs/opinions, which changes behavior.

This implies that position in a communication network should be related to attitudes.

      • Alternatives:
        • Modeling behavior: ego copies behavior of alter to gain respect, esteem, etc.
        • Distinction: Ego tries to be different from (some) alter to gain respect, esteem, etc.
        • Access: Ego wants to do Y, but can only do so because alter provides access (say, being old enough to buy cigarettes).

Craig Rawlings will cover this later in the week!

Network Diffusion & Peer Influence

Peer Influence Dynamics

189 of 202

Statistical Models for Networks

Simple Random Graphs

Long history of model development for networks.

Here we are just hinting at what is here and why useful.

We often want a way to build models that explain the topology in a network. The foundation of these models are Random Graphs.

I will cover this later in the week

190 of 202

Open Problems

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms

  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

191 of 202

Open Problems

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

Models that allow for real-time feedback & data updates, population dynamics, etc. It's doable now in a compartmental framework but largely ad hoc

192 of 202

Open Problems

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

193 of 202

Open Problems

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

Peer

Behavior

Substantively, peers and behavior co-constitute each other in a naturally endogenous and over-determined way. Notions of distinguishing the causal effect of peers on behavior net of behavior on peers miss-asks the question. We need some radical new thinking on this.

Is not equal to

Peer

Behavior

+

Peer

Behavior

194 of 202

Open Problems

Parent

Parent

Child

Child

Child

Positional models are fundamentally under-developed; yet hold the greatest promise of realizing the potential of relational models to provide deep insights into social organization and behavior.

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

195 of 202

Open Problems

Example: Social Exchange in developing contexts

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

196 of 202

Open Problems

Example: Social Exchange in developing contexts

Required: probably need to include content of relation in the theory (at least valence, likely more)

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

197 of 202

Open Problems

Do we know how relations should change over time?

🡪 A 4 year old should not relate the same way to parents as a 14 year old. But what about old friends? Neighbors? Etc.? What is the life-history of a relation?

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

198 of 202

Open Problems

The real controversy over the Framingham studies turned on social mechanism: how do relations get “inside”?

Current models are largely passive transmission or stress-response; both seem much too simple.

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

199 of 202

Open Problems

Networks exist within an institutional context; only way to know that is to return to communities

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

200 of 202

Open Problems

Radio collar studies of people might be a bit much (though talk to Kitts!), but we leave clear digital traces…can we use that smartly?

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

201 of 202

Open Problems

  1. Methods:
    1. Large-scale dynamic Diffusion
    2. Missing data
    3. Bounding Causal questions

  • Theory:
    • Roles & Multiplex Network dynamics
    • Network “life history”: relational evolution
    • Health Mechanisms:
  • Data:
    • Return to community studies
    • Electronic Traces
    • EMR

202 of 202