Social Networks & Health
2023
Welcome!
| Monday 15 | Tuesday 16 | Wednesday 17 | Thursday 18 | Friday 19 |
9:00 | | jimi adams: | Panel on Network Interventions | James Moody | James Moody |
9:30 | | Modalities and Pragmatics in data collection | Tom Valente | Review/Intro statistics on networks | Agent Based Models for SN&H.
|
10:00 | | Craig Rawlings: | Yamile Molina | James Moody | Sam Jenness |
10:30 | | Social Balance, Children, Video | Nina Yamanis | New Features of ERGM 4.0 | EpiModel: Epidemic Disease Model Sim |
10:45 | | Coffee | Coffee | Coffee | Coffee |
11:00 | | Moody: | Panel on Network Experiments | Thomas Wolff | Peter Cho, Jessilyn Dunn |
11:30 | | Implications of Missing data | Sharique Hassan: Field Experiments | ERGM for Ego nets & Bipartite graphs | Wearables for Data Collection |
12:00 | | Brea Perry | Ashley Harrell: Lab Experiments | Carter Butts: | Summary Q&A session |
12:30 | | Advances in ego network data collection and analysis? | | In Silico Network Experiments to Probe Mechanisms and Inform Study Design | |
1:00 | Welcome | Lunch: DNAC Crew | Lunch: Dana Pasquale: | Lunch: Robin Dodsworth | |
1:30 | James Moody | Open Discussion on Ethical Issues | Data Archiving | Linguistic variation and social networks | |
2:00 | Networks & Health | Ashton Verdery | Craig Rawlings | Alex Volfovsky | |
2:30 | Review | RDS/Link Data | Peer Influence Models | Latent Spaces & Causal Effects | |
2:45 | Coffee | Coffee | Coffee | Coffee | |
3:00 | DNAC RA | James Moody | Ashton Verdery | David Schaefer | |
3:30 | Coding in R for SNA | What's the best community detection method? | Simulating Kinship Networks | Siena Models | |
4:00 | | Peter Mucha | James Moody | Scott Duxbury | |
4:30 | Gabriel Varela: IDEANet | Proper Community Detection via CHAMP | Relational Block Models | Relational Event Models | |
5:00 | Stump the Chumps | Stump the Chumps | Stump the Chumps | Stump the Chumps | |
Outline
Social Networks & Health
But scientists are starting to take network seriously:
“Networks”
Introduction: why networks?
“Networks”
“Obesity”
But scientists are starting to take network seriously: why?
5918 papers in 2018
Introduction: why networks?
…and NSF is investing heavily in it.
Introduction: why networks?
Social network analysis is:
Introduction: why networks?
Social Determinants of Health
Social effects hold promising multiplier effects:
Introduction: why networks & health?
A brief history of Networks & health
Chapman, Alexander. Verdery, Ashton. M. and Moody James. (2022). “Analytic Advances in Social Networks and Health in the Twenty-First Century.. Journal of health and social behavior, 221465221086532. doi:10.1177/00221465221086532
A brief history of Networks & health
Social Science & Medicine, 2000
Science
Social
Networks
…more than 7500 publications…
Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris (2003).�statnet: Software tools for the Statistical Modeling of Network Data. URL http://statnetproject.org
State of the field
Trends
English language Articles indexed in Web of Science Social Science Citation Index on: ("health" or "well being" or "medicine") and "network*").
There have been 18572 such papers between 2000 - 2018.
State of the field
Trends
State of the field
Big-Picture
Bibliographic Similarity Networks: 1-step neighborhood of a single paper
State of the field
Big-Picture
Bibliographic Similarity Networks: 2-step neighborhood of a single paper
State of the field
Big-Picture
Since the net is large…
Use a force-directed layout to display the full space & overlay clusters….
The example paper…
Modularity:
Top-Level: 0.798 @ 32 Clusters
2nd Level: 0.785 @ 150 Clusters
Rogers:
Valente: Various
Christakis & Fowler
Martina Morris: Concurrency
Provan: Network Effectiveness
Add Health
Outline
Social Networks & Health
Introduction
Network Research Lifecycle
Introduction
Shameless Plug:
Available this fall!!
Introduction
Key Questions
Social Network analysis lets us answer questions about social interdependence. These include:
“Networks as Variables” approaches
“Networks as Structures” approaches
Both: Connectionist vs. Positional features of the network
We don’t want to draw this line too sharply: emergent role positions can affect individual outcomes in a ‘variable’ way, and variable approaches constrain relational activity.
Connectionist:
Positional:
Networks as pipes
Networks as roles
Ego
Complete
Multiple
- Structural Holes
- Density
- Mixing Models
- Size
- Community
Detection
- Reachability
- Homophily
- Degree
Distribution
- Social Balance
- ERGm
- Multi-layer networks
- Multi-level models of multiple networks
- Local Roles (Mandel 1983, Mandel & Winship 1984)
- Relational Block Models
- Motifs
Centrality
Cohesive blocking
2 ideas:
Connections & Positions: Network Problems
Why do networks matter?
Two fundamental mechanisms: Connections
Connectionist network mechanisms : Networks matter because of the things that flow through them. Networks as pipes.
The spread of any epidemic depends on the number of secondary cases per infected case, known as the reproductive rate (R0). R0 depends on the probability that a contact will be infected over the duration of contact (β), the likelihood of contact (c), and the duration of infectiousness (D).
For network transmission problems, the trick is specifying c, which depends on the network.
Why do networks matter?
Two fundamental mechanisms: Connections example
Isolated vision
Why do networks matter?
Two fundamental mechanisms: Connections example
Connected vision
Why do networks matter?
Why do networks matter?
Two fundamental mechanisms: Connections example
Connections: Diffusion
Connectionist approaches are (by far) the most common aspect of network models in health research.
Theoretically any feature of the setting that governs spread through the network is of interest and will be reflected at multiple levels of the network.
Often we don’t have exact traces of the diffusion itself, only roughly timed outcome differences, which causes problems.
Provides food for
Romantic Love
Bickers with
Why do networks matter?
Two fundamental mechanisms: Positions
Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.
Parent
Parent
Child
Child
Child
Provides food for
Romantic Love
Bickers with
Why do networks matter?
Two fundamental mechanisms: Positions
Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.
C
P
X
Y
Pescosolido & Rubin
Basic structuralist duality: persons are the intersection of who (& how) they are connected to; while collectives are the emergent structure built from those connections.
Social life is a collective & structured affair: the groups we belong to & the roles we occupy simultaneously define us and our social setting.
Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.
Why do networks matter?
Two fundamental mechanisms: Positions
There is a classic structure-action tension between network structuralism and any duality-as-identity.
On the one hand, strong structures constrain interaction opportunities and acceptable activities to such a degree that the network, and one’s position in it, is fixed. Actors are structural dupes.
On the other hand, personal agency implied by self-authorship & “selection” suggest rapidly changing networks that (seem to) lack the substantive stability necessary to warrant being called “structure” in any meaningful sense. Networks become the epiphenomenon of action.
Network tools allow us to empirically interrogate the (also classic) route out of this dilemma: structures are (re)constituted in the ways actors behave; regularized ways of interacting create dynamically-stable settings.
This creates multiple levels of measurement & modeling
Why do networks matter?
Two fundamental mechanisms: wider theory
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Stability & Trajectory patterns across:
Each level requires new techniques or tools to capture setting
That we find stability at the macro level despite lots of churn at the local level raises interesting questions:
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Stability & Trajectory patterns across:
Each level requires new techniques or tools to capture setting
That we find stability at the macro level despite lots of churn at the local level raises interesting questions:
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Each level requires new techniques or tools to capture setting
That we find stability at the macro level despite lots of churn at the local level raises interesting questions:
Stability & Trajectory patterns across:
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Each level requires new techniques or tools to capture setting
That we find stability at the macro level despite lots of churn at the local level raises interesting questions:
Stability & Trajectory patterns across:
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Each level requires new techniques or tools to capture setting
Positional approaches are less common in health research, but very promising as new ways to conceptualize context effects.
That we find stability at the macro level despite lots of churn at the local level raises interesting questions:
Stability & Trajectory patterns across:
Why do networks matter?
Two fundamental mechanisms: Multiple levels
Outline
Social Networks & Health
The unit of interest in a network are the combined sets of actors and their relations.
We represent actors with points and relations with lines.
Actors are referred to variously as:
Nodes, vertices or points
Relations are referred to variously as:
Edges, Arcs, Lines, Ties
Example:
a
b
c
e
d
Social Network Data
In general, a relation can be:
Binary or Valued
Directed or Undirected
a
b
c
e
d
Undirected, binary
Directed, binary
a
b
c
e
d
a
b
c
e
d
Undirected, Valued
Directed, Valued
a
b
c
e
d
1
3
4
2
1
Social Network Data
In general, a relation can be: (1) Binary or Valued (2) Directed or Undirected
Social Network Data
Basic Data Elements
The social process of interest will often determine what form your data take. Conceptually, almost all of the techniques and measures we describe can be generalized across data format, but you may have to do some of the coding work yourself….
Directed,
Multiplex categorical edges
a
b
c
e
d
We can examine networks across multiple levels:
1) Ego-network
- Have data on a respondent (ego) and the people they are connected to (alters).
- May include estimates of connections among alters
2) Partial network
- Ego networks plus some amount of tracing to reach contacts of contacts
- Something less than full account of connections among all pairs of actors in the relevant population
- Example: CDC Contact tracing data for STDs
Social Network Data
Basic Data Elements: Levels of analysis
3) Complete or “Global” data
- Data on all actors within a particular (relevant) boundary
- Never exactly complete (due to missing data), but boundaries are set
We can examine networks across multiple levels:
Social Network Data
Basic Data Elements: Levels of analysis
Ego-Net
Global-Net
Best Friend
Dyad
Primary
Group
Social Network Data
Basic Data Elements: Levels of analysis
2-step
Partial network
Social Network Data
Social network data are substantively divided by the number of modes in the data.
1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples:Communication, friendship, giving orders, sending email.
This is commonly what people think about when thinking about networks: nodes having direct relations with each other.
Social Network Data
Social network data are substantively divided by the number of modes in the data.
2-mode data represents nodes from two separate classes, where all ties are across classes. Examples:
People as members of groups
People as authors on papers
Words used often by people
Events in the life history of people
The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to each other through common membership
There may be multiple relations of multiple types connecting your nodes.
Bipartite networks imply a constraint on the mixing, such that ties only cross classes.
Here we see a tie connecting each woman with the party she attended (Davis data)
Social Network Data
Basic Data Elements: Modes
Social Network Data
Basic Data Elements: Modes
Bipartite networks imply a constraint on the mixing, such that ties only cross classes.
Here we see a tie connecting each woman with the party she attended (Davis data)
By projecting the data, one can look at the shared between people or the common memberships in groups: this is the person-to-person projection of the 2-mode data.
Social Network Data
Basic Data Elements: Modes
Social Network Data
Basic Data Elements: Modes
By projecting the data, one can look at the shared between people or the common memberships in groups: this is the group-to-group projection of the 2-mode data.
Casalino, Lawrence P., Michael F. Pesko, Andrew M. Ryan, David J. Nyweide, Theodore J. Iwashyna, Xuming Sun, Jayme Mendelsohn and James Moody. “Physician Networks and Ambulatory Care Admissions” Medical Care 53:534-41
Social Network Data
Example of a 2-mode network: Patients & Care Settings
Social Network Data
Multi-layer networks
A generalization of multiplex networks and multi-mode networks is the multi-layer network. A multi-layer network is a network that has qualitatively different classes of nodes (like multi-mode networks) and qualitatively different types of relations (like multi-plex). Multiplex and multi-mode networks can be subsumed under the multi-layer formalism.
See Kivela, Mikko, Alex Arenas Marc Bartheelemy, James P. Gleeson, Yamir Moreno, and Mason Porter. “Multilayer Networks” Journal of complex Networks 2:203-271. https://doi.org/10.1093/comnet/cnu016
From pictures to matrices
a
b
c
e
d
Undirected, binary
a
b
c
d
e
a
b
c
d
e
1
1
1
1
1
1
1
1
1
1
An undirected graph and the �corresponding matrix is symmetric.
The traditional way to store & represent network data is with an adjacency matrix.�
The matrix (X) at right represents an undirected binary network. Each node (a-e) is listed on both the row and the column.
The ith row and the jth column (Xij) records the value of a tie from node i to node j. For example, the line between nodes a and b is represented as an entry in the first row and second column (red at right).
Because the graph is undirected the ties sent are the same as the ties receive, so every entry above the diagonal equals the entries below the diagonal.
Basic Data Structures
Social Network Data
Directed, binary
a
b
c
e
d
a
b
c
d
e
a
b
c
d
e
1
1
1
1
1
1
1
A directed graph and the �corresponding matrix is asymmetrical.
Directed graphs, on the other hand,�are asymmetrical.
We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.
Basic Data Structures
Social Network Data
a
b
c
d
e
a
b
c
d
e
1
3
1
2
4
2
1
A directed graph and the �corresponding matrix is asymmetrical.
Directed graphs, on the other hand,�are asymmetrical.
We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.
Basic Data Structures
Social Network Data
Directed, Valued
a
b
c
e
d
From matrices to lists (binary)
a
b
c
d
e
a
b
c
d
e
1
1
1
1
1
1
1
1
1
1
a b
b a c
c b d e
d c e
e c d
a b
b a
b c
c b
c d
c e
d c
d e
e c
e d
Adjacency List
Arc List
Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.
a
b
c
e
d
Basic Data Structures
Social Network Data
From matrices to lists (valued)
a
b
c
d
e
a
b
c
d
e
1
1
2
2
3
5
3
1
5
1
a b
b a c
c b d e
d c e
e c d
a b 1
b a 1
b c 2
c b 2
c d 3
c e 5
d c 3
d e 1
e c 5
e d 1
Adjacency List
Arc List
Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.
a
b
c
e
d
Basic Data Structures
Social Network Data
1
2
5
1
3
a 1
b 1 2
c 2 3 1
d 3 1
e 5 1
contact
value
From matrices to lists (valued)
a
b
c
d
e
a
b
c
d
e
1
1
2
3
a b 1 green
b a 1 green
c b 2 green
c b 2 red
c d 3 green
c e 5 red
e c 5 red
e d 1 red
Arc List
Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.
Basic Data Structures
Social Network Data
a
b
c
e
d
a
b
c
d
e
2
5
5
1
Set/stack/list of adj mats
Working with two-mode data
A person-to-group adjacency matrix is rectangular, with one mode (persons, say) down rows and the other (groups, say) across columns
1 2 3 4 5
A 0 0 0 0 1
B 1 0 0 0 0
C 1 1 0 0 0
D 0 1 1 1 1
E 0 0 1 0 0
F 0 0 1 1 0
A =
Each column is a group, each row a person, and the cell = 1 if the person in that row belongs to that group.
You can tell how many groups two people both belong to by comparing the rows: Identify every place that both rows = 1, sum them, and you have the overlap.
Basic Data Structures
Social Network Data
G = AT(A) = group to group projection
P = A(AT) = person to person projection
Social Network Data
Multi-layer networks
5 waves of Newcombe fraternity data as multi-layer.
Outline
Social Networks & Health
Exploratory analysis: Visualization
One of the first analysis tasks one undertakes is to visualize the network. This gives you a chance to gut-check the data, make sure nothing looks terribly out of place and start exploring key “shape” functions of the setting.
A good visualization can go along way toward building intuition about what features are shaping the network. This is a very deep field, we’re just going to touch on enough to get you started.
A good network drawing allows viewers to come away from the image with an almost immediate intuition about the underlying structure of the network being displayed.
However, because there are multiple ways to display the same information, and standards for doing so are few, the information content of a network display can be quite variable.
Now trace the actual pattern of ties. You will see that these 4 graphs are exactly the same.
Consider the 4 graphs drawn at right.
After asking yourself what intuition you gain from each graph, click on the screen.
Exploratory analysis: Visualization
Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:
Tree-Based layouts
Most effective for very sparse, regular graphs. Very useful when relations are strongly directed, such as organization charts, cluster tress or internet connections.
“Force” layouts
Most effective with graphs that have a strong community structure (clustering, etc). Provides a very clear correspondence between social distance and plotted distance
Two images of the same network
(good)
(Fair - poor)
Basic Layout Heuristics
Tree-Based layouts
Two layouts of the same network
(poor)
(good)
Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:
Basic Layout Heuristics
“good” = high correlation between spatial distance and network distance.
“Force” layouts
Fixed Coordinate Layouts
Two layouts of the same network
Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:
Basic Layout Heuristics
Fit: 0.528
Fit: .682
Fixed Coordinate Layouts
Two layouts of the same network
Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:
Basic Layout Heuristics
Fit=0.57
Fit=0.26
Levels of distortion?
Two layouts of the same network
Network visualization helps build intuition, but you have to keep the drawing heuristic. Here we show the same graphs with two different techniques:
Basic Layout Heuristics
Shrink each cluster to accent separation
Fit=0.75
Fit=0.81
Secondary style elements
Some seemingly innocuous features can shape the effect of network diagrams
Use curves sparingly…most useful when there are multiple edges between the same nodes.
Abstraction helps – consider dense networks
Co-voting similarity, 109th senate
Basic Problem 1: Density/Scale
An alternative is to think of a network trace as the likely number reached from any given node, where “likely” is determined either by an edge transmission p(transfer|dist) or simple distance (geodesic), and follow each starting node.
Trace Curves: Number reachable at each step
The collection of curves is useful, to either compare to a null hypothesis or to contextualize an observed epidemic. Curves are simple to use, as we have good statistical measures of their shape.
But they are still one-step removed from the network itself, so how might we incorporate this information into the network?
Basic Problem 2: Spread over a dynamic network
Tree Based Layout.
Since concurrency is an edge property, we color the edges by the concurrency status of the tie…
Start with Context: Now follow only the diffusion paths
Basic Problem 2: Spread over a dynamic network
Tree Based Layout.
..but since the path carries the infection, we highlight the cumulative effect of concurrency by noting any place where a transmission would have been impossible were it not for an earlier concurrent edge.
This is one way to deal with the importance of network dynamics.
Start with Context: Now follow only the diffusion paths
Basic Problem 2: Spread over a dynamic network
Another way to represent time is by stacking relations against time:
Basic Problem 3: Dynamic network Evolution
Outline
Social Networks & Health
Basic Network Metrics
Much of network “analysis” is measuring a metric on the graph. These often define features that are then used as independent variables in a substantive analysis.
We next describe some of the most common basic elements; there are many alternatives and nuances to each…this is the tip of the iceberg.
Density: Mean of the adjacency matrix
a
b
c
e
d
1
2
5
1
3
a
b
c
d
e
a
b
c
d
e
1
1
2
2
3
5
3
1
5
1
--
--
--
--
--
a
b
c
d
e
a
b
c
d
e
1
1
1
1
1
1
1
1
1
1
--
--
--
--
--
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
For most real-life social networks, density is hard to interpret because the denominator scales by the square of population. So average degree is often better.
Social Network Metrics: Basic Volume
Degree: Number of links adjacent to a node
a
b
c
d
e
a
b
c
d
e
1
1
1
1
1
1
1
1
1
3
0
2
1 2 1 2 1 7
a
b
c
d
e
a
b
c
d
e
1
1
2
3
1
5
1
1
1
6
0
6
1 3 5 4 1 14
Social Network Metrics: Basic Volume
Connectivity refers to how actors in one part of the network are connected to actors in another part of the network.
Social Network Metrics: Connectivity
d
e
c
Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them.
a
b
c
e
d
f
b
f
a
Paths can be directed, leading to a distinction between strong and weak components
Social Network Metrics: Connectivity
Basic elements in connectivity
Example: a 🡪 b 🡪 c🡪d
Social Network Metrics: Connectivity
Reachability
If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component.
Intuitively, a component is the set of people who are all connected by a chain of relations.
Social Network Metrics: Connectivity
This example contains many components.
Social Network Metrics: Connectivity
Because relations can be directed or undirected, components come in two flavors:
For a graph with any directed edges, there are two types of components:
Strong components consist of the set(s) of all nodes that are mutually reachable
Weak components consist of the set(s) of all nodes where at least one node can reach the other.
Social Network Metrics: Connectivity
There are only 2 strong components with more than 1 person in this network.
Components are the minimum requirement for social groups. As we will see later, they are necessary but not sufficient
All of the major network analysis software identifies strong and weak components
Social Network Metrics: Connectivity
Many large networks are characterized by a highly skewed distribution of the number of partners (degree)
Social Network Metrics: Connectivity
Degree distributions
Many large networks are characterized by a highly skewed distribution of the number of partners (degree)
Social Network Metrics: Connectivity
Degree distributions
The scale-free model focuses on the distance-reducing capacity of high-degree nodes:
Social Network Metrics: Connectivity
Degree distributions
The scale-free model focuses on the distance-reducing capacity of high-degree nodes:
If a preferential attachment model is active, then high-degree nodes are hubs, that if removed can disconnect the network.
Since hubs are rare, PA networks are robust to random attack, but very fragile to targeted attack.
Social Network Metrics: Connectivity
Degree distributions
Colorado Springs High-Risk
(Sexual contact only)
Social Network Metrics: Connectivity
Degree distributions
a
Geodesic distance is measured by the smallest (weighted) number of relations separating a pair:
Actor “a” is:
1 step from 4
2 steps from 5
3 steps from 4
4 steps from 3
5 steps from 1
a
Social Network Metrics: Connectivity
Distance
a b c d e f g h i j k l m
------------------------------------------
a. . 1 2 . . . . . . . . 2 1
b. 3 . 1 . . . . . . . . 1 2
c. . . . . . . . . . . . . .
d. 4 3 1 . 1 2 1 . 2 . . 2 3
e. 3 2 2 1 . 1 2 . 1 . . 1 2
f. 4 3 3 2 1 . 3 . 2 . . 2 3
g. 5 4 4 3 2 1 . . 3 . . 3 4
h. . . . . . . . . 1 . . . .
i. . . . . . . . . . . . . .
j. . . . . . . . . 1 . . . .
k. . . . . . . . . 1 . . . .
l. 2 1 2 . . . . . . . . . 1
m. 1 2 3 . . . . . . . . 1 .
b
c
d
g
f
e
k
i
j
h
l
m
a
When the graph is directed, distance is also directed (distance to vs distance from), following the direction of the tie.
Social Network Metrics: Connectivity
Distance
Reachability in Colorado Springs
(Sexual contact only)
(Node size = log of degree)
Social Network Metrics: Connectivity
Distance
As a graph statistic, the distribution of distance can tell you a good deal about how close people are to each other (we’ll see this more fully when we get to closeness centrality).
The diameter of a graph is the longest geodesic, giving the maximum distance. We often use the l, or the mean distance between every pair to characterize the entire graph.
For example, all else equal, we would expect rumors to travel faster through settings where the average distance is small.
Measuring Networks: Connectivity
Distance
For a real network, people’s friends are not random, but clustered. We can modify the random equation by adjusting a, such that some portion of the contacts are random, the rest not. This adjustment is a ‘bias’ - I.e. a non-random element in the model -- that gives rise to the notion of ‘biased networks’. People have studied (mathematically) biases associated with:
There is still a great deal of work to be done in this area empirically, and it promises to be a good way of studying the structure of very large networks.
Measuring Networks: Connectivity
Redundancy Local
Measuring Networks: Connectivity
Redundancy (Loca)
Local redundancy is known as “clustering” or “transitivity” - that one’s friends are friends with each other.
Density is the proportion of pairs tied, excluding ego.
Transitivity is the proportion of two-step ties that are closed (Friend of a Friend is a friend)
Density
Transitivity
Transitivity
No ego
0 0 0
0.4 0.71 1.0
1 1 1
0.7 0.78 .64
Node Connectivity
As size of cut-set
0
1
2
3
Structural Cohesion:
A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.
Measuring Networks: Connectivity
Redundancy Global: Structural Cohesion
0
1
2
3
Node Connectivity
As number of node-independent paths
Measuring Networks: Connectivity
Redundancy Global: Structural Cohesion
Structural Cohesion:
A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.
1
2
3
4
Nestedness Structure
Cohesive Blocks
Depth
Sociogram
5
1
2
3
4
5
6
7
8
9
Cohesive Blocking
The arrangement of subsequently more connected sets by branches and depth uniquely characterize the connectivity structure of a network
Measuring Networks: Connectivity
Redundancy Global: Structural Cohesion
Distance & Connectivity measures “locate” a node based on particular features of the path structure, but there are many other ways of locating nodes in networks.
Centrality refers to (one dimension of) location, identifying where an actor resides in a network.
As a terminology point, some authors distinguish centrality from prestige based on the directionality of the tie. Since the formulas are the same in every other respect, I stick with “centrality” for simplicity.
Measuring Networks
Centrality
Conceptually, centrality is fairly straight forward: we want to identify which nodes are in the ‘center’ of the network. In practice, identifying exactly what we mean by ‘center’ is somewhat complicated, but substantively we often have reason to believe that people at the center are very important with respect to some pattern of flow/spread on the network.
The standard centrality measures capture a wide range of “importance” in a network:
After discussing these, I will describe measures that combine features of each of them.
Measuring Networks
Centrality
Measuring Networks
Centrality
..and just a teaser of all the elements we are leaving out:
Schoch lists over 100 different measures…
The most intuitive notion of centrality focuses on degree. Degree is the number of direct contacts a person has. The ideas is that the actor with the most ties is the most important:
Measuring Networks
Centrality
Degree centrality, however, can be deceiving, because it is a purely local measure.
Measuring Networks
Centrality
If we want to measure the degree to which the graph as a whole is centralized, we look at the dispersion of centrality:
Simple: variance of the individual centrality scores.
Or, using Freeman’s general formula for centralization (which ranges from 0 to 1):
UCINET, SPAN, PAJEK and most other network software will calculate these measures.
Measuring Networks
Centrality
Degree Centralization Scores
Freeman: .07
Variance: .20
Freeman: 1.0
Variance: 3.9
Freeman: .02
Variance: .17
Freeman: 0.0
Variance: 0.0
Measuring Networks
Centrality
A second measure of centrality is closeness centrality. An actor is considered important if he/she is relatively close to all other actors.
Closeness is based on the inverse of the distance of each actor to every other actor in the network.
Closeness Centrality:
Normalized Closeness Centrality
Measuring Networks
Centrality
Distance Closeness normalized
0 1 1 1 1 1 1 1 .143 1.00
1 0 2 2 2 2 2 2 .077 .538
1 2 0 2 2 2 2 2 .077 .538
1 2 2 0 2 2 2 2 .077 .538
1 2 2 2 0 2 2 2 .077 .538
1 2 2 2 2 0 2 2 .077 .538
1 2 2 2 2 2 0 2 .077 .538
1 2 2 2 2 2 2 0 .077 .538
Closeness Centrality in the examples
Distance Closeness normalized
0 1 2 3 4 4 3 2 1 .050 .400
1 0 1 2 3 4 4 3 2 .050 .400
2 1 0 1 2 3 4 4 3 .050 .400
3 2 1 0 1 2 3 4 4 .050 .400
4 3 2 1 0 1 2 3 4 .050 .400
4 4 3 2 1 0 1 2 3 .050 .400
3 4 4 3 2 1 0 1 2 .050 .400
2 3 4 4 3 2 1 0 1 .050 .400
1 2 3 4 4 3 2 1 0 .050 .400
Measuring Networks
Centrality
Distance Closeness normalized
0 1 2 3 4 5 6 .048 .286
1 0 1 2 3 4 5 .063 .375
2 1 0 1 2 3 4 .077 .462
3 2 1 0 1 2 3 .083 .500
4 3 2 1 0 1 2 .077 .462
5 4 3 2 1 0 1 .063 .375
6 5 4 3 2 1 0 .048 .286
Closeness Centrality in the examples
Measuring Networks
Centrality
Distance Closeness normalized
0 1 1 2 3 4 4 5 5 6 5 5 6 .021 .255
1 0 1 1 2 3 3 4 4 5 4 4 5 .027 .324
1 1 0 1 2 3 3 4 4 5 4 4 5 .027 .324
2 1 1 0 1 2 2 3 3 4 3 3 4 .034 .414
3 2 2 1 0 1 1 2 2 3 2 2 3 .042 .500
4 3 3 2 1 0 2 3 3 4 1 1 2 .034 .414
4 3 3 2 1 2 0 1 1 2 3 3 4 .034 .414
5 4 4 3 2 3 1 0 1 1 4 4 5 .027 .324
5 4 4 3 2 3 1 1 0 1 4 4 5 .027 .324
6 5 5 4 3 4 2 1 1 0 5 5 6 .021 .255
5 4 4 3 2 1 3 4 4 5 0 1 1 .027 .324
5 4 4 3 2 1 3 4 4 5 1 0 1 .027 .324
6 5 5 4 3 2 4 5 5 6 1 1 0 .021 .255
Closeness Centrality in the examples
Measuring Networks
Centrality
Betweenness Centrality:
Model based on communication flow: A person who lies on communication paths can control communication flow, and is thus important. Betweenness centrality counts the number of shortest paths between i and k that actor j resides on.
b
a
C d e f g h
Measuring Networks
Centrality
Betweenness Centrality:
Where gjk = the number of geodesics connecting jk, and
gjk(ni) = the number that actor i is on.
Usually normalized by:
Measuring Networks
Centrality
Centralization: 1.0
Centralization: .31
Centralization: .59
Centralization: 0
Betweenness Centrality:
Measuring Networks
Centrality
Centralization: .183
Betweenness Centrality:
Measuring Networks
Centrality
Information Centrality:
It is quite likely that information can flow through paths other than the geodesic. The Information Centrality score uses all paths in the network, and weights them based on their length.
Measuring Networks
Centrality
Comparing across these 3 centrality values
Low
Degree
Low
Closeness
Low
Betweenness
High Degree
Embedded in cluster that is far from the rest of the network
Ego's connections are redundant - communication bypasses him/her
High Closeness
Key player tied to important important/active alters
Probably multiple paths in the network, ego is near many people, but so are many others
High Betweenness
Ego's few ties are crucial for network flow
Very rare cell. Would mean that ego monopolizes the ties from a small number of people to many others.
Measuring Networks
Centrality
Bonacich Power Centrality: Actor’s centrality (prestige) is equal to a function of the prestige of those they are connected to. Thus, actors who are tied to very central actors should have higher prestige/ centrality than those who are not.
This is a variant of eigenvector centrality/pagerank.
Measuring Networks
Centrality
Bonacich Power Centrality:
The magnitude of β reflects the radius of power. Small values of β weight local structure, larger values weight global structure.
If β is positive, then ego has higher centrality when tied to people who are central.
If β is negative, then ego has higher centrality when tied to people who are not central.
As β approaches zero, you get degree centrality.
Measuring Networks
Centrality
Bonacich Power Centrality:
β = 0.23
Measuring Networks
Centrality
A primary interest in Social Network Analysis is the identification of “significant social subgroups” – some smaller collection of nodes in the graph that can be considered, at least in some senses, as a “unit” based on the pattern, strength, or frequency of ties.
Measuring Networks
Community Detection
There are many ways to identify groups. They all insist on a group being in a connected component, but other than that the variation is wide.
We’re covering this in-depth tomorrow!
Modularity is probably the most commonly used contemporary clustering metric
Intuitively this is the
Expected ties
Indicator for same group (0,1)
Observed ties
Resolution parameter
Normalizing
constant
Measuring Networks
Community Detection
Q=0.24
Q=0.51
Q=0.68
Measuring Networks
Community Detection
Elements of a Role:
Examples:
Parent – child
Teacher – student
Lover – lover
Friend – Friend
Husband - Wife
Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together.
Measuring Networks
Community Role Detection
Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as:
P
P
C
C
C
Provides food for
Romantic Love
Bickers with
(and there are, of course, many other relations inside a family)
White et al: From logical role systems to empirical social structures
Measuring Networks
Community Role Detection
Blockmodeling: basic steps
In any positional analysis, there are 4 basic steps:
1) Identify a definition of equivalence
2) Measure the degree to which pairs of actors are equivalent
3) Develop a representation of the equivalencies
4) Assess the adequacy of the representation
At the end of the day, this is community detection on a role-similarity matrix rather than an adjacency matrix.
Measuring Networks
Community Role Detection
If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations, usually including both “directions” of asymmetric relations:
P
P
C
C
C
Provides food for
Romantic Love
Bickers with
Measuring Networks
Community Role Detection
Sim
1 1 0 0 0
1 1 0 0 0
0 0 1 1 1
0 0 1 1 1
0 0 1 1 1
Covering this in-depth on WED
Outline
Social Networks & Health
Introduction
Network Research Lifecycle
Models?
-OF networks
- An explication of why a network looks as it does.
- formal or informal
ERGM vs. “Birds of a feather”
-ON networks.
- An explication of a (social) process that uses networks
- Diffusion of innovations
-Peer Influence
- Brokerage
We spend a lot of time here on formal statistical models; but substantive models should drive the case.
Network Models: Diffusion
The primary route from networks to health is via diffusion, either of a pathogen or a health-related behavior.
We start by discussing biological diffusion as it’s a clear and mechanistic model that we can build from for social diffusion. The treatment here is *very brief,* aimed at giving a sense of the issues at play.
Network Diffusion & Peer Influence
Basics
Classic (disease) diffusion makes use of compartmental models. Large N and homogenous mixing allows one to express spread as generalized probability models.
Works very well for highly infectious bits in large populations…
SI(S) model – actors are in only two states, susceptible or infectious.
See: https://wiki.eclipse.org/Introduction_to_Compartment_Models for general introduction.
SIIR(S) model – adds an “exposed” but not infectious state and recovered.
Network Diffusion & Peer Influence
Basics
Network Models
Same basic SI(R,,etc) setup, but connectivity is not assumed random, rather it is structured by the network contact pattern.
If pij is small or the network is very clustered, these two can yield very different diffusion patterns.*
Real
Random
*these conditions do matter. Compartmental models work surprisingly well if the network is large, dense or the bit highly infectiousness…because most networks have a bit of randomness in them. We are focusing on the elements that are unique/different for network as opposed to general diffusion.
Network Diffusion & Peer Influence
Basics
If 0 < pij < 1
Network Diffusion & Peer Influence
Basics
If 0 < pij < 1
0.01
0.06
0.11
0.26
0.46
In addition to* the dyadic probability that one actor passes something to another (pij), two factors affect flow through a network:
Topology
- Example: one actor cannot pass information to another unless they are either directly or indirectly connected
Time
- the timing of contact matters
- Example: an actor cannot pass information he has not receive yet
*This is a big conditional! – lots of work on how the dyadic transmission rate may differ across populations.
Key Question: What features of a network contribute most to diffusion potential?
Network Diffusion & Peer Influence
Network diffusion features
Use simulation tools to explore the relative effects of structural connectivity features
0
0.2
0.4
2
3
4
5
6
Path distance
probability
Distance and diffusion (p(transfer)=pijdist
Here pij of 0.6
Network Diffusion & Peer Influence
Network diffusion features
We need:
(1) reachability
(2) distance
(3) local clustering
(4) multiple routes
(5) star spreaders
Arcs: 11
Largest component: 12,
Clustering: 0
Arcs: 11
Largest component: 8,
Clustering: 0.205
We need:
(1) reachability
(2) distance
(3) local clustering
(4) multiple routes
(5) star spreaders
Network Diffusion & Peer Influence
Network diffusion features
We need:
(1) reachability
(2) distance
(3) local clustering
(4) multiple routes
(5) star spreaders
Network Diffusion & Peer Influence
Network diffusion features
Probability of transfer
by distance and number of non-overlapping paths, assume a constant pij of 0.6
0
0.2
0.4
0.6
0.8
1
1.2
2
3
4
5
6
Path distance
probability
1 path
10 paths
5 paths
2 paths
Cohesion 🡺 Redundancy 🡺Diffusion
Network Diffusion & Peer Influence
Network diffusion features
0
1
2
3
Node Connectivity
As number of node-independent paths
Structural Cohesion:
A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.
Network Diffusion & Peer Influence
Network diffusion features
STD Transmission danger: sex or drugs?
Structural core more realistic than nominal core
Data from “Project 90,” of a high-risk population in Colorado Springs
Network Diffusion & Peer Influence
Network diffusion features
We need:
(1) reachability
(2) distance
(3) local clustering
(4) multiple routes
(5) star spreaders
Network Diffusion & Peer Influence
Network diffusion features
We need:
(1) reachability
(2) distance
(3) local clustering
(4) multiple routes
(5) star spreaders
Network Diffusion & Peer Influence
Network diffusion features
Network Diffusion & Peer Influence
Network diffusion features
Assortative mixing:
A more traditional way to think about “star” effects.
Partner
Distribution
Component
Size/Shape
Emergent Connectivity in low-degree networks
Network Diffusion & Peer Influence
A closer look at emerging connectivity
In both distributions, a giant component & reconnected core emerges as density increases, but at very different speeds and ultimate extent.
Network Diffusion & Peer Influence
A closer look at emerging connectivity
In addition to* the dyadic probability that one actor passes something to another (pij), two factors affect flow through a network:
Topology
- Example: one actor cannot pass information to another unless they are either directly or indirectly connected
Time
- the timing of contact matters
- Example: an actor cannot pass information he has not receive yet
*This is a big conditional! – lots of work on how the dyadic transmission rate may differ across populations.
Key Question: What features of a network contribute most to diffusion potential?
Network Diffusion & Peer Influence
Relational Dynamics
Use simulation tools to explore the relative effects of structural connectivity features
Contact network: Everyone, it is a connected component
Who can “A” reach?
Network Diffusion & Peer Influence
Relational Dynamics
Discussions of network effects on STD spread often speak loosely of “the network.”
There are three relevant networks that are often conflated:
Three relevant networks
Exposure network: here, node “A” could reach up to 8 others
Who can “A” reach?
Network Diffusion & Peer Influence
Relational Dynamics
Discussions of network effects on STD spread often speak loosely of “the network.”
There are three relevant networks that are often conflated:
Three relevant networks
Transmission network: upper limit is 8 through the exposure links (dark blue). Transmission is path dependent: if no transmission to B, then also none to {K,L,O,J,M}
Who can “A” reach?
Exposable Link (from A’s p.o.v.)
Contact
Network Diffusion & Peer Influence
Relational Dynamics
Discussions of network effects on STD spread often speak loosely of “the network.”
There are three relevant networks that are often conflated:
Three relevant networks
The mapping between the contact network and the exposure network is based on relational timing. In a dynamic network, edge timing determines if something can flow down a path because things can only be passed forward in time.
Definitions:
Two edges are adjacent if they share a node.
A path is a sequence of adjacent edges (E1, E2, …Ed).
A time-ordered path is a sequence of adjacent edges where, for each pair of edges in the sequence, the start time Si is less than or equal to Ej S(E1) < E(E2)
Adjacent edges are concurrent if they share a node and have start and end dates that overlap. This occurs if:
S(E2) < E(E1)
Concurrency
Network Diffusion & Peer Influence
Relational Dynamics
A
B
C
D
time
1 2 3 4 5 6 7 8 9 10
AB
BC
CE
E
CD
2 - 7
1 - 3
5 - 6
8 - 9
S(ab)
E(ab)
S(bc)
E(bc)
S(ce)
E(ce)
The mapping between the contact network and the exposure network is based on relational timing. In a dynamic network, edge timing determines if something can flow down a path because things can only be passed forward in time.
Concurrency
Network Diffusion & Peer Influence
Relational Dynamics
The constraints of time-ordered paths change our understanding of the system structure of the network. Paths make a network a system: linking actors together through indirect connections. Relational timing changes how paths cumulate in networks.
Indirect connectivity is no longer transitive:
A
B
C
D
1 - 2
3 - 4
1 - 2
Here A can reach C, and C and reach D. But A cannot reach D (nor D A). Why? Because any infection A passes to C would have happened after the relation between C and D ended.
A
B
C
D
1 - 2
3 - 4
1 - 2
Network Diffusion & Peer Influence
Relational Dynamics
Edge time structures are characterized by sequence, duration and overlap.
Paths between i and j, have length and duration, but these need not be symmetric even if the constituent edges are symmetric.
Network Diffusion & Peer Influence
Relational Dynamics
Implied Contact Network of 8 people in a ring
All relations Concurrent
Reachability = 1.0
Network Diffusion & Peer Influence
Relational Dynamics
Implied Contact Network of 8 people in a ring
Serial Monogamy (1)
1
2
3
7
6
5
8
4
Reachability = 0.71
Network Diffusion & Peer Influence
Relational Dynamics
Implied Contact Network of 8 people in a ring
Mixed Concurrent
2
2
1
1
2
2
3
3
Reachability = 0.57
Network Diffusion & Peer Influence
Relational Dynamics
Implied Contact Network of 8 people in a ring
Serial Monogamy (3)
1
2
1
1
2
1
2
2
Reachability = 0.43
Network Diffusion & Peer Influence
Relational Dynamics
1
2
1
1
2
1
2
2
Timing alone can change mean reachability from 1.0 when all ties are concurrent to 0.42.
In general, ignoring time order is equivalent to assuming all relations occur simultaneously – assumes perfect concurrency across all relations.
Network Diffusion & Peer Influence
Relational Dynamics
Resulting infection trace from a simulation (Morris et al, AJPH 2010).
Observed infection paths from 10 seeds in an STD simulation, edges coded for concurrency status.
Network Diffusion & Peer Influence
Relational Dynamics
Resulting infection trace from a simulation (Morris et al, AJPH 2010).
Network Diffusion & Peer Influence
Relational Dynamics
Observed infection paths from 10 seeds in an STD simulation, edges coded for concurrency status.
Timing constrains potential diffusion paths in networks, since bits can flow through edges that have ended.
This means that:
Combined, this means that many of our standard path-based network measures will be incorrect on dynamic graphs.
Network Diffusion & Peer Influence
Relational Dynamics
Network Diffusion & Peer Influence
Structural Transmission Dynamics: beyond disease diffusion
Complex Contagion
Thus far we have focused on a “simple” dyadic diffusion parameter, pij, where the probability of passing/receiving the bit is purely dependent on discordant status of the dyad, sometimes called the “independent cascade model” (), which suggests a monotonic relation between the number of times you are exposed through peers.
High exposure could be due to repeated interaction with one person or weak interaction with many, effectively equating:
Alternative models exist. Under “complex contagion” for example, the likelihood that I accept the bit that flows through the network depends on the proportion of my peers that have the bit.
Network Diffusion & Peer Influence
Structural Transmission Dynamics: beyond disease diffusion
1
1
2
3
Complex Contagion
Assume adoption requires k neighbors having adopted, then transmission can only occur within dense clusters:
Network Diffusion & Peer Influence
Structural Transmission Dynamics: beyond disease diffusion
Complex Contagion
Assume adoption requires k neighbors having adopted, then transmission can only occur within dense clusters:
For this network under weak complex diffusion (k=2), the maximum risk size is reaches 98%.
One of the Prosper schools:
Start
Network Diffusion & Peer Influence
Structural Transmission Dynamics: beyond disease diffusion
Complex Contagion
Can lead to widely varying sizes of potential diffusion cascades. Here’s the distribution across all PROPSPER schools:
Distribution is largely bimodal (even with a connected pair start)
Network Diffusion & Peer Influence
Structural Transmission Dynamics: beyond disease diffusion
Complex Contagion
Can lead to widely varying sizes of potential diffusion cascades. Here’s the distribution across all PROPSPER schools:
The governing factors are (a) curved effect of local redundancy and (b) structural cohesion
Network Average Proportion Reached
k=2 complex contagion
Mean Cascade Size
Coh=0.3
Coh=1.2
Coh=2.2
Coh=3.2
Coh=4.1
Background:
This implies that position in a communication network should be related to attitudes.
Craig Rawlings will cover this later in the week!
Network Diffusion & Peer Influence
Peer Influence Dynamics
Statistical Models for Networks
Simple Random Graphs
Long history of model development for networks.
Here we are just hinting at what is here and why useful.
We often want a way to build models that explain the topology in a network. The foundation of these models are Random Graphs.
I will cover this later in the week
Open Problems
Open Problems
Models that allow for real-time feedback & data updates, population dynamics, etc. It's doable now in a compartmental framework but largely ad hoc
Open Problems
Open Problems
Peer
Behavior
Substantively, peers and behavior co-constitute each other in a naturally endogenous and over-determined way. Notions of distinguishing the causal effect of peers on behavior net of behavior on peers miss-asks the question. We need some radical new thinking on this.
Is not equal to
Peer
Behavior
+
Peer
Behavior
Open Problems
Parent
Parent
Child
Child
Child
Positional models are fundamentally under-developed; yet hold the greatest promise of realizing the potential of relational models to provide deep insights into social organization and behavior.
Open Problems
Example: Social Exchange in developing contexts
Open Problems
Example: Social Exchange in developing contexts
Required: probably need to include content of relation in the theory (at least valence, likely more)
Open Problems
Do we know how relations should change over time?
🡪 A 4 year old should not relate the same way to parents as a 14 year old. But what about old friends? Neighbors? Etc.? What is the life-history of a relation?
Open Problems
The real controversy over the Framingham studies turned on social mechanism: how do relations get “inside”?
Current models are largely passive transmission or stress-response; both seem much too simple.
Open Problems
Networks exist within an institutional context; only way to know that is to return to communities
Open Problems
Radio collar studies of people might be a bit much (though talk to Kitts!), but we leave clear digital traces…can we use that smartly?
Open Problems