1 of 124

Networks & Health

Intro & overview

2 of 124

  1. Intro/Big Picture
    1. What are networks?
    2. Connections & Positions
  2. Network Relevance to Health Research
  3. Basic Network Data Elements
    • Types of networks
    • Levels of analysis
    • Data structures

Outline

Social Network Data

3 of 124

Introduction

We live in a connected world:

“To speak of social life is to speak of the association between people – their associating in work and in play, in love and in war, to trade or to worship, to help or to hinder. It is in the social relations men establish that their interests find expression and their desires become realized.”

Peter M. Blau

Exchange and Power in Social Life, 1964

4 of 124

*1934, NYTime. Moreno claims this work was covered in “all the major papers” but I can’t find any other clips…

*

Introduction

We live in a connected world:

"If we ever get to the point of charting a whole city or a whole nation, we would have … a picture of a vast solar system of intangible structures, powerfully influencing conduct, as gravitation does in space. Such an invisible structure underlies society and has its influence in determining the conduct of society as a whole."

J.L. Moreno, New York Times, April 13, 1933

5 of 124

High Schools as Networks

Introduction

6 of 124

Introduction

Countryside High School, by grade

7 of 124

Introduction

Countryside High School, by race

8 of 124

And yet, standard social science analysis methods do not take this space into account.

“For the last thirty years, empirical social research has been dominated by the sample survey. But as usually practiced, …, the survey is a sociological meat grinder, tearing the individual from his social context and guaranteeing that nobody in the study interacts with anyone else in it.”

Allen Barton, 1968 (Quoted in Freeman 2004)

Moreover, the complexity of the relational world makes it impossible to identify social connectivity using only our intuition.

Social Network Analysis (SNA) provides a set of tools to empirically extend our theoretical intuition of the patterns that compose social structure.

Introduction

9 of 124

Social network analysis is:

  • a set of relational methods for systematically understanding and identifying connections among actors. SNA
    • is motivated by a structural intuition based on ties linking social actors
    • is grounded in systematic empirical data
    • draws heavily on graphic imagery
    • relies on the use of mathematical and/or computational models.

  • Social Network Analysis embodies a range of theories relating types of observable social spaces and their relation to individual and group behavior.

Introduction

10 of 124

But scientists are starting to take network seriously:

“Networks”

Introduction

11 of 124

“Networks”

“Obesity”

Introduction

But scientists are starting to take network seriously: why?

12 of 124

Introduction

…and NSF is investing heavily in it.

13 of 124

Social Determinants of Health

…social determinants of health refers to the complex, integrated, and overlapping social structures and economic systems that include social and physical environments and health services.” (CDC, 2010)

WHO Commission on Social Determinants of Health Conceptual Framework

Introduction

14 of 124

Social Determinants of Health

Social effects hold promising multiplier effects:

Introduction

15 of 124

A general embeddedness rubric for network models…

Outcome

  • “A friend of a friend is a friend”
  • “My enemy’s enemy is my friend”

+

+

+

-

-

+

Balanced

Opposition

Segregation, political polarization, feuds, wars, etc.

  • “People adopt the behavior of trusted friends”

Diffusion of health behavior

Introduction

Key Questions

16 of 124

Connectionist:

Positional:

Networks as pipes

Networks as roles

Ego

Complete

Multiple

- Structural Holes

- Density

- Mixing Models

- Size

- Community

Detection

- Reachability

- Homophily

- Degree

Distribution

- Social Balance

- ERGm

- Multi-layer networks

- Multi-level models of multiple networks

- Local Roles (Mandel 1983, Mandel & Winship 1984)

- Relational Block Models

- Motifs

Centrality

Cohesive blocking

2 ideas:

  • Patterns in networks

  • Patterns of networks

Connections & Positions: Network Problems

17 of 124

Classic characterization of epidemic spread is with Ro.

Ro depends on:

      • β: infectivity -- the probability that infection will pass from I to S,
      • c: the likelihood of contact,
      • D: the duration of infectiousness.

 

The spread of any epidemic depends on the number of secondary cases per infected case, known as the reproductive rate (R0). R0 depends on the probability that a contact will be infected over the duration of contact (β), the likelihood of contact (c), and the duration of infectiousness (D).

Why do networks matter?

Two fundamental mechanisms: Connections example

18 of 124

Isolated vision

Why do networks matter?

Two fundamental mechanisms: Connections example

19 of 124

Connected vision

Why do networks matter?

Why do networks matter?

Two fundamental mechanisms: Connections example

20 of 124

The structure of a network captures who is connected to who:

Ro is knowable if your disease relevant contact network (nd) is:

  • Erdos random
  • Sets of linked random graphs (“compartments”)
  • A static graph that is random except for activity levels (“degree sequence”) models
  • Static and “tree like.”

In all cases we can get (near) exact solutions for diffusion in these situations.

…but these situations almost never happen in the real world.

 

Why do networks matter?

Two fundamental mechanisms: Connections example

21 of 124

Standard models fail because:

  • Networks change faster than the disease is spreading (STD, parasite lifecycle)
  • The network admits to unobserved clustering (or any non-random structuring that affects path length or redundancy)
  • There is active feedback between disease spread and contact rate.
  • Some feature of the contact intensifies or weakens infectivity β (or recovery)

There is fundamental science to be done explicating how nd affects epidemic risk.

The structure of a network captures who is connected to who:

 

Why do networks matter?

Two fundamental mechanisms: Connections example

22 of 124

 

Why do networks matter?

Two fundamental mechanisms: Connections example

23 of 124

Perhaps more important nd is conditioned by the social embeddedness of actors in other social systems:

nd = f(beliefs, practices, family, work, politics, …,etc)

nd = f(Human Social Systems)

 

Core systems:

  • Physical embeddedness – where people live, work, play. 🡪 Geography, Kinship, Markets
  • Belief Systems – what people believe and who they listen to 🡪 Peer discussion & influence networks
  • Agentic Capacity – how much control people have over their disease-relevant contacts. 🡪 Power & inequality

We need to study more than nd to understand how disease networks function. nd=f(Human Social Systems)

Why do networks matter?

Two fundamental mechanisms: Connections example

24 of 124

Provides food for

Romantic Love

Bickers with

Why do networks matter?

Two fundamental mechanisms: Positions

Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.

C

P

X

Y

25 of 124

Parent

Parent

Child

Child

Child

Provides food for

Romantic Love

Bickers with

Why do networks matter?

Two fundamental mechanisms: Positions

Positional network mechanisms : Networks matter because of the way they capture role behavior and social exchange. Networks as Roles.

C

P

X

Y

26 of 124

Why do networks matter?

Scope of Social Networks & Health

For those who want a deeper more systematic review:

27 of 124

English language Articles indexed in Web of Science Social Science Citation Index on: ("health" or "well being" or "medicine") and "network*").

18572 papers 2000 - 2018.

Why do networks matter?

Scope of Social Networks & Health

28 of 124

Bibliographic Similarity Networks: 1-step neighborhood of a single paper

Why do networks matter?

Scope of Social Networks & Health

29 of 124

Bibliographic Similarity Networks: 2-step neighborhood of a single paper

Why do networks matter?

Scope of Social Networks & Health

30 of 124

Since the net is large…

Use a force-directed layout to display the full space & overlay clusters….

Why do networks matter?

Scope of Social Networks & Health

31 of 124

The example paper…

32 of 124

Modularity:

Top-Level: 0.798 @ 32 Clusters

2nd Level: 0.785 @ 150 Clusters

33 of 124

Rogers:

34 of 124

Valente: Various

35 of 124

Christakis & Fowler

36 of 124

Add Health

37 of 124

Pescosolido

38 of 124

39 of 124

40 of 124

41 of 124

Social Networks & Health Grant Landscape

1940 currently active NIH grants that include network elements

42 of 124

Social Networks & Health Grant Landscape

1940 currently active NIH grants that include network elements

43 of 124

Social Networks & Health Grant Landscape

1940 currently active NIH grants that include network elements

Color is funding mechanism; brown=R01 which is about 42%)

44 of 124

Social Networks & Health Grant Landscape

1940 currently active NIH grants that include network elements

Topic Map

This workshop!

45 of 124

Introduction

Network Research Lifecycle

46 of 124

47 of 124

Overview

SN&H Program

48 of 124

Overview

SN&H Program

49 of 124

Overview

SN&H Program

50 of 124

Overview

SN&H Program

51 of 124

Overview

SN&H Program

52 of 124

Overview

SN&H Program

53 of 124

Strategy?

SN&H Program

Does it feel like a bit much!?

  • Focus on the ideas & big-picture.

  • Everything is recorded
  • All the slides are posted
  • You can get the details on the 2nd pass when you have time to practice/play at home.

54 of 124

Strategy?

SN&H Program

Code is (reasonably) easy to generate

55 of 124

Strategy?

SN&H Program

Code is (reasonably) easy to generate

56 of 124

Strategy?

SN&H Program

Code is (reasonably) easy to generate.

It’s knowing what question to ask that’s difficult.

57 of 124

…Breathe…

58 of 124

The unit of interest in a network are the combined sets of actors and their relations.

We represent actors with points and relations with lines.

Actors are referred to variously as:

Nodes, vertices or points

Relations are referred to variously as:

Edges, Arcs, Lines, Ties

Example:

a

b

c

e

d

Social Network Data

59 of 124

In general, a relation can be:

Binary or Valued

Directed or Undirected

a

b

c

e

d

Undirected, binary

Directed, binary

a

b

c

e

d

a

b

c

e

d

Undirected, Valued

Directed, Valued

a

b

c

e

d

1

3

4

2

1

Social Network Data

60 of 124

In general, a relation can be: (1) Binary or Valued (2) Directed or Undirected

Social Network Data

Basic Data Elements

The social process of interest will often determine what form your data take. Conceptually, almost all of the techniques and measures we describe can be generalized across data format, but you may have to do some of the coding work yourself….

a

b

c

e

d

Directed,

Multiplex categorical edges

61 of 124

We can examine networks across multiple levels:

1) Ego-network

- May include estimates of connections among alters

2) Partial network

- Ego networks plus some amount of tracing to reach contacts of contacts

Social Network Data

Basic Data Elements: Levels of analysis

3) Complete or “Global” data

- Data on all actors within a particular (relevant) boundary

62 of 124

Ego-Net

Global-Net

Best Friend

Dyad

Primary

Group

Social Network Data

Basic Data Elements: Levels of analysis

2-step

Partial network

63 of 124

Social Network Data

Social network data are substantively divided by the number of modes in the data.

1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples:Communication, friendship, giving orders, sending email.

This is commonly what people think about when thinking about networks: nodes having direct relations with each other.

64 of 124

Social Network Data

Social network data are substantively divided by the number of modes in the data.

2-mode (bipartite) data represents nodes from two separate classes, where all ties are across classes. Examples:

People as members of groups

People as authors on papers

Words used often by people

Events in the life history of people

The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to groups through common membership.

There may be multiple relations of multiple types connecting your nodes.

65 of 124

Bipartite networks imply a constraint on the mixing, such that ties only cross classes.

Here we see a tie connecting each woman with the party she attended (Davis data)

Social Network Data

Basic Data Elements: Modes

66 of 124

Social Network Data

Basic Data Elements: Modes

Bipartite networks imply a constraint on the mixing, such that ties only cross classes.

Here we see a tie connecting each woman with the party she attended (Davis data)

Bipartite

Person Projection

Event Projection

67 of 124

Social Network Data

Basic Data Elements: Modes

Number of modes are general:

1-mode (direct) : Person to Person

2-mode (bipartite): Person to event

3-mode (tripartite): person to event to event-type

i.e. students, programs, staff

authors, papers, topics

persons, animals, parasites

n-mode

Logically these ideas are clearly extensible. In practice, it’s a question of whether the overhead of analyzing as a mode is better than treating it as metadata on the primary relations of interest.

68 of 124

Social Network Data

Multi-layer networks

A recent generalization of multiplex networks and multi-mode networks is the multi-layer network. A multi-layer network is a network that has qualitatively different classes of nodes (like multi-mode networks) and qualitatively different types of relations (like multiplex). Multiplex and multi-mode networks can be subsumed under the multi-layer formalism.

See Kivela, Mikko, Alex Arenas Marc Bartheelemy, James P. Gleeson, Yamir Moreno, and Mason Porter. “Multilayer Networks” Journal of complex Networks 2:203-271. https://doi.org/10.1093/comnet/cnu016

69 of 124

Social Network Data

Basic Data Elements: summary

70 of 124

Social Network Data

(not so basic) Data Elements

71 of 124

From pictures to matrices

a

b

c

e

d

Undirected, binary

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

An undirected graph and the �corresponding matrix is symmetric.

The traditional way to store & represent network data is with an adjacency matrix.�

The matrix (X) at right represents an undirected binary network. Each node (a-e) is listed on both the row and the column.

The ith row and the jth column (Xij) records the value of a tie from node i to node j. For example, the line between nodes a and b is represented as an entry in the first row and second column (red at right).

Because the graph is undirected the ties sent are the same as the ties receive, so every entry above the diagonal equals the entries below the diagonal.

Basic Data Structures

Social Network Data

72 of 124

Directed, binary

a

b

c

e

d

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

A directed graph and the �corresponding matrix is asymmetrical.

Directed graphs, on the other hand,�are asymmetrical.

We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.

Basic Data Structures

Social Network Data

73 of 124

a

b

c

d

e

a

b

c

d

e

1

3

1

2

4

2

1

A directed graph and the �corresponding matrix is asymmetrical.

Directed graphs, on the other hand,�are asymmetrical.

We can see that Xab =1 and Xba =1, �therefore a “sends” to b and b “sends” to a. ��However, Xbc=0 while Xcb=1; therefore,�c “sends” to b, but b does not reciprocate.

Basic Data Structures

Social Network Data

Directed, Valued

a

b

c

e

d

74 of 124

From matrices to lists (binary)

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

a b

b a c

c b d e

d c e

e c d

a b

b a

b c

c b

c d

c e

d c

d e

e c

e d

Adjacency List

Arc List

Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.

a

b

c

e

d

Basic Data Structures

Social Network Data

75 of 124

From matrices to lists (valued)

a

b

c

d

e

a

b

c

d

e

1

1

2

2

3

5

3

1

5

1

a b

b a c

c b d e

d c e

e c d

a b 1

b a 1

b c 2

c b 2

c d 3

c e 5

d c 3

d e 1

e c 5

e d 1

Adjacency List

Arc List

Social network analysts also use adjacency lists and arc lists�to more efficiently store network data.

a

b

c

e

d

Basic Data Structures

Social Network Data

1

2

5

1

3

a 1

b 1 2

c 2 3 1

d 3 1

e 5 1

contact

value

76 of 124

Working with two-mode data

A person-to-group adjacency matrix is rectangular, with one mode (persons, say) down rows and the other (groups, say) across columns

1 2 3 4 5

A 0 0 0 0 1

B 1 0 0 0 0

C 1 1 0 0 0

D 0 1 1 1 1

E 0 0 1 0 0

F 0 0 1 1 0

A =

Each column is a group, each row a person, and the cell = 1 if the person in that row belongs to that group.

You can tell how many groups two people both belong to by comparing the rows: Identify every place that both rows = 1, sum them, and you have the overlap.

Basic Data Structures

Social Network Data

77 of 124

a

b

c

e

d

1

2

5

1

3

a

b

c

d

e

a

b

c

d

e

1

1

2

2

3

5

3

1

5

1

--

--

--

--

--

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1

1

1

--

--

--

--

--

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

 

 

Social Network Metrics:

Volume

 

*in valued graphs, sometimes called average tie strength, but I think that’s potentially misleading, as it reads like average given that its nonzero

78 of 124

Degree: Number of links adjacent to a node

Social Network Metrics: Basic Volume

Strength: Sum of links adjacent to a node

a

b

c

d

e

a

b

c

d

e

1

1

1

1

1

1

1

1 2 1 2 1 7

a

b

c

d

e

a

b

c

d

e

1

1

2

3

1

5

1

1 3 5 4 1 14

1

1

3

0

2

1

1

6

0

6

Note that in igraph there are different choices:

Mode=

79 of 124

For the central node here:

d=igraph::degree(g, mode =“in”) 🡪 3

d=igraph::degree(g, mode =“out”) 🡪 3

d=igraph::degree(g, mode =“all”) 🡪 6

If you want the unique number of nodes they are adjacent to for “all” then do:

d=igraph::degree(as.undirected(g), mode =“all”) 🡪 4

If you have multiple relations, each relation counts for degree. So if that’s not the behavior you want…use degree(simplify(g)).

Degree: Number of links adjacent to a node

Social Network Metrics: Basic Volume

While conceptually simple, note some subtle bits in igraph.

80 of 124

Social Network Metrics: Basic Volume

Degree distribution underlies most everything else in the network. Be sure you know your degree distribution and that it’s sensible

81 of 124

Social Network Metrics: Basic Volume

Degree distribution underlies most everything else in the network. Be sure you know your degree distribution and that it’s sensible

82 of 124

Social Network Metrics: Basic Volume

Degree distribution underlies most everything else in the network. Be sure you know your degree distribution and that it’s sensible

83 of 124

Network Building Blocks

Dyad Census & Reciprocity

 

Where g is the number of nodes, M is the number of mutual dyads, L is the number of lines and L2 is the sum of the squared degree distribution.

Rho: 0.38

84 of 124

Network Building Blocks

Dyad Census & reciprocity

We can combine directionality and degree:

A quick-and-simple measure of prominence would be in-degree net of mutual ties

A similar measure of gregariousness would be out-degree net of mutual ties.

A similar measure of tendencies toward intimacy would be the node-level proportion of ties that are mutual.

85 of 124

Empirically, we also rarely have symmetric relations (at least on affect) thus we need to identify balance in undirected relations. Directed dyads can be in one of three states:

1) Mutual

2) Asymmetric

3) Null

Every triad is composed of 3 dyads, and we can identify triads based on the number of each type, called the MAN label system.

There are 16 possible triads on a binary directed graph:

Network Building Blocks

Triad Census and Hierarchy

86 of 124

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

Triad Census: The periodic table of social elements

Network Building Blocks

Triad Census & reciprocity

87 of 124

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

Intransitive

Transitive

Mixed

Triad Census: The periodic table of social elements

Network Building Blocks

Triad Census & reciprocity

88 of 124

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

16 directed triads

“A friend of a friend is a friend”

Triads also provide a tight coupling between behavior rules and (local) structure

Triad Census: The periodic table of social elements

Network Building Blocks

Triad Distribution

89 of 124

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

16 directed triads

“Hierarchical agreement”

Triads also provide a tight coupling between behavior rules and (local) structure

Triad Census: The periodic table of social elements

Network Building Blocks

Triad Distribution

90 of 124

003

(0)

012

(1)

102

021D

021U

021C

(2)

111D

111U

030T

030C

(3)

201

120D

120U

120C

(4)

210

(5)

300

(6)

16 directed triads

“Reciprocity”

Triads also provide a tight coupling between behavior rules and (local) structure

Triad Census: The periodic table of social elements

Network Building Blocks

Triad Distribution

91 of 124

An Example of the triad census

Network Building Blocks

Triad Distribution

92 of 124

Network Building Blocks

Transitivity Scores

Intuitively, “transitivity” and “clustering” are metrics for the tendency to observe closed triads. Transitivity refers specifically to a consistent ordering.

030T

120U

300

Lots of these:

021D

021C

111U

Few of these:

030C

120C

210

These make my head hurt

93 of 124

Network Building Blocks

Transitivity Scores

Watts clustering coefficient:

Classic Transitivity ratio:

transitivity(g, type=“ratio”)

transitivity(g, type=“global”)

Original (1998) clustering coefficient:

Where k is degree, and e is number of edges between neighbors. Ci = ego-network density.

Mean(transitivity(g, type=“local”))

94 of 124

Measuring Networks: Connectivity

Redundancy (Local)

Local redundancy is known as “clustering” or “transitivity” - that one’s friends are friends with each other.

Density is the proportion of pairs tied, excluding ego.

Transitivity is the proportion of two-step ties that are closed (Friend of a Friend is a friend)

Density

Transitivity

Transitivity

No ego

0 0 0

0.4 0.71 1.0

1 1 1

0.7 0.78 .64

95 of 124

Structural Indices based on the distribution of triads

The observed distribution of triads can be fit to hypothesized structures using weighting vectors for each type of triad.

Where:

l = 16 element weighting vector for the triad types

T = the observed triad census

μT= the expected value of T

ΣT = the variance-covariance matrix for T

Network Building Blocks

Triad Distribution

96 of 124

Triad:

003

012

102

021D

021U

021C

111D

111U

030T

030C

201

120D

120U

120C

210

300

BA

Triad Micro-Models:

BA: Ballance (Cartwright and Harary, ‘56) CL: Clustering Model (Davis. ‘67)

RC: Ranked Cluster (Davis & Leinhardt, ‘72) R2C: Ranked 2-Clusters (Johnsen, ‘85)

TR: Transitivity (Davis and Leinhardt, ‘71) HC: Hierarchical Cliques (Johnsen, ‘85)

39+: Model that fits D&L’s 742 mats N :39-72 p1-p4: Johnsen, 1986. Process Agreement

Models.

CL

RC

R2C

TR

HC

39+

p1

p2

p3

p4

Network Building Blocks

Weighting vectors

97 of 124

PRC{300,102, 003, 120D, 120U, 030T, 021D, 021U} Ranked Cluster:

M

M

N*

M

M

N*

M

A*

A*

A*

A*

A*

A*

A*

A*

1

1

1

1

1

1

1

1

1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

0

0

And many more...

Network Building Blocks

Triad Distribution

98 of 124

d

e

c

Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them.

a

b

c

e

d

f

b

f

a

In a directed graph, paths are directed, leading to a distinction between strong and weak reachability.

Social Network Metrics: Connectivity

99 of 124

Reachability

If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component.

Intuitively, a component is the set of people who are all connected by a chain of relations.

Social Network Metrics: Connectivity

100 of 124

This example contains many components.

Social Network Metrics: Connectivity

101 of 124

Because relations can be directed or undirected, components come in two flavors:

For a graph with any directed edges, there are two types of components:

Strong components consist of the set(s) of all nodes that are mutually reachable

Weak components consist of the set(s) of all nodes where at least one node can reach the other.

Social Network Metrics: Connectivity

102 of 124

There are only 2 strong components with more than 1 person in this network.

Components are the minimum requirement for social groups. As we will see later, they are necessary but not sufficient

All of the major network analysis software identifies strong and weak components

Social Network Metrics: Connectivity

103 of 124

Partner

Distribution

Component

Size/Shape

Emergent Connectivity in “low-degree” networks

Example: Small local changes can create cohesion cascades

Based on work supported by R21-HD072810 (NICHD, Moody PI), R01 DA012831-05 (NIDA Morris, Martina PI)

104 of 124

Connections: Diffusion

Average degree, degree distribution & connectivity

105 of 124

a

Geodesic distance is measured by the smallest (weighted) number of relations separating a pair:

Actor “a” is:

1 step from 4

2 steps from 5

3 steps from 4

4 steps from 3

5 steps from 1

a

Social Network Metrics: Connectivity

Distance

106 of 124

a b c d e f g h i j k l m

------------------------------------------

a. . 1 2 . . . . . . . . 2 1

b. 3 . 1 . . . . . . . . 1 2

c. . . . . . . . . . . . . .

d. 4 3 1 . 1 2 1 . 2 . . 2 3

e. 3 2 2 1 . 1 2 . 1 . . 1 2

f. 4 3 3 2 1 . 3 . 2 . . 2 3

g. 5 4 4 3 2 1 . . 3 . . 3 4

h. . . . . . . . . 1 . . . .

i. . . . . . . . . . . . . .

j. . . . . . . . . 1 . . . .

k. . . . . . . . . 1 . . . .

l. 2 1 2 . . . . . . . . . 1

m. 1 2 3 . . . . . . . . 1 .

b

c

d

g

f

e

k

i

j

h

l

m

a

When the graph is directed, distance is also directed (distance to vs distance from), following the direction of the tie.

Social Network Metrics: Connectivity

Distance

107 of 124

As a graph statistic, the distribution of distance can tell you a good deal about how close people are to each other (we’ll see this more fully when we get to closeness centrality).

The diameter of a graph is the longest geodesic, giving the maximum distance. We often use the l, or the mean distance between every pair to characterize the entire graph.

For example, all else equal, we would expect rumors to travel faster through settings where the average distance is small.

Geodesic distance is just the simplest kind of distance – there are others, and they are interesting!

Measuring Networks: Connectivity

Distance

108 of 124

Node Connectivity

As size of cut-set

0

1

2

3

Structural Cohesion:

A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

109 of 124

0

1

2

3

Node Connectivity

As number of node-independent paths

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

Structural Cohesion:

A network’s structural cohesion is equal to the minimum number of actors who, if removed from the network, would disconnect it.

110 of 124

1

2

3

4

Nestedness Structure

Cohesive Blocks

Depth

Sociogram

5

1

2

3

4

5

6

7

8

9

Cohesive Blocking

The arrangement of subsequently more connected sets by branches and depth uniquely characterize the connectivity structure of a network

Measuring Networks: Connectivity

Redundancy Global: Structural Cohesion

111 of 124

Conceptually centrality identifies nodes in the ‘center’ of the network.

In practice, ‘center’ is complicated, as there are multiple dimensions to be central on.

The standard centrality measures capture a wide range of “importance” in a network:

      • Degree
      • Closeness
      • Betweenness
      • Eigenvector / Power measures

Measuring Networks

Centrality

112 of 124

Measuring Networks

Centrality

..and just a teaser of all the elements we are leaving out:

Schoch lists over 100 different measures…

113 of 124

Measuring Networks

Centrality

Degree Centrality: Number of lines adjacent to each node. Captures local importance.

Strength: Sum of the edges adjacent to each node

In-degree: number of lines pointing at node

Out-degree: number of lines pointing from node

mutual-degree: number of reciprocated ties

Closeness Centrality: inverse of the average number of steps from one node to another

Usually geodesic distance, so sensitive to random ties

Betweenness Centrality: Number of times a node sits on the shortest paths between all other nodes.

Captures ability to bridge different parts of the network

Eigenvector/Power Centrality: Degree weighted by the degree (weighted by the degree…) of nodes each node is connected to.

Being central amongst the central.

114 of 124

Be careful using igraph for centrality on weighted networks. It probably doesn’t do what you think it does.

Measuring Networks

Centrality

In many cases, igraph treats weights as “costs” so walks through the graph cumulate – like miles in a trip.

a

b

c

d

5

8

1

2

The a-b-c path will be length 13, but the a d c path will be length 3. IF you want to represent who is closer (i.e. weights are strengths), then “13” is “closer” than “3.”

So INVERT the distances first for betweenness and closeness (path based)

115 of 124

Homophily

116 of 124

Homophily is the tendency for social contacts to be similar.

117 of 124

Homophily is the tendency for social contacts to be similar.

118 of 124

Homophily is the tendency for social contacts to be similar.

Grade Mixing Matrix

Internal ties

External ties

119 of 124

Measuring Homophily

 

A value of -1 indicates perfect homophily (all ties are internal), a value of 1 indicates perfect heterophily (all ties are external), and a value near 0 indicates a balance between homophilous and heterophilous ties.

 

Within-group fraction

E-I Index

Assortativity index

Edge-wise correlation of attributes

120 of 124

Segregation Index

(Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and Research 6411-30.)

Freeman asked how we could identify segregation in a social network. Theoretically, he argues, if a given attribute (group label) does not matter for social relations, then relations should be distributed randomly with respect to the attribute. Thus, the difference between the number of cross-group ties expected by chance and the number observed measures segregation.

 

Measuring Homophily

(Analogous to a chi-square test on a contingency table)

Calculation notes in the hidden slides.

121 of 124

One problem with the segregation index is that it is not ‘margin free.’ That is, if you were to change the distribution of the category of interest (say race) by a constant but not the core association between race and friendship choice, you can get a different segregation level.

One antidote to this problem is to use odds ratios. In this case, and odds ratio tells us the relative likelihood that two people in the same category will choose each other as friends. Familiar from logistic regression.

Odds Ratios

Measuring Homphily

122 of 124

We can also measure the extent that ties fall within groups with the modularity score. This is the most common current scoring, it’s a linear transform of the segregation index.

Where:

m is the number of edges

k is the degree

Aij is the edge weight between ij

δ(cicj) is 1 if in the same group

γ is the resolution parameter

Modularity of 0 means random mixing. This has become standard for community detection.

Modularity

 

123 of 124

Questions?

124 of 124

DNAC RAs: Madelynn Wellons

Gabriel Varela

DNAC/SN&H Alums: Dana Pasquale

Tom Wolff

Liann Tucker

Rhodes iiD

Kathy Peterson

DUPRI

Bernie Fischer

Linda Simpson

SSRI

NICHD: 5R25HD079352

SN&H Program

Thank you!!