1 of 14

Chapter 7 – Models of Network Formation – Part 1

Based on the textbook “The Little Book of Networks” by Professor Jerry Scripps

2 of 14

Networks are dynamic

Links and nodes are added and removed

Attributes are modified

Groups are continually changing

4 of 14

How did networks evolve?

There are several models
Models gain acceptance based on:

Their simplicity
How well they explain characteristics of physical world networks

5 of 14

Network models are similar to statistical distributions

Both network models and statistical distributions are simplified, idealized mathematical explanations for the observances in the real world.
Network models tell us what to expect in the network.
Statistical distributions are characterized by some key parameters
Similarly, some metrics in a network may help us identify the model that helps us understand how the networks was generated.

6 of 14

We will examine:

The characteristics of the networks that obey the model
The generating functions
Generating functions are mathematical formulas or descriptions for creating networks.
Generating functions are probabilistic.
The generated networks are unique, but they have the characteristics provided to the generating function.

7 of 14

7.1 The Random Model

This is the oldest model.
It was defined by Erdös and Enyi in 1959.
A random network has the characteristics one would expect in a network where the links are placed randomly between nodes.

8 of 14

Generating functions

There are two common generating functions.
The first one:

G(n,m) has two parameters

n: the number of nodes
m: The number of links

The function operates by placing the m links, one at a time, between two nodes chosen uniformly at random.

9 of 14

The second generating function

The second generating function G(n,p) is similar to the first one
Preferred because it has some helpful mathematically provable characteristics.
p here is the probability of link
For each possible link between all n(n-1)/ 2 pairs of nodes, a link is placed with a probability p.
For every pair of nodes:

A random number between 0 and 1 is drawn.
If it is less than p, a links is inserted between that pair of nodes

10 of 14

G(n,p)

Suppose n = 10, 10 nodes

P = 0.1 , predetermined

The function first checks for two nodes v1 and v2

Choose a random number between 1 and 0

suppose you choose 0.03

0.03 < p (0.1), so insert an edge between v1 and v2

And keep doing this for all pairs of nodes

11 of 14

Characteristics of random networks

The mean degree c is (n-1)p
The degree distribution follows a Poisson distribution
The average clustering coefficient

C = c / (n-1)

The diameter is a + (ln n/ln c) where a is a constant

12 of 14

Characteristics of random networks G(n,p)

The mean degree c is (n-1)p
How?

If p = 0.1, that means out of all possible edges,

only 10 percent will be generated

How many edges in total?

(n(n-1) / 2) * p

Let’s call this TotalE

Every edge adds 2 degrees for two nodes at both ends

Total degree is 1+1 = 2

Only 1 edge

So total degree is #total edges * 2 = TotalE * 2 = (n(n-1)/2) * p* 2 = n(n-1)p

Total nodes = n. so average degree is n(n-1)p / n = (n-1)p

13 of 14

Worth noting:

The clustering coefficient is small.
Take a network with n = 100,000 with average degree c = 100
Then the clustering coefficient would be ~ 0.001
The diameter is also quite small
Take a network with 7 billion nodes (the earth’s population)
Assume an average degree of 1,000
The diameter would be approximately 3.33

14 of 14

Not very useful for network miners

Random networks are of interest to mathematicians
But they do not describe well (most) real world networks
Hence they are not popular with network miners or analysts