1 of 14

Chapter 7 – Models of Network Formation – Part 1

Based on the textbook “The Little Book of Networks” by Professor Jerry Scripps

2 of 14

Networks are dynamic

Links and nodes are added and removed

Attributes are modified

Groups are continually changing

3 of 14

4 of 14

How did networks evolve?

  • There are several models
  • Models gain acceptance based on:
    • Their simplicity
    • How well they explain characteristics of physical world networks

5 of 14

Network models are similar to statistical distributions

  • Both network models and statistical distributions are simplified, idealized mathematical explanations for the observances in the real world.
  • Network models tell us what to expect in the network.
  • Statistical distributions are characterized by some key parameters
  • Similarly, some metrics in a network may help us identify the model that helps us understand how the networks was generated.

6 of 14

We will examine:

  • The characteristics of the networks that obey the model
  • The generating functions
  • Generating functions are mathematical formulas or descriptions for creating networks.
  • Generating functions are probabilistic.
  • The generated networks are unique, but they have the characteristics provided to the generating function.

7 of 14

7.1 The Random Model

  • This is the oldest model.
  • It was defined by Erdös and Enyi in 1959.
  • A random network has the characteristics one would expect in a network where the links are placed randomly between nodes.

8 of 14

Generating functions

  • There are two common generating functions.
  • The first one:
    • G(n,m) has two parameters
      • n: the number of nodes
      • m: The number of links
    • The function operates by placing the m links, one at a time, between two nodes chosen uniformly at random.

9 of 14

The second generating function

  • The second generating function G(n,p) is similar to the first one
  • Preferred because it has some helpful mathematically provable characteristics.
  • p here is the probability of link
  • For each possible link between all n(n-1)/ 2 pairs of nodes, a link is placed with a probability p.
  • For every pair of nodes:
    • A random number between 0 and 1 is drawn.
    • If it is less than p, a links is inserted between that pair of nodes

10 of 14

G(n,p)

Suppose n = 10, 10 nodes

P = 0.1 , predetermined

The function first checks for two nodes v1 and v2

Choose a random number between 1 and 0

suppose you choose 0.03

0.03 < p (0.1), so insert an edge between v1 and v2

And keep doing this for all pairs of nodes

11 of 14

Characteristics of random networks

  • The mean degree c is (n-1)p
  • The degree distribution follows a Poisson distribution
  • The average clustering coefficient
    • C = c / (n-1)
  • The diameter is a + (ln n/ln c) where a is a constant

12 of 14

Characteristics of random networks G(n,p)

  • The mean degree c is (n-1)p
  • How?

If p = 0.1, that means out of all possible edges,

only 10 percent will be generated

How many edges in total?

(n(n-1) / 2) * p

Let’s call this TotalE

Every edge adds 2 degrees for two nodes at both ends

Total degree is 1+1 = 2

Only 1 edge

So total degree is #total edges * 2 = TotalE * 2 = (n(n-1)/2) * p* 2 = n(n-1)p

Total nodes = n. so average degree is n(n-1)p / n = (n-1)p

13 of 14

Worth noting:

  • The clustering coefficient is small.
  • Take a network with n = 100,000 with average degree c = 100
  • Then the clustering coefficient would be ~ 0.001
  • The diameter is also quite small
  • Take a network with 7 billion nodes (the earth’s population)
  • Assume an average degree of 1,000
  • The diameter would be approximately 3.33

14 of 14

Not very useful for network miners

  • Random networks are of interest to mathematicians
  • But they do not describe well (most) real world networks
  • Hence they are not popular with network miners or analysts