Peer to Peer Networking

An Exercise In Abstraction

Gabe Appleton

The Goal

I set out to make a set of libraries/modules which:

  • Implement multiple types of peer-to-peer networking
  • Abstract it so that developers don’t need to manage it themselves
  • Have it work in as many languages as possible

2

The Proof Of Concept

For the first try, I decided to make an unorganized network. The idea is that a node will try to stay connected to multiple peers, and follows simple rules about how to propagate messages.

This eliminates pathfinding from consideration, but also means that distributed data storage is off limits without sacrificing efficiency, or making intrusive modifications.

(I have extensions which do both)

3

Similar Networks

4

Interface

I have made two implementations of this so far. One in Python, one in Javascript.

5

Python

>>> from py2p import mesh

Make a socket:

>>> sock = mesh.mesh_socket(‘localhost’, 4444)

Connect to network:

>>> sock.connect(‘localhost’, 5555)

Send a message:

>>> sock.send(‘hello’, ‘are’, ‘you’, ‘there?’)

Receive a message:

>>> msg = sock.recv()
>>> msgs = sock.recv(10)

Javascript

> const mesh = require(‘js2p’).mesh;

Make a socket:

> let sock = new mesh.mesh_socket(‘localhost’, 4444);

Connect to network:

> sock.connect(‘localhost’, 5555);

Send a message:

> sock.send([‘hello’, ‘are’, ‘you’, ‘there?’]);

Receive a message:

> var msg = sock.recv();

> var msgs = sock.recv(10);

Note that the highlighted is one of THREE API differences

What’s Underneath?

  • A wrapper object (mesh_socket)
    • Provides user interface
    • Stores routing, protocol data
    • Has message handlers
  • A daemon object (mesh_daemon)
    • Receives data, new connections
    • Runs message handlers
  • Many connection objects (mesh_connection)
    • Abstracts serialization methods
    • Handles message sending, forwarding
    • Stores connection state
      • Compression state
      • Last message sent

6

I’ll be talking about the Python implementation for the most part. Every time you hear me say “daemon”, replacing that with “callback from connections” would get you the Javascript way.

How It Sets Up

Each node must connect to at least one other node on the network

When a node connects or receives a connection, it sends a handshake. This includes information like: the node’s ID, its supported compression methods, and its peerlist

When you receive a peerlist, connect to nodes until you hit a predefined limit

Do not connect to the same node twice

7

An Example

We’re going to make a network of eight nodes, going clockwise

8

0

4

5

1

6

3

2

7

An Example

First 0 will make a connection to 1

9

0

4

5

1

6

3

2

7

An Example

Then 1 will make a connection to 2

10

0

4

5

1

6

3

2

7

An Example

1 will send its peerlist to 2. 2 will then make a connection to 0.

11

0

4

5

1

6

3

2

7

An Example

The same process will repeat when 2 connects to 3.

12

0

4

5

1

6

3

2

7

An Example

13

0

4

5

1

6

3

2

7

An Example

14

0

4

5

1

6

3

2

7

An Example

6 will run amuck of the outgoing connection limit

15

0

4

5

1

6

3

2

7

An Example

As will 7

16

0

4

5

1

6

3

2

7

The Three Rules of Propagation

  • The first node to broadcast sends the message to all its peers
  • Each node which receives a message relays the message to each peer but the node which sent it
  • Nodes do not relay a message they have seen before

17

Propagation Example

Let’s see a message send on a loosely connected network

18

0

4

5

1

6

3

2

7

Propagation Example

First node 0 sends the message to its peers

19

0

4

5

1

6

3

2

7

Propagation Example

These each relay.

1 -> 2, 3, 7

2 -> 1, 3, 4, 7

20

0

4

5

1

6

3

2

7

Propagation Example

3 -> 1/2, 5

4 -> 3, 5, 6

7 -> 1/2, 6

21

0

4

5

1

6

3

2

7

Each node has at least two viable paths for a message

The Last Example

Let’s say 6 would like to reply to 0’s message privately. How?

22

0

4

5

1

6

3

2

7

The Last Example

6 would send a ‘request’ to its peers for the address of ID 0

23

0

4

5

1

6

3

2

7

The Last Example

1 and 2 both know 0’s address. Let’s assume 2 replies first.

24

0

4

5

1

6

3

2

7

The Last Example

The route back to 6 is then traced out with ‘whisper’s

25

0

4

5

1

6

3

2

7

The Last Example

When 6 receives this, it will connect to 0

26

0

4

5

1

6

3

2

7

The Last Example

Then directly send its message

27

0

4

5

1

6

3

2

7

The Last Example

They also exchange peer lists, strengthening the network

28

0

4

5

1

6

3

2

7

How is This Useful?

  • As an alert network
    • “I finished this job! Move on to the next one.”
    • Useful because it’s reliable
    • Bitcoin uses it like this
  • As a message board
    • More robust than IRC
    • Decentralized
  • For Botnets/Worms
    • Web crawlers are very useful, and it’s good to have them on several systems
  • File and association sharing
    • Gnutella uses it like this

29

Potential Drawbacks

  • Full browser support likely unworkable
    • W3 has been working on TCP in browser for 12 years
    • Websocket version implemented
      • But cannot listen for peers, only reach out
    • Workable as a Chrome, Firefox extension
  • Every broadcast sent generate lots of network traffic
    • But it’s linear on large networks
  • Private replies inflate the number above
    • Increases outward connections beyond normal limits
    • But reduces traffic in the short term

30

Showing Linear Scaling

Terms:

  • n number of nodes on the network
  • ℓ the limit on outward connections
  • m the number of messages per broadcast
  • hmax the maximum number of hops before a broadcast reaches the last node
  • cx the number of connections at a particular node
  • t , the sum of all nodes’ connections

31

Showing Linear Scaling (Cont.)

  • The origin node (n0) sends c0 messages
  • Each subsequent node (nx) sends cx - 1 messages

Rewritten:

m = c0 +

m = t - n + 1

  • The first node to broadcast sends the message to all its peers
  • Each node which receives a message relays the message to each peer but the node which sent it
  • Nodes do not relay a message they have seen before

32

Behold the mathy translation

Showing Linear Scaling (Cont.)

Special Case: Saturated Network

Each node has (n - 1) connections. In other words, it’s connected to all other nodes.

Therefore t = (n - 1)n

m = t - n + 1

m = (n - 1)n - n + 1

m = n2 - n - n + 1

m = n2 - 2n + 1

m = (n - 1)2

m is O(n2)

33

Showing Linear Scaling (Cont.)

Special Case: Limited Network

A limited network is where each node has ℓ outward connections. This is the limit set in software, so a node will not initiate more than ℓ connections on its own.

Because connections must have another end, we can conclude that the number of inward connections per node is also ℓ. Therefore t = 2ℓ * n

m = t - n + 1

m = 2ℓ * n - n + 1

m = (2ℓ - 1)n + 1

m is O(n)

34

Showing Linear Scaling (Cont.)

We have two domains the network trends towards. To figure out which is in control, we can find where the two functions meet. This happens to be where:

n = 2ℓ + 1

When we chart this out, we get this graph:

We can conclude that where n is small

relative to ℓ then m is O(n2), and when n

is large relative to ℓ then m is O(n).

35

Showing Linear Scaling (Cont.)

We defined two h terms earlier. havg and hmax

For hmax the easiest definition is by example. The longest configuration I have found thus far is shown below. Once for ℓ=1, once for ℓ=2. Both shown are n=8. Because this behaves like a skiplist of length n-1, this leads to the formula:

hmax = ⌈max(n-2, 1) / ℓ⌉

where n > 2ℓ + 1

36

6

5

4

3

2

1

0

1

3

3

2

2

1

1

0

1

Possible Next Steps

  • Add Java and C implementations
  • Get security/code review
  • Add more unit tests
  • Further integrate Python with existing C code
  • Optimize network rules

37

Additional Schemas—Sync

A synchronized hash table based on a the basic mesh network

38

0

4

5

1

6

3

2

7

Note log(n) lookup time

Additional Schemas—Chord

A distributed hash table based on a circularly linked skiplist

39

0

4

5

1

6

3

2

7

Note log(n) lookup time

Possible Future Schemas—Kademlia

A distributed hash table based on a binary search tree

40

More Info

  • Networking code in
    • Python
    • Javascript
  • Message parser in
    • Python
    • Javascript
    • Java
    • Golang
    • C
    • C++
    • Smalltalk (partial)

41

Install:

  • pip install py2p
  • npm install js2p
Peer to Peer Networking - Google Slides