1 of 50

Tapestry Gearup

CS1380: Distributed Systems S22

By: March & Arvind

Slides link available on Website

2 of 50

Motivations

  • Distributed key-value store implemented on top of a peer-to-peer routing system
  • Possible use cases: decentralized applications, Interplanetary File System
  • The peer-to-peer routing system is the main thing we want you to learn about

3 of 50

System Overview

Key-value pair

<K, V>

Node

1

Node

2

Node

3

Node

X

Node

Y

Well-known Root of K

Node

Z

Where is K?

K is here!

K is here!

...

Node 3 tells me it has K.

Node storing <K,V>

FindRoot(k)

Publish(k)

4 of 50

Data Structure Overview

Node A

  • BlobStore
    • Storing bytes of data. → <K, V>
  • LocationMap
    • Key K1 is stored at Node X1
    • Key K2 is stored at Node X2
    • ...
  • Routing Table

  • Backpointers

Node B

  • BlobStore
  • LocationMap
  • Routing Table
  • Backpointers

Node B

Node A

5 of 50

Finding Root: FindRoot(id ID, level int32)

Finds root node of key (via FindRootRPC), also returns a set of nodes to remove

  1. First check local Routing Table for nextHop
  2. If nextHop is self, then local node is the root. Return
  3. Else call FindRootRPC on nextHop
  4. If timeout or error on FindRootRPC to nextHop, remove nextHop from local Routing Table and add nextHop to removeNodes set → Important for notifying upstream nodes of nodes to remove from their routing tables. GOTO step 1
  5. Remove nodes from Routing Table that are returned from FindRootRPC

6 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

Check local Routing Table for nextHop

7 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

Check local Routing Table for nextHop

12

8 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

Check local Routing Table for nextHop

12

nextHop

9 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

FindRootRPC to nextHop

12

nextHop

tap2.FindRootRPC(12, 1)

FindRoot(12, 1)

10 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

Check local Routing Table for nextHop

12

nextHop

tap2.FindRootRPC(12, 1)

FindRoot(12, 1)

12

nextHop

11 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

FindRootRPC to nextHop

12

nextHop

tap2.FindRootRPC(12, 1)

FindRoot(12, 1)

12

nextHop

tap3.FindRootRPC(12, 2)

FindRoot(12, 2)

2 >= DIGITSROOT!

12 of 50

Finding Root: FindRoot(id ID, level int32)

Local Node

Tap 1 (ID: 01)

Tap 2 (ID: 21)

Tap 3 (ID: 22)

FindRoot(12, 0)

0

1

2

xx

01

21

0x

01

0

1

2

xx

01

21

2x

21

22

0

1

2

xx

01

22

2x

21

22

Return root

12

nextHop

tap2.FindRootRPC(12, 1)

FindRoot(12, 1)

12

nextHop

tap3.FindRootRPC(12, 2)

FindRoot(12, 2)

2 >= DIGITSROOT!

Root = tap3 (ID: 22)

Root = tap3 (ID: 22)

13 of 50

Storing Object: Store(key string, value []byte)

  1. Publish key
    1. Find root node of key (FindRootRPC)
    2. Register local node and object key with root node (via RegisterRPC(key string, replica RemoteNode))
      1. When root node receives RegisterRPC request, confirm that it is root of key (remember to hash key)
      2. If root node is indeed root of key, register key and replica to its LocationMap (via LocationMap.Register(key string, replica RemoteNode, timeout time.Duration))
      3. Return to caller whether root node was the root of the key

14 of 50

Storing Object: Store(key string, value []byte)

  • Publish key
    • Find root node of key (FindRootRPC)
    • Register local node and object key with root node (via RegisterRPC(key string, replica RemoteNode))
      • When root node receives RegisterRPC request, confirm that it is root of key (remember to hash key)
      • If root node is indeed root of key, register key and replica to its LocationMap (via LocationMap.Register(key string, replica RemoteNode, timeout time.Duration))
      • Return to caller whether root node was the root of the key
    • If error on initial publish (e.g Cannot contact root or root was not actually the root), propagate error
    • Periodically publish key on a ticker on a separate goroutine
    • Return to user a cancel chan that allows local node storing object to stop publishing
  • Store object in local node’s BlobStore

15 of 50

Storing Object: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

Tapestry CLI: Store(key string, value []byte)

16 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

1. Publish Key

publish(key)

17 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

2. Find root of key

publish(key)

findRootOnRemoteNode(local.node, key)

18 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

3. FindRootRPC

publish(key)

findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

19 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

3. FindRootRPC

publish(key)

findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

20 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

3. FindRootRPC

publish(key)

findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

FindNextHop?

FindNextHop?�

Nope.�We are the root!

Register key to LocationMap

FindRootRPC

21 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

3. FindRootRPC

publish(key)

root := findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

FindNextHop?

FindNextHop?�

Nope.�We are the root!

Register key to LocationMap

FindRootRPC

Root

Root

22 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

4. RegisterRPC

publish(key)

root := findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

FindNextHop?

FindNextHop?�

Nope.�We are the root!

Register key to LocationMap

FindRootRPC

Root

Root

root.RegisterRPC(key, local.node)

RegisterRPC

23 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

4. RegisterRPC

publish(key)

root := findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

FindNextHop?

FindNextHop?�

Nope.�We are the root!

Register key to LocationMap

FindRootRPC

Root

Root

root.RegisterRPC(key, local.node)

RegisterRPC

Yup, we’re the root of this key

return true

24 of 50

Storing Object: Store(key string, value []byte)

Tapestry CLI: Store(key string, value []byte)

Local Node

Tap 1

Tap 2

Tap 3

4. Republish

publish(key)

root := findRootOnRemoteNode(local.node, key)

FindNextHop?

FindRootRPC

FindRootRPC

FindNextHop?

FindNextHop?�

Nope.�We are the root!

Register key to LocationMap

FindRootRPC

Root

Root

root.RegisterRPC(key, local.node)

RegisterRPC

Yup, we’re the root of this key

return true

Republish = publish → Cannot simply just call RegisterRPC for Republish. Must first find Root of Object on every Republish and then call RegisterRPC

Periodically republish

25 of 50

Joining Node: Join(otherNode RemoteNode)

  1. Find Root for joining Node
  2. AddNodeRPC to Root node
  3. Joining node receives neighbor set in response to AddNodeRPC
  4. Add neighbor set to joining Node routing table
  5. Populate rest of joining Node Routing Table by getting backpointers of neighbor set

26 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

Tap 1 (Joining Node)

Tap 2

Tap 3

Join(Tap2)

1. Find Root for joining Node

27 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

Not root

Root!

Tap 1 (Joining Node)

Tap 2

Tap 3

Join(Tap2)

1. Find Root for joining Node

tap2.FindRootRPC(tap1ID, 0)

tap3.FindRootRPC(tap1ID, 1)

28 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

Not root

Root!

Tap 1 (Joining Node)

Tap 2

Tap 3

Join(Tap2)

root := tap3

1. Find Root for joining Node

tap2.FindRootRPC(tap1ID, 0)

tap3.FindRootRPC(tap1ID, 1)

Root = tap3

Root = tap3

29 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

Not root

Root!

Tap 1 (Joining Node)

Tap 2

Tap 3

Join(Tap2)

root := tap3

2. AddNodeRPC to Root Node

tap2.FindRootRPC(tap1ID, 0)

tap3.FindRootRPC(tap1ID, 1)

Root = tap3

Root = tap3

root.AddNodeRPC(local.Node)

30 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

root.AddNodeRPC(local.Node)

3. AddNodeMulticast

AddNodeMulticast(newNode, level)

targets = routingtable.get(level) // Must include local node

results = []

for target in targets

results.append(target.AddNodeMulticast(newNode, level + 1))

self.addRoute(newNode)

transferRelevantObjects(newNode)

return merge(results, targets)

31 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

Get shared prefix with addNode

root.AddNodeRPC(local.Node)

32 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

110..

112..

...

AddNodeMulticastRPC(addNode, P + 1) to all nodes at level P in local RT

Why P+1? Why not P?

root.AddNodeRPC(local.Node)

33 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

110..

112..

...

AddNodeMulticastRPC(addNode, P + 1) to all nodes at level P in local RT

Repeat AddNodeMulticastRPC for next level down

root.AddNodeRPC(local.Node)

34 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

110..

112..

...

AddNodeMulticastRPC(addNode, P + 1) to all nodes at level P in local RT

Repeat AddNodeMulticastRPC for next level down

Continue AddNodeMulticastRPC until last level

root.AddNodeRPC(local.Node)

35 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

Reached end of routing table

In the last step:

  1. Add addNode to local routing table
  2. Transfer applicable Location Map objects from self to addNode

root.AddNodeRPC(local.Node)

36 of 50

Joining Node: Join(otherNode RemoteNode)

Root!

Tap 3

0

1

...

xx...

...

..

..

1x…

..

..

..

11x..

110..

112..

P := SharedPrefix(self, addNode)

3. AddNodeMulticast

Wait for all downstream AddNodeMulticastRPC to finish

...

Return neighbors

110..

112..

root.AddNodeRPC(local.Node)

37 of 50

Joining Node: Join(otherNode RemoteNode)

Populate local Routing Table at level P with initial neighbor set

Local Node

Tap 1 (Joining Node)

neighbors := root.AddNodeRPC(local.Node)

4. Construct New Routing Table

0

1

...

xx...

1x…

P

..

..

..

P := SharedPrefix(self, addNode)

Neighbors

38 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

neighbors := root.AddNodeRPC(local.Node)

5. Populate rest of Routing Table (Levels: P → 0) (Backpointer Traversal)

0

1

...

xx...

1x…

P

..

..

..

P := SharedPrefix(self, addNode)

  1. From the entire neighbors set, trim to K closest unique keys, as determined by Closer() (but do not modify Routing Table)
  2. For each neighbor after trimming, call neighbor.GetBackpointersRPC(self, P)

Tap 1 (Joining Node)

39 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

neighbors := root.AddNodeRPC(local.Node)

5. Populate rest of Routing Table (Levels: P → 0) (Backpointer Traversal)

0

1

...

xx...

1x…

P

..

..

..

P := SharedPrefix(self, addNode)

  • From the entire neighbors set, trim to K closest unique keys, as determined by Closer() (but do not modify Routing Table)
  • For each neighbor after trimming, call neighbor.GetBackpointersRPC(self, P)
  • Result will be used to populate the whole table
  • Append result to neighbors set
  • Decrement level; GOTO step 1

Tap 1 (Joining Node)

40 of 50

Joining Node: Join(otherNode RemoteNode)

Local Node

neighbors := root.AddNodeRPC(local.Node)

Tap 1 (Joining Node)

6. Populate Backpointers

During AddNodeMulticast, whenever a remote node added local node to its Routing Table, it called joiner.AddBackpointerRPC(self)

This is implemented in addRoute()

41 of 50

Debugging

  • Delve: basically a GDB for Golang
  • Checkout lab 2
  • go tool trace, not covered in lab 2, but is also helpful. Run in the same way as pprof
    • go test -run <test name> -trace trace.out
    • go tool trace trace.out
  • trace demo

42 of 50

gRPC

  • Lab 3 handout on gRPC will be released early (likely around March 1-2) with the original due date in case you need a reference to gRPC for Tapestry
  • "Context Deadline Exceeded" error in gRPC
    • generic error, means that the caller timed out waiting for a reply from callee
    • go tool trace can help with debugging these errors

43 of 50

Tapestry Paper

  • https://sites.cs.ucsb.edu/~ravenben/classes/papers/tapestry-jsac04.pdf
  • Reading paper is hard
  • three passes approach
    • how to read a paper
    • first pass -> general idea ~ 10 minutes
    • second pass -> key points ~ 1 hour
    • third pass (if needed to) -> in depth analysis

44 of 50

Closing Remarks

Expected lines of code: 500 ~ 700

→ Start early!

→ Read the handout/paper!

→ Come to TA hours.

→ Project Due: Sunday March 11th, 11:59 PM

45 of 50

FAQs from Years Past

How does FindNextHop(id, level) work?

  1. Jump to row number level
  2. Find the cell corresponding to id[level]th digit (assuming 0-indexing!)
  3. Scan a row starting from here for the first remote node you can get; if you only find the local node first in that row, jump down to the next row and repeat
  4. If you run out of rows, return the local node

In a particular slot in a row, the nodes in the slot should already be sorted by Closer()

46 of 50

FAQs from Years Past

How does id.IsNewRoute(newId, currentId) work?

From code comment: Given IDs newId and currentId, will id now route to newId?

  1. Let i be an index variable.
  2. Keep comparing newId[i] to currentId[i] until newId[i] != currentId[i]
  3. When newId[i] != currentId[i] (i.e. when their shared prefix is exhausted), then
    1. Compare ((newId[i] - id[i]) modulo BASE) to ((currentId[i] - id[i]) modulo BASE)
    2. The one with the smaller value in the previous comparison is what you return

47 of 50

FAQs from Years Past

From the paper: N uses the neighbor set to fill routing level p, trims the list to the closest k nodes and requests these nodes send their backpointers. What measure of closeness should we use for the k nodes during this backpointer traversal stage?

> Use Closer()!

48 of 50

FAQs from Years Past

Remove/AddBackpointer RPC clarifications:

  • If AddBackpointerRPC fails, remove the new node from the routing table and add the evicted node back (if there is one). Here "evicted" node is the node removed by RoutingTable.Add() when slot capacity is exceeded.

  • When RemoveBackpointerRPC is called on an evicted node, you can ignore the error returned.

49 of 50

FAQs from Years Past

FindRootonRemoteNode vs. FindRoot:

  • FindRoot traverses the routing table of the node locally
  • FindRootOnRemoteNode asks a remote node to traverse their routing table remotely and return the results
    • Implemented with FindRootRPC
    • Level argument to FindRootRPC should be zero
    • This function is used by new nodes when they’re just joined since they have an empty routing table.

50 of 50

FAQs from Years Past

FindRoot has an argument `toRemove`. When should list of nodes in `toRemove` be removed in FindRoot?

  • When FindRoot makes recursive calls, all the toRemove sets returned are
    • removed from the local routing table, and
    • merged and returned to caller.�
  • The initial invoker of FindRoot (e.g. in Publish) can disregard toRemove.

tl;dr toRemove sets returned by all subcalls should be merged, removed and returned