1 of 30

Bitcoin

Putting the "pseudo" back in pseudonymous

Presentation by Mathieu Lavoie

2 of 30

What this talk is NOT about

  • Will Bitcoin replace all currencies ?
  • Should I buy a ASIC cluster ?
  • Who is Satoshi Nakamoto?
  • Does MtGox...
  • Is Bitcoin real money?

3 of 30

... What it’s about

  • Address
  • Block and blockchain
  • Transaction
  • Limit of pseudonymity
    • How to associate addresses to an entity
  • Tool release!

4 of 30

Bitcoin 101

5 of 30

Address

  • Simpler representation of a public key.

  • 70 millions addresses seen on the network

6 of 30

Block and Blockchain

  • Block contains transactions.
  • Blocks are chained together
    • The “Blockchain”
  • 1 block is mined every 10 minutes
    • New transactions are confirmed when included in a block

7 of 30

Transaction

  • ∑ amount in Input == ∑ amount in Outputs
  • Input refers to a previous transaction output
  • Signature mechanism for validation
  • ± 65 Millions Tx

8 of 30

Transaction

Output

Input

9 of 30

Recap

10 of 30

Regrouping Nodes

Inputs Addresses in TXs

Addresses of an Entity

1, 2

4,5,6

7,10

2,4,7

1,2,4,5,6,7,10

11 of 30

The Goal

  • Regroup addresses owned by an entity in a graph
    • A Node is an entity
    • A Node contains addresses
    • Edges represent money flows

12 of 30

My assumptions

  • Multiple addresses used in inputs belong to the same entity.
    • Each Input must be signed with its private key
    • Multi-signed transactions are negligible
  • People use the sames addresses in multiples transactions

13 of 30

The tool

The story of a graph problem that became a BigData problem

14 of 30

Works well in Theory...

  • Data is public
    • Will be easy! Right?

  • It’s only nodes and addresses
    • That will totally fit in memory!

15 of 30

Idea #1

  1. Get a block
  2. Take each TX
  3. Take each Input
  4. Get referenced Output
  5. Calculate Bitcoin Address from Hash160
  6. Put it in the graph.
  7. ...Watch this run for 3 months...
  8. (╯°□°)╯︵ ┻━┻

16 of 30

Idea #2

  • Get a block
  • Take each TX
  • Take each Input
  • Decompress ECDSA Public Key
  • Calculate address from the Public Key
    1. ...Wait, those addresses do not exist!
  • (╯°□°)╯︵ ┻━┻

17 of 30

Idea #3

  • Understand what you did wrong in #2
    • No need to uncompress ECDSA before converting to an address.
  • Do good code to graph Bitcoin
  • Find that your merge routine is linear - O(n)
    • It would have still been crawling by now...
  • (╯°□°)╯︵ ┻━┻

18 of 30

Idea #4

  1. Drink alone in the dark
  2. Decide to revert your clean code
  3. Hack crappy code in O(1)
  4. Run the crawler for 2-3 hours
  5. Get an out of memory exception
  6. (╯°□°)╯︵ ┻━┻

19 of 30

Idea #5

  1. Realise that you were running Python 32 bits with an addressing space limit of 4GB (Duh!)
    1. Use Python x64...
  2. Run it for 16 hours and get another memory exception
    • Exausted 16 GB of RAM...
  3. Use a transactional DB to dump data each N block
  4. Watch your transactional DB engine suffer.

20 of 30

Finally

  • Use MongoDB with multiple instances
  • Wait for 24 hours.
  • ...
  • Profit!
  • ┬─┬ ノ( ^_^ノ)

21 of 30

Minimum system requirements

  • 30 GB HDD
  • 32 Gb Ram
  • Python 64 bits
  • 24 hours of free time
    • We are doing it live!!!
  • 4 tables to flip

22 of 30

DEMO!

23 of 30

Real use case

24 of 30

What’s Next!

  • Release the crappy code
    • Github : mathieulavoie
    • Pull requests are welcome
  • DB Available (Thanks Clibre)
  • Follow the money
  • Collaborative tools

25 of 30

Questions?

26 of 30

Specials Thanks

  • Taher Azab, Philippe Arteau and Louis Dion-Marcil
  • Philippe Pépos-Petitclerc and Clibre

27 of 30

References

28 of 30

Annexes

29 of 30

MongoDB Commands

  • db.addresses.aggregate([{$group:{_id:"$n_id", count:{$sum:1}}},{$sort:{count:-1}}],{allowDiskUse:true})
  • db.addresses.find({_id:/^1LGnuv.+/})
  • db.addresses.find({n_id:15292275})

“discovered by iSIGHT Partners, the Australian variant they analyzed asked for Bitcoins to be sent to15aBFwoT5epvRK69Zyq7Z7HMPS7kvBN8Fg. In our case, the Bitcoin address changed to13qm2ezhWSHWzMsGcxtKDhKNnchfP5Sp3X”. - Eset

  • db.addresses.find({_id:"15aBFwoT5epvRK69Zyq7Z7HMPS7kvBN8Fg"})
  • db.addresses.find({_id:"13qm2ezhWSHWzMsGcxtKDhKNnchfP5Sp3X"})
  • db.addresses.find({n_id:1258538}).count()

30 of 30

Transaction Scripts

Output

OP_DUP OP_HASH160 b2a141b4ac548da3ba678ac3377a83e65d0bc25c OP_EQUALVERIFY OP_CHECKSIG

Input

304402206e019e1d3b9f0c48adf95e0c02d421851e419eca1c1d37e0e82d8bef555c940c0220051ce09da3a23009d9642c704ab502885b14959f8a95d36f99804a06374a73b601020ab884f71f11005b04019e9bb91e2cb423edaef3fb3f33d35e8609299f000e69