1 of 25

Proteus

AI Data Provenance

Danny Bessonov, Aayush Gupta, Rohan Sanjay

2 of 25

Agenda

1

3

2

4

5

6

Motivation

Proteus V1 + Demo

Existing techniques

Privacy

Proteus V2

Extensions

3 of 25

Motivation

X

4 of 25

Provenance is important

  • Where content comes from is critical context�
  • Easy to assume viral content is real�

5 of 25

Provenance is important

6 of 25

Existing techniques

X

7 of 25

Image watermarking

  • Encode data into the image

8 of 25

C2PA

  • De-facto standard �
  • Adds signature to image metadata�
  • Trivial to remove or lose this metadata�

9 of 25

Proteus V1

X

10 of 25

Decoupling the process

Track content regardless of modification

Prove that modified content came from the original

11 of 25

Perceptual hashes

A potpourri of algorithms ensures robustness to common adversarial attacks.

12 of 25

Content Generation Phase

13 of 25

Content Consumption Phase

14 of 25

Bayes Optimization

15 of 25

Privacy

X

16 of 25

Privacy

  • Want to keep parts of image private

  • PII concerns in prompt

17 of 25

Privacy

  • V1:
    • Remove trust dependence on our lookup

Image registry

Private image

Private image

Private image

Public, modified image

Proof

this image exists + can be re-created through transformation

18 of 25

Privacy cont.

Public

Private

Crop: [x=0, y=500, w=400, h=600]

Signature on original image from X

Circuit

1. phash(transform(original)) ==�phash(modified image)

2.�signature of phash exists for org. X

Organization X PK

signature appears on-chain at timestamp Y

19 of 25

Demo

X

20 of 25

Proteus V2

X

21 of 25

Content Generation Phase

22 of 25

Probability Bounds

23 of 25

Content Consumption Phase

24 of 25

Appendix

X

25 of 25

Ideal, real-world scheme

  • ✅ Good actors can trustlessly prove provenance
    • “This is Midjourney”
    • “This isn’t Midjourney”�
  • ✅ Provenance is irrespective of real-word modifications
    • i.e. screenshots & social-media sharing

  • ❌ Bad actors cannot claim content came from somewhere
    • i.e. “This image is real”