Ocean for AI/ML Data Flows
A Hands-On Introduction
Trent McConaghy
June 2, 2023
AI/ML is data all the way down
AI/ML is about making models to make predictions.
Data is at at every step of the pipeline:
Data can also be algorithms to build the models.
Data can be dynamically changing, i.e. data streams / data feeds.
Challenges in data
Challenges in data
It's all web3! Decentralization, immutability, assets, incentives
This brings new Q's...
The Key: Tokenize Data
How?
The Key: Tokenize the Data
ERC721 & ERC20 support is everywhere! Leverage it for data access control
(Consume datatokens)
(Create data NFTs & datatokens)
Data on-ramp: mint ERC721 data NFTs → mint ERC20 datatokens
Data off-ramp: consume datatokens
Enables data assets * Web3 wallets, exchanges, and DAOs
Data asset on-ramp
Data wallets: Data Custody, Data Mgmt
Data Exchanges,
IDO Launchpads
Data DAOs: Data Coops, Data Unions
Data Insurance, Data Baskets, Data as Collateral...
Data asset off-ramp
Data Provenance
Atomic → Higher Level Building Blocks
Atomic building blocks: Data NFTs and datatokens
Higher level blocks. The atomic blocks naturally interoperate with
Higher level yet. From this, we can construct many AI/ML data flows:
Ocean for AI/ML Data Flows
Ocean stack solves key goals:
Applies to data at every step of the AI/ML pipeline:
Raw data → cleaned data → trained models → tuning data → tuned models → predictions
What I’ll cover in detail
Installation
& Setup
Outline
github.com/oceanprotocol/ocean.py
install.md
setup-local.md
setup-remote.md
On-chain data:
Data NFTs
Outline
On-chain data (small):
Ocean Data NFTs
On-chain data
with privacy:
Data NFTs with encryption
Outline
On-chain data (small):
Ocean Data NFTs with Private Data 1/4
On-chain data (small):
Ocean Data NFTs with Private Data 2/4
On-chain data (small):
Ocean Data NFTs with Private Data 3/4
On-chain data (small):
Ocean Data NFTs with Private Data 4/4
Off-chain data: Datatokens
Outline
Off-chain data:
Ocean datatokens 1/5
Off-chain data:
Ocean datatokens 2/5
Off-chain data:
Ocean datatokens 3/5
Off-chain data:
Ocean datatokens 4/5
Off-chain data:
Ocean datatokens 5/5
Off-chain data with privacy: Datatokens + Compute-to-Data
Outline
Off-chain data with privacy:
Ocean datatokens with Compute-to-Data
Ocean
f(x)
private data
(stays on-premise)
compute script
run the script
see script results
C2D Quickstart via Ocean.py: Overview
github.com/oceanprotocol/ocean.py/blob/main/READMEs/c2d-flow.md
Ocean Market:
Decentralized data market for algorithms + data
Outline
Ocean Market: Splash Page
Ocean Market: Publish Flow, for a "Data NFT Drop"
Example Data Asset
Example Data Asset: A Data Union
Ocean is multi-chain
blog.oceanprotocol.com/ocean-makes-multinetwork-even-simpler-c3ec6c0cbd50
Fine-grained permissions
blog.oceanprotocol.com/fine-grained-permissions-now-supported-in-ocean-protocol-4fe434af24b9
Ocean for dapp developers
Outline
Example: Daimler / Acentrik data marketplace�acentrik.io
Example: deltaDAO AI Marketplace for GAIA-X�twitter.com/deltadao
Example: Desights AI Competitions�All user info is on-chain & encrypted. desights.ai
Example: FELT Federated Learning
Powered by Ocean Compute-to-Data. feltlabs.ai
Showcases & business ideas�https://oceanprotocol.com/templates
Open-source Templates�https://oceanprotocol.com/templates
Teams building with Ocean
Conclusion
AI/ML is data all the way down
AI/ML is about making models to make predictions.
Data is at at every step of the pipeline:
Data can also be algorithms to build the models.
Data can be dynamically changing, i.e. data streams / data feeds.
The Key: Tokenize the Data
ERC721 & ERC20 support is everywhere! Leverage it for data access control
(Consume datatokens)
(Create data NFTs & datatokens)
Data on-ramp: mint ERC721 data NFTs → mint ERC20 datatokens
Data off-ramp: consume datatokens
Enables data assets * Web3 wallets, exchanges, and DAOs
Data asset on-ramp
Data wallets: Data Custody, Data Mgmt
Data Exchanges,
IDO Launchpads
Data DAOs: Data Coops, Data Unions
Data Insurance, Data Baskets, Data as Collateral...
Data asset off-ramp
Data Provenance
Ocean for AI/ML Data Flows
Ocean stack solves key goals:
Applies to data at every step of the AI/ML pipeline:
Raw data → cleaned data → trained models → tuning data → tuned models → predictions
Create your own tokenized AI/ML data flows
How to try out Ocean:
Appendix: Where to store data <> How to share it
Where to store | Where to store: specific medium | How to share (access control) |
Off-chain | Any web2 or web3 service. Eg S3, Filecoin |
|
On-chain (small data) | Key-value pairs in data NFTs |
|