An Overview of Decentralised Web Technologies as a
Foundation for Future IPFS-centric FDOs
The dPID working group
FAIR DIGITAL OBJECTS FORUM
March 20, Berlin
DECENTRALIZED WEB TECHNOLOGIES
Composable building blocks
Combining these protocols enables new properties
→ Let’s explore how this can be relevant to FDOs
WEB2 vs WEB3 approach
Protocols
Platforms
Protocols
Platforms
WEB2
WEB3
Importance
(meta)data
Indexing
Provenance
Timestamps
Compute
…
Data moats
Vendor lock-in
Data loss risk
- G2: FDOs need to generate trust in accurate data survival over long periods of time, assuring researchers, funders, and developers that their significant effort in reusing them will be worthwhile.
→ f(network operators)
Network of
“Node” operators
Platform A
WEB3 - OSDN
Platform B
Service A
Metadata incl. provenance
Data
Indexing
Compute routing
Light “UIs” competing for features
DevUX QoL (e.g., REST APIs)
Web3: Open State Data Networks (OSDNs)
Protocols
Platform A
Platform B
Service A
WEB3
FDOR-GR1: A PID, standing for a globally unique, persistent and resolvable identifier, is assumed to be the basis for FAIR Digital Objects. Every FDO is assigned one or more PIDs.
- FDO-PIDR4: The PID system which is used for the identification of FDOs must be global, robust, scalable, and demonstrate persistence.
→ f(network operators)
Credibly neutral
infrastructure
→ Deterministic resolution
→ No social contracts
Web3: Open State Data Networks (OSDNs)
Protocols
Platform A
Service A
WEB3
- FDO-PIDR5: The PID System which is used for the identification of FDOs must support high security capabilities. The owner, or owner-delegated agent, of a PID and its associated attribute-value pairs is the only actor allowed to make changes, to define accessibility to attribute-value pair information, and to request encryption of information.
DID key
Platform B
Direct control
(e.g., ORCID:DID)
Platform-mediated
(e.g., SciLabs:DID)
DID key
FDOR-GR1: FDO-PIDR6: Management access to the PID system needs to be secured by a public key infrastructure and if necessary the use of standardised certificates must be possible.
Web3: Open State Data Networks (OSDNs)
Why this matters
Why this matters
Join the Conversation
Learn More about dPID<>FDO
More Details about dPIDs
ANNEX
dPID: Decentralized persistent identifiers
Minting dPIDs
Updating dPIDs
Tooling: →CodexLib, NodesLib
CODEX Protocol: Collaborative Open Data Exchange
Employing IPFS as a PID system is not a new idea
Deterministic resolution of PIDs through content-addressed networks
Our innovations
- Version-invariant stream IDs (Containers for versioned DAGs)
- Aliasing system to square Zooko’s triangle (Any PID schema ⊆ PID technology)
Prior Art: Merkle DAGs, PIDs from CAS networks
Software heritage foundation
Architecture: Merkle DAG roots as PIDs
Limitations of past approaches
- The need for version-invariant containers (meta-containers)
- Middleware to support the Data Grid (access, replication, networking, indexing)
- Human unfriendliness of Hash-based namespaces (squaring zooko’s triangle)
Our innovations
- Version-invariant stream IDs (Containers for versioned DAGs)
- Aliasing system to square Zooko’s triangle (Any PID schema ⊆ PID technology)
Decentralized identifiers (DID)
W3C approved spec
→PKI infrastructure scaling challenges
→ DLT is one DID method (others exist, e.g., DNS-based)
SideTree protocol
Scalable DID infrastructure
→Bundling via a sequencer enables scaling
Protocols and specs developed in the context of supporting a self-sovereign identity model
Prior Art: Decentralized Identifiers, SideTree protocol
Our innovations
- Version-invariant stream IDs (Containers for versioned DAGs)
- Aliasing system to square Zooko’s triangle (Any PID schema ⊆ PID technology)
Video Demonstrations - dPID Fetch, Code Atomizer, Nodes-Lib
Nodes-Lib
Code Atomizer
dPID Fetch
Social Trust through Attestations - Persistent Badging as a potential starting point for dPID FDO integration
- G2: FDOs need to generate trust in accurate data survival over long periods of time, assuring researchers, funders, and developers that their significant effort in reusing them will be worthwhile.
→Data Grid approach: participatory network – anyone can spin up a CODEX node.
- FDOR-GR1: A PID, standing for a globally unique, persistent and resolvable identifier, is assumed to be the basis for FAIR Digital Objects. Every FDO is assigned one or more PIDs.
→ dPIDs provides a unique, deterministically resolvable path to any resource and its (meta)data within a Merkle-DAG object
- FDO-GR11: A collection of FDOs is also an FDO. The content of collection FDOs describes its construction using an agreed formal language which specifies the relationships of the constituent members. An FDO may be a member of several collections.
→ dPID PID approach enables recompositing of FDOs into arbitrary containers.
- FDO-PIDR4: The PID system which is used for the identification of FDOs must be global, robust, scalable, and demonstrate persistence.
→ dPID eliminate social contracts over all dimensions, except for persistence of the meta(data) itself, which is handled via replication middleware via the CODEX DHT.
- FDO-PIDR5: The PID System which is used for the identification of FDOs must support high security capabilities. The owner, or owner-delegated agent, of a PID and its associated attribute-value pairs is the only actor allowed to make changes, to define accessibility to attribute-value pair information, and to request encryption of information. And FDO-PIDR6: Management access to the PID system needs to be secured by a public key infrastructure and if necessary the use of standardised certificates must be possible.
→ dPID updates are handled by PKI (Public key infrastructure). Only the DID controller is allowed to make changes to the dPID.
FDO Spec Requirement - Selected Alignments
Relevant W3C DID Spec Compliance
8.1.1) A DID method specification MUST define exactly one method-specific DID scheme that is identified by exactly one method name as specified by the method-name rule in 3.1 DID Syntax.
Since DID's are not governed by a central authority we can choose any method name. So far nobody has taken the method name "dpid", so we're going with "dpid" as a method name. We're currently working on including this on the W3C spec registry, and will have it online with our contact info in the near future once we have the documentation completed.
8.1.2) The DID method specification MUST specify how to generate the method-specific-id component of a DID.
The dpid method uses a smart contract registry and underlying p2p file storage networks (DID Verifiably Data Registries) to enforce the assignment of a monotonically increasing identifier scoped to a target network.
8.1.3) The DID method specification MUST define sensitivity and normalization of the value of the method-specific-id.
Our method-specific-id's are straightforward and produce normalized values because they are decimal encoded integers scoped by hexadecimal encoded integers.
8.1.4) The method-specific-id value MUST be unique within a DID method. The method-specific-id value itself might be globally unique.
The dPID protocol is built specifically to serve as a GUPRI (Globally Unique Persistent Resolvable Identifier) to enable the FAIR principles for research objects. We have some more info in our FAIR Implementation Plan, we are currently working on the next iteration of our docs.
8.1.5) Any DID generated by a DID method MUST be globally unique.
Specifying the identifier and target network is sufficient for the dPID to be globally unique because the DID Verifiable Data Registry enforces uniqueness via smart contract logic, additionally the underlying target payloads are identified by cryptographic hashes.
8.2.1) A DID method specification MUST define how authorization is performed to execute all operations, including any necessary cryptographic processes.
Our first implementation of the dPID protocol allows for a did:pkh controller to cryptographically sign payloads.
8.2.2) A DID method specification MUST specify how a DID controller creates a DID and its associated DID document.
The dPID protocol specifies how did:pkh controllers can create cryptographically hashed research object payloads and then register them using the dPID Verifiably Data Registry, which is open and permissionless.
8.2.3) A DID method specification MUST specify how a DID resolver uses a DID to resolve a DID document, including how the DID resolver can verify the authenticity of the response.
The dPID protocol specifies how its method-specific-id can be broken down into target network and identifier in order to resolve the correct payload. Additionally, the authenticity of the response can be verified by 1) computing the target payload cryptographic hash 2) verifying the signature of the resolved data and 3) verifying the signature of the corresponding blockchain transaction. There are additional ways to verify the authenticity of the response depending on the use case.
8.2.4) A DID method specification MUST specify what constitutes an update to a DID document and how a DID controller can update a DID document or state that updates are not possible.
A DID controller can not update the dPID itself. A DID controller can update what the dPID target, but this is enforced to result in the production of a cryptographically verifiable append-only audit log showing the history of updates. The controller may also update the target payload without updating the dPID target if it is also a DID subject they control. In the protocol this also results in an cryptographically verifiable append-only audit log, however in this second DID update model it is possible to withhold publishing updates from the public and optionally reveal them later, which is a feature.
8.2.5) The DID method specification MUST specify how a DID controller can deactivate a DID or state that deactivation is not possible.
Deactivation of dPID is not possible, the only thing that would cause a resolution failure is a failure of the underlying p2p networks. It is possible to signal via an update that the dPID is no longer "active" in a practical sense, but due to the auditable nature of every update, this would be more of a gesture or formality.
Additional Reading Materials