Draft Indian Standard
Smart Cities - Data Exchange Framework: Part 1 Reference Architecture
© BIS 2019
BUREAU OF INDIAN STANDARDS
MANAK BHAVAN, 9 BAHADUR SHAH ZAFAR MARG
NEW DELHI 110002
Smart Infrastructure Sectional Committee, LITD 28
This is a First Revision of the preliminary draft of the standard reference architecture for data exchange, under review at the Bureau of Indian Standards, Smart Infrastructure Sectional Committee LITD28.
This standard has been requested by the Ministry of Housing Affairs, Government of India.
The next phase of smart cities implementations will leverage data empowerment, in order to harness the maximum value from the enormous data cities generate. The current smart city implementations are unable to satisfy this need efficiently, due to the proprietary and ad-hoc nature of the interfaces and their implementations. Hence it is difficult to develop next generation AI/ML based applications for providing new solutions and services at scale, in the current framework. The Data Exchange Framework as discussed in this document aims to address this gap, by creating a reference architecture (part 1) and interface specifications (part 2) for interconnecting various IT systems of different government departments as well as external organizations.
The data exchange will provide two key services:
Security and Privacy will be incorporated by design in this architecture. This framework should simplify the life of the data custodian as well as the application developer.
The data exchange framework will enable new applications to emerge, that can take advantage of data from different IT Systems, to provide novel services. For example, a Woman’s safety index can calculate the live safety index of any street, combining data from smart streetlights, video analytics from traffic cameras, data from police database along with analysis of land use. Such an index can be used by trip planning apps to allow for determining safe routes or used by city or police to plan on patrolling.
By defining the reference architecture and specifying the interfaces and data models, the data exchange framework standards will enable a whole new ecosystem of application developers to provide new, data driven, solutions and services. Additionally, adopting the data exchange framework nationally, will enable economies of scale for the developers and will allow same applications to run across the country. For data custodians – the data exchange framework will allow a simple way to expose, give consent, audit and track their data usage.
This document describes the reference architecture for the data exchange framework, the use cases that are enabled in this ecosystem and the responsibilities of the various stakeholders, their interactions with other system stakeholders.
It describes the high level architecture of the three main components of the data exchange services:
a) Catalogue service that provides framework to manage meta-information about resources,
b) Authorisation service, that manages authorisation to access the resources
c) Resource Access Service, that provides a standardized way to access resources.
The current document also describes high level definition of the various interfaces. A more detailed specification for the data exchange framework is described in part 2.
The following referenced documents are necessary for the application of the present document.
 IETF RFC 7231: "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content". Available at https://tools.ietf.org/html/rfc7231.
 IETF RFC 7232: "Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests". Available at https://tools.ietf.org/html/rfc7232.
 IETF RFC 3986: "Uniform Resource Identifier (URI): Generic Syntax". Available at https://tools.ietf.org/html/rfc3986.
 IETF RFC 8288: "Web Linking". Available at https://tools.ietf.org/html/rfc8288.
 IETF RFC 7946: "The GeoJSON Format". Available at https://tools.ietf.org/html/rfc7946.
 IETF RFC 8141: "Uniform Resource Names (URNs)". Available at https://tools.ietf.org/html/rfc8141.
 Open Geospatial Consortium Inc. OGC 06-103r4: "OpenGIS® Implementation Standard for Geographic information - Simple feature access - Part 1: Common architecture". Available at https://portal.opengeospatial.org/files/?artifact_id=25355.
 UN/CEFACT Common Codes for specifying the unit of measurement. Available at http://www.unece.org/fileadmin/DAM/cefact/recommendations/rec20/rec20_Rev9e_2014.xls.
 IETF RFC 7396: "JSON Merge Patch". Available at https://tools.ietf.org/html/rfc7396.
 ISO 8601: 2004: "Data elements and interchange formats -- Information interchange -- Representation of dates and times". Available at http://www.iso.org/iso/catalogue_detail?csnumber=40874.
 IETF RFC 2818: "HTTP Over TLS". Available at https://tools.ietf.org/html/rfc2818.
 IETF RFC 5246: "The Transport Layer Security (TLS) Protocol Version 1.2". Available at https://tools.ietf.org/html/rfc5246.
 IANA Registry of Link Relation Types. Available at https://www.iana.org/assignments/link-relations/.
 ISO/IEC 29100:2011(en) Information technology — Security techniques — Privacy framework. Available at https://www.iso.org/obp/ui/#iso:std:iso-iec:29100:ed-1:v1:en
 The OAuth 2.0 Authorization Framework. Available at https://tools.ietf.org/html/rfc6749
 OAuth 2.0 Token Revocation. Available at https://tools.ietf.org/html/rfc7009
 MQTT 5.0, OASIS Standard. Available at https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html
 ISO/IEC 19464: Information technology — Advanced Message Queuing Protocol (AMQP) v1.0 specification. Available at https://standards.iso.org/ittf/PubliclyAvailableStandards/c064955_ISO_IEC_19464_2014.zip
 IETF RFC 6455, The WebSocket Protocol. Available at http://www.ietf.org/rfc/rfc6455.txt
 IETF RFC 2326, Real Time Streaming Protocol (RTSP). Available at http://www.ietf.org/rfc/rfc2326.txt
 ISO 19119:2016 Geographic Information - Services. Available at https://www.iso.org/standard/59221.html
[i.1] The Personal Data Protection Bill 2018, Govt. of India, http://meity.gov.in/writereaddata/files/Personal_Data_Protection_Bill,2018.pdf
[i.2] Electronic Consent Framework, Technical Specs v1.1, http://dla.gov.in/sites/default/files/pdf/MeitY-Consent-Tech-Framework%20v1.1.pdf
[i.3] National Data Sharing and Accessibility Policy 2012, Govt. of India, https://data.gov.in/sites/default/files/NDSAP.pdf
[i.4] Account Aggregator Technical Standards, Version 1.2, Reserve Bank Information Technology Pvt. Ltd., https://api.rebit.org.in/group
[i.5] Policy on Open Programming Interfaces of Govt. of India, 2015. http://meity.gov.in/writereaddata/files/Open_APIs_19May2015.pdf
[i.6] User-Managed Access (UMA) 2.0, https://docs.kantarainitiative.org/uma/ed/uma-core-2.0-08.html
[i.7] User-Managed Access (UMA) 2.0 Grant for OAuth 2.0 Authorization. Available at https://docs.kantarainitiative.org/uma/wg/rec-oauth-uma-grant-2.0.html
[i.8] Federated Authorization for User-Managed Access (UMA) 2.0. Available at https://docs.kantarainitiative.org/uma/wg/rec-oauth-uma-federated-authz-2.0.html
[i.9] ETSI GS CIM 009 V1.1.1 (2019-01), Context Information Management (CIM); NGSI-LD API. Available at https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.01.01_60/gs_CIM009v010101p.pdf
[i.10] ONVIF Network Interface Specifications. Available at https://www.onvif.org/profiles/specifications/
[i.11] HTTP Live Streaming. Available at https://tools.ietf.org/html/draft-pantos-http-live-streaming-23
[i.12] PAS 212:2016 Automatic resource discovery for the internet of things. Specification. Available at https://shop.bsigroup.com/forms/PASs/PAS-212-2016-download/
[i.13] ISO/IEC 27001, Information technology – Security techniques – Information security management systems – Requirements
[i.14] ISO/IEC 27002, Information technology – Security techniques – Code of practice for information security controls
[i.15] ISO/IEC 27017, Information technology – Security techniques – Code of practice for information security controls based on ISO/IEC 27002 for cloud services
[i.16] ISO/IEC 27018, Information technology – Security techniques – Code of practice for protection of personally identifiable information (PII) in public clouds acting as PII processors
[i.17] ISO/IEC 27031, Information technology – Security techniques – Guidelines for information and communication technology readiness for business continuity
[i.18] ISO/IEC 27033 (all parts), Information technology – Security techniques – Network security
[i.19] ISO/IEC 27034 (all parts), Information technology – Security techniques – Application security
[i.20] ISO/IEC 27035 (all parts), Information technology – Security techniques – Information security incident management
[i.21] ISO/IEC 27040, Information technology – Security techniques – Storage security ISO/IEC 29100, Information technology – Security techniques – Privacy framework
[i.22] ISO/IEC29101, Information technology – Security techniques – Privacy architecture framework
[i.23] ISO/IEC 29134:2017, Information technology – Security techniques – Guidelines for privacy impact assessment
[i.24] ISO/IEC 29151, Information technology – Security techniques – Code of practice for personally identifiable information protection
Table 3-1: Terminology
Legal Entity: Human (possibly delegated by an Organization), Organization or an organizational role that has responsibility to provide authorisation to use resources.
Service: Serves resources to authorized Apps/Consumers.
Legal Entity: Human or Organization or an organizational Role that consumes a resource via a web or mobile App.
Application: Software (like a mobile app, web app, device app or server app), that uses resources to provide a service or experience to the Consumer.
Application: An App that enables a Provider to manage the meta-data and access control in the data exchange, for the resources they are responsible for.
Data Exchange Framework
Service: Hosts and manages meta-data about resources and manages authorisation for accessing the resources.
Provider’s freely given, specific and informed agreement to the accessing and processing of specific resources in their responsibility.
A machine-readable electronic document that specifies the parameters and scope of data sharing that a Provider consents to in any data sharing transaction.
Personally Identifiable Information
Any information that (a) can be used to identify the PII principal to whom such information relates, or (b) is or might be directly or indirectly linked to a PII principal
Note 1 to entry: To determine whether a PII principal is identifiable, account should be taken of all the means which can reasonably be used by the privacy stakeholder holding the data, or by any other party, to identify that natural person.
Natural person to whom the personally identifiable information (PII) relates
Note 1 to entry: Depending on the jurisdiction and the particular data protection and privacy legislation, the synonym “data subject” can also be used instead of the term “PII principal”.
A digital entity that is used to present the authorization credentials to the Resource Server.
A registry of meta-data about the resources in the data exchange available for consumption
An entry in the Catalogue that describes the meta-information of the resource that is hosted in an associated Resource Server
DX Authorization Service
Authorization Service of the data exchange
DX Catalogue Service
Catalogue Service of the data exchange
Adapter service in front of a non-DX compliant Resource Server
Legal Entity: Responsible for administering, managing and running the data exchange
DX Certificate Authority
Service: Certificate Authority service run by the DX
Table 3-2: Abbreviations
eXtensible Markup Language
Java Script Object Notation
Application Programming Interface
Personally Identifiable Information
Transport Level Security
Certificate Signing Request
Controller of Certifying Authorities
The Data Exchange Framework is a set of services that enables consumption of resources (like data) by a Consumer from one or more Resource Servers, based on explicit Consent obtained from the Provider of the resources.
The implementation guide adheres to the following guidelines:
The following table outlines the roles and responsibilities of the various entities involved in the data exchange ecosystem.
Table 4-1: Entities and their roles
Roles and Responsibilities
Legal Entity: Human or Organization that owns a resource.
Legal Entity: Human or Organization that consumes a resource.
Service: Hosts and manages meta-data about resources and manages authorisation for those.
Service: Serves Provider’s resources to authorized clients.
Application: A Consumer’s application, that consumes resources. Can be a mobile app, web app or server app.
Service: Provides identities for various entities
Service: Provides Digital Certificates. Their trust can be traced to the Root Certificate Authorities.
Application: Helper app used by Providers to manage interactions with the data exchange. Can be a mobile app, web app or server app.
Legal Entity: Develops Apps that consume (or produce) resources.
Data Exchange Provider
Legal Entity: Operates the data exchange.
Figure 4-1: Data Exchange Architecture
Resources, managed by a Provider, are hosted on one or more Resource Servers, and are made available for consumption to entities via a description of its meta-information (like its format, Provider, etc.), through a catalogue in the Data Exchange. The Catalogue is both human readable as well as machine-readable.
The Provider registers and manages the meta-data of its resources and their associated access control policies via the management interface of the data exchange. The Provider may use a helper application, like the ProviderApp. The meta-data of each resource should help an app developer to ease the consumption of resources in order to create useful applications for Consumers.
The App can register with the Data Exchange to get notified about any changes to the meta-data of the resources of interest to the Consumer. The App obtains consent to consume the resources via the authorisation interface by obtaining an Authorization Token.
Any request to a provider’s resource by a Consumer App will be checked against the existing access control policies. If no decision can be made, the Data Exchange will coordinate between the Provider and the Consumer to complete a consent transaction and generate the Consent Artefact. The Consent Artefact will be used to update the access control policy for those resources. The Consent Artefact will be logged by the system to ensure auditability of the consent flows. The coordination between the Provider and Consumer is done outside the scope of this specification and can be built using any of the available messaging technologies like SMS/OTP/EMAIL etc. The data licensing terms and conditions will also be outside the scope of this specification. However reference to license may be provided in the meta-data for the resource.
In order to provide for a better consumer experience, the App Developer may also enter into a resource licensing agreement with the Provider. In this case, the Consumer can be shielded from having to get consent separately, as long as the App Developer and the App adhere to the licensing terms and conditions of the Provider.
The set of interfaces for this data exchange framework are listed below:
Table 4-2: Data Exchange Interfaces and functionalities
Create, Update, Delete, List, Search and View the items in the catalogue.
Create, Update, Delete and View the access control policies.
List and View information about consumers.
Interface for the Provider to manage the resource meta-info and access-control policies.
List, Search, View and Count items in the catalogue
Search can use complex queries and filters involving Geo, Time and other Attributes
Request, Grant, Revoke, Introspect Access Tokens for resources.
Federated, OAuth style authorization flows.
Get Latest Data, Search for resources, Get Status, Get Counts, Subscribe, Update Subscription, Unsubscribe.
Playback of live and archived media streams, download media files. Stop, Pause and Stop playback.
GIS resources access.
Service resources access.
Retrieve latest data and search for resources using complex queries and filters. Search can be on Geo, Time and other Attributes.
Existing international standards for access of specific resource types will be used wherever applicable.
This interface connects with external identity management systems. OpenID or LDAP etc can be used to implement this. Defining this will be out of scope for this standard currently.
Get consent from Provider
This interface gets the consent from the Provider for accessing protected, private or confidential data. Since it involves interactions with Humans - it is not defined as part of this standard currently. SMS/Phone/Mail etc can be used. Not that in case of embedded PII, Provider has responsibility to get consent from PII Principals,
These interfaces are specified in detail in Part 2 of this standard.
There are no deployment restrictions imposed by the DX. Data exchange using the said reference architecture and the associated interface specifications, can be deployed in multiple different ways and DX does not favour or impose any specific deployment model.
The identities of the Providers, App Developers SHOULD be via X.509 certificate chains. The identity of Consumers MAY be via X.509 Certificates or ID tokens (OpenID Connect, SAML 2.0 or industry standards). The provisioning and management of these certificates or Tokens will be outside the scope of this standard.
The two main services provided by the Data Exchange are the catalogue service, that allows management and search of meta-data about resources, and the Authorisation service, that manages authorisation to access the resources. These are described in more detail next.
On a high level the catalogue service enables the following:
At the core, DX catalogue is a store of meta-information associated with the data assets/resources available with the data exchange. A meta-information object may be related to another meta-information object by providing explicit references to one another. Further, using concepts of linked-data, semantic grounding is provided for the attributes contained in the meta-information objects. The catalogue service layer, built on top of the meta-information store, provides powerful search capabilities to discover resources of interest and their associated meta-information (e.g., data models, api objects etc.). Additionally, the catalogue provides services to build and maintain the meta-information store in a consistent and collaborative fashion.
Details of catalogue information model and various catalogue objects are provided in Section 6.
Data Exchange catalogue exposes services via a set of APIs built on top of the meta-information store. The APIs can be broadly categorised into two sets:
The details of catalogue APIs are provided in part 2 of the data exchange standard.
The main goal of data exchange framework (DX) is to enable seamless sharing of resources while respecting ownership, privacy and compliance requirements. DX achieves this by defining a set of open standards for authorization, data classification, and policy authoring, and providing sample implementations according to these standards. The standards enable data providers and application developers to target a consistent set of APIs for authoring policies and accessing data across smart city platforms. When DX is used for sharing sensitive or PII, the standards ensure that PII principals retain control over data shared on the platform in accordance with the strongest privacy regulations.
The authorization service in DX is designed to reduce barriers for adoption. In particular, resource providers should be able to start sharing resources with authorized entities with minimal effort. Towards this end, DX will support mechanisms for plugging existing non-DX complaint resource and authorization servers into the DX ecosystem with simple extensions. The authorization service also supports a simple-to-understand data classification framework and policy authoring tools to help providers migrate to the DX framework.
The following aspects of resource sharing are out of scope of DX authorization service.
The authorization service in DX should support the following functionalities.
The main actors of the architecture are:
Figure 4-3: Authorization Flow
The following aspects of data sharing are not in scope of DX.
DX shall enable Providers to associate a policy with every resource-item in the catalog. The policy should govern who has access to the resource described by the item and how entities with access should handle storage of resource’s data. Associating policy with attributes within a resource is currently not supported.This is because the resource is accessed at object granularity in DX. In order to associate policies at a finer granularity, providers may define views over underlying raw data, and associate a policy with every view.
A policy P is a pair of the form (W, H), where
Table 5-1: Authorization policy attributes and values.
Authorization protocol and policy
Specifies protocol that a consumer must use to request access from a resource server
None, OAuth/UMA, OAuth/UMA + XACML Policy, Token/IUDX + Aperture policy language
Specifies requirements on where a consumer is allowed to store data after getting access
Country, State, Organization (Service-based access), None
Specifies requirements on how long consumers may retain data
Fixed period, Up to a certain event or date, None
Specifies in what form a consumer may store data
Encrypted (using adequately protected keys), Encrypted (using keys owned by data owner), Any
Specifies requirements on the purpose for which data is used.
Privacy preserving computation, Anonymization, Computation certified by a third party, Any
Specifies audit requirements on data that the consumer must satisfy
Audit accesses along with time and duration of access, None
This list is by no means complete. For example, the policy framework does not capture requirements around ownership of data. Further additions to the list of attributes or attribute values may be made over time while maintaining backward compatibility.
In order to simplify the process of authoring policies, we define a set of policy labels that capture commonly used policy specifications. The policy labels are only normative and may evolve over time.
Table 5-2: Some standard policy Labels
Nature of data
Contains no personal information
Contains anonymized information
May contain personally identifiable information
May contain personally identifiable information
and/or other data that is confidential within the organization
Authorization protocol and policy
Requires authorization using IUDX/UMA, no custom auth policy
Requires authorization using IUDX/UMA, custom auth policy specified in a policy language
Requires authorization using IUDX/UMA, custom auth policy specified in a policy language
Requires consent of owners
Requires consent of owners
Configurable or as per regulatory framework
Only service based access
Configurable or as per regulatory framework
Licensed with legal framework
Licensed with legal framework
Not to be monetized
Identity of producers/consumers of data in DX is through:
DX shall honour certificates from any licensed CA in India (certified by the CCA). Also, DX may issue certificates to users based on their email ids. DX shall host a certificate authority (CA) which will grant certificates to:
Individuals, app developers, or employees of an organization who wish to access protected private, or confidential data must require a valid certificate.
Individuals and App developers should send a certificate signing request (CSR) as an attachment in an email to the DX Administrator with subject "Certificate request". The DX CA will validate the CSR and will respond back with a certificate.
Organizations should send a certificate signing request (CSR) as an attachment in an email from their organization domain to DX administrator. The DX CA will validate the CSR and will respond back with a certificate. The certificate provided to organization can only be used to grant certificates to employees of the organization. Thus, the domain name of the organization must match with the e-mail of the employee for whom the certificate is granted.
For organizations to be able to request certificates from DX CA, they shall register themselves with DX CA (this may be through an online form). All registered organization's domain names shall be added to a white-list; and DX will only honour certificate requests from organizations in the white-list.
Organizations while generating a certificate may add more details about the employee such as: organization, organization unit, first name, last name, role of the employee, state, city, etc.
Employees of an organization may send a certificate signing request (CSR) as an attachment in an email to DX Administrator. The organization that will act as a sub-CA or a registration-authority will validate the CSR and will respond back with a certificate. The scripts to grant certificates may be provided by the DX to organizations.
DX CA shall issue 5 classes of certificates:
All Interfaces should make statistics available and log all events to allow audit of interactions.
The systems must be designed to be highly available, scalable using a distributed architecture for vertical and horizontal scale, and high on performance. The APIs must have high uptime and a public API Status Page must be provided by the Authorization Service Provider, Catalogue Service Provider and the Resource Server that reports the same for each of the endpoints (along with other open data like average response time, latency, etc). Furthermore, these must also implement the Heartbeat API for reporting their system uptime in real-time.
This section explains how the various failure scenarios that must be handled [TBD]:
In this scenario, when the AS or RS is not able to notify the Provider on the status of the consent flow or data flow, a mechanism has to be put in place to notify the Provider at a later stage. This can be achieved by reinitiating the notification message to the Provider or by providing the Provider an option to check the status through an application, or by providing a list of all consent flows and data flows (with status) in the application.
In this scenario, when the response sent by AS does not reach the Consumer,the latter should have a mechanism provided by AS to initiate a request to know the status of the consent flows.
In this scenario, when the response sent by catalogue service does not reach the Consumer, the Consumer should have a mechanism provided by catalogue service to initiate a request to know the status of catalogue services.
In this scenario, when RS is not available to Consumer, Consumer may have a mechanism to re-initiate the request to RS.
Conceptually, the catalogue is a collection of items with each item containing a set of meta-information attributes along with their values. Each item belongs to an itemType which serves to categorise the type of meta-information contained in it. Each itemType is associated with a ‘schema’ which defines a mandatory set of attributes an item must contain. Other than the mandatory set of attributes, an item may contain additional custom attributes to augment its information content. IUDX catalogue supports the following itemTypes (see Table 6-1): ‘resourceItem’, ‘resourceServer’, ‘provider’, ‘resourceServerGroup’ and ‘catalogueItem’.
Each attribute in an item belongs to one of the attribute types chosen from a pre-defined set referred to as IUDX core attribute types. A core attribute type provides semantic context for an attribute and also defines the syntactic structure of that attribute. The syntactic structure is designed to be extensible such that additional information fields may be included apart from the minimal set of fields defined for a given core attribute type. IUDX catalogue supports the following core attribute types: ‘Property’, ‘Relationship’, ‘GeoProperty’, ‘QuantitativeProperty’ and ‘TimeProperty’.
The catalogue items are linked data objects. The mandatory attributes, and preferably the custom attributes as well, of an item are mapped to discoverable universal resource identifiers. That is, the attributes are provided with a context. The context for a given attribute may contain information on how the attribute should be interpreted. An attribute may contain linkages, using linked data primitives, to other attributes from other vocabularies/taxonomies thereby leading to further enhancement in its interpretability by machines (and humans). Further, different attributes from various items can point to the same resource identifier which provides a simple mechanism to harmonise semantically similar yet syntactically different attributes contained in different items.
Figure 6-1 : IUDX Catalogue information model
Figure 6-1 summarizes the catalogue information model. The base layer consists of core attribute types. An attribute will, in general, have a ‘type’ and ‘value’ and may have other meta-properties associated with it to additionally describe the attribute. ‘type’ identifies the core attribute type and must have one of the following values: ‘Property’, ‘Relationship’, ‘GeoProperty’, ‘QuantitativeProperty’ and ‘TimeProperty’. ‘value’ contains the values assigned to the attribute and it may range from simple objects, like strings or numbers, to complex objects, like, ‘GeoJSON’ objects etc. IUDX core attribute types are described in Section 6.2.
Common attributes, which necessarily extend one of the core attribute types, serve to give specific meaning and interpretation to an attribute value. For example, ‘location’, which is of type ‘GeoProperty’, is used to specifically represent a geo-spatial point with just one set of latitude and longitude coordinates. Similarly, ‘createdAt’ is of type ‘TimeProperty’ which is used to represent the time (in a standard format) when a given catalog item is created. Common attributes have been defined to capture commonly used concepts across IUDX catalogue items. Further, all the mandatory attributes contained in various catalogue items necessarily come from the common attribute set.
An itemType is used to describe and give a semantic interpretation to a collective set of attributes contained in an item. For example, an item of type ‘resourceItem’ is used to describe meta-information associated with an IUDX Resource, like, how to access data from this resource, links to the data model used to describe data from this resource, discovery tags associated with this resource, information about the Provider of this data (who can authorize access) etc. Similarly, a ‘provider’ item captures information about Provider entity. As mentioned before, each itemType specifies a set of mandatory attributes to be included in any item of this itemType and these attributes necessarily belong to the common attribute set.
A data model describes meta-information attributes and their syntactic structure for a given application domain. The catalogue framework allows, using the concepts of linked data, reuse of attributes from other domain specific vocabularies. Note that defining a specific data model is out of scope for the catalogue specifications. However, for consistency purpose, attributes defined in data models should follow the same general set of rules, specified later, as are followed for IUDX meta-information attributes. The intent here is to specify a robust framework that allows specification, reuse and building consensus for domain specific data models.
IUDX catalogue is based on JSON-LD and JSON-schema. JSON-LD framework is used to provide linked data encodings to the attributes in the catalogue items. The context mappings link attributes to vocabulary (or ontology) of interest which provides additional meaning and context to it. IUDX catalogue uses JSON schema framework for representational purposes. In particular, JSON schema is used for describing and validating the structure of catalogue items.
All the catalogue items are JSON-LD documents and necessarily need to include “@context” field, which contains JSON-LD context, that maps attributes to IRIs (Internationalized Resource Identifiers as described in [RFC3987]) providing unambiguous identification of these attributes. IUDX will necessarily provide context for IUDX core attribute types and IUDX common attributes. For additional attributes, it is recommended that the provider of these items should use context from IUDX vocabulary and/or from existing vocabularies, e.g., schema.org, GeoJSON-LD, etc.
JSON schemas are used to specify the structure of catalogue items. Each itemType has an associated JSON schema which is used to define the syntactic structure of an item belonging to this itemType. The schemas for all itemTypes is collectively referred to as base schemas.
JSON-schemas are also used to specify the domain specific data models. Data models describe attributes that contain meta-information about a data resource.
In addition, the JSON schema ‘definitions’ and/or ‘properties’ which are dereferenceable JSON pointers, and hence valid IRI links, can also be used to provide IRI references to be used in “@id” or “@type” fields from within the JSON-LD objects.
The base schemas and data models are not stored as catalogue items. However, a repository of base schemas and data models will be available to IUDX catalogue implementations.
IUDX core attribute types represent a minimal set of types/classes to which various meta-information attributes belong to. Each core attribute has a well defined structure which is extensible to allow for additional information fields to be included for a given attribute. Since every attribute belongs to one of the core types, it imposes a partially known structure on various attributes, especially the ones that are not a part of IUDX common attribute set. Also, such an explicit categorisation allows for targeted operations on relevant core attribute types, e.g., geo-spatial search, time-based searches etc.
IUDX Catalogue information model supports the following core attribute types: Property, Relationship, GeoProperty, TimeProperty, QuantitativeProperty.
An attribute of type ‘Property’ is a JSON object whose key is the attribute name, which is mapped to an IRI using “@context” field of the item containing the attribute, and whose value is a JSON-LD object that includes the following keys:
‘Property’ attribute is most general of the core-attribute types. In fact, as can be seen later, GeoProperty, TimeProperty and QuantitativeProperty are specializations of ‘Property’ that impose some additional structure on “value” field.
An attribute of type ‘Relationship’ is a JSON object whose key is the attribute name, which is mapped to an IRI using “@context” field of the item containing the attribute, and whose value is a JSON-LD object that includes the following keys:
‘Relationship’ attribute is useful to establish relationships amongst different catalogue objects. It can point to other items in the catalogue as well as external objects to which the items refer to, e.g., data models, base schemas, API objects, etc.
An attribute of type ‘GeoProperty’ is a JSON object whose key is the attribute name, which is mapped to an IRI using “@context” field of the item containing the attribute, and whose value is a JSON-LD object that includes the following keys:
Attributes of type ‘GeoProperty’ are used to include geo-spatial information in catalogue items.
An attribute of type ‘TimeProperty’ is a JSON object whose key is the attribute name, which is mapped to an IRI using “@context” field of the item containing the attribute, and whose value is a JSON-LD object that includes the following keys:
Attributes of type ‘TimeProperty’ are used to include time information in catalogue items.
An attribute of type ‘QuantitativeProperty’ is a JSON object whose key is the attribute name, which is mapped to an IRI using “@context” field of the item containing the attribute, and whose value is a JSON-LD object that includes the following keys:
Attributes of type ‘QuantitativeProperty’ are useful to represent observed or measured quantities.
In terms of property graph model the following (non-normative) mappings could be made:
The IUDX catalogue core attribute types are similar to the core meta-model in NGSI-LD, a recent data exchange standard by ETSI ISG CIM, which incorporates the linked data concepts in the data served by Resource Servers. In particular, ‘Property’ and ‘Relationship’ objects are similar between IUDX catalogue and NGSI-LD core meta-model. Further, the core-attributes ‘GeoProperty’ and ‘TimeProperty’ are also in consonance to concepts defined in the NGSI-LD common meta model.
IUDX common attributes define commonly used concepts in IUDX catalogue items. All the common attributes belong to one of the core attribute types and thus follow the structure of the parent attribute type. Informally, the common attribute definition adapts the core attribute type to a specific concept that is being modelled. For example, although ‘location’ and ‘coverageArea’ are both of type ‘GeoProperty’ these are modelling different geo-spatial scenarios. Whereas, ‘location’ represents a point by restricting the ‘geometry’ object to be of type ‘GeoJSON point’, ‘coverageArea’ represents a geo-spatial region by restricting the ‘geometry’ object to be of type ‘GeoJSON polygon’.
The syntactic structure for common attributes is defined using JSON schemas. As mentioned before, the attribute definitions from within the JSON schema document may be used as the corresponding IRI link for the linked data context.
Table 6-2 lists down all the common attributes defined within IUDX catalogue.
As mentioned before, each catalogue item belongs to an itemType. Further, all itemTypes have associated schemas, collectively referred to as base schemas, which describe the syntactic structure of the items belonging to the respective itemTypes. The base schemas are represented using JSON-schemas. A base schema specifies the template of a given item which includes: specifying the set of common attributes used in an item, specifying a set of mandatory attributes, specifying constraints (if any) on additional attributes etc.
By specifying a set of mandatory attributes base schemas ensure a level of uniformity in the information contained in an item of a given itemType. For example, for every ‘resourceItem’, a mandatory ‘tags’ field is useful to search for resources of interest. At the same time the flexibility of including additional attributes allows for including specific information in an item that may be hard to generalise.
Representing base schemas as JSON-schemas brings into play the powerful validation framework that can be leveraged to validate items at the time of creation or updation of items into the catalogue store.
With regards to any additional attribute included in any item, it is recommended that the general guidelines be followed:
The catalogue contains the following item types:
Table 6-1: Item types in a DX catalogue
Contains meta-information associated with a data resource.
Contains information about a Resource server.
Contains information about a resource server group.
Contains information about the IUDX Provider entity
Contains information about the “catalogue” instance. For example, instance id, end points, Provider etc.
Table 6-3 lists down the set of mandatory attributes for each of the above itemType. Also see Section 6.4.4. for a discussion on the relationship between various catalogue itemTypes and other catalogue objects.
The data model object contains description of domain specific attributes associated with a resource. These can include attributes that describe the data from resource, also referred to as data-attributes, as well as attributes that describe other meta-information related to the resource. As an example, let the data from a temperature sensor be available in JSON format and contain keys “temperature” and “time-stamp”. The data model corresponding to this sensor must contain schemas for attributes “temperature” and “time-stamp” along with optional textual descriptions about these attributes. Further, the data model may also include attributes, like, location of sensor, make and model of sensor etc. which do not directly describing the data and yet contain important meta-information related to the resource.
JSON schema is used to represent data models in IUDX catalogue. The attributes contained in the data model, in a fashion similar to IUDX common attributes, must belong to one of the core attribute types. Also, wherever applicable, data model attributes should use attribute definitions from the IUDX common attribute set.
Data models improves understanding and interoperability of data from a given resource. Further since JSON schema framework is used to represent data models, these can also be used for validation purposes by data consuming applications.
In IUDX catalogue framework the data model, which is a JSON schema document, also plays a role in enabling linked data. The JSON-LD “@context” field containing the context for all the data model attributes is included in the data model. This will enable easy conversion of JSON data from resource server into a JSON-LD data. All that is required is to add an “@context” field in the JSON data from the resource containing a reference to the corresponding data model.
Note that by adding “@context” field the data model serves dual purpose: It can be referred to as a valid JSON-LD context document and all fields outside “@context” will be ignored. Similarly, it remains a valid JSON schema document and when used as a schema document, say for validation purposes, all non JSON schema fields will be ignored.
The idea of adding “@context” is further extended in the following way: The data models may include some ‘non JSON schema attributes’ in the schema and a JSON-LD context is provided for these attributes. The JSON-LD parser will expand these to their corresponding IRIs thereby providing the consuming applications with additional context about a given data attribute. This leads to improved human and machine interpretability of these attributes and eventually a better understanding of data itself.
The data models are not stored as catalogue items. However, a repository of base schemas and data models will be available to IUDX catalogue implementations.
As mentioned above, the data models include data attributes which belong to one of the core attribute types. It may not always be possible (and even preferable) to send data according to the template required by the associated core type. For these scenarios, a compact representation is supported where the data attribute and its ‘value’ are represented as a simple key-value pair (thus, excluding the explicit mention of fields ‘type’ and ‘value’). However, one can easily get the ‘type’ of this attribute using the JSON-LD context mappings and/or by examining the data model corresponding to this data resource.
The access object formally describes the methods of accessing data from (and interacting with) the resource. For example, an open-API object can be used to describe API end-points, query parameters, request and response bodies for a REST API based access. Reference for such objects can be included in catalogue resource items to ease the process of accessing data from a given resources or a set of resources. Other such formal descriptions, e.g., async API objects for MQTT/AMQP messaging based resources etc. can also be included in the resource items.
An interesting aspect is that in some access objects, e.g., API object, the request/response bodies may be described using JSON-schema references. For such access objects, it is recommended that the access object refer to the corresponding data model (which is a JSON-schema) to describe attributes in the request/response bodies.
The access objects are not stored as catalogue items.
Figure 6-2: Relationship of “resourceItem” with other itemTypes
Figure 6-2 summarizes the relationship between various IUDX catalogue objects.
An item of type ‘resourceItem’ is a key object around which catalogue services have been designed. It consolidates different types of meta-information associated with a resource. For example, links to the associated data model objects, links to associated API objects etc.
In addition to the attributes specified by the base schema, a resource item may contain additional attributes from an associated data model, e.g., location information of a fixed sensor or device model and manufacturer etc.
An item of type ‘resource-item’ must contain a reference to an item of type ‘provider’. The provider object contains information about the provider for this resource, e.g., identity of the Provider, description, contact information etc.. This information is needed for applications looking to get authorization to access this resource.
Similarly, an item of type ‘resourceItem’ must contain a reference to an item of type ‘resourceServer’ which contains information about the Resource server that is hosting the resource, e.g., identity, description, name, IP address and ports etc.
Additionally, a ‘resourceItem’ may contain a reference to an item of type ‘resourceServerGroup’ which identifies the resource group within a resource server to which the resource belongs. A resource group is a grouping of resources that have the same data model, same access objects (and hence the same request/response bodies) and belong to the same Resource server. The concept of resource groups allow operations (e.g., get data, status etc.) to be defined on multiple resources belonging to the same group. Currently, no notion of resource groups exist across the resource servers. However, once the data models for certain resource groups have been standardized it can be envisioned that a resource group can have resources from different resource servers also. The ‘resourceServerGroup’ item contains information that is applicable to all the resources within this group and hence this information need not be included in each resource item.
Table 6-2 below lists down the attributes defined in the IUDX common attribute set. These attributes are used to define base schemas. Table 6-3 lists down the common attributes used in IUDX base schemas.
Table 6-2: List of IUDX common attributes
id of a catalogue item
id of the resource in the resource server
name of a catalogue item
Time when a resource/item/attribute was created
Time when a resource/item/attribute was modified
Time when a resource/item/attribute was deprecated
Array of keywords describing this item facilitating item discovery.
Type of the resource (see Table 6-4 below)
Attribute whose value is a URI
Status of this catalogue item
Link to the base schema for this itemType
Link to a resourceServer item
Link or an array of links to resourceServerGroup items
Text description of this item.
iudx item type
Reference to the data model for this resource item.
Link to the provider of this resource
Status of an item. Set to either 'active' or 'deprecated'
Describes a geo-spatial location as a geoJSON point
Describes a geo-spatial region as a geoJSON polygon
Information about a given organization (contact info, email, urls etc.)
Information regarding the authorization server that this item uses
Device model information, it's make, brand, model, url, etc
Type of access mechanism. For example, 'openAPI', 'asyncAPI', 'custom'.
URL that points to more information about data access of this resource
Link to an object (OpenAPI 3.0 api JSON object, or a json-schema) to describe access mechanism for this data resource.
Item specific API object variables. The variables and their corresponding value for this resource item are listed as a key-value pairs in value field of this property. The json-object in the value should be treated as a simple json object and not a json-ld object.
List of access mechanisms available for data associated with this catalog item
Array of fields from the data-model which appear in the data packet. These fields are not necessarily instantiated in the resourceItem
Table 6-3: Mandatory attributes for IUDX base schemas
id, name, tags, refBaseSchema, itemDescription, itemType
id, tags, refBaseSchema, resourceServer, itemDescription, refDataModel, provider, resourceServerGroup, resourceId, itemType
id, name, tags, refBaseSchema, itemDescription, resourceServerHTTPAccessURL(uriLink), resourceServerOrg(organizationInfo), coverageRegion, itemType
id, name, tags, refBaseSchema, resourceServer, itemDescription, refDataModel, provider, itemType
Table 6-4: List of supported resource types
Nature of Resource
Data set as a file
Data set as a table of records
Data as a notification (e.g alert, event)
Data as a stream of messages (e.g. sensor readings)
Media stream (temporally encoded like video or audio)
Figure 7-1: Basic Interaction Scenarios
The minimal set of interaction scenarios supported by the data exchange ecosystem is indicated in the Figure 7-1. Details for each follows.
The provider needs to obtain a certificate from the DX CA to establish identity. An example workflow is given as follows
Figure 7-2: Provider registration flow
In this use case, a Provider is requesting the Authorisation Service to allow access to use the Catalogue. Once approved, the Provider can create or update an entry in the catalogue.
Figure 7–3: Resource Provider getting access to use Catalogue
The use case allows Providers to update the catalogue
An approval is provided to perform the requested operation
If a resource is not public, then the Provider can request an Authorisation Service to set policies for data access.
The use case allows Providers to set policies for their resources
A policy is set for the provider's resources.
Figure 7–4: Setting up a permissions for a resource by the provider
In this use case, a Consumer is using the search APIs of the Catalogue Service to find interested entities. Once the entities are identified, a Consumer using Consumer App can request for consent and access their data.
Figure 7–5: Consumer discovering data
This use case allows consumers to search the catalogue using customer app
A list of search hits are sent to the consumer app
A consumer application can request access to data from an DX compliant resource server using any one of the supported APIs. If requested data does not require authorization i.e. does not have an authorization policy, or the request contains a valid access token, then the resource server serves the request after token validation.
If data requires authorization and the request does not contain a token, the resource server initiates authorization following a IUDX/UMA 2.0 compliant protocol. The protocol supports a subset of UMA 2.0 workflows.
Figure 7–6: Consumer requesting access to a resource
This use case allows the Consumer to access the requested data
The requested data is provided to the Consumer
A provider shall be able to revoke access to a particular resource by calling the /revoke API.
Figure 7–7: Revoking consent flow
This use case allows the Provider to revoke access to data
The consumer is no longer able to access the resource
As described before, all catalogue items are JSON-LD objects and are instances of pre-defined JSON-schemas. The catalogue contains items of different itemTypes, each having an associated base-schema. A base-schema specifies a minimal set of mandatory meta-information attributes to be contained in the catalogue item. Further, a ‘resourceItem’ is associated with a data-model, which is a JSON schema document, and an API access object. In this annexe, we provide examples of various catalogue objects.
Let us take an example of an item with ‘itemType’ ‘resourceItem’ that corresponds to data observed from a physical sensor device measuring ‘CO2’ and ‘TEMPERATURE’.
First, we provide an example of data-model describing the domain specific attributes of the sensor device. One can create a new data model from scratch or one can reuse some existing data models for devices in the same domain.
The context for attributes in the datamodel are derived from a json document mentioned in the “@context” field of the above, namely “<catalogue-link>/airQuality/airQuality_context.json".
The above data model describes various attributes of the sensor resource, e.g., ‘NAME’ of the device, ‘location’ of the device etc. Further, for ‘TEMPERATURE_MAX’ and ‘CO2_MAX’, which are data attributes, additional information, e.g., ‘unitCode’, ‘unitText’, ‘minValue’, ‘maxValue’ etc., has been provided.
Note that the above additional information keywords, e.g., ‘unitCode’, ‘unitText’ etc., are not JSON schema keywords and will be ignored by JSON schema tools/validators. However, these keywords have been provided linked data grounding using “@context” field in the data model. If this data-model is passed through a JSON-LD parser, these keywords will expand to the IRIs provided via “@context” and will serve to provide additional context about these properties. A snippet of JSON-LD expanded object for ‘TEMPERATURE_MAX’ is shown below:
Note that ‘maxValue’ has expanded to ‘https://schema.org/maxValue’. This implies the property ‘unitCode’ referred to in the data schema is same as ‘schema.org/maxValue’. Applications already aware of that vocabulary will find it very easy to use and interpret this attribute. The above example also illustrates how IUDX catalogue is able to reuse attribute definitions from external vocabularies using “@context” keyword.
Another important aspect to note is that the “@context” field in data model also serves to provide context to the data attributes and hence may be used to convert the JSON data from the resource into JSON-LD data.
We next provide an example of the catalogue item of itemType ‘resourceItem’. This catalogue item summarizes various types of meta-information with regards to this data resource.
The base schema for this item is specified by the field ‘refBaseSchema’ which in this case refers to the schema for an item of type ‘resourceItem’. The above item contains all the mandatory attributes listed by the base schema. Similarly, ‘refDataModel’ points to the ‘data-model’ (described above). Note that links (attributes of type ‘Relationship’) are provided for the ‘Provider’ item (which contains information about ‘Provider‘ entity for this item) and ‘resourceServer’ item (which provides information about ‘resource-server’ entity on which this resource is hosted).
The item also contains ‘tags’ and ‘itemDescription’ attributes which are very useful for discovery purposes. Another important attribute of type ‘Relationship’ is the ‘accessObject’ which refers to the API object (described below) that describes methods for accessing data from this resource. The attribute ‘accessVariables’ lists down the values of API variables required by the API object. Using these attributes it becomes very easy for any consuming application to access data from this resource.
We also note that the item contains “@context” attribute which contains mappings for all the attributes in the item. Below we provide an illustrative snippet of the JSON-LD expanded version of the attribute ‘location’ contained in the above item.
exItem context .json
We once again see, from the above expansion, vocabulary reuse (in this case GeoJSON-LD) in IUDX catalogue framework.
Finally, we provide an example of ‘accessObject’. For this example, the data is assumed to be available through REST-APIs hosted on the ‘resource-server’ and hence the ‘accessObject’ can be an ‘openapi’ object as described below:
Note that the API object refers to the data model to describe attributes in its request-response bodies. The same API object variables, e.g. ‘NAME’ in the above example, pertaining to the individual resources should be provided via the ‘accessVariables’ attribute in the corresponding ‘resourceItem’.
A message packet of a resource coming from a resource server might point to the context for that datamodel supporting dynamic semantic interpretability, for example -
example message packet (JSON)
Including an “@context” field here which points to the previously mentioned airQuality_dataModel allows the receiver of this message to interpret the fields “TEMPERATURE_MAX”, etc without knowing apriori the kind of resource that is sending this message.
Passing this message through a JSON-LD preprocessor yields
example message packet (JSON-LD)
The id of an attribute, <catalogue-link>/airQuality/airQuality_context.json#/CO2_MAX provides for semantic interpretability whereby a user/program can trace the link and understand what exactly CO2_MAX means. The @type provides “type” interpretability where the structure of the attribute can be understood.
 In case the values takes by “value” is a JSON object, then it is assumed to be a JSON-LD node object and it is assumed that its context is either explicitly provided or the keys used are already mapped via the catalogue item context.
 A numeric quantity represented as a string is acceptable. Any non-numeric characters in the string will not be accepted.
 A resource group defines a grouping of resources within same Resource server and share same data model. See Section 6.4.4. for more details.
 Also refer to the IUDX Github repo (https://github.com/iudx/iudx-ld) for latest releases of base schemas and data models.
 Existing data-models can serve as examples or as a starting point for creating newer data models. In future, tools may be provided to ease data-model development.
 See https://github.com/iudx/iudx-ld