BA Bill Anderson
BB Brian Banks
BH Bjorn Hagstrom
BL Beata Lisowska
DB Darren Barnes
DL Deirdre Lee (deirdre@derilinx.com)
GG Gaurav Godhwani
HL Heather Leson
IM Irum Maqsood
JD Joonas Dukpa
JLM Jose Luis Marin
NT Natasha Thamali
PA Phil Ashlock
PC Pyrou Chung
UDM Ulrika Domellof Mattsson
IM: In Canada they published guidelines for use of open standards.
BB: How can you make sure that guidelines can be updated?
IM: Hoping to put them on GitHub, working with OGP, they set up a document. We looking at data standards and data quality
JLM: Did you consider the kind of things we discussed, having the right field, etc.
IM: yes, we have specific guidelines, for example with open contracting this was very important, we recommend ISO, etc.
JLM: many times csv/xml can be just a container, there’s no definition what’s in there
IM: yep, another thing is acronyms, there are a ton
PA: we also use DCAT, we strongly type them, and then a central validation process, we check how each agency is performing, we provide support to different agencies on how to implement them. We have common approach to measure against. We have to be proactive about providing support
BA: how one fits the technical and political part together is important. Portals are similar and standardised, but sometime there are different standards. Is there not a common ground that we can all agree on, sign up to?
PA: I think we do that with DCAT
UDH: we also sign up with DCAT, right now the open data portal is with the national archives, so we hope there will be more work that will be used. It is still difficult to implement an open standard, once you decide to use it
JS: there are approaches, people that don’t understand the standards
BH: we use the DCAT-AP templates to lower the barrier
PA: all the vendors that operate in the US all use DCAT JKAN, etc, 60 different local authorities use DCAT, we used to provide syndication to data.gov with Socrata. Technology not important, but have to use DCAT in order to be syndicated. Data.gov use JSON-DCAT
BL: to improve the discoverability of data, should the datasets be interoperable
PC: whatever standards we use they should be machine-readable, because its challenging for us to get data from other sources, we’re aggregators, we’re taking data about hydro dams, every dam has a different id, different name and 4 different translations in the same country.
BH: that leads to discovery, that’s the problem, I want to use standards, where are they?
BB: the language is important point, that’s the interesting point, want to have the linkages, how to build bridges across the standards. We’re creating new silos
JL: should make a list of important datasets, reference data, repositories of authoritative datasets. Only governments can do that.
PA: we have open registers
DB: they are struggling on that though, register of registers. Struggling to work with public bodies, because if you want that someone has to maintain it.
GG: more complex because we are dealing with so many languages. We don’t have a fiscal metadata standard. Is anyone working in that area? 2 problems
BH: contracting standard
BA: openbudgets.eu
JL: XBRL? Many of these standards are over engineered
HL: I just spent two months living in the middle east, and I think we need to drop open, we have to be open and flexible about standards. We should have to look at data sharing. Looking at new ways to start conversation with people, how to present, and that can be used to move towards standards that can be used:
GG: do you use?
PA: we use DCAT, but we expand that, for example, for APIs, there are Swagger, rabble. We’ll probably be using it to define csv data dictionary too
IM: We’ll probably be using it to define csv data dictionary too. We looked at DCAT data dict, it didn’t meet our needs.
NT: how to go about discovering existing data
DB: there’s something going on…
PA: there are initiatives to look at what data is available, e.g. to tell orgs you already publish this. There are also initiatives that require people to provide an inventory of data that exists, also looking at what FOI requests have come in
NT: talking to people to data in the rail data, but getting pushback and they’re seeing what it’s sensitive data. What’s sensitive data
BH: sensitive means they don’t want to release it
HL: well that depends
BB: there’s a difference between the humanitarian or business domains. With us, there is a reluctance to publish personal phone numbers, anything that will link to individual
BH: Secrecy, personal information or third-party copyright. If it’s not one of these, there is no reason
HL: we need to understand the hesitancy, there is a socialisation piece, there is different kinds of negotiation, not just here’s why, but maybe example from your peer
NT: going back to discovery piece, how do you discover it if you
DB: the list of data can be released, even if it is not going to be published as open data
PA: we’re doing that right now
IM: we are going through complete inventory of all data, if data is not released, why is it not released
BB: controlled data, not just open, but what about registration, the conditionality, we want o track analytics. Is that usual?
BH: I hear that a lot, but I tell them it doesn’t work . if data is free, it can be shared. The only benefit is if the API is overused
DB: we have registration on an API exactly for that reason, if someone hammers it. How can we track usage?
BB: we have an optional registration page, it’s a hallway, we get some feedback. We will get some info from that, we
DB: why do you want to know who uses it? Because it’s so broad
PA: we’ve had a lot of discussions around that, any impedance is a barrier, we only have a couple of APIs with registration, we’ve also added a feedback form. But now we’re encouraging citation, because that is
DL: data usage vocabulary
BA: I like the idea of open metadata for closed datasets.
PA: it’s time to change the language for open data to data management
BA: in Africa open data is a great tool for neoliberals, if the data is open, it won’t be African companies using it
BH: talk to your users
PC: that’s always biased thought
HL: there are other ways, community roundtables on how to engage your users, gather qualitative data, thinking of different ways to target, not everyone will be into technical solution
NT: is there a sector that uses open data in an ideal kind of way that we can use to apply to our work
PA: transport is a good example, it’s good to talk to
DB: try transport for London
JD: building open data portal. There is no visibility on the open data standards that are used, open API standards, there’s no domain specific.
PA: there’s an effort in US and Canada, there Thinking less of an inventory of data standards, but not just an inventory, the US data federation. Methodology, interested in looking at maturity of standards.
DL: LOV, also says who uses the vocabulary. http://lov.okfn.org/dataset/lov/
BA: such a wide tablet
DL: technical framework, very simple, and top down policy
BA: for budget spend, they keep changing columns
DB: we need more detail on what ‘machine-readable’ means
IM: that’s why we went deeper than that, versioning, provenance, data quality, there is a data quality standard, focus is now on reusability and quality
DB: its about reusability
DL: that’s by bps cover wide range, provenance, quality, etc
IM: make it simple
New guy: one thing about metadata is about persistence, how long will the data be there
BB: is SLAs common practice?
IM: we have frequency, we don’t have ‘never’ datasets, has to be daily/weekly/annual etc. unless there is historical reason. But they are edge cases
JD: data publishers aren’t thinking of users always. They can just delete/remove services. Goes back to how to measure usability
PA: for archiving we look at how to have distributed file store, for political unrest, natural hazard
DL: are we collaborating enough with other communities – geospatial, statistics, library,
UDM: It’s important, especially it can be linked at a legal level, PSI directive
BA: governments are good, but large global institutions are bad at it, e.g. un. World bank, etc.
UDM: reference data is also provided
BB: what is grass roots standards?
GG: risk assessments for opening data
JD: political standards
DB: don’t want to adopt new standards, let’s use what we have
HL: we fork standards to be able to serves needs from. All of us have the responsibility to reuse
BA: it’s not our business to create a new standard, up to consortiums
PA: grassroots efforts are more focused on domain space, the community of practice that has the need, but they don’t have experience of how to create standards. There’s a mismatch. There can be agility, e.g. via GitHub.
DL: do we need standard bodies?
BB: there’s a mix. We didn’t know where to start. We needed the building blocks for org, geo, location, etc. and we built the niche part. There were huge barriers to working with standards bodies, we didn’t think it was possible. But there is value to domain-specific standards. The standardisation bodies should tell how to bring in new standards. Making it more sustainable. Persistent, governance, membership model. This isn’t always possible without the standards bodies. Is there always a needs for grassroots standards that have been created ad-hoc?
PC: I think it’s laziness. The other institutions don’t care about standards; they don’t have one.
JD: started at low-level, technical, but with IATI, the easier it becomes, they have resources for the outreach and communication
JL: This is a conversation I often have with the orgs I work with. Then talk to the tech providers. Governments aren’t free to implement the standards; they use private software.
DB: that’s true, move towards open y default, Siloed standards can be built in government because we have budget we need to spend.
JL: Roadblocks if data is not open, governments just say whatever
IM: it has to be the government to make those rules, moving forward, everything in procurement must be open data, if you build/buy software. It is default, federal
JL: needs a lot of political will
HL: privacy by design helped with that
UDM: they sell standards to licences. If you want to create standard, you have to be a member, in Sweden it’s pointed to as this is the standard body that should be used
PA: incorporation by reference, if a standard is referenced in legislation, then it has to be free/open
BL: rich countries have lots of open standards, etc. but how about less developed countries? Approaches should be applicable to them too
HL: there should be graduation level, e.g. in Qatar they only publish data on certain social groups, but how to collaborate with orgs. 197 countries designed
PA: sustainable development goals are interesting guideline for these