User Identification and Authentication for Geophysical Data�Centers: Exploring a Difficult Transition
Florian Haslinger, Jerry Carter, Helle Pedersen, Jonathan Schaeffer, Robert Casey, Javier Quinteros, Angelo Strollo
and further contributions from
Lesley Wyborn, Elisabetta D‘Anastasio, Jonathan Hanson, Mark Chadwick, Christos Evangelidis, Jens Klump …
Open, unrestricted, unconstrained anonymous access to (waveform) data and associated metadata is a long-standing paradigm in seismology (to large extents also in other disciplines, e.g. GNSS) – founded in the realisation that � where global observations are needed to do science, open sharing of data is fundamental
The (seismological) world today – paradise, almost…
Increasingly, data centers are asked by funders or other institutional authorities to report more details on ‘usage‘ of their data and services than they currently capture
To comply with that, user identification (authentication) will have to be implemented for (all) data access
The challenge: funders and other authorities want to know more… (I)
EIDA authentication service EAS
ORFEUS has an Authentication/ Authorization System (EAS) in production supporting eduGAIN (via B2ACCESS).
4
Statistical information logged
(anonymized)
Datacentre,
Date,
Seismic Network,
Station Code,
Location,
Channel,
Country,
Cumulative amount of:
Bytes,
Requests,
Successful requests,
Failed Requests.
Usage data collection today: ORFEUS-EIDA Data Centres
Increasingly, data centers are asked by funders or other institutional authorities to report more details on ‘usage‘ of their data and services than they currently capture
To comply with that, user identification (authentication) will have to be implemented for (all) data access
The challenge: funders and other authorities want to know more… (II)
Implementing user identification at data centers meets with some technical and managerial issues:
The consequences
There are some (apparent) benefits arising from personalized user tracking � – aside from fulfilling funder requirements
Hey, but wow … tracking usage may offer benefits for data centers and users
Could these benefits also be realized (more effectively) through other means and activities?
Authentication & authorisation mechanisms are required anyway at our data centers at least for some services
OK, so let‘s move on …
A general / generic user authentication requirement for everything should be (re)viewed very critically
so let‘s keep coordinated and develop common solutions – in seismology but also beyond
The authors of this presentation came together in an ad-hoc manner triggered by IRIS‘ announcement �that they would implement user identification for their data services by summer 2022. �We are currently discussing both the technical and the governance & strategic issues.
Technical issues will be further discussed and promoted through FDSN mechanisms (for seismology)
– expect some communication there soon
Governance & strategic issues will be further discussed in other upcoming venues (IUGG 2023, …) � and brought to relevant other bodies & initiatives (RDA, CODATA, ISC, …)
If you are interested to join the discussion, get in touch!
IRIS Data Services will soon be implementing an identity management system to:
Users who download data:
Instead of tracking by IP Address:
10
Identity Profile (Example):
Name
Institution
Location
User Class
Usage data collection tomorrow – IRIS data services
Further reading (suggestions)
The links below point to documents and other resources that we consider relevant and/or interesting in the context of the topic of this presentation
UNESCO recommendations on Open Science, 2021:
https://unesdoc.unesco.org/ark:/48223/pf0000379949.locale=en
(federated) identity management, FAIR and open access from a different discipline (Biology):
https://www.fim4l.org/wp-content/uploads/2021/03/Open-Access-and-FIM-v4.pdf
two complementary reports by OECD/GSF and ICSU/WDS on international research data networks and sustainable research data repositories:
https://doi.org/10.1787/e92fa89e-en
https://doi.org/10.1787/302b12bb-en
A study from Germany / DFG on issues related to data tracking and use of usage data by academic publishers:
https://www.dfg.de/download/pdf/foerderung/programme/lis/datentracking_papier_en.pdf