D1.1 OSPO-RADAR Stakeholder Requirements and Specifications

Project Title

Open Source Program Office Research Assets Dashboard and Archival Resource

Project Acronym

OSPO-RADAR  

Grant Agreement No.

2025-25188

Start Date of Project

2025-06-01

Duration of Project

24 months

Project Website

https://www.softwareheritage.org/2025/04/02/ospo-radar/

Work Package

WP1, Project management, specifications, planning, and testing

Authors

Morane Gruenpeter, Renaud Boyer, Sabrina Granger

Reviewer(s)

Clare Dillon, Christopher Erdman, Bastien Guerry, Daniel S. Katz, Violaine Louvet, Micha Moskovic ADD COMMUNITY REVIEWERS

Date

2025-09-30

Version

V1.0 FOR COMMUNITY REVIEW

Document log

Issue

Date

Comment

Author/Editor/ Reviewer

v.0.1

06-08-2025

Inception and first structure draft

M. Gruenpeter & R. Boyer

v.0.2

21-08-2025

Writing sprint and gaps identification

M. Gruenpeter, R. Boyer & S.Granger

v.0.3

04-09-2025

Draft of main sections

M. Gruenpeter, R. Boyer & S.Granger

v 0.4

22-09-2025

Added workflows, use cases & personas

M. Gruenpeter, R. Boyer & S.Granger

1.0

26-09-2025

Version 1.0 shared for community review

M. Gruenpeter, R. Boyer & S.Granger

Abstract

The OSPO-RADAR project will develop the technical tooling to enable Academic Open-Source Programme Offices (OSPOs) to efficiently archive, manage, and showcase their institutions' software productions. This will elevate research software to a first-class research output and enable an evidence-based approach to software development. Built on the integration with Software Heritage, the portal will offer enhanced metadata management, streamlined workflows, and institutional visibility, fostering a sustainable ecosystem for open-source software management.[a][b]

Image result for cc by picture

Terminology

Terminology/Acronym

Definition

ARDC

Archive, Reference, Describe & Cite

Referencing a software refers to: making software artifacts identifiable by attributing SoftWare Hash Identifiers (SWHIDs).

CFF

Citation File Format

CRUD

Create, Read, Update, and Delete 

DMP

Data Management Plan: “...is a living summary document that provides assistance with organising and planning all the phases in the lifecycle of data. It explains, for each dataset, how project data will be managed, from creation or collection to sharing and archiving.” (Université Paris Saclay, 2019)

Software can be tracked in a software management plan (SMP).

FAIR

Principles based on community expectations in respect of research outputs - findable, accessible, interoperable, and reusable.

JATS

Journal Article Tag Suite: XML format used to describe scientific literature published online

MVP

“A Minimum Viable Product is a version of a product with just enough features to be usable by early customers who can then provide feedback for future product development[1]

PID

Persistent identifier: generally expected to be unique, resolvable, and persistent.

RSMD

Research Software MetaData (guidelines)

SIRS

Scholarly infrastructures for research software (report)

SPDX

System Package Data Exchange is an open standard capable of representing systems with digital components as bills of materials.

SWHID

SoftWare Hash Identifiers are designed to identify permanently and intrinsically all the levels of granularity that correspond to concrete software artifacts: snapshots, releases, commits, directories, files and code fragments. SWHID became ISO/IEC international standard 18670 on April 23, 2025.

SWORD

Simple Web-service Offering Repository Deposit is an interoperability standard developed by JISC extending ATOM.


Table of contents

1/ Introduction

1.1/ Objectives

1.2/ Scope and Methodology

2/ Stakeholder groups and personas

2.1/ What is the Open Source Programme Office Scope in Academia?

2.2/ Scholarly Ecosystem stakeholders

2.3/ Personas, as a collective image of a segment of the target audience

3/ Current landscape and associated challenges

3.1/ The existing infrastructures in the ecosystem

3.2/ Building on existing infrastructures and components

3.3/ The Role of CodeMeta in OSPO-RADAR

4/ About the specifications

4.1/ Overview

4.2/ Out of Scope

4.3/ Objectives (aligned with the SIRS pillars)

4.4/ Existing features OSPOs can use

5/ Minimum Viable Product: features and workflows overviews

5.1/ Account creation, login and management

5.2/ Populate the software source code dashboard

5.3/ View dashboard, curate and filter

5.4/ Public view and search capabilities

6/ Non-functional requirements to address

6.1/ Accessibility

6.2/ Performance / compatibility

6.3/ Legal requirements

6.4/ Sustainability & maintainability

7/ What’s next? Interoperability & reusability of the Dashboard.

7.1 / integration with existing tools

7.2/ Functionalities and capabilities to keep in mind after MVP

8/ The road ahead: a sustainable service model

References

Appendices

Appendix A: Premortem

Appendix B: Persona Academic OSPO Manager

Appendix C: A collection of use cases from the RSMD workshop (2023)

Appendix D: Review grid of the RSAC components specifications

1/ Introduction

  1. 1.1/ Objectives

The OSPO-RADAR project (Open Source Program Office Research Assets Dashboard and Archival Resource) addresses a growing need in the research ecosystem: the effective management, preservation, and recognition of software as a first-class research output.[c]

At the highest policy level, such as the UNESCO recommendations on Open Science, the DORA declaration, and the French National Plan for Open Science, software is increasingly acknowledged as a key scholarly product. Yet, unlike articles or datasets, the infrastructures supporting the archival, curation, and metadata management of software remain underdeveloped.

The inherent complexity of software, with its dynamic nature, dependency layers, and varied documentation, poses unique challenges for discoverability, preservation, and attribution. For Open Source Program Offices (OSPOs) within academic and research institutions, these gaps translate into operational difficulties. OSPOs play a critical role in managing and promoting open source practices, yet they face recurring obstacles in:

  • Archiving and tracking software outputs;
  • Ensuring alignment with institutional processes;
  • Managing metadata in ways that foster visibility, compliance, and strategic decision-making.

(C. Dillon, 2025)

Effective metadata management is central to overcoming these challenges. Metadata links software to related publications, datasets, and contributors, thus strengthening its academic recognition and impact. Standards such as CodeMeta, widely adopted by the community and endorsed by EOSC, RDA, and Force11, provide a strong foundation. However, they require further refinement to address OSPO-specific workflows, such as capturing license compatibility or documenting diverse forms of software documentation.

To meet these needs, OSPO-RADAR proposes the development of a Dashboard. Built on the Software Heritage archive and grounded in metadata standards, the dashboard will offer OSPOs a unified, interoperable, and scalable platform to:[d][e]

  • Manage institutional software assets;
    Enrich and curate metadata;[f]
  • Support proper software citation and attribution;
  • Track contributions, funding, and reuse;
  • Interconnect with existing infrastructures such as InvenioRDM.

This dashboard aims to simplify workflows, enhance discoverability, and increase the visibility of research software, thereby empowering OSPOs to better support their institutions and contribute to the sustainability of open science.

The

SIRS (Scholarly Infrastructures for Research Software, EOSC 2020) distills four essentials for a healthy software scholarship ecosystem—archive, reference, describe, cite—and couples them with a cross-cutting call for interoperability and researcher education (notably via publishers). Together, these set the north star for OSPO-RADAR: move beyond principles into day-to-day operations that institutions can actually run.

Why SIRS matters now

Despite broad consensus, practice lags: hand-offs between repositories remain brittle, identifiers aren’t used consistently, metadata is shallow or siloed, and citation guidance is fragmented. Infrastructures have the leverage to normalize practice, provided the tooling makes compliance the path of least resistance. OSPO-RADAR operationalizes SIRS recommendations via ready-to-run flows, opinionated defaults, and actionable signals that the OSPO offices can adopt quickly.

Archive

  • Ensure durable, verifiable, provenance-rich preservation of source code and its context.
  • Define institutional deposit flows to Software Heritage (human and API), provenance capture (origin URLs, timestamps, authorship), and retention policies visible in the dashboard.

Reference

  • Make software findable and unambiguously referable [g][h][i][j]over time.
  • Treat PIDs (notably SWHIDs) as first-class data: stored, displayed, exported, and resolvable across UI and API.

Describe

  • Provide rich, reusable metadata to enable discovery, assessment, and reuse.
  • Implications for D1.1. Adopt CodeMeta profiles, require a minimal metadata core, support mappings to local schemas, and reduce curator burden with assisted extraction (“FAIRifiers”).

Cite

  • Normalize correct software citation across venues.
  • Provide rich, reusable metadata to enable discovery, assessment, and reuse.
  • Adopt CodeMeta profiles, require a minimal metadata core, support mappings to local schemas, and reduce curator burden with assisted extraction (“FAIRifiers”).[k]
  1. 1.2/ Scope and Methodology

The scope of D1.1 is to capture the requirements, expectations, and challenges of Open Source Program Offices (OSPOs) in academic and research institutions, in order to shape the functional design of OSPO-RADAR. The methodology combined two complementary approaches: (1) gathering structured feedback through a survey distributed across the OSPO and research software community, and (2) validating and enriching these findings through a community review process, during September and October 2025.

1.2.1/ Survey Results

The OSPO-RADAR survey collected responses from 16 organizations, with a majority representing universities (68.8%), followed by research institutes (25%).[l] The maturity of OSPOs varied: most respondents indicated that their initiatives are just getting started (53.8%), while others reported defined roles and goals (23.1%) or defined policies with limited tooling (15.4%). Only one organization reported advanced workflows with automation.

Key findings included:

  • Curation workflows: Respondents were evenly split between preferring researcher-initiated (50%) and institution-initiated (50%) curation.
  • Functional priorities: The most valuable features identified were tracking reuse, citations, and funding links (27.9%), integration with platforms (23.2%), and public-facing software showcases and citation tools (16.2% each). Automatic archival (SWHIDs) and centralized metadata management also received significant support.
  • Solution preferences: A majority favored integration into existing systems (58.3%), while 41.7% expressed interest in a SaaS solution provided by Software Heritage.
  • Challenges: Common issues included lack of systematic tracking and archiving, difficulty convincing researchers to deposit software, fragmented schemas across groups, and concerns around interoperability, metadata curation, and quality control.
  • Engagement: More than half of the respondents were willing to contribute to OSPO-RADAR specifications, with a preference for online workshops (54.6%) and contributions through shared documents or code repositories.

These results provide a clear picture of the diverse contexts and priorities of OSPOs in academia, confirming both the demand for dedicated tooling and the importance of interoperability with existing platforms.

1.2.2/ Community Review Process

In addition to the survey, a community review process was implemented to validate and refine the preliminary findings. This process combined synchronous and asynchronous exchanges to maximize participation and depth of feedback:

  • Targeted workshop: A small workshop was held with five designated OSPOs, allowing for focused discussions on the survey outcomes and the initial scope of OSPO-RADAR.
  • Asynchronous review: Drafts of this deliverable were circulated for comments over a one-month period, enabling broader community members to provide input at their own pace.
  • Iterative refinement: Feedback from both the workshop and the asynchronous review was consolidated and used to adjust the interpretation of survey results and sharpen the identified priorities.

This combined approach confirmed the survey’s main conclusions, while emphasizing the importance of flexibility to accommodate institutions at different maturity levels. It also reinforced the value of transparency and continued engagement, ensuring that OSPO-RADAR remains shaped by its community.

  1. 2/ Stakeholder groups and personas

  2. One of the most promising and differentiating characteristics of university OSPOs is
  3. their ability to support novel forms of impact through new partnerships and engagement.
  4. (Young et al., 2024) 
  5. 2.1/ What is the Open Source Programme Office Scope in Academia?

Within the academic landscape, Open Source Programme Offices (OSPOs) play a strategic role in bridging institutional research practices with the broader open-source ecosystem.[m] While corporate OSPOs primarily address compliance, risk management, and efficiency, academic OSPOs are distinguished by their focus on advancing Open Science and supporting research software as a first-class scholarly output.

The scope of an academic OSPO typically includes:

  • Governance and Policy
    Establishing clear institutional policies for open-source software development, licensing, intellectual property, and compliance, while aligning with national and international Open Science strategies.
  • Research Software Management
    Supporting the long-term archiving, citation, and discoverability of software outputs; promoting the use of persistent identifiers and metadata standards (e.g., SWHIDs, CodeMeta); and ensuring research reproducibility.
  • Community and Capacity Building
    Serving as a hub that connects researchers, research software engineers (RSEs), librarians, and IT professionals. Academic OSPOs provide training, workshops, and guidance on open-source best practices, while fostering recognition of software contributions in research assessment.
  • Infrastructure and Tools
    Facilitating the integration of institutional systems with external infrastructures such as Software Heritage, Zenodo, or GitHub/GitLab; developing dashboards and monitoring tools to track reuse, impact, and sustainability of research software.
  • Strategic Partnerships
    Acting as the institutional interface with external open-source communities, funders, and international initiatives (e.g., EOSC, RDA, ReSA, SciCodes). Academic OSPOs help position their institutions as active contributors to the global open knowledge ecosystem.

In this way, OSPOs in academia contribute not only to the efficient management of research software assets but also to the cultural shift required for software to be fully recognized as a critical component of scholarly communication.

2.2/ Scholarly Ecosystem stakeholders

The Software Source Code Identification Working Group (SCID WG), a joint effort under the Research Data Alliance (RDA) and FORCE11, produced a landmark output on use cases and identifier schemes for persistent software source code identification. Published in 2020, the report identifies a broad range of stakeholders that are listed below.

Infrastructures and tooling:

  • Archive: Preserves human knowledge, particularly software source code.
    Examples: Software Heritage (SWH), Zenodo
  • Citation Manager: Provides services or tools for managing citations.
    Examples: Zotero, Mendeley, EndNote
  • Collaborative Development Platform / Forge: Host collaborative software development in public or private repositories.
    Examples: GitHub, GitLab, Bitbucket
  • Indexer: Aggregate, classify, and provide access to research outputs, improving findability of software items.
    Examples: ADS, Scopus, Web of Science, Google Scholar
  • Institutional, National or Domain Repository: Preserve intellectual outputs of a specific institution or domain.
    Examples: HAL
  • Journal / Publication Venue / Publication platform: Disseminate research outputs (articles, data, software) through peer review, journals, or conferences.
    Examples: JOSS, POPL, Episciences
  • Package Manager: Facilitate installation, configuration, and management of software tools.
    Examples: PyPI, NPM
  • Registry: Provide online catalogs describing software projects with metadata.
    Examples: ASCL, swMATH, SciCrunch, Wikidata

Organization types and actors:

  • Funder: Provide financial support for research projects, including those producing software, and evaluate outcomes.
    Examples: NSF, NIH, Wellcome Trust
  • Institution / Research Center / University: Employ researchers, may hold copyright of outputs, and evaluate contributions.
    Examples: MIT, ENS, Inria, Grenoble Alpes University, Delft University of Technology
  • Technology transfer offices: have a different scope than OSPOs, but can maximize the impact of open source software outside the academic environment.
  • Library / University Library
    Collect and curate resources, may provide emulation services for legacy software.
    Research libraries help researchers develop data/software management plans as these are increasingly mandatory by research funding organisations. Examples: Stanford Library; The Department of Libraries, Information and Open Science (DiBISO) at Paris Saclay University
  • Curator / Librarian / Digital Archivist: Moderate and curate research/software artifacts descriptions in archives, repositories, or libraries. Support scholarly communications (i.e. open access, emerging trends in publishing). Provide training and create learning materials on open science best practices. Contribute to the open science institutional policy.
  • Policy Maker: Define institutional, national, or international policies related to research outputs. Examples: European Commission, national research committees
  • Researcher as Software User (RSU): Use software in research without contributing to its creation. Examples: Any researcher using existing tools in their work
  • Researcher as Software Author (RSA) / Research Software Engineer
    Create research software, possibly fulfilling roles such as design, architecture, coding, debugging, maintenance, documentation, testing, or management.
  • Software[n][o][p][q][r] Engineer
    Develop and maintain software without necessarily being a researcher, often contributing to creation, maintenance, or dependencies.

Research Data Alliance/FORCE11 Software Source Code Identification WGet al. (2020). Software Source Code Identification Use cases and identifier schemes for persistent software source code identification (1.1). Zenodo. https://doi.org/10.15497/RDA00053

  1. 2.3/ Personas, as a collective image of a segment of the target audience

  2. A persona is an archetype representing a group of people whose behaviors, motivations, and goals are similar. The purpose of personas is not to represent all audiences or address all needs of the website or a product, but instead to focus on the major needs of the most important user groups. Creating personas helps identify the barriers to accessing the product, which leads to asking why users would choose the future product over another.

Persona’s name

Job title

Main needs

Pain points

Sofia

Academic OSPO Manager

  • Build open source awareness across the institution
  • Define and implement OSPO policies and best practices
  • Support researchers in licensing, compliance, and community contributions
  • Connect the institution to European and global OSPO networks (CURIOSS, CHAOSS, etc.)
  • Deploy tools for the ecosystem (legal compliance, metrics, etc.)
  • Keep up to date with available tools for OSPOs
  • Lack of developer capacity
  • Unclear internal policies
  • Balancing compliance with innovation
  • Within the institution, some groups might view an OSPO as a “competitor” rather than a complementary resource (Young et al., 2024)

Christine

Academic library  director

  • Provide reliable indicators to the university’s governance, without reinventing the wheel
  • Provide support to develop Data/Software management plans (DMP / SMP)
  • Coordinate the metadata curation of the institutional outputs
  • Identify the different types of stakeholders within the institution
  • Transitioning from Research Data Management to Research Software Support
  • Capacity building
  • Lack of developer capacity
  • The library is not always identified as a key partner when it comes to software. OSPOs come the industry and universities are complex places with unique structures, systems and cultures. Libraries do not exist in corporations.

Leo

Infrastructure technical manager

  • Improve the infrastructure code base with the help of his team
  • Ensure the consistency of the developments with the infrastructure roadmap
  • Plan and allocate resources to tasks and projects
  • Complicated adoption of SWHID, which isn't yet recognized in the scholarly ecosystem
  • Users of Leo's infrastructure don't ask for SWHIDs
  • Software citation seems like high effort with low return

Jitin

Researcher

  • Track all his contributions
  • Report back for different instances
  • Showcase the work done, including, as a software contributor in different types of documents: CV, report, grant request, etc.

  • Processes are time consuming - lack of automation
  • Lack of support for implementing best practices (archiving, describing, etc.)
  • Stakeholders may have conflicting expectations about reported entries
  • Contributions made outside of the formal job scope go unrecognized and uncompensated
  • Difficult to monitor the impact of the contributions at an individual level

Mostafa

Lead of a research unit

  • Identify all the pieces of software produced by the research unit
  • Provide a comprehensive list of the software to the local governance as well as to national and international evaluation committees
  • Coordinate and validate the lists produced by the researchers
  • Promote good practices in terms of software authorship, use of the institutional forges
  • Lack of consistency in descriptions (deduplications, inconsistent data quality)
  • The high volume of last-minute deposits in the institutional repository prevents us from curating them correctly due to time constraints
  • Too much information exchanged in spreadsheets
  • Evaluation committees have varying expectations for the required data format and content
  • The contribution status for non-authoring roles (e.g., testers or maintainers) is unclear, as there are no clear expectations on what to report. In addition, the roles of a single contributor can be plural, with contributions spanning variable time extents
  • Many colleagues don’t consider themselves as developers. So they don’t list their software production in the reports.
  1. 3/ Current landscape and associated challenges

3.1/ The existing infrastructures in the ecosystem

A first observation from both the OSPO-RADAR survey and the SIRS Gap Analysis Report (Azzouz-Thuderoz et al., 2023) highlights both the strengths and limitations of current infrastructures. Thus[s], a study (Carlin et al., 2023) highlighted the fact that software isn’t included as a type of research output within many repositories in the UK.

(Carlin et al., 2023) A concerning lack of software stored in institutional repositories in the UK

While there are multiple services supporting the archiving, referencing, and publishing of research software, OSPOs still lack an integrated view and proper tooling to manage their software assets.

“No OSPO is an Island” (Gilles Mathieu)

Stakeholder feedback illustrates these gaps:

  • Lack of proper tooling: “We started creating a dashboard years ago, but then we couldn't maintain it. Thankfully e-Science and Software Heritage have done a way better work.”
  • Lack of incentives: “At the moment we don't really do any management beyond uploading to GitHub, and some even put it on Zenodo. From a researcher perspective, it’s the time and effort required, as you pointed out in the presentation.”
  • Constraints in sensitive domains: “I manage research software developed at a University Medical Center. We've got constraints on what is allowed in the environments where the software analyzes medical data, which can complicate sharing code (and data).”
  • Knowledge gaps: “Not knowing how to begin archiving software, not knowing best practices or available tools.”

Infrastructure type

Purpose

Potential challenges for the end-users

Scholarly repositories,

Institutional repositories

Store, manage, and provide access to datasets, enabling their sharing, preservation, and reuse.

Software is not always included as a type of research output within a repository: deposit is not possible[t]

Coverage issue in terms of archival: “the vast majority of repositories do not feed the universal archive [SWH] yet” (Azzouz-Thuderoz et al., 2023)

Difficult to identify and track the software pieces’ evolution

Descriptions may not provide information that may be relevant for an OSPO, as the extraction of intrinsic metadata, which is found in the source code itself, is not supported.

Intrinsic identifiers are not supported by most of the infrastructures

Aggregators

Integrate, harmonize, and offer access to information originating from different sources, which should otherwise be accessed independently.

Enrich the aggregated content with information that was not available at the sources.

Forges

Facilitate collaborative work on a software project. A forge contains tools like a versioned source code repository, discussion forums, an automated testing environment and so on. (French Ministry of Higher Education and Research.)

Coverage: even if an institutional forge is implemented, end-users may collaborate in external instances.

Archival: forges are collaborative tools, but are not designed for long-term access.

Publishing platforms

Infrastructures associated with journals or publishing companies that are used to deposit scholarly publications

“Only a few publishers have already explicitly integrated this recommendation [the software associated with the publication should be equipped with proper metadata].”

Examples of successful implementations: Dagstuhl, IPOL, eLife, JTCAM, and the journals hosted on the Episcience platform

  1. 3.2/ Building on existing infrastructures and components

  2. The lessons learned from integrating different infrastructures with Software Heritage define clear requirements for OSPO-RADAR. Policy alignment is critical: shared frameworks such as the SIRS report, CodeMeta vocabulary, and Software Heritage Identifiers (SWHIDs) must guide the dashboard to avoid fragmentation and ensure interoperability. Strong support is equally essential. OSPO-RADAR must offer clear documentation and helpdesk services so institutions can manage metadata practices, identifiers, and APIs without friction.
  3. Advancing software management cannot be left to individual researchers alone; institutional engagement with training materials, onboarding templates, and capacity-building resources should enable librarians, RSEs, and research staff to embed software practices in daily workflows.
  4. Finally, OSPO-RADAR should be modular, interoperable, and reusable, with components that are openly documented and capable of integrating seamlessly with repositories, registries, forges, and publishing platforms. In this way, OSPO-RADAR can become the connective tissue of the scholarly ecosystem, turning lessons learned into durable infrastructure for research software management.

The added value of OSPO-RADAR is to transform a fragmented ecosystem into an actionable, interoperable, and institutionally relevant view of research software, equipping OSPOs with the tools they currently lack to manage, sustain, and showcase their software assets.

3.3/ The Role of CodeMeta in OSPO-RADAR

The issue [...] is in the interoperability of disparate platforms, scientific disciplines and descriptors for research artefacts, i.e., individual repositories cannot communicate in a common language to describe their data to each other.

(Carlin et al., 2023)

3.3.1/ Metadata requirements for OSPO-RADAR

The OSPO-RADAR dashboard relies on machine-actionable metadata to archive, track, and valorize institutional software assets. CodeMeta provides a JSON-LD schema that captures key descriptive, administrative, and provenance metadata, while mapping to external standards such as schema.org, Dublin Core, and DataCite. By adopting CodeMeta as the backbone metadata model, OSPO-RADAR ensures:

  • Interoperability with existing repositories (HAL, Zenodo, InvenioRDM, CRIS systems).
  • Consistency in reporting across institutions and infrastructures.
  • Extensibility to cover OSPO-specific needs while remaining aligned with community practice.

3.3.2/ Core CodeMeta properties for OSPOs

The following CodeMeta properties are central to OSPO-RADAR use cases:

Property

Description

OSPO Use Case Example

Related PID / Standard

name

Human-readable title of the software

Institutional inventory of software assets

schema.org name

description

A description of the item.

A research output abstract.

schema.org description

author / contributor

People or organizations responsible for the software

Linking to ORCID and ROR for attribution. Adding author role.

ORCID, ROR

version

Software version or release

Tracking evolution of institutional outputs

Semantic Versioning

license

Applicable license(s)

Monitoring license compatibility and compliance

SPDX License IDs

programmingLanguage

Primary implementation languages

Reporting language trends in institutional software

schema.org mapping

relatedPublication

Publications describing or citing the software

Linking code to Scholarly infrastructure (e.g HAL) bibliographic records

DOI, HAL ID

funding

Funding sources that supported development

Tracking grants and sponsor contributions

Grant DOIs, funder ROR IDs

readme / documentation (proposed)

Documentation sources (README, build instructions, manuals)

Ensuring reproducibility and usability in reporting

To be added via CodeMeta PR

3.3.3/ Identified gaps and extensions

Initial stakeholder consultations confirm that CodeMeta should be extended for OSPO workflows. The following gaps are prioritized:

  • License compatibility: capturing multiple license expressions and their compatibility (e.g., GPL + MIT).
  • Documentation granularity: distinguishing between different documentation artifacts (README, manual, buildInstructions).
  • Institutional affiliations: ensuring stronger integration with ROR for organizational reporting. Also identifying units and teams within an organization. In HAL, these are collections that are shared between different institutions.
  • Funding and contributions: improving representation of contributions with reference list of identification of grants and linking funder and funding.

These requirements will be discussed with the CodeMeta governance community (SciCodes, EOSC, RDA, Force11) to ensure global alignment.

3.3.4/ Role in the SWH Dashboard and Reporting

Within OSPO-RADAR, CodeMeta will serve four core functions:

  1. Harvesting: extraction and enrichment of metadata from source repositories (e.g., GitHub, GitLab, institutional archives).
  2. Archival linkage: mapping CodeMeta identifier fields to SWHIDs, ensuring software preservation and traceability.
  3. Dashboard display: powering interactive views of software portfolios (search, filters, metadata panels).
  4. Report generation: providing standardized, comparable metadata for institutional-level reports and benchmarks.

By adopting and extending CodeMeta, OSPO-RADAR ensures that the dashboard is technically robust and aligned with international standards, supporting OSPOs in managing software as a recognized scholarly output.

  1. 4/ About the specifications

4.1/ Overview

The OSPO-RADAR project aims to deliver a Source Code Assets Dashboard (OR Dashboard) that enables academic and research OSPOs to archive, manage, and valorize their software outputs. Built on top of the Software Heritage archive and the CodeMeta standard, it provides tools for metadata curation, reporting, and institutional visibility.

4.2/ Out of Scope

  • Full CRIS integration in phase 1 (requires extended metadata harmonization).
  • Advanced automated discovery of software (beyond “Save Code Now” and repository harvesting).
  • Deep analytics (e.g. dependency analysis, containerization metrics) beyond MVP.
  • Multilingual metadata curation (planned but resource-intensive).

4.3/ Objectives (aligned with the SIRS pillars)

  • Archive: Ensure long-term preservation of institutional code assets in Software Heritage.
  • Reference: Provide stable, unique identifiers (SWHIDs) for precise referencing of software.
  • Describe: Enable rich, standardized metadata management (CodeMeta v3), both intrinsic and extrinsic.
  • Cite: Support citation-ready exports in BibLaTeX, APA, CFF, and CodeMeta.

4.4/ Existing features OSPOs can use

This section lists the Software Heritage capabilities OSPOs can use today—without waiting for an OSPO-RADAR dashboard. Through web tools, APIs, and a push-based deposit service (SWORD v2), OSPOs can trigger on-demand archiving (Save Code Now), push tarballs or metadata-only records, run bulk save jobs, obtain stable SWHIDs for referencing, and generate publisher-ready citations from intrinsic metadata (CodeMeta/CITATION.cff).

Archiving & Deposit

  • Save Code Now (on-demand archiving) with webhook/plugin integration.[2]
  • Deposit Service (push-based via SWORD v2): create/update/enrich deposits[3]; supports
  • Archive uploads (.zip/.tar.gz)
  • Metadata-only deposits that reference repo URLs or SWHIDs
  • CLI for automated deposits and scripting.
  • Provenance & version updates in deposits (track new releases/versions).

Bulk Archival at Scale

  • Bulk Save API (/save/bulk/) to submit large lists of origins (CSV/JSON).[4]
  • Status tracking endpoint (/save/bulk/<request_id>/) to monitor progress.[5]

Metadata & Citation (no dashboard needed)

  • Metadata query by SWHID or origin URL (intrinsic/extrinsic; raw/indexed).
  • Citation API endpoints:
  • Automatic BibTeX generation from codemeta.json or CITATION.cff
     (with smart entry types based on SWHID: snapshot/release/revision/directory/content).
  • CodeMeta-aware ingestion and mapping (CFF → CodeMeta → BibTeX).

Discoverability & Referencing

  • Stable SWHIDs (permalinks) at multiple granularity levels (origin/snapshot/release/revision/directory/content) for long-term reference.
  • Public web “Citation” tab under permalinks for quick copy-paste (optional UI use).

Integration Support

  • Comprehensive tech docs & user guides for all of the above.
  • Partner deposit admin view (optional web UI for partners; not required to use APIs/CLI).

5/ Minimum Viable Product: features and workflows overviews

The section captures the basic operations that the OSPO-RADAR Dashboard needs to support. They describe how a user interacts with the system, how the Dashboard connects with the Software Heritage archive, and what minimal features are required for a Minimum Viable Product (MVP). The idea is to keep things lightweight and usable, while ensuring that provenance and integration are handled correctly from the start.

The MVP delivers the minimal end-to-end path, including:

  • Secure account/role management;
  • Collection population (batch import, signal sw repo, search/add);
  • Internal view/filter/curate;
  • A read-only public front-shop that can be injected in institutional website through an API.

Following the community review of the proposed MVP, a clear boundary will be set for the OSPO-RADAR project (2025-2027) and all other use cases, features and tools will be deferred to the backlog as issues.

5.1/ Account creation, login and management

5.1.1/ Account creation

#W1 Account creation: General description of the feature’s workflow

How new users request access, with validation by an administrator and retrieval of an API token from Software Heritage. This sets up the user profile and organization link.

Main actor: OSPO manager / dashboard admin

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Adding a new OSPO to the dashboard is a process that requires manual validation to confirm the participants' identity and train them on the tool.

  • Create a user (admin) account associated with a collection - manually validated by SWH team

The OSPO must then be autonomous in managing user accounts linked to its organization.

  • Create accounts for other institutional users (contributors) - validated by user (admin)

We should be able to differentiate between two different user levels in an OSPO: a contributor level to manage only the collection & metadata, and a manager level who can also make changes to the organization's settings (description, users, etc.).

  • Organization management (name, logo, description) restricted to Admin/Manager

It is important to minimize the amount of personal data collected during account creation.

  • Data minimization; retention policy for denied/expired requests; visibility into stored personal data

Whenever possible, existing organization/user accounts from Software Heritage's centralized authentication should be reused.

  • If possible, use same account that is used in the archive.softwareheritage.org

User story

As a prospective OSPO collaborator, I want to request an account on the OSPO-RADAR Dashboard, so that I can access my institution’s collection and contribute entries.

Associated use case

A prospective user opens the OSPO-RADAR Dashboard and submits an account request. The Dashboard records the request and immediately notifies the designated Admin for that organization.

A validation phase follows:

  • If the Admin rejects the request: the Admin records the decision in the Dashboard. The Dashboard closes the request and sends a notification to the user informing them that access was not granted.
  • If the Admin accepts the request: the Dashboard creates the user record in the OSPO-RADAR database (and, if needed, creates or links the organization record). The Dashboard then contacts Software Heritage (SWH) to obtain an organization-scoped API token. Once the token is returned, it is stored in the OSPO-RADAR database. Finally, the Dashboard notifies the user that their account has been approved and provides next-step instructions for signing in.

End state: Either the request is closed as rejected (user informed), or the user is active, linked to the correct organization, and the SWH API token is stored for later authenticated operations.

Access roles

Role

Description (+Personas)

Create collection & Bulk addition

Update & Validate entries in collection

View collection

Admin - manager

Full control: can create sub-collections, bulk add, update/validate entries, and view.

Yes

Yes

Yes

Curator - contributor[u]

Can enrich and maintain collections: update/validate entries and view.

No

Yes

Yes

Researcher

Read-only access to consult collections.

No

No

Yes

Large public

General audience with view-only access.

No

No

Yes

Workflow overview: #W1 Account creation for user (contributor)

sequenceDiagram
    actor User as User
    participant Dashboard as OR Dashboard
    actor Admin as Admin
    participant DB as OR DB
    participant SWH as SWH
    User->>Dashboard: Request an account
    critical Account validation process
        Dashboard->>Admin: Notify
    option Rejected ?
        Admin->>Dashboard: Reject
        Dashboard->>User: Notify the user
    option Accepted ?
        Admin->>Dashboard: Accept
        Dashboard->>DB: Create the user and its organization
        Dashboard->>SWH: Fetch an API Token
        SWH->>DB: Store the token
        Dashboard->>User: Notify the user
    end

5.1.2/ Login to Dashboard

#W2 General description of the feature’s workflow

Standard authentication with error handling and password reset. Nothing fancy, but secure and consistent with institutional practices. Ensure that different institution profiles can be accessed by role based levels of access.

Main actor: User (contributor) / User (admin)

Needs or pain-points to be tackled

OSPO-RADAR capabilities

The authentication mechanism must be secure: requiring complex passwords and recommending two-factor authentication.

  • Enforce password policy; optional TOTP/WebAuthn 2FA via Keycloak

It should be easy to change a password without needing to contact a manager or Software Heritage.

  • Self-service password change & forgot-password flow handled by Keycloak

User story

  • As the OSPO-RADAR operator, I need to login in as an administrator of my collection.

Associated use case

  1. authenticate myself on the site to securely manage my data and my organization's data. I also need to be able to manage other users linked to my organization.
  1. Initial account creation mechanism for the OSPO, this account must be linked to a global organization account on the other SWH website to be able to use our APIs
  2. Login/forgot password workflows
  3. Profil management.
  4. Access Control Lists (ACL) depending on user roles (admin, OSPO operator and OSPO collaborator)
  5. Create Retrieve Update Delete (CRUD) users.
  6. Organization management (name, logo, description, etc.).

Community review for workflows: #W1 & #W2 Account management (creation, login and management)

Name / Anonymous

Comment

+Upvotes

AAI authentication (EduGain

Feedback summary and response:

Workflow overview: #W2 Account login

sequenceDiagram
    actor User as User
    participant SWHAD as OR Dashboard
    participant DB as Auth provider
    
    User->>SWHAD: Request login
    
    critical Login process
        User->>SWHAD: submit login form
        SWHAD-->DB: verify information
    option Invalid credentials ?
        SWHAD->>User: Reject login
    option Valid credentials ?
        SWHAD->>User: Confirm login
    end

    critical Lost password process
        User->>SWHAD: Enter an email
        
    option Known email ?
        SWHAD->>User: Send an email containing an<br> ephemeral password reset link
    end
    
    User->>SWHAD: Follow the link and update the password
    SWHAD->>DB: Update user profile
    SWHAD->>User: Confirm password change


5.2/ Populate the software source code dashboard

5.2.1/ Batch import

#W3 General description of the feature’s workflow

Allows an OSPO manager to submit a list of software projects/urls to populate the collection. The workflow calls “Bulk Save Request[6]” and annotates the origin with the institution information.
OSPOs are already using other tools to track projects / software, they want to reuse available data instead of having to refill forms

Main actor: Admin / OSPO Manager

Needs or pain-points to be tackled

OSPO-RADAR capabilities

The initial import[v] process needs to be simple and not rely on a specific tool. A simple list of software URLs (link to the project on a forge, a package manager, etc.) should work.

  • Import inputs: paste a newline-separated list of URLs; optional CSV upload (single column or header url); API endpoint for programmatic imports. URL validation supports known forges and package registries.
  • Bonus: Add metadata of sub-collections and other tags for filtering.

A detailed report should be available after the import to know what has been added to the collection, archived, etc.

  • Immediately: Import report shown in-app - per-row status (Added to collection, Already in collection)
  • After Bulk save request ingested: Import report with:
  • URL,
  • Save request status (if Failed + reason),
  • Archived (SWHID) - heuristic of DIR swhid of root directory in master branch.

User story

As an OSPO, I want to import a list of software to my collection and view a status report of this import, so that I can quickly populate the collection from existing sources, ensure each origin is archived and assigned an SWHID, and act on a clear per-item outcome without manual re-entry.

Associated use case

  1. The Logged-in User submits a list of software origins to the OR Dashboard.
  2. The OR Dashboard validates the payload (basic URL checks, duplicates[w]) and enqueues a batch “Save Code Now” job with the Task Scheduler.
  3. The Task Scheduler processes the job origin by origin:
    3.1) Calls SWH to archive the origin (or retrieve existing archival state).
    3.2) Receives metadata/SWHIDs from SWH.
    3.3) Sends progress updates back to the OR Dashboard, which updates the
    job status and adds/updates items in the User’s organization collection (including institutional annotations).
  4. When all origins are processed, the OR Dashboard finalizes the job and prepares a status report (successes, already archived, failures with reasons, SWHIDs where applicable).
  5. The OR Dashboard notifies the User that the import has completed and provides access to the report (view/download).


Workflow overview: #W3 Batch import

sequenceDiagram
    actor User as Logged-in User
    participant SWHAD as OR Dashboard
    participant DB as OR DB
    participant SWH as SWH Archive
    
    User->>SWHAD: Send a list of software<br>to add to my catalog
    SWHAD->>DB: Create a bulk import progress report
    SWHAD->>SWH: Call the Bulk Save Request API
    SWHAD->>User: Confirm the request
    
    loop Until the import is finished
        SWHAD<<-->>SWH: Poll the Bulk Save Request<br> monitoring API to update the report 
        SWHAD->>DB: Update the progress report
    end
    
    User->>SWHAD: Check the progress report

5.2.2/ Signal software

#W4 General description of the feature’s workflow

For one-off additions. A researcher provides a URL, a curator is notified and validates it. If the project isn’t there yet, a Save Code Now request is triggered before adding it to the collection.

Have software contributions automatically included in institutionally validated exports that can be reused in CVs, annual activity reports, and grant applications.

Main actor: Researcher / Scientist / Research manager

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Avoid friction due to account creation, login / password etc. and at the same time the form to signal software must not be easily discoverable outside the institution.

  • Provide a simple form for the researcher to submit a URL and indicate whether the institution:
  • created or contributed to the software
  • used or depends on the software
  • Without login: Ensure institutional affiliation of the submitter without forcing account creation (lightweight email check[x]).

OSPO staff need simple curation tools/workflows to triage submissions before anything appears on the public “front-shop” view.

  • Provide simple tools/workflows to manage the list of signaled software. A curation has to be done before adding the software to the front-shop view.

If a project isn’t archived, a simple mechanism in the workflow should make it easy to do.

  • Trigger a save code now if needed.

Researchers add mentions of software in articles that are deposited in scholarly repositories (e.g HAL) and to ease the additions of these inputs in an institutional collection, a notification meachnism to the OSPO, could facilitate the input.

Optional: COAR Notify from other infrastructures to include software origins to a collection (relates to the workflow of identifying software in articles)

User story

As a researcher, I want my software contributions to appear in the collection.
Reason:
Researchers want their software outputs to be visible and recognized alongside publications and datasets, without having to manually compile scattered records. Institutionally curated exports ensure accuracy, persistence, and proper citation.

Associated use case

  1. (optional) Researcher includes a codemeta.json file in their repository.
  2. Researcher provides the url and institutional email in to simple “signal software” form.
  3. An email is sent to the Researcher email account for validation
  4. If validated, email is sent to OSPO to accept the additional request
  5. OSPO validates the request to integrate into the institutional dashboard
  6. Origin is archived in Software Heritage → generates an SWHID.
  7. The Dashboard ingests metadata and associates the project with the researcher’s ORCID and institutional affiliation.
  8. OSPO manager validates the metadata and adds the record to the institutional collection.

Metadata fields covered:

  • identifier (SWHID, DOI)
  • author (with ORCID)
  • affiliation (institution)
  • name (software title)
  • version
  • dateCreated, dateModified, datePublished
  • relatedPublication
  • funding, funder
  • license

Workflow overview: #W4 Signal software source code asset

sequenceDiagram
    actor OSPO as OSPO
    participant SWHAD as OR Dashboard
    participant DB as OR DB
    participant Intranet as Intranet
    actor Curator as Curator
    actor Researcher as Researcher
    
    OSPO->>SWHAD: Request a magic link<br> to add to a catalog
    SWHAD->>DB: Generate a magic link
    DB->>SWHAD: Displays the magic link
    OSPO->>Intranet: Communicate the link
    Intranet->>Researcher: click the link to signal a software
    Researcher->>SWHAD: fill the form
    SWHAD->>DB: Create a add to<br> collection request 
    
    critical [Valdidate the request]
        SWHAD->>Curator: Notify of a new request
    option [Request is invalid]
        Curator->>SWHAD: reject the request
    option [Request is valid]
        Curator->>SWHAD: accept the request
        SWHAD->>SWH: add the software to the collection
    end
    SWHAD->>DB: mark request as handled

5.2.3/ Search and add software

#W5 General description of the feature’s workflow

For one-off additions. A curator searches a software by its name or url in Software Heritage’s database. If the project isn’t archived, the Curator triggers Save Code Now and then adds it to the collection.

Main actor: Curator

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Provide access to Software Heritage’s search engine.

  • Built-in SWH search integration (name or URL query) with results returned in-app.
  • One-click “Save Code Now + Add to collection” action when no archival record is found.

Provide a simple interface to query the search engine and identify matching projects, to enable on-off additions.

  • “Add to collection” action on any matched result (with collection/sub-collection picker).
  • Duplicate check with inline warning if the item already exists in the org’s collection.

User story

As a Curator, I want the piece of software I’ve identified to appear in the collection after searching by name or URL, so that I can quickly curate the catalog without manual re-entry and ensure an SWHID exists.

Associated use case

  1. Curator opens Search & Add.
  2. Enters name or URL and runs the search against SWH.
  3. Dashboard lists results with origin URL, SWHID (if any), last snapshot date.
    3a) If a correct match exists → Curator clicks Add to collection → selects target collection/sub-collection → item added (with institutional annotation).
    3b) If no archival record exists → Curator clicks Save Code Now + Add → the system archives the origin, then adds it to the collection.
  4. If the item is already in the collection, the UI shows a duplicate notice and prevents double-addition.

Workflow overview: #W5 search and add software

sequenceDiagram
    actor User as Logged-in User
    participant SWHAD as OR Dashboard
    participant SWH as SWH Archive

    critical search process
        User->>SWHAD: use the search form to find a software 
    option query looks like an URL ?
        SWHAD->>SWH: search origin
    option query by name
        SWHAD->>SWH: search by keywords
    end

    SWH->>SWHAD: return results
    
    critical Results match user query ?
        SWHAD->>User: display results
    option No result match ?
        User->>SWHAD: request the archival of the origin
        SWHAD->>SWH: save code now
        SWH->>SWHAD: return the new archived origin
        User->>SWHAD: Add software to the catalog
    option Result match
        User->>SWHAD: Add software to the collection
    end


Community review for workflows: #W3, #W4 & #W5: dashboard setup and data population (batch import, signal software, and search & add)

Name / Anonymous

Comment

+upvoters

Feedback summary and response:

5.3/ View dashboard, curate and filter

5.3.1/ View internal dashboard

#W6 General description of the feature’s workflow

Once the collection is populated, users (OSPO Manager / Curator) need a simple list view of software projects associated with their institution, with basic filters (dates, research teams, licenses, programming languages). The goal is clarity and fast retrieval—not advanced analytics—while laying the groundwork for later reporting.

Main actor: OSPO Manager / Contributor (Curator)

Needs or pain-points to be tackled

OSPO-RADAR capabilities

No reliable baseline inventory of software produced vs. used across a wide, multi-unit community; data scattered in many tools; hard to start a landscape analysis.

“Since we are composed of many of the universities in XXX we are an interesting catalyst and example (for the XXX Research Council). Some initial work has been done with XXX and we expect some analysis in July but also via the service we procure in the Fall.”

“While we are just starting out, XXX, and our organisation (XXX), have particular expertise when it comes to research software but we are starting with a landscape analysis of software produced and used in our community which is [country]-wide in the Life Sciences. The challenge is having a baseline to start so that we can build on with guidance and overall monitoring.”

  • Internal dashboard list view of all projects in the institution’s collection.

Workflow overview: #W6 View internal dashboard (accessible only to institutional accounts)

sequenceDiagram
    actor User as User
    participant Homepage
    participant Login Form
    participant Dashboard
    
    critical Access the website
        User->>Homepage: Access the website
    option Already logged-in
        Homepage->>User: Homepage with a dashboard button
    option Not logged-in
        Homepage->>User: Homepage with a login button
        User->>Login Form: Access the login form
        Login Form->>Homepage: Redirect user after handling<br>the login process 
    end
    User->>Dashboard: Access the dashboard

5.3.2/  Explore the portal

#W7 General description of the feature’s workflow

Institutions need more than a static list of software projects. They need dynamic, up-to-date views that support monitoring, decision-making, and reporting. Filtering allows OSPO managers to quickly identify trends, gaps, or compliance issues without manual data collection.

Main actor: OSPO Manager

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Need to scope the landscape by organizational unit/team.

Licensing posture unclear across the portfolio.

Use filters (e.g., by department, programming language, license type, activity date, contributor) to explore the institutional software portal and generate real-time insights.

  • Basic filters: date ranges, research teams/units, licenses, programming languages.
  • Unit/Team facet (multi-select), sourced from org metadata.
  • License facet with normalization (e.g., SPDX IDs) + “Unknown” bucket.

Metadata fields covered

  • affiliation (department / unit)
  • license
  • programmingLanguage
  • dateModified
  • funding
  • identifier - extrinsic (DOI, HAL-ID, RRID, ROR, ORCID)

User story

As an OSPO manager, I can apply filters in the collection view of the OSPO-RADAR Dashboard to instantly see which departments have the most active projects, which licenses are most commonly used[y], or which projects have not been updated recently, so that I can provide accurate reports to leadership and funders, and identify areas needing support.

Associated use case

  1. OSPO manager logs into Dashboard platform.
  2. Opens the institutional collection view.
  3. Applies filters such as:
  • Department or organizational unit
  • License type (e.g., MIT, GPL)
  • Programming language
  • Date of last modification (dateModified)
  • Funding source
  1. Dashboard dynamically updates and displays:
  • Counts and distributions (charts, tables)
  • Lists of matching software projects
  • Key indicators (e.g., projects with missing metadata, inactive projects).
  1. OSPO manager exports filtered results for further analysis or inclusion in reports.

5.3.2/ Curate entries with enhanced metadata

#W8 General description of the feature’s workflow

Curators open a software record in the internal dashboard, edit/enrich key metadata, validate fields (licenses, teams, ORCID links, funding), and save. Admin/Managers may review and publish changes to the public “front-shop” view. The goal is to improve data quality with minimal friction while keeping provenance clear (what came from the source vs what the institution added).

Main actor: Curator / Contributor

Secondary actor: Admin / Manager (review & publish)

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Missing or inconsistent metadata across records.

Difficulty normalizing licenses, languages, teams, and identifiers.

  • Controlled vocabularies: SPDX license picker, programmingLanguage picker, org unit/team selector; identifier inputs (SWHID/DOI) with format checks.

Capturing funding and related outputs (papers, datasets) minimally.

  • Simple fields: funding/funder, relatedPublication (ID/URL), minimal free-text notes.

Linking people and affiliations reliably.

  • ORCID lookup/link for authors; affiliation selector (institution → unit/team).

Distinguishing what comes from the code repo vs institutional curation.

Provenance badges (Intrinsic from repo / Extrinsic curated); field-level source indicators.

Hard to tell how the institution relates to a piece of software (created vs contributed vs used vs dependent).

Add four explicit checkboxes on submit/curate forms:

  • Own/Authored: Primary development is by our institution (project we own/lead).
  • Contributed to: We made identifiable contributions to an external project (PRs, commits, packages, docs).
  • Used: We use the software in research/ops (a dependency/tool in our work).
  • Referenced: We cite/mention the software in outputs (papers, reports, curricula), but do not necessarily use or contribute.

Quickly spotting incomplete records.

Bonus: Data-quality hints (e.g., “Missing license”, “No ORCID”, “Unknown language”).

User Story

As a curator, I want to flag external projects our teams Contributed to so we can credit staff contributions and surface them in evaluations.

Associated use case

  1. Find the project
  1. Search by name or URL in SWH.
  2. If not found, run Save Code Now.
  3. Add it to the collection.
  1. View intrinsic metadata
  1. Dashboard shows what SWH already has in intrinsic metadata or other annotated metadata (name, URL, SWHID, license, language).
  2. These fields are read-only.
  3. Optional: Dashboard highlights data-quality issues (e.g., missing ORCID).
  1. Annotate
  1. Tick Contributed to (and optionally Own/Authored, Used, Referenced).
  2. Add contributors (Name, ORCID if available), team/unit, and evidence links (PRs, commits, citations).
  3. Annotate missing fields (license, funder, keywords) without overwriting intrinsic data.
  1. Save & review
  1. Save annotations (as draft).
  2. Admin/Manager can review.
  3. After review, annotation is validated and captured locally. Annotations can also be sent to the main archive for storage.

(Guerry, 2025)

5.3.3/ Generate a component inventory with SWHIDs

#W9 General description of the feature’s workflow

Generate a CSV file containing an institutional inventory of software components, their source repository URLs, and their corresponding Software Hash identifiers (SWHIDs).

Reason:
Institutions need a reliable record of their software assets that can be reused for reporting, compliance checks, and integration with other systems (e.g., CRIS, data catalogs, grant reports). By standardizing on SWHIDs, the exported file guarantees long-term traceability and avoids duplication.

Main actor: OSPO Manager / Research Support Staff

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Institutions need certainty that every software component in their inventory is safely archived and has a clear, reproducible SWHID. Without this, exported lists may include unarchived or untraceable code, undermining trust in reporting and compliance.

Metadata fields covered:

  • name (component)
  • identifier (SWHID)
  • codeRepository (URL)

User story

As an OSPO manager, I can export a CSV with three columns: Component/Project Name, URL (origin), SWHID; so that my institution has an authoritative list of its archived software assets, which can be used for internal tracking or shared with funders.

Associated use case

  1. The user selects a set of projects (all institutional projects or a filtered subset).
  2. For each project, Dashboard checks if the origin is archived in Software Heritage.
  • If yes, retrieve the Directory SWHID.
  • If not, trigger “Save Code Now” automatically, then retrieve the new SWHID.
  1. Compile results into a structured CSV with:
  • Component/Project name
  • Repository URL
  • Directory SWHID
  1. Provide the CSV as a downloadable artifact or via API.

Workflow overview: #W9 Export a list of assets with most recent dir SWHID

sequenceDiagram
    actor User as Logged-in User
    participant SWHAD as OR Dashboard
    participant SWH as SWH Search engine
    User->>SWHAD: Access the collection
    SWHAD->>SWH: Fetch user's organization collection
    SWH->>SWHAD: Returns a list of software
    SWHAD->>User: Displays the collection
    loop Search refinement
        User->>SWHAD: Add / remove a filter or a keyword, sort results
        SWHAD->>SWH: Query user's organization collection
        SWH->>SWHAD: Returns a list of software
        SWHAD->>User: Displays the collection
    end
    User->>SWHAD: Request an export
    SWHAD->>User: Returns a CSV file


Community review for workflows: #W6, #W7, #W8 & W9: internal view and dashboard functionalities

Name / Anonymous

Comment

+upvoters

Feedback summary and response:

5.4/ Public view and search capabilities

5.4.1/ Front-shop view

#W10 General description of the feature’s workflow

Beyond the internal dashboard, institutions often need a public-facing view of their software outputs. The idea is to expose a “front-shop” that displays selected projects from the dashboard, with metadata from intrinsic sources or from institutional annotations.

This is a read-only public portal that surfaces a curated subset of records. It includes:

  • Portal homepage (entry point for all users).
  • Institution-specific OSPO pages[z].
  • Software-specific modals.

The public site is fed directly from the dashboard (source of truth). Institutions control visibility through publish/unpublish, feature/unfeature toggles.

The list of published entries can be displayed on any institutional website, using the dashboard API endpoint.

Main actor: All (public visitors, researchers, funders, OSPO staff)

Needs or pain-points to be tackled

OSPO-RADAR capabilities

Institutions want to showcase software outputs publicly without duplicating data.

The front-shop portal is automatically fed from the curated internal collection; OSPO staff select which records are public. No re-entry required.

Need to control publication while keeping workflows lightweight.

Publish/unpublish and feature/unfeature toggles in the dashboard, with instant sync to the public site.

Make it easier for machines to navigate the pages by implementing Signposting patterns

Public pages include Signposting HTTP link headers and structured links (to SWHID, DOI, related publications), enabling automated harvesting and interoperability.

Institutions need to know impact and reach of their portfolio.[aa][ab][ac]

Built-in basic analytics: counts of page views, downloads, and outbound clicks, aggregated per project and per institution.

User story

  • As an OSPO staff member, I can mark projects as public/featured and provide short descriptions so that visitors can discover our key software.
  • As a public visitor, I can browse/search the portal and open a software page to view core metadata and links.
  • As a researcher, I can find the project’s repository, archival identifier, license, and citation information from a single page.

Associated use case

  1. Browse/search the portal (Public visitor)
  1. Public visitor accesses the portal homepage.
  2. Uses search box or filters (name, tags, domain, license) to browse.
  3. Featured projects appear highlighted on the homepage/OSPO page.
  1. View a software entry
  1. Visitor selects an entry→ public software entry opens.
  2. Page displays:
  • Title / description (from dashboard annotation).
  • Repository URL (link to forge).
  • SWHID (stable archival link) + iframe?
  • License/s (SPDX) - if available (warning if missing)
  • Citation info (e.g., BibTeX/CSL).
  • Related outputs (papers, datasets).

Workflow overview: #W10 Showcase OSPO’s collection and subcollections

sequenceDiagram
    actor User as User
    participant Homepage
    participant Institution Collection
    participant Software Entry
    User->>Homepage: Access the website
    Homepage->>User: Returns the list of OPSOs
    loop Check as OSPO Collection
        User->>Institution Collection: Use filters to find<br>a specific software
        Institution Collection->>User: Update the list
    end
    User->>Software Entry: Access a software entry

Community review for workflows: #W10 Front-shop view and API usage of collection list

Name / Anonymous

Comment

+upvoters

Feedback summary and response:

  1. 6/ Non-functional requirements to address

6.1/ Accessibility

To allow the greatest number of people to access the site's content, we aim for an AA conformity level with The Web Content Accessibility Guidelines.

6.2/ Performance / compatibility

We will depend on external APIs (search, deposit, etc.) so we’ll need to find ways of displaying loading states.

The public website must:

  • work with javascript disabled
  • work on multiple screen sizes (responsive design)
  • work on the latest three versions of major browsers

The dashboard itself will require javascript and at least a tablet-sized screen to work properly.

6.3/ Legal requirements

The OSPO-RADAR Dashboard will comply with EU and French law and transparent policies for cookies, privacy, and terms:

  • GDPR (EU): collect only necessary data, minimal cookies (consent only if non-essential).
  • Provide Privacy Notice, Terms of Use, Content Policy, and DPO contact.
  • Host in France/EU[ad]; document data flows and log retention.

6.4/ Sustainability & maintainability

The OSPO-RADAR Dashboard will be developed as open source project, under a permissive license and, and can be self-hosted, forked, and extended without lock-in. We will:

  • License & governance: permissive license (e.g., MIT/Apache-2.0), CONTRIBUTING.md, CODE_OF_CONDUCT.md, lightweight maintainer model, issue templates, release notes.
  • Architecture: modular services with clear boundaries; stable, versioned public APIs (SemVer); dependency minimization; replaceable adapters for SWH/ID providers.
  • Data portability: full export/import (JSON/CSV), documented schemas, no hidden state; migration scripts between versions.
  • Quality & tests: CI/CD with unit/integration/e2e tests; code coverage targets; static analysis and linting.
  • Docs & training: docs available, quick-start recipes, upgrade guides, and periodic refresh of examples.
  • Community: public tickets of features for MVP and next versions, discussions, and periodic community calls; encourage upstreaming of institutional extensions.

7/ What’s next? Interoperability & reusability of the Dashboard.

OSPO-RADAR’s users and partners emphasize different starting points and levels of ambition. Taken together, these inputs point to a modular product that is easy to plug in, easy to ignore if out-of-scope, and easy to reuse elsewhere.

Key signals from partners

  • Some are procuring dashboards/mining systems and want the SWHAD to expose a clean API their tools can call—without forcing a platform switch.
  • Others already track software in the Research Software Directory (RSD) and need lightweight connectors rather than overlapping features.
  • Several use Dataverse and prefer bi-directional sync (metadata + links, ideally files where appropriate).
  • Many want to move beyond static checklists toward interactive, guided workflows (while acknowledging that SWH isn’t yet embedded in current practices).
  • A subset note that managing researchers’ software isn’t their mandate; for them, the SWHAD should stay low-touch, primarily signposting to external services and repositories.

Design implication: Favor simple UX and a minimal, well-chosen feature set over breadth. Make integration the default path to value.

  1. 7.1 / integration with existing tools

  2. Finally, OSPO-RADAR must be modular, interoperable, and reusable, with components that are openly documented and capable of integrating seamlessly with repositories, registries, forges, and publishing platforms. In this way, OSPO-RADAR can become the connective tissue of the scholarly ecosystem, turning lessons learned into durable infrastructure for research software management.” (see §3.2)

Evaluate integrations

  • Software Heritage (SWH)
  • Scholarly Infrastructures as a source of input and a OSPO-RADAR Dashboard consumer (HAL, Zenodo, Research Software Directory (RSD), and others
  • One-way (read) to surface existing software and metadata.
  • Optional write-back: push enriched links (e.g., SWHIDs, citations) when enabled.
  • Bi-directional metadata sync (datasets ↔ software records, related identifiers).
  • Map persistent IDs (DOI, SWHID, ORCID, ROR) and provenance fields.

Interaction pattern

  • API-first: every feature available via REST/GraphQL endpoints.
  • Connectors over code forks: adapters that translate between schemas (CodeMeta, RSD JSON, Dataverse metadata blocks).
  • Progressive enhancement: start with read-only discovery.

From checklists to guided flows

  • Replace static “to-do”s with step-by-step wizards (e.g., “Archive → Reference → Describe → Cite”).
  • Offer contextual tips and auto-fill from existing records (RSD/Dataverse) to reduce user effort.
  • Keep a no-ops path for orgs that only want pointers to external services.

Outcomes

  • Orgs with existing stacks gain value via APIs and connectors.
  • Orgs without stacks can still use a simple, focused OSPO-RADAR Dashboard without feature bloat.
  • Researchers benefit from fewer steps, better links, and durable identifiers across the ecosystem.

7.2/ Functionalities and capabilities to keep in mind after MVP

  • Provenance & traceability
  • Capture event logs (who/when: archive, curate, edit, approve).
  • Store verbatim codemeta.json + normalized model; show diffs across versions/snapshots.
  • Identifier ecosystem
  • Intrinsic: SWHIDs (dir/rev/snp/origin).
  • Extrinsic: DOIs (Zenodo), ORCID (people), ROR (orgs), HAL-ids, RRIDs.
  • Deduplication rules across identifiers; merge/split records.
  • Interoperability & connectors
  • Harvesters for GitHub/GitLab/self-hosted forges, HAL, Zenodo, Dataverse, InvenioRDM, CRIS.
  • Webhooks & scheduled jobs; “Save Code Now” integration at import time.
  • Import profiles: CodeMeta v3, CITATION.cff; export profiles: CodeMeta v3, BibLaTeX, CSV, JSON.
  • Metadata quality & policy
  • Validations (SPDX license, dates, ORCID format, ROR match).
  • Required vs optional fields
  • Quality signals: README present, license present, CI badge, test coverage link, container/Env spec.
  • Software quality & reusability
  • Software hygiene
  • Dependency management
  • Containerization
  • Curation workflows
  • Queue with statuses: proposed → needs fix → under review → approved → published.
  • Field-level locks after approval; change requests; audit trail.
    Bulk actions (approve/assign curator/retire).
  • Search, filters, and analytics
  • Filters: department (ROR), language, license, last activity, funding, project, topic.
  • Saved views; scheduled exports.
  • Indicators: active/inactive, missing metadata, risky licenses, “no release” flag.
  • Institutional KPIs: growth, reuse (downstream deps), citation coverage.
  • Public “Front-Shop”
  • Theming (logo/colors), featured projects, SEO (sitemaps, schema.org).
  • Soft publication rules (auto-publish if policy met; otherwise hold for OSPO).
  • Reporting & exports
  • One-click institutional report (PDF/CSV) + API.
  • Researcher/Group exports for CV/annual review (CSV, PDF, BibLaTeX).
  • Benchmark pack (aggregate, anonymized) — optional, opt-in.
  • Roles & permissions
  • Roles: Viewer, Contributor, Curator, OSPO Manager, Admin.
  • Scopes by org unit (department/lab); SSO (Keycloak/OIDC).
  • Consent flags for public display; PII minimization.
  • Ops & deployment
  • SaaS & on-prem profiles; staging/production; backups & retention policies.
  • Observability (metrics, logs, traces); rate limiting; job retries.
  • Accessibility (WCAG 2.1), i18n for UI + multilingual metadata.
  1. 8/ The road ahead: a sustainable service model

  2. The project delivers the OSPO-RADAR Dashboard as its primary outcome: a standards-based platform that enables OSPOs to manage, archive, and showcase research software. The dashboard can serve as the foundation for a broader suite of institutional products, designed to amplify the visibility and strategic use of research software in academia. A dedicated Open Science helpdesk will support the product adoption and effective use of the dashboard.

Once the dashboard is in place, another potential emerges with the combination of regular institutional reporting. These report, drawing on the dashboard’s data, the Software Heritage archive and the a massive data analysis of the archive, could transform raw information into actionable insights. They enable institutions to:

  • Track software outputs across departments and research groups;
  • Assess compliance with Open Science policies and Open Source guidelines;
  • Identify flagship projects, emerging practices, and sustainability gaps;
  • Demonstrate institutional contributions to funders and policymakers.

As Roberto Di Cosmo has often emphasized, university OSPOs are the local execution engines of open-source and open-science policy[ae]: they sit at the interface between researchers and institutional functions (tech transfer, legal, research office, libraries/open science), track national and local policy evolution, and build a coherent, institution-wide view of software production. In practical terms, they coach and equip research teams across the full lifecycle: Archive & Reference (deposit in Software Heritage and, where relevant, HAL; assign stable SWHIDs), Describe & Cite (maintain codemeta.json; generate publisher-ready software citations linked to research outputs and evaluation), Compliance & License (default-open decision flow, compatibility checks, and legal support), and Development & Dissemination (good forge hygiene, CI/testing, packaging, onboarding, and community practices)  (Di Cosmo at UGA).

OSPO-RADAR is designed as the operational gateway for these tasks, giving OSPOs the workflows, metadata, and dashboards needed to make policy actionable at scale.

The pilot report on French academic institutions preserved in Software Heritage (Di Cosmo, July 2025) demonstrates the feasibility of such a product. Using the institutional analysis pipeline, Software Heritage was able to provide:

  • Key numbers (projects, contributions, active developers, activity timespan);
  • Distributions and trends (last activity, first contributions, project lifetimes);
  • Top repositories by contributions, stars, and longevity;
  • Programming languages and domains;
  • Citation practices (CodeMeta and CITATION.cff adoption).

The OSPO-RADAR dashboard, complemented by institutional reports, establishes a product ecosystem that extends beyond the project’s two-year timeline. Reports can be offered as:

  • Membership benefits for OSPO-RADAR institutional partners;
  • Services integrated into national infrastructures (e.g., HAL, EOSC);
  • Benchmarks for international collaborations (SciCodes, ReSA, RDA).

For OSPOs, institutional-level reports represent far more than static analyses: they function as strategic tools that allow institutions to monitor software production across departments and labs, identify flagship projects and emerging communities, benchmark practices such as metadata adoption and project sustainability, demonstrate impact to funders and policy bodies, and support compliance with Open Science mandates, including archiving and citation readiness. By packaging the dashboard’s data into exportable reports, OSPO-RADAR extends its role beyond day-to-day management, providing OSPOs with a means to communicate effectively with leadership, funders, and external stakeholders.

In this way, OSPO-RADAR moves from a one-off project deliverable to a sustainable service model, positioning Software Heritage as both an archival resource and a provider of institutional intelligence for research software and beyond.


  1. References

Description/Link

 https://www.zotero.org/groups/5682994/faircoreeosc_d./library 

Alliez, P., Cosmo, R. D., Guedj, B., Girault, A., Hacid, M.-S., Legrand, A., & Rougier, N. (2020). Attributing and Referencing (Research) Software: Best Practices and Outlook From Inria. Computing in Science & Engineering, 22(1), 39–52. https://doi.org/10.1109/MCSE.2019.2949413

Azzouz-Thuderoz, M., Del Cano, L., Castro, L. J., Dumiszewski, Ł., Garijo, D., Gonzalez Lopez, J. B., Gruenpeter, M., Schubotz, M., & Wolski, M. (2023). SIRS Gap Analysis Report. https://zenodo.org/records/10376006

Bilder, G., Lin, J., & Neylon, C. (2020). The Principles of Open Scholarly Infrastructure. https://doi.org/10.24343/C34W2H

Carlin, D., Rainer, A., & Wilson, D. (2023). Where is all the research software? An analysis of software in UK academic repositories. PeerJ Computer Science, 9, e1546. https://doi.org/10.7717/peerj-cs.1546

Declaration on Research Assessment. (2013). San Francisco Declaration on Research Assessment. DORA. https://sfdora.org/read/

Di Cosmo, R. (2025, September 8). Academic Software Landscape overview through the Software Heritage looking glass. Access to research software: Opportunities and challenges, OECD. Zenodo. https://zenodo.org/doi/10.5281/zenodo.17075792

Di Cosmo, R., Gruenpeter, M., & Zacchiroli, S. (2018). Identifiers for Digital Objects: The Case of Software Source Code Preservation. 1–9. https://doi.org/10.17605/OSF.IO/KDE56

Di Cosmo, R. (2023, March). SWHID specification kickoff meeting. SWHID kick-off meeting, Online Conference. https://hal.science/hal-04121507

Dillon, C. (2025). Academic OSPOs, What are they & Why we need them! 5ème séminaire de l’écosystème Recherche Data Gouv, Lille. https://rdg-seminaire5.sciencesconf.org/program/details

Directorate-General for Research and Innovation (European Commission) & EOSC Executive Board. (2022). Strategic Research and Innovation Agenda (SRIA) of the European Open Science Cloud (EOSC). Publications Office of the European Union. https://data.europa.eu/doi/10.2777/935288

Eglen, S., & Nüst, D. (2019). CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications. Septentrio Conference Series, 1. https://doi.org/10.7557/5.4910

EOSC Executive Board & EOSC Secretariat. (2020). Scholarly infrastructures for research software. Report from the EOSC Executive Board Working Group (WG) Architecture Task Force (TF) SIRS. European Commission. Directorate General for Research and Innovation. https://data.europa.eu/doi/10.2777/28598

Garijo, D., Arroyo, M., Gonzalez, E., Treude, C., & Tarocco, N. (2024). Bidirectional Paper-Repository Tracing in Software Engineering. Proceedings of the 21st International Conference on Mining Software Repositories, 642–646. https://doi.org/10.1145/3643991.3644876

Granger, S., Gruenpeter, M., Monteil, A., Nivault, E., & Sadowska, J. (2022, October 26). Modérer un dépôt logiciel dans HAL: Dépôt source et dépôt SWHID. Inria ; CCSD ; Software Heritage. https://inria.hal.science/hal-01876705

Gruenpeter, M., Sadowska, J., Nivault, E., & Monteil, A. (2022). Create software deposit in HAL. Inria ; CCSD ; Software Heritage. https://inria.hal.science/hal-01872189

Gruenpeter, M., Granger, S., Monteil, A., Chue Hong, N., Breitmoser, E., Antonioletti, M., Garijo, D., González Guardia, E., Gonzalez Beltran, A., Goble, C., Soiland-Reyes, S., Juty, N., & Mejias, G. (2023). D4.4—Guidelines for recommended metadata standard for research software within EOSC. https://doi.org/10.5281/ZENODO.8199104

Guerry, B. (2025, September 24). Renforcer la visibilité et l’interconnexion entre les OSPOs du secteur public. Inauguration de l’Open Source Program Office de l’Université Grenoble Alpes, Saint Martin d’Hères. https://ospo-uga.sciencesconf.org/data/pages/bg_dinum_uga_2025_v1.0.pdf

Katz, D. S., & Barker, M. (2023). The Research Software Alliance (ReSA). Upstream. https://doi.org/10.54900/zwm7q-vet94

Le Berre, D., Jeannas, J.-Y., Cosmo, R. D., & Pellegrini, F. (2023). Higher Education and Research Forges in France—Definition, uses, limitations encountered and needs analysis [Report]. Comité pour la science ouverte. https://doi.org/10.52949/37

  1. Lopez, M. (2021, February 7). Open Source Program Offices (OSPO) and their role in OSS ecosystems. How having an OSPO might help to open source software ecosystem sustainability. Fosdem 2021, Online Conference. https://fosdem.org/2021/schedule/event/community_devroom_ospo_oss_ecosystems/some of them already have tooling, some not: different strategies

Malone, J., Brown, A., Lister, A. L., Ison, J., Hull, D., Parkinson, H., & Stevens, R. (2014). The Software Ontology (SWO): A resource for reproducibility in biomedical data analysis, curation and digital preservation. Journal of Biomedical Semantics, 5(1), 25. https://doi.org/10.1186/2041-1480-5-25

Mayernik, M. S. (2016). Research data and metadata curation as institutional issues. Journal of the Association for Information Science and Technology, 67(4), 973–993. https://doi.org/10.1002/asi.23425

Rios, F. (2018). Incorporating Software Curation into Research Data Management Services: Lessons Learned. International Journal of Digital Curation, 13(1), Article 1. https://doi.org/10.2218/ijdc.v13i1.608

Task Force on Best Practices for Software Registries, Monteil, A., Gonzalez-Beltran, A., Ioannidis, A., Allen, A., Lee, A., Bandrowski, A., Wilson, B. E., Mecum, B., Du, C. F., Robinson, C., Garijo, D., Katz, D. S., Long, D., Milliken, G., Ménager, H., Hausman, J., Spaaks, J. H., Fenlon, K., … Morrell, T. (2020). Nine Best Practices for Research Software Registries and Repositories: A Concise Guide (arXiv:2012.13117). arXiv. http://arxiv.org/abs/2012.13117

Treloar, A., & Wilkinson, R. (2008). Rethinking Metadata Creation and Management in a Data-Driven Research World. 2008 IEEE Fourth International Conference on EScience, 782–789. https://doi.org/10.1109/eScience.2008.41

Université Paris Saclay. (2019, November 29). Introduction to data management plans. Université Paris-Saclay. http://www.universite-paris-saclay.fr/en/recherche/science-ouverte/les-donnees-de-la-recherche/introduction-data-management-plans

van de Sandt, S., Nielsen, L. H., Ioannidis, A., Muench, A., Henneken, E., Accomazzi, A., Bigarella, C., Lopez, J. B. G., & Dallmeier-Tiessen, S. (2019). Practice meets Principle: Tracking Software and Data Citations to Zenodo DOIs (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1911.00295

Young, J., Barba, L. A., Choudhury, S., Flanagan, C., Lippert, D., & Littauer, R. (2024). A Definition of an Academic OSPO. https://doi.org/10.5281/ZENODO.13910683


  1. Appendices

  2. Appendix A: Premortem

Identified Risk / Cause of Failure

Proposed Mitigation Measure

The product doesn’t match functional needs.

Define key functionalities and a clear MVP.

No active users or engagement.

Use an agile, iterative approach with frequent user testing and feedback.

Users themselves don’t know or can’t articulate what they need.

Early engagement, lightweight prototypes, and validation sessions.

Too many divergent use cases → scattered development and feature overload.

Define scope via governance and Product Owner oversight.

Service architecture or delivery model (SaaS, archive integration) is unclear or unmanageable.

Clarify delivery model early; validate feasibility with the technical team.

Operating costs are too high.

Design for efficiency; plan, budget and sustainability model from start.

Users can’t operate the tool autonomously → overload on the helpdesk.

Ensure usability; provide training and clear documentation.

Governance is unclear → difficulty prioritizing features, risk of scope creep.

Establish a governance structure and validation mechanism.

Some partners perceive the product as a competitor → threatens collaboration.

Communicate clearly why Software Heritage is the right actor to deliver this solution.

Not all potential user groups were identified.

Map stakeholders early; revisit and update regularly.

Another team or competitor develops a similar service in parallel.

Monitor the ecosystem and position OSPO-RADAR as complementary.

Insufficient resources for development, deployment, and maintenance.

Secure adequate funding; adjust scope if needed.

Academia and industry have fundamentally different needs.

Keep modular design; allow context-specific configurations.

Negative reputation risk (e.g., concerns over tracking or privacy).

Include privacy protections (e.g., hide personal data when needed).

  1. Appendix B: Persona Academic OSPO Manager

  2. According to C. Dillon (2025), “The overarching goal of an academic OSPO is to maximize the social and economic impact of open source software for research & education.”
  3. (Young et al., 2024)

Bio

Name: Sofia

Age: 38

Location: Lisbon, Portugal

Languages: Portuguese, English, Spanish

Professional life

  • Role: Academic OSPO Manager
  • Skills: Open source governance, policy drafting, community management, basic coding (Python, R), public speaking
  • Experience: 10+ years in research project management, 4 years running the university’s OSPO
  • Decision level: Influencer and coordinator; reports to Vice-Rector for Research

Needs

  • Build open source awareness across the institution
  • Define and implement OSPO policies and best practices
  • Support researchers in licensing, compliance, and community contributions
  • Connect the institution to European and global OSPO networks (CURIOSS, CHAOSS, etc.)
  • Deploy tools for the ecosystem (legal compliance, metrics, etc.)
  • Keep up to date with available tools for OSPOs

Relations

  • Member of OSPO Alliance and European task forces
  • Regular attendee at SciCodes meetings
  • Collaborates with IT, legal, research office, library teams and researchers, technology transfer office

Influences / Information sources

  • Reads OSPO newsletters, EOSC and OpenAIRE updates, and Horizon Europe guidelines
  • Follows the Software Heritage newsletter
  • Follows GitHub discussions on followed repositories
  • Reads European policy reports

Motivation / Drivers

  • Make open source the academic default
  • Help researchers avoid legal and technical pitfalls
  • Create a thriving, FAIR-aligned open source ecosystem within the university

Tech Stack

  • Repositories: GitHub, GitLab, Zenodo, etc.
  • Metadata: CodeMeta, RSMD, DataCite, PublicCode
  • Tools: SPDX, CycloneDX, REUSE, OpenRefine, Jupyter Notebooks, SonarQube
  • Platforms: OSPO.Zone, OpenAIRE, EOSC Portal, OSPO Alliance

Superpowers

  • Translates legal jargon into plain language for researchers
  • Bridges policy and tech communities inside the institution
  • Energizes people around open source and FAIR goals
  • Facilitates the deployment of tools for research software engineers (e.g., SonarQube, CI/CD pipelines)

Daily Routine

09:00 → Morning espresso + email check
10:30 → Meeting with research office on Horizon Europe reporting
12:00 → Lunch with IT lead
14:00 → Internal OSPO training session
15:30 → Review open source policy draft
16:00 → Review SonarQube analysis reports and CI/CD pipeline status
17:00 → OSPO network call
21:00 → Check GitHub PRs on OSPO-RADAR, CodeMeta

Communication Channels

  • Email
  • Slack / Mattermost
  • GitHub
  • Conferences, workshops
  • Newsletters

Motto

“Open is not just about the license, it's a culture — and we are engaged to implement open practices with articles, with data and with source code.”

SWH profile

  • User status: B2B collaborator, potential deposit partner, interested in metadata integration
  • Engagement level: Moderate; starting integration explorations with Software Heritage
  • Incentives: Ensure long-term preservation and visibility of academic software
  • Pain points: Lack of developer capacity, unclear internal policies, balancing compliance with innovation
  • Information retrieval: Gets info through OSPO networks, SciCodes calls, and Software Heritage outreach
  • Support needed: Clear documentation, example integration workflows, and success stories from peers
  1. Appendix C: A collection of use cases from the RSMD workshop (2023)

Actor - Who?

Action - What?

Reason - Why?

Scenario

metadata

Stakeholder that does the action

Action needed

The goal of the action, why do this actor needs this action

As a [actor] I can  [action] so that [reason]

CodeRepository

Software developer

open an issue

contribute to an existing tool to improve it

identifier

Sw author or responsible asks for PID

PID provider mints/calculates a PID

Ask

Assign/Calculate

To make the resource easily identifiable to the world

As an Author/responsible of the software, I ask for a PID (extrinsic) or enable automatic calculation (intrinsic, e.g., by having the software in SWH) so my sw has a unique PID

author

Curator or from code repository/README

Entered by a curator or harvested from another site or within the software

attribution/contact

As a collaborator, I can identify a creator of the code so that they get attribution in a citation

dateModified

Developer / researcher

As a developer / researcher, I can check the modified date so that I know if the software is “up to date” (has not been worked on a long time ago).

BuildInstructions

A user

needs to install or run the software

To use the software locally or in its own hardware (such as a HPC, or cloud)

DateCreated

Developer of the software

As a developer I can have credit and attribution for a software I worked on.

Readme

A user

Wants to understand if the software is useful for their purpose

To select the software they are going to use

Readme

A user

Wants to understand the requirements of using a given software (see also software requirements)

To decide whether or not the requirements can be met before running the software

Readme

A user

Wants to know who to acknowledge

To add a reference to the software tool in a publication, or a poster or any other scholarly output

datePublished

Developer / Aggregator

Citation

So that people know when it was released.

version

User

refer a specific version of software

To be able to cite it for reproduction of older results from colleagues

As a user I can refer to a specific version of a software so that I can cite it when reproducing older research.

Keyword

Search engine

Help in findability of the resource, but also in the classification of it.

aries/Ontologies

Resources need to be classified in order to be easily findable

A postdoctorant looking for some software helping in analysing his data

ProgrammingLanguage

User / search engine

search for language specific tools

To enable interoperability with other tools written in the same language

To allow reuse by users who can use that language

A teacher wanting to illustrate a way to solve a problem using a specific language

runtimePlatform

User / Contributor

Use (or develop) the software

To have compatible environment where the software should work as intended

As a user (or contributor) I can use (or develop) the software so that I have a compatible environment where the software works as it should.

license

Authors of the software

Make a decision about a proper license and add (a) LICENSE file(s)

To make a clear statement, what are the conditions and terms that apply to use the product/software

As a developer I want to be able to clearly state, who should be allowed to do what with my software

As a user, I want to know, what I am allowed to do with this software (use it, build on it)

developmentStatus

Developers

Description of development status, e.g. Active, inactive, suspended. See repostatus.org

Inform the public if a software is live or outdated.

Important information to decide, if I - as a researcher - want to use or build on this software for my research

As a user, I want to know, if I can use this software for my research.

As a developer, I want to know, if anyone is actively maintaining this software

embargoDate

Authors when publishing software via any sort of repository?

Date when the embargo is over

Some software might be restricted for a period of time. The users need to know when this period of restriction has ended.

OperatingSystem

User/service

Reinstall / reuse

The operating system should be described so the user or service can install or run the tool

An IT user wanting to compare performances of given OS’s used in the implementation of some kind of software

SoftwareRequirements

user/other software/dependencies

Install before reuse,

Or combine with other tools

To ensure all the pre-requisites for the software are available before  re-use

To inform about dependencies of the Resource

A user wanting to be sure that the software is not using a dependency which ha can not use in his own environment (incompatibilities)


Appendix D: Review grid of the RSAC components specifications

Table 5: Reviewing grid explanation

Section

Subsection

Suggested usage

The question to answer

Overview

Text from proposal

Objectives

Archive

Archive

Reference

Reference

Describe

Describe

Cite

Cite

Out of Scope

Identify  elements in the proposal that are out of scope for this particular subcomponent

And identify other limitations due to resources or feasibility  

What are the limitations of the subcomponent in achieving the 4 pillar objectives- archive, reference, describe and cite?

Requirements

User stories

As a __ I can ___ so that ___

What is the user’s story? Why does this user want to achieve a particular goal?

User requirements

Identifying the needs, goals, and tasks directly from the SIRS report

What is the user’s need to achieve the goal and have a happy ending?

Functional requirements

Identification of application requirements, server, database, etc..

What does the system / infrastructure can provide as new or improved functionalities to obtain the happy ending?

Non-functional requirements

What does the system / infrastructure can provide as non technical additions to obtain the happy ending?

Specifications

Architectural design

Sequence diagram

What components are related and what is the workflow to obtain the objectives?

Functional specifications

A breakdown of the implementation function: capabilities, appearance, and interactions with users in detail for software developers

(equivalent to a issue / ticket / task that can be resolved with one PR / diff)

What will the subcomponent team implement as part of the infrastructure features/functionalities to answer the identified requirements?

Service specifications

How do I test that the service is working properly?

Operational specifications

Integration with EOSC Core components

List the FC4E CC with links to each specs and identify the points of integration / interoperability

How does this subcomponent interact with each one of the FC4E Core Components?

External references

implementation infrastructure

Relevant documentation on infrastructure website

Where can I get more information about the implementation’s current state?

Software Heritage

Relevant documentation on SWH

Where can I get more information about the SWH archive's current state?

  | OSPO-RADAR has received funding from the Sloan Foundation under Grant Agreement no. 2025-25188.


[1] https://en.wikipedia.org/wiki/Minimum_viable_product 

[2] https://archive.softwareheritage.org/save/ 

[3] https://docs.softwareheritage.org/user/deposit/index.html 

[4] https://archive.softwareheritage.org/api/1/origin/save/bulk/doc/ 

[5] https://archive.softwareheritage.org/api/1/origin/save/bulk/request/doc/ 

[6] https://archive.softwareheritage.org/api/1/origin/save/bulk/doc/

[a]This seems like the abstract for the project, not for the deliverable.

1 total reaction

Mihail Anton reacted with ➕ at 2025-10-14 13:23 PM

[b]it's also future tense rather than present or past

[c]in a specific context, right?

[d]In EVERSE project we are developing a dashboard. Software quality pipelines will do assessments using quality indicators (metrics) and the results will be added the dashboard. Software citation, and other metadata will also be included in the framework. I see a big overlap here. Perhaps we should chat about this.

See:

- https://github.com/EVERSE-ResearchSoftware/DashVERSE  

- https://github.com/EVERSE-ResearchSoftware/QualityPipelines

[e]I see my comment was added anonymously :)

I am Faruk Diblen.

[f]should this have a bullet?

[g]I think "unambiguously referable" really means something like "The specific version of software I used in a paper has a stable DOI or other identifier."

[h]Can you elaborate more? What information is missing here?

[i]Finability is, I think, something we can measure. "Unambiguously referable" is not, I think, specific enough for us to know how to measure .... I'm trying therefore to intuit what the author meant, and then translate that into something we can measure. In my own case, referring to specific versions of specific software packages is what I need to attest to the full provenance for data in my research. I do this down to an enumeration of all the versions of all the external libraries as well. I think that is what we are trying to get at.

[j]Here we are separating reference from (citation) credit, so it is not about the extrinsic identifier (DOI) getting you into a metadata record, but a reference to the specific artifact, that can be a directory, a commit or a release and that we are sure that it is this specific thing, in our case using a hash.

[k]I don't see the connection between this and the earlier text in the subsection

[l]it might be worth saying something about what fraction of research organizations have OSPOs and if there is anything unique about this subset compared with the larger set of research organizations.

[m]maybe say that this goal "bridging ..." is one that is common to many universities, but they have different ways of doing this and are at different levels of success. OSPOs are one way, and are the focus of this work (assuming this is true. One question is if this project will benefit the organizations that don't have OSPOs and if that is an intended goal of the project)

[n]Could an “Engineer as Software User” be part of this persona or might it be added as additional persona? IMHO, any potential software user from outside the academic environment matters

[o]Can you explain how this persona is interacting with the Dashboard or with the Software Heritage archive? This persona matters in the ecosystem but it would be useful to understand what is the connection with this infrastructure

[p]A potential software user might discover the software through the Dashboard, and its information might be relevant to understand software authorship, maintainance, etc.

[q]How will they access the dashboard? It is an internal dashboard for an academic OSPO. With a visitor view to browse entities. Not sure this view is adapted to this user. Can you check workflow 10 and see if it is?

[r]We will capture all comments at the end of the review and see how to address globally. I'm asking here in case you have specific user in mind for whom this is the view they will use and explain in depth the use case.

[s]why "thus"?

[t]is a line needed below this in this column as well?

[u]It would be valuable for an OSPO administrator to be able to delegate the ability to update and validate entries for specific projects to project owners or department/lab administrators. This ability to assign edit rights that are limited to specific projects may be similar to this "Curator - contributor" role, but with a more limited scope.

[v]This may not only be an initial import, it could be an on-going workflow. If an OSPO maintains their own registry in parallel, there will need to be the ability to perform batch updates on a periodic basis to keep the data in the two systems synchronized.

[w]I assume this duplicate check will not allow multiple OSPOs to enter the same project URL. If an institution uses an open source library or tool but does not actively contribute to the code, will there be a way to capture the details of that location implementation so that it can be included as part of the institution's dashboard? How would this look if multiple institutions do this for the same open source project?

[x]Implementation detail - If this check relies on matching email domains, ensure multiple domains can be listed and allow OSPO admins to set these values.

[y]It would also be nice to be able to identify repos that do not have an identified license, or README, or other recommended project files.

[z]If these are to be used as an institutional resource, there will need to be the ability to set institutional branding, be adjustable to meet baseline institutional web resource guidelines (e.g. colors, logos, header/footer links) and support the use of a custom domain or subdomain.

[aa]Funders will ask for indicators of impact

[ab]n° of citations/mentions

[ac]This component is being used in.... (dependency metrics)

[ad]Consider if this will be a sticking point for institutions outside of the EU. If so, might there be partners willing to host in other jurisdictions?

[ae]what happens if university OSPOs remain rare or even decrease?