1 of 6

CompF07:

Reinterpretation and Long-term Preservation

of Data and Code

Conveners: Kyle Cranmer, Mike Hildreth, Matias Carrasco Kind

2 of 6

  • Public data
    • (comes in many forms … HepData, public likelihoods, CERN OpenData, data for education/outreach)
    • Tools for generating annotated public data and software
    • Tools for sharing data and software
  • Not-yet-public Data
    • Tools for generating annotated “private” data and software
  • Tools for combining results across experiments and frontiers
  • Tools for archiving and re-running analyses (RECAST/REANA, … )
    • Internal-to-experiment and external “public” preservation

Functional (Focus) Areas:

  • Obvious overlap with all physics groups, as well as other computational areas
  • Will try to join/convene as many joint sessions as possible moving forward

3 of 6

  • Define the stakeholders and consumers of the data and software
    • What are the needs/requirements of the stakeholders?
      • (probably most difficult question to answer)
  • What resources are needed?
    • e.g. long-term storage with external access, infrastructure for preserving executable code, etc.
    • metadata infrastructure
  • What technologies are available or will be available, what is the technology evolution of these tools?
    • To be discussed in common with CompF5: End User Analysis:
      • version control
      • Containers/VMs
      • proprietary software/licenses

Group Mandate, Activities, Questions:

4 of 6

  • How are/will the stakeholders use these technologies?
  • What are the workflows that are used to combine results across experiments and frontiers?
  • What tools are used/needed by the stakeholders to combine results across experiments and frontiers?
  • What will the technological evolution of these tools look like?
  • How are other science domains handling this topic?
  • What are other science domains using, what is industry using?�

Group Mandate, Activities, Questions (cont.):

5 of 6

  • Raise awareness/visibility of preservation issues across frontiers
  • Communicate current efforts/technologies to other groups/frontiers
  • Mediate incorporation of these concepts and objectives into all reports and guidelines (where appropriate)
  • Production of general guidelines (aspirations?) for preservation of scientific results

Overall Goals:

6 of 6

Parallel Sessions:

Today:

Tomorrow:

(times in EDT)