2 of 27

Core idea

Putting code or models online is only the first step

Open Science happens when they are versioned, documented, licensed, and easy to cite or reuse.

These are also key steps toward making them FAIR:

Findable Accessible Interoperable Reusable

https://www.reddit.com/r/ProgrammerHumor/comments/1 22hv4n/when_you_are_tired_of_explaining/

3 of 27

Session outline

60 minutes

Learning outcomes

Why open code and models matter

Concepts: code, models, licensing, citation

Hands-on 1: Git basics

Hands-on 2: README and reusable pipelines

Hands-on 3: model documentation

Wrap-up checklist

4 of 27

Hands-on work

Three practical skills to take away

1. Version control (Git) Create a small version-controlled

research code folder.

2. Document Write a README that helps others

understand and run code.

3. Share responsibly Identify the minimum documentation needed

for model reuse.

Focus today: practical minimums, not perfect repositories. Goal: make outputs easier to inspect, cite, and reuse.

5 of 27

“Online” is not the same as “reusable”

Difference between visibility and practical openness

Visible, but hard to reuse

Files uploaded somewhere
No README
No license
Unclear version
Missing dependencies
No citation or contact info

Reusable in practice

Versioned repository
README
License
Run instructions
Release or archive
Limitations documented

6 of 27

This reduces friction for others — and for your future self

AI generated

7 of 27

What Open Science asks us to do here

Open as possible, closed as necessary

Aim for openness that supports reuse

Make outputs understandable
Make them easier to inspect
Enable reuse where possible
Document what others need to know

Why something may not be fully shareable

Personal or sensitive data
Copyrighted training material
API keys or credentials
Proprietary dependencies
Partner or contract restrictions

Even when full release is impossible, you can often still share metadata, documentation, synthetic examples, or access conditions.

8 of 27

What is code?

Code is a set of instructions that tells a computer how to perform a task (e.g. to run and build a model).

9 of 27

What is a model?

A model is a simplified representation of part of the world, built so we can describe it, explain it, predict outcomes, or simulate processes.

Examples across research

Statistical models
Climate or epidemiological models
Conceptual models in theory-building
Machine learning models

input

model

output

10 of 27

Code and models

Code builds and runs models

Models represent patterns or systems

Code → needed to reproduce how the model was created Model → needed to reuse or apply it

Sharing only one is often not enough for open science

11 of 27

The minimum shareable code package

project/

├─ README.md

├─ LICENSE

├─ analysis.py

├─ requirements.txt

├─ results/

├─ .gitignore

└─ data/

└─ README.txt

Why each file matters

README.md explains what the project is and how to use it
LICENSE: states reuse conditions
analysis.py contains the example code.
requirements: software dependencies needed
results/ is where outputs may be written
.gitignore: lists files or folders that should not be tracked
data/README.txt explains the data folder, even if data cannot be shared.

12 of 27

Sharing and citing code: repository vs archive

A paper needs a stable version, not only the latest state

Development platform

GitHub or GitLab

Note: May be deleted any time by their owner!

Active work
Latest version
Collaboration and issue tracking

Archive / citation layer

Zenodo
Release a specific version
Archive it, ensuring long-term access
Mint a DOI
Cite the exact version used in the paper

“A repository is where the project grows. An archive is where a version becomes part of the scholarly record.”

13 of 27

Licensing: visible is not the same as reusable

Keep the legal message simple and practical

No license

People can see the code, but reuse is unclear.

Common families to mention

MIT – very permissive: you can use, modify, and share the code freely, as long as you include the original license.

GPL – copyleft: you can use and modify the code, but if you share your version, it must also be open under the same license.

Apache 2.0 – permissive: like MIT, but with additional legal clarity, especially regarding patents.

14 of 27

Git and version control

15 of 27

Git and open science - making research more transparent and reusable

Transparency

See what changed, when, and why

•

Reproducibility

Return to the exact version used in a study - avoid “final_final_v3” confusion

Collaboration

Work together without overwriting each other and track contributions clearly

Sharing

Structure your code for others and connect to platforms (GitHub, Zenodo)

Your future self

Understand your own work later!

16 of 27

generated

17 of 27

Git and GitHub: what is the difference?

Git = version control tool. GitHub = online hosting platform.

Git

A tool for tracking changes in files

Works with any text-based files (R, Python, SPSS etc.)

Works on your own computer
Allows you to save versions through commits

GitHub

A web platform for hosting Git repositories online
Used for sharing, collaboration, and visibility
Often used to publish code publicly
Builds on Git, but is not the same thing

You can use Git without GitHub, but GitHub usually uses Git.

18 of 27

Hands-on 1: Git basics

Around 15–20 minutes

What to remember

A repository starts in an ordinary folder
A commit is a named snapshot
Short commit messages explain the change
Versioning begins with a few simple commands

19 of 27

Link to exercises

http://bit.ly/4dnQYZq

20 of 27

Hands-on 2: Writing a README

If someone reads only one file, it will probably be this one

Example README structure

# Project name

## Short description ## Files included

## How to run

## Dependencies ## Data

## Citation / contact ## License

A good README answers basic questions

What is this project?
What does it do?
How do I run it?
What depends on what?
Is the data included, restricted, synthetic, or available elsewhere?
How should I cite it?
What license applies?

21 of 27

Hands-on 2: Example README

Fictional project: Campus Coffee Consumption Analysis

Project Name

Campus Coffee Consumption Analysis

Short description

Analyzes coffee habits from a small campus survey.

Files included

analysis.py, data/README.txt, results/ (generated outputs, ignored).

How to run

Install dependencies, data description is available in data/README.txt, then run: python analysis.py

Dependencies

Python 3.10, pandas, matplotlib.

Citation / contact / license

Your Name, 2026. Contact: email@university.se. MIT License.

22 of 27

Link to exercises

http://bit.ly/4dnQYZq

23 of 27

Models are not just files: they need documentation

Purpose

What is this model for? Who is it meant to help?

Data and evaluation

What data was used?

How was the model evaluated?

Limitations

Where can it fail? What is out of scope?

License and use

What may others do with it, and how do they run it?

There are many domain-specific frameworks (such as DOME for ML), but they all focus on the same core questions: purpose, data, evaluation, and limitations.

24 of 27

Hands-on 3: Documenting models

# Model name

## Version ## Purpose

## Intended use

## Out-of-scope use

## Training/source data ## Evaluation

## Limitations ## License

## How to use

Around 10 minutes

Before sharing code or a model, ask:

Is it versioned? Is it licensed?

Is it documented?

Is it runnable or at least understandable?

Have sensitive files, credentials, and restricted data been excluded?

Is there a stable citation point? Are the limitations clear?

This is essentially a simplified version of what is called a ‘model card’ in machine learning

25 of 27

Hands-on 3: Example Model Documentation

Fictional model: Coffee Consumption Classifier

Version

v1.0 — the documented model release.

Purpose

Predict high vs low coffee use from survey answers.

Intended use

Teaching and research demonstration of a simple classifier.

Out-of-scope use

Not for health advice, profiling, or decisions about individuals.

Training/source data

Campus survey of staff and students (n=200); no identifiers included.

Evaluation

Tested on held-out survey responses; accuracy around 78%.

Limitations

Small dataset, one university context, limited generalizability.

License

MIT License for example code and documentation.

How to use

Run the prediction function in analysis.py with new survey data.

26 of 27

Link to exercises

http://bit.ly/4dnQYZq

27 of 27

Take-home checklist

Versioned → track changes (Git)

Documented → README + model description Licensed → clear reuse conditions

Runnable → dependencies + simple instructions Citable → release or DOI (e.g. Zenodo) Responsible → no sensitive or restricted data

Open Science is not about sharing everything — it’s about sharing enough for others to understand, run, and reuse your work.