1 of 47

FAIR Forward 2025

Workshop #4:

Basics of Version Control using Git

1

September 24 2025 | biodatasage.com/fair

2 of 47

Where are you zooming in from?

2

Drop your city/state in the chat!

3 of 47

This is part 4 of a series!

3

This workshop and hackathon series is supported by an Open Research Community Accelerator grant.

4 of 47

What is FAIR Forward?

  • A hands-on experience will engage participants in analyzing open-source scientific and healthcare data �
  • Addressing real-world problems�
  • Participants will learn how to apply the FAIR principles:

Findable, Accessible, Interoperable, and Reusable

    • to ensure their research outputs contribute to transparent, reproducible science�
  • We're prioritizing BIPOC students from HBCUs, HSIs, TCUs, and MSIs in our first recruitment wave, followed by working professionals from BIPOC backgrounds.

33

4

5 preparatory Workshops

Virtual Hackathon

Open-Source Data Science Learning Materials

5 of 47

Today’s focus

5

What is Git?

Why use Git?

Git vs GitHub

Basic Git Flow + Walkthrough

GitHub Desktop

6 of 47

Disaster strikes when:

6

Your computer crashes

You accidentally close a script without saving

You want to experiment with your code

You need to collaborate on a group project

7 of 47

Version control to the rescue!

7

Think Google Docs but for code!

  • Every change made to the code base is tracked
  • Saves snapshots of your code
  • Lets you easily share code with the team you are working with
  • Allows you to collaborate on data projects and know who contributed to the changes made
  • Gives you access to any previous version of your document

8 of 47

Quick example: YRBS data

8

In our last workshop, we explored data manipulation with R using YRBS data

9 of 47

Previous scripts

9

In our last workshop, we explored data manipulation with R using YRBS data

What if your script breaks here and you want to go back to step 2 of your analysis?

10 of 47

File Name Chaos

10

What if you made additional analyses and updates to your script?

11 of 47

With Git

11

Every version is saved automatically!

12 of 47

Git VS GitHub

12

  • Lives on your computer
  • Tracks your changes
  • Works offline
  • Personal backup

Git (local)

GitHub (cloud)

  • Lives on the internet
  • Stores your Git repositories
  • Share with others
  • Cloud backup

https://github.com

13 of 47

Basic Git Flow

13

A folder for your programming project that can contain:

  1. R scripts (or any other programming language)
  2. Any datasets used (e.g. CSV files, excel files)
  3. Any outputs generated during your analysis
  4. Any other documents related to your project

Repository/Repo

a set of files

14 of 47

Basic Git Flow

14

Repository/Repo

a set of files

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

15 of 47

Basic Git Flow

15

a set of files

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

Repository/Repo

a set of files

16 of 47

Basic Git Flow

16

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

Push = send and store to the cloud

  • Saves the new files on the cloud (GitHub)
  • File changes are now documented

Repository/Repo

a set of files

17 of 47

Basic Git Flow

17

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

Push = send and store to the cloud

  • Saves the new files on the cloud (GitHub)
  • File changes are now documented

Repository/Repo

a set of files

Pull

18 of 47

How do we do all of this on RStudio?�

18

19 of 47

19

Write notes here

Write code here

Results here

Objects (data) here

Written files here

20 of 47

20

Git Pane

21 of 47

21

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

22 of 47

Diff Example

22

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

Say we wanted to export our filtered dataset from our last workshop into a CSV:

23 of 47

23

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

We observe that the changed R notebook file and exported CSV shows up in the Git pane

24 of 47

24

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

25 of 47

25

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

26 of 47

26

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

27 of 47

Commit message

27

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

28 of 47

Storing files in the cloud

28

Push = send and store to the cloud

  • Saves the new files on the cloud (GitHub)
  • File changes are now documented

29 of 47

Refreshing your files from the cloud

29

Diff = add/save

Repository/Repo

a set of files

Pull

  • File changes are now documented

Pulling lets you:

  • Access the latest updates to your files
  • Pick up your project from any computer from where you left it
  • Work with others and bring in any changes your teammates make

30 of 47

History in Git

30

  • File changes are now documented

History shows you what changes you’ve made to your project

31 of 47

History in Git

31

  • File changes are now documented

History shows you what changes you’ve made to your project

  • You can also check who made the change and when they were made

32 of 47

History in Git

32

  • File changes are now documented

History shows you what changes you’ve made to your project

  • You can also check who made the change and when they were made
  • You can additionally look at what files were changed, added, or deleted

33 of 47

Recap: Git Flow

33

Diff = add/save

  • Add files to repo
  • Save any new changes made to files

Commit = describe the change

  • Helps convey why the changes were made
  • Confirm the changes you want to make

Push = send and store to the cloud

  • Saves the new files on the cloud (GitHub)
  • File changes are now documented

Repository/Repo

a set of files

Pull

34 of 47

Uploading your project to GitHub

34

35 of 47

Register a GitHub account

35

Simple Rules:

  • Use your real name - like "sarah_johnson" or "mikec"
  • Keep it short - easier to type and remember
  • Make it unique - so no one else has it
  • Keep it professional - something you'd be okay with teachers or future bosses seeing
  • Don't use temporary stuff - avoid your school name or current city
  • Reuse from other places - if you have Twitter or other accounts, use the same name

Remember: You'll use this username for years, so pick something you'll still like later!

36 of 47

Demo: Create a GitHub Repo & Clone Repo in RStudio

36

37 of 47

Introduce yourself to Git

37

git config --global user.name "Jane Doe"

git config --global user.email "jane@example.com"

git config --global --list

Two ways:

1. Commands from the terminal

*substitute the name with your name and email associated with your GitHub account

38 of 47

Introduce yourself to Git

38

git config --global user.name "Jane Doe"

git config --global user.email "jane@example.com"

git config --global --list

Two ways:

1. Commands from the terminal

install.packages("usethis")

library(usethis)

use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")

2. Using the usethis R package

*substitute the name with your name and email associated with your GitHub account

39 of 47

Demo: Token generation & �Save > Commit > Push

39

40 of 47

Github Desktop

40

  • Gives you a simple interface to replicate the Git flow without commands

41 of 47

Github Desktop

41

  • Gives you a simple interface to replicate the Git flow without commands

42 of 47

Github Desktop

42

  • Gives you a simple interface to replicate the Git flow without commands

For tutorials and how to use GitHub Desktop: https://docs.github.com/en/desktop

43 of 47

Additional Resources

43

  1. Happy Git with R: https://happygitwithr.com/
  2. GitHub’s Git tutorial: https://docs.github.com/en/get-started/git-basics/set-up-git

44 of 47

Q&A

44

45 of 47

Let’s continue the conversation!

Join our slack channel:

bit.ly/fairslack

45

46 of 47

Coming up next….

Workshop 5

Introduction to Data Visualization in R

October 10th 4pm EST

46

Register now!

47 of 47

Thank you for attending!

47