Project-based workflows with GitHub
Courtney Robichaud and Emma Hudgins
@cdrobich @emmajhudgins
You walk away confident in using Git/GitHub for version control with your (R-based) projects
What are your concerns about using or learning Git/GitHub?
I don’t know how to use it in my research
The learning curve/difficulty for new users
Little coding experience
Never got the flow of it
What we will cover:
The power of projects, Git and GitHub
Your current organization
Could look something like this
Your ideal organization
R project
Commit often
Push to GitHub
R Projects
What is a “project” and why is it better?
devmountain.com
What else can GitHub do?
GitHub basics
Your moves:
Repo(sitory) - one or more folders that have git functionality, GitHub repos are stored on the cloud
Push - send changes to the cloud
Pull - get changes from the cloud
Commit - create a named version of a set of one or more changes to the repo
Clone - copy an existing repo into your local github folder such that it communicates with the original repo
Fork - freeze an existing repo in time and copy it into your github folder such that it does not communicate with the old repo
OR click the green button in the left pane
github.com
Structure of a repo
Check/change your settings in R:
GO TO GITHUB
.gitignore
Choose a template based on your main programming language (R template ignores files like .RHistory)
Some examples of files you probably want to ignore:
Choosing the best license
choosealicense.com
Ideal folder structure
Raw Data
Metadata includes date of download or collection, original source and re-use info
(Derived) Data
Data you transformed after downloading/collecting, e.g. merging 2 databases
Scripts
Code (can separate by language)
Output
Figures, tables, results
Every folder should contain a README!
Readme/Metadata best practices
File naming
Clean coding
Be proactive
More advanced GitHub
More advanced functionality
Branch - one set of version histories for a repo, including the ‘main’ original branch, and additional branches used to suggest changes, test out new ideas that may not work etc.
Pull request - a suggested commit (created in another branch or from a fork) that must be approved by the owner of the main branch
Pull often, commit after each change
Revert changes
Easier pre-commit, but possible post-commit too.
Pre-commit:
In RStudio, right click on a file and select ‘revert’
Releases, Zenodo & DOI creation
Releases, Zenodo & DOI creation
OpenRefine
https://openrefine.org/