1 of 10

Code Review

Melanie Frazier & Julie Lowndes

rOpenSci Community Call

October 16, 2018

2 of 10

OUR SITUATION

Global Ocean Health Index calculated every year (since 2012)!

Work in R

Mostly write code to wrangle and analyze data

3 of 10

OUR SITUATION

We review our code every year (4-6 people)

We’ve had a chance to determine what works

4 of 10

IDEAL vs. REALITY

THE BEST: Have someone independently

recreate code

2nd BEST: External review of all code

BUT: Time is an obstacle!

What to do!?

5 of 10

  1. MAKE IT EASY TO SHARE

  • Github repos: store all code and data
  • Github issues: report, questions, reference materials, etc.
  • here package: standardize file paths (previous: ~/github/)

6 of 10

2. PROMOTE COMMON PRACTICES

  • Weekly Team Meetings: discuss coding

(tidyverse, etc)

  • Protocols for file organization
  • Put code in Rmds: flow between documentation & code

We prioritize documentation over code optimization

(we are sometimes guilty of unnecessary loops, cut-and-paste, etc ….. oh-well!)

7 of 10

3. ERROR AVOIDANCE TRAINING

  • Break code into manageable chunks

(no run-on dplyr chains)

  • Techniques to check data esp. at danger points (i.e., joins, NAs)
  • Write warnings/errors and/or document expectations when checking data

(e.g., “# should be length 0”)

8 of 10

4. REVIEW AS A PROCESS

We post results (in near real time as well as for final review) to a Github issue:

  • Do results make sense
  • Explore NAs
  • Explore outliers

9 of 10

5. STRATEGIC CODE REVIEW

  • Restart R Session and rerun code
    • Make sure it still runs!
    • Review warnings
    • Compare results with previous runs

git2R to pull data from previous commits

  • External review of some code (~ 50% for us)
    • Complex analyses
    • New team members

10 of 10

LAST THINGS

Work to create an environment where mistakes are expected and OK (Parker 2017) !

Start of additional resource:

https://rawgit.com/OHI-Science/ohiprep_v2018/master/Reference/CodeReview/code_review.html