1 of 25

Working with Open-source CV Codebases

CV4Ecology Summer School

Sara Beery and Surya Hari | August 17, 2023

2 of 25

When should you adapt open source code?

3 of 25

CV/ML languages (ie Pytorch) have built-in options for simple models

4 of 25

What is built in to Pytorch?

5 of 25

What is built in to Pytorch?

6 of 25

What is built in to Pytorch?

7 of 25

What is built in to Pytorch?

Not very many options!

8 of 25

What if we want to use another model?

9 of 25

What if we want to use another model?

10 of 25

11 of 25

Open-source repositories: pros and cons

-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!

12 of 25

Open-source repositories: pros and cons

-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!

-(sometimes) vetted implementations

13 of 25

Open-source repositories: pros and cons

-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!

-(sometimes) vetted implementations

-you need to read and understand someone else’s code

14 of 25

Open-source repositories: pros and cons

-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!

-(sometimes) vetted implementations

-you need to read and understand someone else’s code

-not always well-documented

15 of 25

Open-source repositories: pros and cons

-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!

-(sometimes) vetted implementations

-you need to read and understand someone else’s code

-not always well-documented

-(sometimes) heavier-weight than you need

16 of 25

How do we know what code is good?

17 of 25

Let’s say we’ve read the paper and want to try a RetinaNet object detection model on our data.

18 of 25

19 of 25

That’s still a lot of options…

20 of 25

What to watch out for

  • No documentation
  • Data hard-coded into the repo
  • No stars or forks of the repo (it might have bugs!!)
  • No pretrained models

21 of 25

There are a few well-established repos that are a good place to start

22 of 25

“Model zoos”: curated collections of architectures and pretrained models

transformers (HuggingFace)

timm

23 of 25

24 of 25

25 of 25