Working with Open-source CV Codebases
CV4Ecology Summer School
Sara Beery and Surya Hari | August 17, 2023
When should you adapt open source code?
CV/ML languages (ie Pytorch) have built-in options for simple models
What is built in to Pytorch?
What is built in to Pytorch?
What is built in to Pytorch?
What is built in to Pytorch?
Not very many options!
What if we want to use another model?
What if we want to use another model?
Open-source repositories: pros and cons
-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!
Open-source repositories: pros and cons
-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!
-(sometimes) vetted implementations
Open-source repositories: pros and cons
-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!
-(sometimes) vetted implementations
-you need to read and understand someone else’s code
Open-source repositories: pros and cons
-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!
-(sometimes) vetted implementations
-you need to read and understand someone else’s code
-not always well-documented
Open-source repositories: pros and cons
-can have built-in implementations of data augmentation, loss functions, evaluation (less code to write!
-(sometimes) vetted implementations
-you need to read and understand someone else’s code
-not always well-documented
-(sometimes) heavier-weight than you need
How do we know what code is good?
Let’s say we’ve read the paper and want to try a RetinaNet object detection model on our data.
That’s still a lot of options…
What to watch out for
There are a few well-established repos that are a good place to start
“Model zoos”: curated collections of architectures and pretrained models
transformers (HuggingFace)
timm