Byte-Sized
Computer Science for Data People
Part 1: Clean Code
1 | Copyright © 2020 Nick Lind. All rights reserved.
Topics We’ll Cover
1
Clean Code
2
System Design
3
Collaboration
Writing performant code that others will be excited to reuse
Building systems and�products that scale
Working productively�with other people
2 | Copyright © 2020 Nick Lind. All rights reserved.
Principles of Clean Code
Style
Writing developer-friendly code that encourages others to build on top of and reuse your work
Speed
Ability to handle larger volumes of data without slowing down
Ability to handle larger volumes of data without breaking
Space
3 | Copyright © 2020 Nick Lind. All rights reserved.
Deep-Dive: Style
Hastily writing code that’s easier for a future developer to abandon than to fix
Writing clean code once and having future users thank you forever
Benefits
Big Ideas
Related CS Concepts
*Clean Code recommends keeping functions to <4 lines, <3 parameters
4 | Copyright © 2020 Nick Lind. All rights reserved.
Deep-Dive: Speed
Your modeling pipeline was very fast when running on a small sample of data, but started to hang when you tried the same code on a larger dataframe. What should you do first?
Running nested for-loops on a Spark DataFrame
Using a built-in function on a filtered pandas DataFrame instead
Related CS Concepts
Thinking Exercise
Big Ideas
5 | Copyright © 2020 Nick Lind. All rights reserved.
Deep-Dive: Space
Your modeling pipeline was working fine when you were filtered on one region, but suddenly throws an out-of-memory error when include all regions in your dataframe. What steps would you take to solve this problem?
Wasting money�on expensive�storage and�compute clusters
Profiling your code and compressing your data so you don’t have to
Related CS Concepts
Thinking Exercise
Big Ideas
6 | Copyright © 2020 Nick Lind. All rights reserved.
Real-World Examples
7 | Copyright © 2020 Nick Lind. All rights reserved.
Real-World Examples
8 | Copyright © 2020 Nick Lind. All rights reserved.
Real-World Examples
9 | Copyright © 2020 Nick Lind. All rights reserved.
Book Recommendations
1
Clean Code
2
System Design
3
Collaboration
10 | Copyright © 2020 Nick Lind. All rights reserved.