Right Code
Right Place
Right Time
Tim Hopper
Senior Data Scientist
Cylance, Inc
🐦 @tdhopper
💻 tdhopper.com
📧 tdhopper@gmail.com
🖥 bit.ly/pydata2018
What 2010 Tim thought I’d do
What my wife thinks I do
What my CEO thinks I do
What my boss thinks I do
What I want people on Twitter to think I do
What I actually do
Install Python dependencies for exploratory analysis
Filter and extra features from data snapshots on S3
Spin up AWS spot instance with enough RAM to load data into Pandas
Configure EC2 instances and VPN to make Jupyter server accessible locally
Translate Scikit results into form that can be re-implemented in production
Extract code from random scripts and notebooks into Python package
Write shell script to bootstrap EC2 machines to reproduce analysis
Move Python environments inside Docker containers
Figure out how to share Docker images with the rest of the team
Build dashboard to monitor performance of model predictions
Schedule reporting tasks to run nightly
Do whatever people do with Kubernetes
Configure AWS permissions
Data Scientists Solve Problems
https://www.youtube.com/watch?v=Av07QiqmsoA
VPs of Engineering Don’t Want �Data Scientists Being Engineers
Production models
Model building and deployment pipelines
Great Data Science
Needs
Great Engineering
Great Engineering
Needs
Infrastructure
and �Operations
DevOps Teams
Can Hinder
DevOps Practice
DataSciDevOps?
(MLEngOps?)
Great Machine Learning
Requires
Great Engineering �and� Great Operations
Tim Hopper
Senior Data Scientist
Cylance, Inc.
🐦 @tdhopper
💻 tdhopper.com
📧 tdhopper@gmail.com
🖥 bit.ly/pydata2018
?
Right Code
Right Place
Right Time
Is my code correct?
Right Code
Are my dependencies available?
Right Code
Is my configuration correct?
Right Code
Are internal libraries readily available to coworkers?
Right Place
Are deployments automated?
Right Place
Is my virtual network correctly configured?
Right Place
Right Place
Is my configuration and provisioning automated?
Can I easily run code on a schedule?
Right Time
Do I have visibility into its status and history?
Right Time
Can I provision infrastructure on-demand for ad-hoc jobs?
Right Time