Kubeflow Explained: NLP Architectures
on Kubernetes
Michelle Casbon
YOW!
Melbourne�December 7, 2018
whoami
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
1
2
3
4
5
@texasmichelle
ML decision tree
Move along
Is this a clearly defined problem?
Can it be solved in a deterministic way?
Do that
Dive in
No
No
Yes
Yes
Credit: David Andrzejewski
@texasmichelle
Counting things is still really hard.
MACHINE
LEARNING
@texasmichelle
https://github.com/kubeflow/examples/demos
@texasmichelle
A curated set of compatible tools and artifacts that lays a foundation for running production ML apps
Enables consistency across deployments by providing Kubernetes object templates that bring together disparate components
@texasmichelle
Infrastructure
Application
Platform
GCP
Yelp Sentiment
Kubeflow
GCP
Sentiment
Kubeflow
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
2
1
3
4
5
@texasmichelle
Production code
@texasmichelle
Moving from local to production
Portability
Package infrastructure components together
Credit: Jörg Wagner and Stefan Prehn
GCP
Sentiment
Kubeflow
@texasmichelle
Complexity
GCP
Sentiment
Kubeflow
@texasmichelle
Perception
Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.
GCP
Sentiment
Kubeflow
@texasmichelle
Reality
Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.
GCP
Sentiment
Kubeflow
@texasmichelle
Data
Featurization
Training
Application
Platform
GCP
Sentiment
Kubeflow
Feature Extraction
Data Ingestion
Data Exploration
Data Transformation
Data Validation
Data Analysis
Training Data Segmentation
Model Building
Model Validation
Model Versioning
Model Auditing
Distributed Training
Continuous Training
Process Management
Configuration
Resource Management
Monitoring
Logging
Continuous Delivery
Authentication/ Authorization
Serving Infrastructure
UI
Business Logic
Load Balancing
@texasmichelle
Complexity
Composability
Logical groupings
Reusable components
GCP
Sentiment
Kubeflow
@texasmichelle
Maintainability
Composability
Shorten the development lifecycle
Automation
GCP
Sentiment
Kubeflow
@texasmichelle
Capacity Planning
Scalability
Kubernetes
Autoprovisioning
GCP
Sentiment
Kubeflow
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
1
2
3
4
5
@texasmichelle
Make it easy for everyone to develop, deploy, and manage portable, scalable ML everywhere
@texasmichelle
Kubeflow
Composability
Single, unified tool for common processes
Portability
Entire stack
Scalability
Native to k8s
Reduce variability between services & environments
Full product lifecycle
Support specialized hardware, like GPUs & TPUs
Reduce costs
Improve model performance
GCP
Sentiment
Kubeflow
@texasmichelle
Kubeflow
https://github.com/kubeflow/kubeflow
Who
Data scientists
ML researchers
Software engineers
Product managers
Why
Because building a platform is too big of a problem to tackle alone
What
Portable ML products on k8s
v0.3.4 release
GCP
Sentiment
Kubeflow
@texasmichelle
Kubeflow
Kubernetes-native platform for ML
Run wherever k8s runs
Use k8s to manage ML tasks
CRDs for distributed training
Adopt k8s patterns
Microservices
Manage infra declaratively
Package infrastructure components together
Ksonnet
Move between local -> dev -> test -> prod -> onprem
Support multiple ML frameworks
Tensorflow
Pytorch
Scikit
Xgboost
Et al.
GCP
Sentiment
Kubeflow
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
2
1
3
4
5
@texasmichelle
But what is it?
@texasmichelle
GCP
Sentiment
Kubeflow
@texasmichelle
A curated set of compatible tools and artifacts that lays a foundation for running production ML apps
Enables consistency across deployments by providing Kubernetes object templates that bring together disparate components
@texasmichelle
What's Inside v0.3?
GKE
Ingress�(e.g. Ambassador)
Pipelines Controllers
Argo
Controllers
Katib HP Tuning Controllers
IAP
Central Dashboard
JupyterHub
TF Job Dashboard
TF Job
Operator
Pipelines
Dashboard
Argo
Dashboard
Katib HP Tuning
Dashboard
Pytorch Operator
GCP
Sentiment
Kubeflow
@texasmichelle
Click-to-deploy
GCP
Sentiment
Kubeflow
@texasmichelle
What's new in v0.3?
GCP
Sentiment
Kubeflow
@texasmichelle
Pipelines
GCP
Sentiment
Kubeflow
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
2
1
3
4
5
@texasmichelle
https://github.com/kubeflow/examples/demos
GKE
Ingress�(e.g. Ambassador)
Pipelines Controllers
Argo
Controllers
Katib HP Tuning Controllers
IAP
Central Dashboard
JupyterHub
TF Job Dashboard
TF Job
Operator
Pipelines
Dashboard
Argo
Dashboard
Katib HP Tuning
Dashboard
Pytorch Operator
TF Serving
Application UI
Notebook
TF Job
Parameter Server
1
TensorFlow Master
TensorFlow
Workers
1
2
3
Pipeline
StudyJob
GCP
Sentiment
Kubeflow
@texasmichelle
Try it Yourself
GCP
Sentiment
Kubeflow
@texasmichelle
Agenda
Problems
Goals
What's inside
Demo
Future Direction
2
1
3
4
5
@texasmichelle
Roadmap
GCP
Sentiment
Kubeflow
@texasmichelle
Just the Beginning
GCP
Sentiment
Kubeflow
@texasmichelle
Questions?
@texasmichelle