1 of 40

Kubeflow Explained: NLP Architectures

on Kubernetes

Michelle Casbon

YOW!

Melbourne�December 7, 2018

2 of 40

whoami

@texasmichelle

3 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

1

2

3

4

5

@texasmichelle

4 of 40

ML decision tree

Move along

Is this a clearly defined problem?

Can it be solved in a deterministic way?

Do that

Dive in

No

No

Yes

Yes

Credit: David Andrzejewski

@texasmichelle

5 of 40

Counting things is still really hard.

MACHINE

LEARNING

@texasmichelle

6 of 40

https://github.com/kubeflow/examples/demos

@texasmichelle

7 of 40

A curated set of compatible tools and artifacts that lays a foundation for running production ML apps

Enables consistency across deployments by providing Kubernetes object templates that bring together disparate components

@texasmichelle

8 of 40

Infrastructure

Application

Platform

GCP

Yelp Sentiment

Kubeflow

GCP

Sentiment

Kubeflow

@texasmichelle

9 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

2

1

3

4

5

@texasmichelle

10 of 40

Production code

@texasmichelle

11 of 40

Moving from local to production

Portability

Package infrastructure components together

Credit: Jörg Wagner and Stefan Prehn

GCP

Sentiment

Kubeflow

@texasmichelle

12 of 40

13 of 40

14 of 40

Complexity

GCP

Sentiment

Kubeflow

@texasmichelle

15 of 40

Perception

Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.

GCP

Sentiment

Kubeflow

@texasmichelle

16 of 40

Reality

Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.

GCP

Sentiment

Kubeflow

@texasmichelle

17 of 40

Data

Featurization

Training

Application

Platform

GCP

Sentiment

Kubeflow

Feature Extraction

Data Ingestion

Data Exploration

Data Transformation

Data Validation

Data Analysis

Training Data Segmentation

Model Building

Model Validation

Model Versioning

Model Auditing

Distributed Training

Continuous Training

Process Management

Configuration

Resource Management

Monitoring

Logging

Continuous Delivery

Authentication/ Authorization

Serving Infrastructure

UI

Business Logic

Load Balancing

@texasmichelle

18 of 40

Complexity

Composability

Logical groupings

Reusable components

GCP

Sentiment

Kubeflow

@texasmichelle

19 of 40

Maintainability

  • Error resolution, recovery, & prevention
  • Speed of iteration
  • Versioning

Composability

Shorten the development lifecycle

Automation

GCP

Sentiment

Kubeflow

@texasmichelle

20 of 40

Capacity Planning

Scalability

Kubernetes

Autoprovisioning

  • Usage patterns
  • Demand spikes
  • Efficient resource usage

GCP

Sentiment

Kubeflow

@texasmichelle

21 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

1

2

3

4

5

@texasmichelle

22 of 40

Make it easy for everyone to develop, deploy, and manage portable, scalable ML everywhere

@texasmichelle

23 of 40

Kubeflow

Composability

Single, unified tool for common processes

Portability

Entire stack

Scalability

Native to k8s

Reduce variability between services & environments

Full product lifecycle

Support specialized hardware, like GPUs & TPUs

Reduce costs

Improve model performance

GCP

Sentiment

Kubeflow

@texasmichelle

24 of 40

Kubeflow

https://github.com/kubeflow/kubeflow

Who

Data scientists

ML researchers

Software engineers

Product managers

Why

Because building a platform is too big of a problem to tackle alone

What

Portable ML products on k8s

v0.3.4 release

GCP

Sentiment

Kubeflow

@texasmichelle

25 of 40

Kubeflow

Kubernetes-native platform for ML

Run wherever k8s runs

Use k8s to manage ML tasks

CRDs for distributed training

Adopt k8s patterns

Microservices

Manage infra declaratively

Package infrastructure components together

Ksonnet

Move between local -> dev -> test -> prod -> onprem

Support multiple ML frameworks

Tensorflow

Pytorch

Scikit

Xgboost

Et al.

GCP

Sentiment

Kubeflow

@texasmichelle

26 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

2

1

3

4

5

@texasmichelle

27 of 40

But what is it?

@texasmichelle

28 of 40

GCP

Sentiment

Kubeflow

@texasmichelle

29 of 40

A curated set of compatible tools and artifacts that lays a foundation for running production ML apps

Enables consistency across deployments by providing Kubernetes object templates that bring together disparate components

@texasmichelle

30 of 40

What's Inside v0.3?

GKE

Ingress�(e.g. Ambassador)

Pipelines Controllers

Argo

Controllers

Katib HP Tuning Controllers

IAP

Central Dashboard

JupyterHub

TF Job Dashboard

TF Job

Operator

Pipelines

Dashboard

Argo

Dashboard

Katib HP Tuning

Dashboard

Pytorch Operator

GCP

Sentiment

Kubeflow

@texasmichelle

31 of 40

Click-to-deploy

GCP

Sentiment

Kubeflow

@texasmichelle

32 of 40

What's new in v0.3?

  • Deploy
    • Click-to-deploy
    • CLI (kfctl)
    • Autoprovisioning
  • Develop
    • Argo
    • Pytorch operator
    • Hyperparameter tuning StudyJob CRD
    • Kubeflow Pipelines

GCP

Sentiment

Kubeflow

@texasmichelle

33 of 40

Pipelines

  • End-to-end ML workflows
  • Orchestration
  • Service integration
  • Components & sharing
  • Job tracking, experimentation, monitoring
  • Notebook integration

GCP

Sentiment

Kubeflow

@texasmichelle

34 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

2

1

3

4

5

@texasmichelle

35 of 40

https://github.com/kubeflow/examples/demos

GKE

Ingress�(e.g. Ambassador)

Pipelines Controllers

Argo

Controllers

Katib HP Tuning Controllers

IAP

Central Dashboard

JupyterHub

TF Job Dashboard

TF Job

Operator

Pipelines

Dashboard

Argo

Dashboard

Katib HP Tuning

Dashboard

Pytorch Operator

TF Serving

Application UI

Notebook

TF Job

Parameter Server

1

TensorFlow Master

TensorFlow

Workers

1

2

3

Pipeline

StudyJob

GCP

Sentiment

Kubeflow

@texasmichelle

36 of 40

Try it Yourself

  • deploy.kubeflow.cloud (Click-to-deploy)
  • codelabs.developers.google.com
    • Intro to Kubeflow on Google Kubernetes Engine: https://goo.gl/192bs7
    • Kubeflow End-to-End: GitHub Issue Summarization: https://goo.gl/qLXUTG
  • Qwiklabs: https://qwiklabs.com
  • Katacoda: https://www.katacoda.com/kubeflow
  • GitHub: https://github.com/kubeflow/examples/tree/master/github_issue_summarization
  • http://gh-demo.kubeflow.org

GCP

Sentiment

Kubeflow

@texasmichelle

37 of 40

Agenda

Problems

Goals

What's inside

Demo

Future Direction

2

1

3

4

5

@texasmichelle

38 of 40

Roadmap

  • 0.4 in December
  • Enterprise readiness
    • 1.0 with hardened APIs
    • IAM/RBAC
    • Clean deployments, upgrades
  • Better Jupyter Notebook integration
  • Pipeline experiment comparison
  • Model management
  • Test release infrastructure
  • You tell us! (Or better yet, help!)

GCP

Sentiment

Kubeflow

@texasmichelle

39 of 40

Just the Beginning

  • Kubeflow is open
    • Open community
    • Open design
    • Open source
    • Open to ideas
  • You tell us! Get involved
    • github.com/kubeflow
    • kubeflow.slack.com
    • @kubeflow
    • kubeflow-discuss@googlegroups.com
    • Community call Tuesdays alternating 8:30am and 5:30pm Pacific
    • Kubeflow Contributor Summit
      • Q1 2019

GCP

Sentiment

Kubeflow

@texasmichelle

40 of 40

Questions?

@texasmichelle