1 of 16

2 of 16

Yuan Tang (Akuity) and Andrey Velichkevich (Apple)

Managing Thousands of Automatic Machine Learning Experiments With Argo and Katib

3 of 16

Kubeflow Overview

4 of 16

What is Katib ?

  • Production Ready OSS Project for AutoML.
  • Support Hyperparameter Tuning, Neural Architecture Search, and Early Stopping.
  • Platform to develop and evaluate custom AutoML algorithms.
  • Can orchestrate any Kubernetes custom resources.
  • Agnostic to ML frameworks and languages.
  • Natively integrated with other Kubeflow components (Training, Notebooks, Pipelines).

5 of 16

Katib Architecture

6 of 16

Challenges in HP Tuning

7 of 16

Argo Workflows

  • Machine learning pipelines
  • Data processing/ETL
  • Infrastructure automation
  • Continuous delivery/integration

8 of 16

Memoization Cache

Step A

Cache Store

If the cache is outdated, re-run the step.

If the cache is still fresh (within a configurable time, e.g. <10s), retrieving cache and use it.

Step B

Creating cache (if not exist yet)

Cache is saved as ConfigMaps

9 of 16

Memoization Cache - Example

Cache (K8s ConfigMap):

10 of 16

Memoization in ML Workflows

Data ingestion

Model training

Cache store

The data has NOT been updated recently.

The data has already been updated recently.

Triggers a ML pipeline with Argo Workflow

Triggers a Katib Experiment

11 of 16

Memoization Cache

Sequential steps:

Data ingestion step:

Distributed model training step:

12 of 16

Multi-objective Optimization Pipeline

Data Ingestion

Data Ingestion

Logistic Regression

Accuracy: 68%

Neural Networks

AUC: 76%

Decision Trees

Loss: 90%

Metrics Collection

Hyper-parameters Suggestion

Triggers a ML pipeline with Argo Workflow with the new suggested hyperparameters

13 of 16

Multi-objective Optimization Pipeline

DAG:

Data ingestion steps:

Model training steps:

14 of 16

Demo: Caching

15 of 16

Join Argo Workflows and Katib Community

  • Follow this guide to run Argo Workflows example in Katib.
  • Join the Argo Workflows and Katib Community meetings to get the latest updates.
  • Check the Argo Workflow and Katib GitHub repositories.
  • Join the Argo Workflows and Katib Slack channel.
  • If you are using Katib please update the Adopters list.
  • Learn more about Katib in the presentation list.
  • Personal Contact:
    • Yuan Tang - @TerryTangYuan
    • Andrey Velichkevich - @andreyvelichk

16 of 16

Thank you!