1 of 6

Multi-user

Environments

kam.d.kasravi@intel.com, abolfazl.shahbazi@intel.com

https://shorturl.at/kxPR3

2019-03-12

2 of 6

Multi-user Impetus

  • Data scientists should have strong privacy controls within their workspace
  • Data scientists often need to isolate different workflows in different workspaces
  • Data scientists should have resource quotas per workspace
    • fair scheduling across teams
  • Data scientists often need controlled access to different cloud provider resources
    • GCS cloud storage
    • AWS S3
    • GCR (image registries)

3 of 6

Multi-user Kubeflow ↔ Profiles in Kubeflow 0.5

  • Data scientists use self-serve environments to create their own workspaces with no devops involvement
    • One user, many workspaces*
    • Access control (RBAC) part of the workspace
    • Least privilege model
      • no cluster wide definitions
      • no privilege escalation to create the workspace
  • Integrated with several types of storage models (nfs, s3, pv’s)
  • Part of Kubeflow Notebooks

* implemented as namespaces

4 of 6

Profiles is a CRD

apiVersion: kubeflow.org/v1alpha1

kind: Profile

metadata:

name: jill

spec:

owner:

kind: User

apiGroup: rbac.authorization.k8s.io

name: jill@foo.com

apiVersion: kubeflow.org/v1alpha1

kind: Profile

metadata:

name: john

spec:

owner:

kind: ServiceAccount

namespace: kube-system

name: john

User is defined in GCP IAM

ServiceAccount is defined in the k8 cluster, namespace kube-system

5 of 6

Submitting a Profile CR

6 of 6

Profiles features in kubeflow 0.5+ (what we’re working on)

  • Extend Profiles to only allow users in cloud-provider IAM service accounts
    • IAM Service Accounts includes roles that provide edit or view access to Cloud Resources
  • Allow additional IAM users in single-user workspaces
  • Workspace owner can add/remove IAM users
  • Delegation and impersonation
    • Kubeflow ServiceAccounts will execute in each workspace on behalf of the owner or member.
  • Utilize the Google Cloud CDK since it provides a layer across GCP, AWS, and Azure.