1 of 14

k8s-hep Meetup

  • wrap-up

1

k8s-hep Meetup ('Chicagoland')

January 30, 2020

Rob Gardner

Enrico Fermi Institute

University of Chicago

2 of 14

themes from today

2

3 of 14

Themes

  • deploying k8s on-prem
  • operation suprises
  • managing k8s releases
  • managing storage volumes
  • uniformity between sites and cloud - infrastructure looks the same
  • can start to do multicluster things
  • need a good enviro to grow - starting points
  • where to collaborate (e.g. gitlab)
  • sharing images and charts
  • analysis facilities in wlcg need effort
  • jupyter hubs scheduled by k8s, and federated jupyter hubs

3

4 of 14

Themes, II

  • various use cases - batch, analysis, federated edge service orchestration
  • scoping privileges
  • Helping sys admins - giving them a uniform layer - lining up w/ industry standards
  • what can k8s fix that we had to do before?
  • distributed OS - k8s a platform (analogy with Linux on the node)
  • HTCondor is evolving to k8s and open to input
  • helm deploy htcondor pool
  • noting best practice methods
  • functional interfaces to infrastructure

4

5 of 14

challenges

5

6 of 14

Challenges

  • Distributions - the many options in a rapidly growing community
  • Security - registries, image security, charts
  • Why are you federating? k8s v1 was a failure; v2 still in alpha a year later - most meetings are cancelled
  • Load balancing - exists? what if its out of IPs?
  • Helm templating - lack of documentation
  • PriorityClasses, StorageClasses for pods - understanding wha
  • Security and app curation rules
  • HTCondorCE that submits pilots as pod on a cluster
  • Context switching - root reqs of the CE -
  • Application whitelisting / black
  • Priv containers, those with root access; OKD4, OKD3
  • Kubernetes version

6

7 of 14

Challenges

  • No one is asking for ML as a service, but "federation" can be simple but with ingress on the target
  • U-Fix! by SRE's!
  • Provisioning the resource - the layer between htcondor vs k8s
  • Logs
  • chicken - egg: htcondor & k8s and handling selective eviction - what can the k8s do today, - use labels, pods, etc

7

8 of 14

opportunities

8

9 of 14

R&D platforms, SSLab, ...

  • Vital for IRIS-HEP and OSG-LHC
  • But also is a place to experiment with new technologies & share k8s tips & tricks
  • Next gen Tier2, Tier3
  • Lessions from Fermilab k8s-organized resources

9

10 of 14

Tier3

  • Deploying all services via slate/k8s - USCMS effort at ND
  • A blueprint for other Tier3's
  • Find another Tier3 to work with Kenyi

10

11 of 14

Federated ML platforms

  • Need to clarify the needed capabilities
  • Federating multiple clusters

11

12 of 14

HTCondor, OSG

Handling k8s targets - submitting pods with base pilots

Containerizing services

Image security

Kubernetes as grid type .. HTKondor?

12

13 of 14

a funcX'ing cool demo

a different programming model directing tasks to agents

compose with other tools - to make an anlysis

13

14 of 14

Help define the future of particle physics with the future of Infrastructure

  • Link
  • July 10-21, 2021
  • 1000 attendees
  • Needed:
    • datasets
    • software environments
    • mix of resources
    • easy to use platforms
      • submit, notebooks, bespoke
    • declarative, flexible & scalable infrastructure to support it all

14