1 of 10

Pluggable Resource Management for Kubernetes

Intel Resource Management September 2022

Marlow Weston, Dr. Atanas Atanasov, Adrian Hoban

Intel Confidential

Department or Event Name

1

2 of 10

Plan for Pluggable Resource Management

Objectives & Goals:

  • Make Kubelet more microkernel, with the allowance of specific custom drivers for resources such as CPU and memory
  • Minimize kubelet codebase and responsibility
  • Users can request workload allocation preferences without admin intervention and Kubelet restarts
  • Enable Pluggable interface for easy deployment of managers
  • Maintain existing experience through transition to plugin architecture (E2E test compatibility)
  • Can reuse code in other platforms, as plugins are not tied to specific tech
  • Allow vendor-specific plugins supported by vendors instead of pushing into Kubelet for hardware changes
  • Avoid communicating with the API server and increasing cluster traffic when managing on-node resources (today, we use operators or informers)
  • Sig-node sponsored set of resource plugins

Intel Confidential

Department or Event Name

2

3 of 10

Current State of Technology

  • Kubelet has 5 managers to manage resources: CPU Manager, Memory Manager, Topology Manager, Device Manager, Dynamic Resource Allocation
  • First 3 Managers are not extendable; Vendors have hard time to keep managers up to date with the HW
  • Affect Sustainability and Total Cost of Ownership with current models for some users
  • Many custom-made solutions from users (all need to turn off current features and hack around for solutions)
  • CPU and Memory Management policies are not sufficient and require kubelet restart and reconfiguration. In some environments, this requires draining and restarting the node
  • Topology manager does not support complete set of topology mappings on current and future hardware needed by workloads
  • Kubelet holds all the code for them , becomes bigger and harder to maintain, performance and quality suffers, Interfaces become unnecessary big and heavy

Intel Confidential

Department or Event Name

3

4 of 10

Current Kubelet

Intel Confidential

Department or Event Name

4

5 of 10

Current Resource Manager Flow

This architecture has side effects with unnecessary complexity

Intel Confidential

Department or Event Name

5

6 of 10

Internals of Resource Management Plugin

Intel Confidential

Department or Event Name

6

7 of 10

Phase 1 – Topology Info Required

Intel Confidential

Department or Event Name

7

8 of 10

Phase 2 – Topology Info becomes Optional

Intel Confidential

Department or Event Name

8

9 of 10

Proposed next steps

Publish a KEP for pluggable resource management

Submit code which covers the plugin mechanism and first plugins for community discussion

Gatekeeper logic which can enable/disable the pluggable RM and current Topology Manager, CPU Manager, Memory Manager

Show E2E-node compatibility of the system

Show performance impact of the pluggable resource management vs standard Kubernetes

Support current state of Device Manager together with pluggable Resource Manager

Intel Confidential

Department or Event Name

9

10 of 10

References

  • Issue created: https://github.com/kubernetes/enhancements/issues/3675
  • Initial CPU Management Use Case Doc: https://docs.google.com/document/d/1U4jjRR7kw18Rllh-xpAaNTBcPsK5jl48ZAVo7KRqkJk/edit
  • RFC for Kubelet Plugin: https://docs.google.com/document/d/1O5G4HMhfyC9AdaGai1eV5OJpCugV3vFIW19_FCuMOaY/edit?resourcekey=0-qLkKucnl3Y2wJ_WEfZPRVQ#heading=h.xgjl2srtytjt

Intel Confidential

Department or Event Name

10