COSI - The Common Operating System Interface
Status | Draft |
Authors | Stefano Borrelli <steve@borrelli.org>, Andrew Rynhard <andrew.rynhard@talos-systems.com> |
Container Linux Distributions 3
This proposal introduces COSI[a], the Common Operating System Interface. Inspired by the Container Runtime Interface (CRI), Container Network Interface (CNI) and the Container Storage Interface (CSI), COSI will focus on the configuration of the underlying operating system, providing Protocol Buffer definitions and a gRPC API reference implementation for common system configuration settings such as DNS, network and node filesystems.
The tight coupling of the kubelet with the Linux operating system is an ongoing challenge for the Kubernetes platform. As part of normal operation, the kubelet must communicate with the underlying operating system to configure pod networking, request storage, write files to the filesystem, manage processes and various other tasks.
Unfortunately, in most Linux systems the standard contract is often not an API, but the execution of a command line utility. This means that the node must have multiple binaries present and that the kubelet must shell out to execute these binaries, reading back unstructured output and exit codes.
An attempt to document all the kubelet dependences was started in 2016 in issue 26093, Identify, document and maybe remove kubelet dependencies. As of October 2020 this issue is still open. Perhaps the clearest definition of Kubernetes’ Linux operating system dependencies is located in the e2e test suite. These dependencies include Linux Kernel features and many individual binaries.
Another example of this coupling is KEP-2000 Graceful Node Shutdown, where the proposed implementation writes systemd files and sends a signal to logind. This proposed solution is not applicable to Kubernetes distributions like Talos and k3OS that do not use systemd.
These examples highlight issues in the kubelet -> Linux contract, including:
A number of Linux distributions have emerged with a focus on containerized workloads. These distributions tend towards a minimal set of features, immutability, and management from a centralized control plane. However, these distributions diverge from each other in terms of system management tools and API endpoints.
Below[b][c][d][e] is a table summarizing the various methods container-oriented Linux distributions are configured and managed:
Cloud Boot Init System | Ignition, config via transpiling | |||
Userdata format | JSON (transpiled from YAML) | |||
Init System | Systemd (via servicedog) | Openrc (via openrc-run) | ||
Network | systemd-networkd via Ignition | [f][g][h] | ||
API Definition | Go: config (init only) | Go: config.go (init only) | ||
SSH Enabled | Yes | No | Optional (via Admin Container) | Yes |
HTTP API | No | No | Yes | No |
gRPC API | No | Yes | No | No |
Primary programming Language | Go (Ignition, config-transpiler) | Go | Rust | Go |
COSI attempts to define a standardized configuration and API interface. This has the potential to simplify the kubelet and the development of distributions.
Define protocol buffer definitions for configuring operating system components (Network configuration, MOTD, Kernel Modules, sysctl, DNS servers, etc.)
Define a mechanism by which Kubernetes control and node components can communicate with arbitrary operating system configurations (‘osd’), including a plugin architecture that allows different components to be managed..
Develop a reference implementation that provides an API for managing the underlying operating system.
Prescribe implementation details of the operating system configuration software.
Possible Solutions
Given the relative simplicity of container-based operating systems, there is limited ability for differentiation. However we see in these new operating systems the same trends that have plagued Unix and Linux for decades: incompatible APIs and configuration formats, a lack of separation between interface and implementation, and significant effort spent duplicating common functionality.
The Kubernetes community has seen enormous success in defining standard interfaces including the Container Runtime Interface (CRI), Container Network Interface (CNI) and the Container Storage Interface (CSI). The power of this pattern of common interfaces with pluggable backends allows users a single method of control and the development of a broad 3rd party community.
By defining an interface that container orchestration agents such as the kubelet can use, issues like 26093 are eliminated, as are potential incompatibilities in the versions of utilities (find, grep, etc.).
Benefits to the Kubernetes community include simplifying the kubelet, and may allow other methods of node management including control nodes communicating directly with worker nodes.
[a]TBH, the name reminds me of Kubernetes Container Object Storage Interface, https://container-object-storage-interface.github.io/ . ;-)
[b]It might be also good to compare how packages are managed in each container-oriented distro. For example, Flatcar does not have a package manager. That is why a bunch of apps and e2e tests fail to work on Flatcar, because they simply assume rpm or deb package manager on a distro.
[c]This is a really good point, and I would love to hear your thoughts about package management.
I avoided it because it seems that package managers manage artifacts on shared filesystems, where it seems to me that we should have isolated artifacts. (Thinking of MacOS/OSI app bundles).
Anyway, I would love more discussion on this topic.
[d]I wonder if we could assume we target CRI-enabled OS-es. Getting container runtime installed on OS seems to belong to image build process, which might be out of scope for COSI, which I'd think would address more of runtime operations and not configuration management.
[e]+1 on it being out of scope for the spec.
[f]Link is broken
[g]It looks the talos docs have been refactored and the link removed. I'll look for another link.
[h]@andrew.rynhard@talos-systems.com I added a link to the networkd.go code, if there is a bitter link let me know.
[i]We would use gRPC to define an API and to make use of extensible nature of gRPC right? Transport like HTTP/2 should be an implementation detail.