1 of 20

k8s中pod及核心组件

张育鑫(Taylor Zhang)

https://www.linkedin.com/in/yxzh/

2 of 20

Content

  1. Pod相关
    1. 如何定义一个pod
    2. 在pod中使用configmap
    3. Pod的生命周期及探针(porbe)
    4. Pod的调度策略
  2. 组件相关(next)
    • Kubernetes API server
    • Controller Manager
    • Scheduler
    • Kubelet

3 of 20

Pod 的yaml定义

4 of 20

如何解决config分发问题?

场景:一个集群内多个服务共享一个网关,如何配置?

ConfigMap

  • 解决分布式配置分发问题
  • Key - value格式

思考:如果是db secret等sensitive data呢?

Secret

  • 可加密存储
  • 专门存放敏感数据

Secret kind:https://kubernetes.io/docs/concepts/configuration/secret/#secret-types

5 of 20

6 of 20

Container states

Container Restart Policy

State

Meaning

Running

The Running status indicates that a container is executing without issues.

Waiting

If a container is not in either the Running or Terminated state, it is Waiting.

Terminated

A container in the Terminated state began execution and then either ran to completion or failed for some reason.

  • Always
  • OnFailure
  • Never

7 of 20

容器探针 container probes

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes

Probe type

Last entire container life time?

After failure

Usage

LivenessProbe

Yes

The container get killed

判断容器的存活状态,是否运行正常

ReadinessProbe

Yes

Pod’s Endpoint is removed

判断容器是否就绪,是否可以接收流量处理请求

StartupProbe

No

The container get killed

判断容器是否启动成功。(成功前会disable liveness and readiness probe)

8 of 20

Probe Check Mechanisms

exec

在容器中执行一个命令

返回码 == 0

TCPSocketAction

通过容器的ip地址和port进行TCP检查

能否建立TCP链接

HTTPGetAction

通过ip和port及路径调用http get

返回码 >=200 && <400

grpc

Through grpc call. An alpha feature

Need to implement GRPC Health Checking Protocol

Status of the response == SERVING

9 of 20

Pod readiness

Pod with 1 container

  • Container ready -> pod ready

Pod with multiple containers

PodScheduled

Initialized

Containers Ready

Ready

Pod Scheduled

Y

Y

Y

Y

Init container

Y

Y

Y

Other containers

Y

Y

Readiness Gates

Y

10 of 20

Pod Phase 生命周期

Phase

Description

Pending

The Pod has been accepted by the Kubernetes cluster, but one or more of the containers has not been set up and made ready to run. This includes time a Pod spends waiting to be scheduled as well as the time spent downloading container images over the network.

Running

The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting.

Succeeded

All containers in the Pod have terminated in success, and will not be restarted.

Failed

All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system.

Unknown

For some reason the state of the Pod could not be obtained. This phase typically occurs due to an error in communicating with the node where the Pod should be running.

11 of 20

Pod 及container状态转化事例

容器数

Pod phase

event

Pod结果状态

Always

RestartPolicy=OnFailure

Never

1 container

Running

容器正常退出

Running

Succeeded

Succeeded

1 container

Running

容器异常退出

Running

Running

Failed

2 containers

Running

1容器异常退出

Running

Running

Running

2 containers

Running

容器OOM

Running

Running

Failed

12 of 20

Pod调度策略

Pod应该在哪个节点创建?怎么创建? label & selector

  • Deployment or Replication Controller:全自动调度
  • NodeSelector:定向调度
  • NodeAffinity:Node亲和性调度
  • PodAffinity:Pod亲和性与互斥调度
  • Taints & Tolerations:污点和容忍
  • Pod Priority Preemption:Pod优先级调度

13 of 20

Deployment or Replication Controller:全自动调度

自动创建并维持3个pod

14 of 20

NodeSelector:定向调度

15 of 20

NodeAffinity:Node亲和性调度

16 of 20

topology Key

https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

kubernetes.io/hostname

topology.kubernetes.io/region

topology.kubernetes.io/zone

17 of 20

PodAffinity:Pod亲和性与互斥调度

18 of 20

PodAffinity:Pod亲和性与互斥调度

19 of 20

Taints & Tolerations:污点和容忍

场景

  1. 某node磁盘空间已满
    1. 可临时配置为taint,并配置驱逐时间,时间到则容器被evict
    2. 待清理后取消taint标记
  2. 某node网络很慢但cpu资源很多
    • 可永久配置为taint
    • 可将重本地计算的pod配置tolerations并调度到此node上

20 of 20

Pod Priority Preemption:Pod优先级调度