概述:
污点taints是定义在节点之上的键值型属性数据,用于让节点回绝将Pod调度运行于其上, 除非该Pod对象具备接收节点污点的容忍度。而容忍度tolerations是定义在 Pod对象上的键值型属性数据,用于配置其可容忍的节点污点,而且调度器仅能将Pod对象调度至其可能容忍该节点污点的节点之上,如图所示
- 一个Pod是否被调度到节点上因素有
- 是否节点有污点
- 节点上有污点.Pod是否能容忍这个污点
污点和容忍度
污点定义在节点的node Spec中,而容忍度则定义在Pod的podSpec中,它们都是键值型数据,但又都额定反对一个成果effect标记,语法格局为key=value:effect,其中key和value的用法及格局与资源注俯-信息类似, 而effect则用于定义对Pod对象的排挤等级,它次要蕴含以下三种类型效用标识
- NoSchedule
不能容忍此污点的新Pod对象不可调度至以后节点,属于强制型束缚关系,节点上现存的Pod对象不受影响。 - PreferNoSchedule
的柔性束缚版本,即不能容忍此污点的新Pod对象尽量不要调度至以后节点,不过无其余节点可供调度时也容许承受相应的Pod对象。节点上现存的Pod对象不受影响。 - NoExecute
不能容忍此污点的新Pod对象不可调度至以后节点,属于强制型束缚关系,而且节点上现存的Pod对象因节点污点变动或Pod容忍度变动而不再满足匹配规定时,Pod对象将被驱赶。
在Pod对象上定义容忍度时,它反对两种操作符:一种是等值比拟Equal,示意容忍度与污点必须在key、value和effect三者之上齐全匹配;另一种是存在性判断Exists,示意二者的key和effect必须齐全匹配,而容忍度中的value字段要应用空值。
Pod调度程序
一个节点能够配置应用多个污点,一个Pod对象也能够有多个容忍度,不过二者在进行匹配查看时应遵循如下逻辑。
- 首先解决每个有着与之匹配的容忍度的污点
- 不能匹配到的污点上,如果存在一个污点应用了NoSchedule效用标识,则回绝调度Pod对象至此节点
- 不能匹配到的污点上,若没有任何一个应用了NoSchedule效用标识,但至多有一个应用了PreferNoScheduler,则应尽量避免将Pod对象调度至此节点
- 如果至多有一个不匹配的污点应用了NoExecute效用标识,则节点将立刻驱赶Pod对象,或者不予调度至给定节点;另外,即使容忍度能够匹配到应用了 NoExecute效用标识的污点,若在定义容忍度时还同时应用tolerationSeconds属性定义了容忍时限,则超出时限后其也将被节点驱赶。
应用kubeadm部署的Kubernetes集群,其Master节点将主动增加污点信息以阻止不能容忍此污点的Pod对象调度至此节点,因而,用户手动创立的未特意增加容忍此污点容忍度的Pod对象将不会被调度至此节点
示例1: Pod调度到master 对master:NoSchedule标识容忍
[root@k8s-master Scheduler]# kubectl describe node k8s-master.org #查看master污点 效用标识...Taints: node-role.kubernetes.io/master:NoScheduleUnschedulable: false[root@k8s-master Scheduler]# cat tolerations-daemonset-demo.yaml apiVersion: apps/v1kind: DaemonSetmetadata: name: daemonset-demo namespace: default labels: app: prometheus component: node-exporterspec: selector: matchLabels: app: prometheus component: node-exporter template: metadata: name: prometheus-node-exporter labels: app: prometheus component: node-exporter spec: tolerations: #容忍度 容忍master NoSchedule标识 - key: node-role.kubernetes.io/master #是key值 effect: NoSchedule #效用标识 operator: Exists #存在即可 containers: - image: prom/node-exporter:latest name: prometheus-node-exporter ports: - name: prom-node-exp containerPort: 9100 hostPort: 9100[root@k8s-master Scheduler]# kubectl apply -f tolerations-daemonset-demo.yaml [root@k8s-master Scheduler]# kubectl get pod -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESdaemonset-demo-7fgnd 2/2 Running 0 5m15s 10.244.91.106 k8s-node2.org <none> <none>daemonset-demo-dmd47 2/2 Running 0 5m15s 10.244.70.105 k8s-node1.org <none> <none>daemonset-demo-jhzwf 2/2 Running 0 5m15s 10.244.42.29 k8s-node3.org <none> <none>daemonset-demo-rcjmv 2/2 Running 0 5m15s 10.244.59.16 k8s-master.org <none> <none>
示例2: 为节点增加effect效用标识NoExecute 驱赶所有Pod
[root@k8s-master Scheduler]# kubectl taint --helpUpdate the taints on one or more nodes. * A taint consists of a key, value, and effect. As an argument here, it is expressed as key=value:effect. * The key must begin with a letter or number, and may contain letters, numbers, hyphens, dots, and underscores, up to253 characters. * Optionally, the key can begin with a DNS subdomain prefix and a single '/', like example.com/my-app * The value is optional. If given, it must begin with a letter or number, and may contain letters, numbers, hyphens,dots, and underscores, up to 63 characters. * The effect must be NoSchedule, PreferNoSchedule or NoExecute. * Currently taint can only apply to node.Examples: #示例 # Update node 'foo' with a taint with key 'dedicated' and value 'special-user' and effect 'NoSchedule'. # If a taint with that key and effect already exists, its value is replaced as specified. kubectl taint nodes foo dedicated=special-user:NoSchedule # Remove from node 'foo' the taint with key 'dedicated' and effect 'NoSchedule' if one exists. kubectl taint nodes foo dedicated:NoSchedule- # Remove from node 'foo' all the taints with key 'dedicated' kubectl taint nodes foo dedicated- # Add a taint with key 'dedicated' on nodes having label mylabel=X kubectl taint node -l myLabel=X dedicated=foo:PreferNoSchedule # Add to node 'foo' a taint with key 'bar' and no value kubectl taint nodes foo bar:NoSchedule[root@k8s-master Scheduler]# kubectl get pod -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESdaemonset-demo-7ghhd 1/1 Running 0 23m 192.168.113.35 k8s-node1 <none> <none>daemonset-demo-cjxd5 1/1 Running 0 23m 192.168.12.35 k8s-node2 <none> <none>daemonset-demo-lhng4 1/1 Running 0 23m 192.168.237.4 k8s-master <none> <none>daemonset-demo-x5nhg 1/1 Running 0 23m 192.168.51.54 k8s-node3 <none> <none>pod-antiaffinity-required-697f7d764d-69vx4 0/1 Pending 0 8s <none> <none> <none> <none>pod-antiaffinity-required-697f7d764d-7cxp2 1/1 Running 0 8s 192.168.51.55 k8s-node3 <none> <none>pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 8s 192.168.12.36 k8s-node2 <none> <none>pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 8s 192.168.113.36 k8s-node1 <none> <none>
- 为Node 3打上NoExecute效用标签,驱赶Node所有Pod
[root@k8s-master Scheduler]# kubectl taint node k8s-node3 diskfull=true:NoExecute node/k8s-node3 tainted[root@k8s-master Scheduler]# kubectl describe node k8s-node3...CreationTimestamp: Sun, 29 Aug 2021 22:45:43 +0800Taints: diskfull=true:NoExecute
- node节点所有Pod曾经被驱赶 但因为Pod 定义为每个节点只能存在一个同类型Pod 所以会被挂起,不会被在其它节点创立
[root@k8s-master Scheduler]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESdaemonset-demo-7ghhd 1/1 Running 0 31m 192.168.113.35 k8s-node1 <none> <none>daemonset-demo-cjxd5 1/1 Running 0 31m 192.168.12.35 k8s-node2 <none> <none>daemonset-demo-lhng4 1/1 Running 0 31m 192.168.237.4 k8s-master <none> <none>pod-antiaffinity-required-697f7d764d-69vx4 0/1 Pending 0 7m45s <none> <none> <none> <none>pod-antiaffinity-required-697f7d764d-l86td 0/1 Pending 0 6m5s <none> <none> <none> <none>pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 7m45s 192.168.12.36 k8s-node2 <none> <none>pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 7m45s 192.168.113.36 k8s-node1 <none> <none>
- 删除污点 Pod从新被创立
[root@k8s-master Scheduler]# kubectl taint node k8s-node3 diskfull- node/k8s-node3 untainted[root@k8s-master Scheduler]# kubectl get pod -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESdaemonset-demo-7ghhd 1/1 Running 0 34m 192.168.113.35 k8s-node1 <none> <none>daemonset-demo-cjxd5 1/1 Running 0 34m 192.168.12.35 k8s-node2 <none> <none>daemonset-demo-lhng4 1/1 Running 0 34m 192.168.237.4 k8s-master <none> <none>daemonset-demo-m6g26 0/1 ContainerCreating 0 4s <none> k8s-node3 <none> <none>pod-antiaffinity-required-697f7d764d-69vx4 0/1 ContainerCreating 0 10m <none> k8s-node3 <none> <none>pod-antiaffinity-required-697f7d764d-l86td 0/1 Pending 0 9m1s <none> <none> <none> <none>pod-antiaffinity-required-697f7d764d-rpb5r 1/1 Running 0 10m 192.168.12.36 k8s-node2 <none> <none>pod-antiaffinity-required-697f7d764d-vf2x8 1/1 Running 0 10m 192.168.113.36 k8s-node1 <none> <none>
参考文档:
https://www.cnblogs.com/ssgee...