背景
业务开发须要批改 pod 的内核参数,这些参数被认为是 unsafe 的参数,须要批改 kubelet 的 --allowed-unsafe-sysctls
中才能够用,同时要把 pod 指定调度到这些 kubelet 被批改过的节点。
在遗记设置节点亲和性或者 nodeSelector 的状况下,间接批改 deployment,会造成什么样的问题。上面通过试验复现一遍。
试验
自 k8s 1.12 起,sysctls 个性 beta 并默认开启,容许用户在 pod 的 securityContext 中设置内核参数
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
securityContext:
sysctls:
- name: net.core.somaxconn
value: "1024"
containers:
- name: nginx
image: nginx
创立 deplyemnt 后,过五分钟后查看,集群创立了上千个 pod
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-7fbbcfcc7d-4gmrg 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-6dfpm 0/1 SysctlForbidden 0 17s
nginx-7fbbcfcc7d-6jkdn 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-6mf6z 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-6p2hs 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-cd759 0/1 SysctlForbidden 0 12s
nginx-7fbbcfcc7d-ckqbl 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-gtvq4 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-jbv2p 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-jdh84 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-kmd9p 0/1 SysctlForbidden 0 20s
nginx-7fbbcfcc7d-lcp6k 0/1 SysctlForbidden 0 15s
nginx-7fbbcfcc7d-lsdlx 0/1 SysctlForbidden 0 15s
nginx-7fbbcfcc7d-mbd74 0/1 SysctlForbidden 0 19s
nginx-7fbbcfcc7d-mbjnf 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-mmbj7 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-n2ndn 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-rhjmp 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-rznhl 0/1 SysctlForbidden 0 13s
nginx-7fbbcfcc7d-sfrl9 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-t9bkk 0/1 SysctlForbidden 0 19s
nginx-7fbbcfcc7d-vd6x8 0/1 SysctlForbidden 0 17s
nginx-7fbbcfcc7d-vt2jh 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-w4l7n 0/1 SysctlForbidden 0 20s
nginx-7fbbcfcc7d-w5sgq 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-wlf2c 0/1 SysctlForbidden 0 13s
nginx-7fbbcfcc7d-xh22t 0/1 SysctlForbidden 0 21s
解决办法
kubectl scale deployment --replicas=0 nginx
kubectl delete pods -l app=nginx
总结
为 pod 设置内核参数前先创立一个长期 pod 验证过再去批改 deployment,防止创立大批量有效的 pod。