背景:
参照:https://www.yuque.com/duiniwukenaihe/ehb02i/kdvrku 实现了 1.16.15 到 1.17.17 的降级,当初降级到 1.18 版本
集群配置
主机名 | 零碎 | ip |
---|---|---|
k8s-vip | slb | 10.0.0.37 |
k8s-master-01 | centos7 | 10.0.0.41 |
k8s-master-02 | centos7 | 10.0.0.34 |
k8s-master-03 | centos7 | 10.0.0.26 |
k8s-node-01 | centos7 | 10.0.0.36 |
k8s-node-02 | centos7 | 10.0.0.83 |
k8s-node-03 | centos7 | 10.0.0.40 |
k8s-node-04 | centos7 | 10.0.0.49 |
k8s-node-05 | centos7 | 10.0.0.45 |
k8s-node-06 | centos7 | 10.0.0.18 |
1. 参考官网文档
参照:https://kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
https://v1-17.docs.kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
kubeadm 创立的 Kubernetes 集群从 1.16.x 版本升级到 1.17.x 版本,以及从版本 1.17.x 降级到 1.17.y,其中 y > x。当初持续将版本升级到 1.18。从 1.16 降级到 1.17 有点了解谬误:1.16.15 我认为只能先降级到 1.17.15。认真看了下文档是没有这说法的。我就筹备从 1.17.17 降级到 1.18 的最新版本了!
2. 确认可降级版本与降级计划
yum list --showduplicates kubeadm --disableexcludes=kubernetes
通过以上命令查问到 1.18 以后最新版本是 1.18.20- 0 版本。master 有三个节点还是依照集体习惯先降级 k8s-master-03 节点
3. 降级 k8s-master-03 节点管制立体
仍然 k8s-master-03 执行:
1. yum 降级 kubernetes 插件
yum install kubeadm-1.18.20-0 kubelet-1.18.20-0 kubectl-1.18.20-0 --disableexcludes=kubernetes
2. 凌空节点查看集群是否能够降级
特意整一下凌空(1.16.15 降级到 1.17.17 的时候没有整。就当复习一下 drain 命令了)
kubectl drain k8s-master-03 --ignore-daemonsets
sudo kubeadm upgrade plan
3. 降级版本到 1.18.20
1. 小插曲
嗯操作降级到 1.18.20 版本
kubeadm upgrade apply 1.18.20
嗯有一个 work 节点没有降级版本仍然是 1.16.15 版本 哈哈哈提醒一下。master 节点应该是向下兼容一个版本的。先把 test-ubuntu-01 节点降级到 1.17.17
[root@k8s-master-01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-01 Ready master 317d v1.17.17
k8s-master-02 Ready master 317d v1.17.17
k8s-master-03 Ready,SchedulingDisabled master 317d v1.17.17
k8s-node-01 Ready node 569d v1.17.17
k8s-node-02 Ready node 22d v1.17.17
k8s-node-03 Ready node 569d v1.17.17
k8s-node-04 Ready node 567d v1.17.17
k8s-node-05 Ready node 567d v1.17.17
k8s-node-06 Ready node 212d v1.17.17
sh02-node-01 Ready node 16d v1.17.17
test-ubuntu-01 Ready,SchedulingDisabled <none> 22d v1.16.15
tm-node-002 Ready,SchedulingDisabled node 174d v1.17.17
tm-node-003 Ready <none> 119d v1.17.17
先降级下 test-ubuntu-01 节点如下(此操作在 test-ubuntu-01 节点执行):
sudo apt-get install -y kubelet=1.17.17-00 kubectl=1.17.17-00 kubeadm=1.17.17-00
sudo kubeadm upgrade node
sudo systemctl daemon-reload
sudo systemctl restart kubelet
登陆任一 master 节点确认版本都为 1.17.17 版本(失常在 k8s-master-03 节点看就行了,我 xshell 开了好几个窗口就那 01 节点看了):
2. 降级到 1.18.20
k8s-master-03 节点继续执行降级:
kubeadm upgrade apply v1.18.20
3. 重启 kubelet 勾销节点爱护
[root@k8s-master-03 ~]# sudo systemctl daemon-reload
[root@k8s-master-03 ~]# sudo systemctl restart kubelet
[root@k8s-master-03 ~]# kubectl uncordon k8s-master-03
node/k8s-master-03 uncordoned
4. 降级其余管制立体(k8s-master-01 k8s-master-02)
k8s-master-01 k8s-master-02 节点都执行以下操作(这里就没有清空节点了,看集体需要):
yum install kubeadm-1.18.20-0 kubelet-1.18.20-0 kubectl-1.18.20-0 --disableexcludes=kubernetes
kubeadm upgrade node
systemctl daemon-reload
sudo systemctl restart kubelet
5. work 节点的降级
没有执行清空节点(当然了 例行降级的话还是最初执行以下清空节点),间接降级了,如下:
yum install kubeadm-1.18.20-0 kubelet-1.18.20-0 kubectl-1.18.20-0 --disableexcludes=kubernetes
kubeadm upgrade node
systemctl daemon-reload
sudo systemctl restart kubelet
6. 验证降级
kubectl get nodes
留神:test-ubuntu-01 疏忽。
目测是降级胜利的,看一下 kube-system 下的几个零碎组件发现:ontroller-manager 的 clusterrole system:kube-controller-manager 的权限又有问题了?
同 1.16.15 降级 1.17.17 一样:
kubectl get clusterrole system:kube-controller-manager -o yaml > 1.yaml
kubectl delete clusterrole system:kube-controller-manager
cat <<EOF > kube-controller-manager.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: "2021-03-22T11:29:59Z"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-controller-manager
resourceVersion: "92"
uid: 7480dabb-ec0d-4169-bdbd-418d178e2751
rules:
- apiGroups:
- ""
- events.k8s.io
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- coordination.k8s.io
resourceNames:
- kube-controller-manager
resources:
- leases
verbs:
- get
- update
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- apiGroups:
- ""
resourceNames:
- kube-controller-manager
resources:
- endpoints
verbs:
- get
- update
- apiGroups:
- ""
resources:
- secrets
- serviceaccounts
verbs:
- create
- apiGroups:
- ""
resources:
- secrets
verbs:
- delete
- apiGroups:
- ""
resources:
- configmaps
- namespaces
- secrets
- serviceaccounts
verbs:
- get
- apiGroups:
- ""
resources:
- secrets
- serviceaccounts
verbs:
- update
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
- apiGroups:
- '*'
resources:
- '*'
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- serviceaccounts/token
verbs:
- create
EOF
kubectl apply -f kube-controller-manager.yaml
目测是能够了 ………………….. 截图过程都找不到了,昨天晚上写的货色没有保留停电了 ……
这里的 flannel 没有什么问题 就不必看了 Prometheus 仍然是有问题的 …… 我就先疏忽了。因为我还筹备降级集群版本 … 降级版本后再搞 Prometheus 了