介绍:
云盘数据卷扩容包括以下几个部分:
云盘物理空间扩容,需要在云盘控制台操作;
文件系统扩容,需要挂载云盘到一个物理节点手动操作;
PV、PVC Size 更新,需要更新 StorageClass、PVC;
注意:扩容云盘前需要为云盘打快照,以预防升级失败导致数据丢失;
云盘目前无法做到在线扩容,需要重启应用才可以实现,可以通过两种方法实现云盘扩容。下面以集群中部署 Zookeeper 为例介绍两种扩容方式,Zookeeper 集群如下:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
zookeeper-default-zookeeper-0 1/1 Running 0 2m55s
zookeeper-default-zookeeper-1 1/1 Running 0 2m14s
zookeeper-default-zookeeper-2 1/1 Running 0 94s
# kubectl get pvc| grep zoo
datadir-zookeeper-default-zookeeper-0 Bound d-8vb5teafaoa80ia7affg 20Gi RWO alicloud-disk-efficiency 3m12s
datadir-zookeeper-default-zookeeper-1 Bound d-8vb60faf6epslbctnzka 20Gi RWO alicloud-disk-efficiency 2m31s
datadir-zookeeper-default-zookeeper-2 Bound d-8vbidmq57w4df6k84zem 20Gi RWO alicloud-disk-efficiency 111s
# kubectl get pv| grep zoo
d-8vb5teafaoa80ia7affg 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 3m17s
d-8vb60faf6epslbctnzka 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-1 alicloud-disk-efficiency 2m33s
d-8vbidmq57w4df6k84zem 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-2 alicloud-disk-efficiency 107s
暂停应用方式:
如果应用可以暂停服务,可以先停掉应用(删除),手动对每个依赖数据盘进行扩容,然后再启动应用。缺点是:应用会暂停一定时间;
上面应用使用了 3 个 20Gi 的云盘,分别挂载在 3 个 pod 上。目标为把三个云盘扩容到 30Gi,主要步骤:
- 删除应用负载;
- 云盘控制台在线扩容云盘;
- 挂载云盘到某节点进行文件系统扩容;
- 更新 PV、PVC 的 Size 参数;
- 重启应用;
1. 删除应用:
删除 zookeeper statefulset 对象;
# kubectl delete sts zookeeper-default-zookeeper
# kubectl get pvc | grep zoo
datadir-zookeeper-default-zookeeper-0 Bound d-8vb6ie0kwtyynpf4gu4l 20Gi RWO alicloud-disk-efficiency 22m
datadir-zookeeper-default-zookeeper-1 Bound d-8vbhscszlr47rbot0boc 20Gi RWO alicloud-disk-efficiency 21m
datadir-zookeeper-default-zookeeper-2 Bound d-8vb444t0f8xnicj9c2ov 20Gi RWO alicloud-disk-efficiency 21m
# kubectl get pv | grep zoo
d-8vb444t0f8xnicj9c2ov 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-2 alicloud-disk-efficiency 21m
d-8vb6ie0kwtyynpf4gu4l 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 22m
d-8vbhscszlr47rbot0boc 20Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-1 alicloud-disk-efficiency 21m
2.3. 云盘扩容:
根据云盘文档,分别对 3 块云盘进行扩容:
https://help.aliyun.com/document_detail/113316.html
https://help.aliyun.com/document_detail/25452.html
注意:扩容云盘的同时也要完成对文件系统的扩容。
4. 扩容 PV、PVC
编辑 pvc、pv 所使用的 StorageClass,配置 allowVolumeExpansion: true 标签;
# kubectl edit sc alicloud-disk-efficiency
# kubectl get sc alicloud-disk-efficiency -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2019-09-05T12:30:27Z"
name: alicloud-disk-efficiency
resourceVersion: "1675896"
selfLink: /apis/storage.k8s.io/v1/storageclasses/alicloud-disk-efficiency
uid: f1071bcc-cfd8-11e9-81cd-00163e0804c2
parameters:
type: cloud_efficiency
provisioner: alicloud/disk
reclaimPolicy: Delete
volumeBindingMode: Immediate
更新 PVC 的 Size 为 30Gi:
# kubectl patch pvc datadir-zookeeper-default-zookeeper-0 -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
persistentvolumeclaim/datadir-zookeeper-default-zookeeper-0 patched
# kubectl patch pvc datadir-zookeeper-default-zookeeper-1 -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
persistentvolumeclaim/datadir-zookeeper-default-zookeeper-1 patched
# kubectl patch pvc datadir-zookeeper-default-zookeeper-2 -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
persistentvolumeclaim/datadir-zookeeper-default-zookeeper-2 patched
# kubectl get pvc | grep zoo
datadir-zookeeper-default-zookeeper-0 Bound d-8vb6ie0kwtyynpf4gu4l 20Gi RWO alicloud-disk-efficiency 51m
datadir-zookeeper-default-zookeeper-1 Bound d-8vbhscszlr47rbot0boc 20Gi RWO alicloud-disk-efficiency 50m
datadir-zookeeper-default-zookeeper-2 Bound d-8vb444t0f8xnicj9c2ov 20Gi RWO alicloud-disk-efficiency 49m
# kubectl get pv | grep zoo
d-8vb444t0f8xnicj9c2ov 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-2 alicloud-disk-efficiency 49m
d-8vb6ie0kwtyynpf4gu4l 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 51m
d-8vbhscszlr47rbot0boc 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-1 alicloud-disk-efficiency 50m
可见 pv 的大小已经更新为 30Gi,pvc 的大小会在 Pod 启动后自动更新;
5. 重新创建应用:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
zookeeper-default-zookeeper-0 1/1 Running 0 94s
zookeeper-default-zookeeper-1 1/1 Running 0 64s
zookeeper-default-zookeeper-2 1/1 Running 0 39s
# kubectl exec -ti zookeeper-default-zookeeper-0 sh
# df -h |grep zoo
/dev/vdb 30G 45M 30G 1% /var/lib/zookeeper
# kubectl get pvc | grep zoo
datadir-zookeeper-default-zookeeper-0 Bound d-8vb6ie0kwtyynpf4gu4l 30Gi RWO alicloud-disk-efficiency 56m
datadir-zookeeper-default-zookeeper-1 Bound d-8vbhscszlr47rbot0boc 30Gi RWO alicloud-disk-efficiency 56m
datadir-zookeeper-default-zookeeper-2 Bound d-8vb444t0f8xnicj9c2ov 30Gi RWO alicloud-disk-efficiency 55m
# kubectl get pv | grep zoo
d-8vb444t0f8xnicj9c2ov 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-2 alicloud-disk-efficiency 55m
d-8vb6ie0kwtyynpf4gu4l 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 56m
d-8vbhscszlr47rbot0boc 30Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-1 alicloud-disk-efficiency 56m
上面输出日志可以看出,文件系统已经扩容到 30G,pv pvc 的 size 也扩容到 30G;
逐个 Pod 升级方式:
由多个 Pod 运行的服务,可以通过一个一个 pod 升级的方式实现业务不中断,而只是 Running pod 数量暂时减少。
还是以 zookeeper 为例,使用 3 个 20Gi 的云盘,分别挂载在 3 个 pod 上。目标为把三个云盘扩容到 40Gi:
主要步骤:
- 修改某个 Pod 所对应 pv 的 Zoneid label,然后删除 pod;
- 登陆云盘控制台在线扩容云盘(扩容前最好对云盘打快照,容灾);
- 挂载云盘到某节点进行文件系统扩容;
- 更新 PV、PVC 的 Size 参数,以及 Pv 的 ZoneId Lable;
- 对其他 Pod 重复 1 - 4 步骤;
1. 修改 PV 的 ZoneId
# kubectl get pod zookeeper-default-zookeeper-0 -oyaml | grep pers -C 1
- name: datadir
persistentVolumeClaim:
claimName: datadir-zookeeper-default-zookeeper-0
# kubectl get pvc datadir-zookeeper-default-zookeeper-0
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
datadir-zookeeper-default-zookeeper-0 Bound d-8vbhscszlr47tgn0eheb 20Gi RWO alicloud-disk-efficiency 21m
上述日志得出 zookeeper-default-zookeeper-0 使用的 pvc 为:datadir-zookeeper-default-zookeeper-0,对应的 PV 为 d -8vbhscszlr47tgn0eheb;
更新 PV Label,
如果 pv 中已经有如下 label,则把 zone 对应的值改为”原来 value -none“,即配置一个不存在的 zone,让 pod 无法调度成功;
如果 pv 中没有如下 label,则添加这些 lables 到 pv;
#kubectl edit pv d-8vbhscszlr47tgn0eheb
labels:
failure-domain.beta.kubernetes.io/region: cn-zhangjiakou
failure-domain.beta.kubernetes.io/zone: cn-zhangjiakou-a-none
删除 Pod:zookeeper-default-zookeeper-0
# kubectl delete pod zookeeper-default-zookeeper-0
这时删除的 pod 一直处于 Pending 状态:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
zookeeper-default-zookeeper-0 0/1 Pending 0 9s
zookeeper-default-zookeeper-1 1/1 Running 0 24m
zookeeper-default-zookeeper-2 1/1 Running 0 24m
2.3. 云盘扩容:
根据云盘文档,分别对 3 块云盘进行库容:
https://help.aliyun.com/document_detail/113316.html
https://help.aliyun.com/document_detail/25452.html
注意:扩容云盘的同时也要完成对文件系统的扩容。
4. 扩容 PV、PVC
编辑 pvc、pv 所使用的 StorageClass,配置 allowVolumeExpansion: true 标签;
# kubectl edit sc alicloud-disk-efficiency
# kubectl get sc alicloud-disk-efficiency -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2019-09-05T12:30:27Z"
name: alicloud-disk-efficiency
resourceVersion: "1675896"
selfLink: /apis/storage.k8s.io/v1/storageclasses/alicloud-disk-efficiency
uid: f1071bcc-cfd8-11e9-81cd-00163e0804c2
parameters:
type: cloud_efficiency
provisioner: alicloud/disk
reclaimPolicy: Delete
volumeBindingMode: Immediate
更新 PVC 的 Size 为 40Gi:
# kubectl patch pvc datadir-zookeeper-default-zookeeper-0 -p '{"spec":{"resources":{"requests":{"storage":"40Gi"}}}}'
persistentvolumeclaim/datadir-zookeeper-default-zookeeper-0 patched
# kubectl get pvc | grep datadir-zookeeper-default-zookeeper-0
datadir-zookeeper-default-zookeeper-0 Bound d-8vbhscszlr47tgn0eheb 20Gi RWO alicloud-disk-efficiency 29m
# kubectl get pv | grep datadir-zookeeper-default-zookeeper-0
d-8vbhscszlr47tgn0eheb 40Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 29m
可见 pv 的大小已经更新为 40Gi,pvc 的大小会在 Pod 启动后自动更新;
恢复 PV 的 Lable,把 zoneId 的值恢复之前的 value,删除相应 labels(若之前没有这些 labels):
labels:
failure-domain.beta.kubernetes.io/region: cn-zhangjiakou
failure-domain.beta.kubernetes.io/zone: cn-zhangjiakou-a
检查容器挂载文件系统大小,pv、pvc 大小:
# kubectl exec -ti zookeeper-default-zookeeper-0 sh
# df -h | grep zoo
/dev/vdb 40G 48M 40G 1% /var/lib/zookeeper
# kubectl get pvc | grep datadir-zookeeper-default-zookeeper-0
datadir-zookeeper-default-zookeeper-0 Bound d-8vbhscszlr47tgn0eheb 40Gi RWO alicloud-disk-efficiency 33m
# kubectl get pv | grep datadir-zookeeper-default-zookeeper-0
d-8vbhscszlr47tgn0eheb 40Gi RWO Delete Bound default/datadir-zookeeper-default-zookeeper-0 alicloud-disk-efficiency 33m
可见文件系统、pv、pvc 都已经实现了扩容;
5. 对其他 pod 所对应的 pvc、pv、云盘进行上述扩容
本文作者:阚俊宝
阅读原文
本文为云栖社区原创内容,未经允许不得转载。