1. 长期存储的概念
长期存储即宿主机节点的本地存储。
晚期版本的 kubernetes 提供了对 container 的 CPU、内存的限度,并没有提供对 container 应用的本地存储的限度,这种状况下,可能存在某些 container 大量耗费宿主机的存储容量,导致宿主机没有足够的磁盘空间运行外围组件。
container 应用的宿主机空间:
- 寄存 log 的目录:/var/lib/kubelet、/var/log/pods
- 寄存 rootfs 的目录: /var/lib/docker
kubernetes 在 1.8 版本引入了 ephemeral storage 资源,以治理本地长期存储的应用。
2. 长期存储的配置
Pod 的 container 能够配置长期存储的 requests 和 limits:
* spec.containers[].resources.limits.ephemeral-storage
* spec.containers[].resources.requests.ephemeral-storage
创立一个 deploy,ephemeral-storage 申请 300Mi,限度 300Mi:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
imagePullPolicy: IfNotPresent
command:
- nginx
- -g
- "daemon off;"
workingDir: /usr/share/nginx/html
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
limits:
ephemeral-storage: 300Mi
requests:
ephemeral-storage: 300Mi
创立进去 2 个 pod:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-774779ddfb-89f5m 1/1 Running 0 12s
nginx-774779ddfb-k8r2r 1/1 Running 0 12s
在其中一个 pod 写入 400Mi 的文件:
# kubectl exec -it nginx-774779ddfb-89f5m /bin/sh
# dd if=/dev/zero of=/test bs=4096 count=102400
102400+0 records in
102400+0 records out
419430400 bytes (419 MB, 400 MiB) copied, 16.2403 s, 25.8 MB/s
能够看到,写入文件的 pod 被 Evicted,并新创了一个 pod:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-774779ddfb-89f5m 0/1 Evicted 0 2m19s
nginx-774779ddfb-cv4zw 1/1 Running 0 4s
nginx-774779ddfb-k8r2r 1/1 Running 0 2m19s
3. 长期存储的监控
容器长期存储的使用量:
- container_fs_usage_bytes
容器长期存储的申请量:
- kube_pod_container_resource_requests{resource=”ephemeral_storage”,unit=”byte”}
节点的长期存储总量:
- kube_node_status_allocatable{resource=”ephemeral_storage”,unit=”byte”}
节点已调配长期存储:
- sum by (node) (kube_pod_container_resource_requests{resource=”ephemeral_storage”})
4. 长期存储的源码
监控 pod 和 container 的长期存储使用率,若超过限度,则被 evicted:
//pkg/kubelet/eviction/eviction_manager.go
func (m *managerImpl) localStorageEviction(pods []*v1.Pod, statsFunc statsFunc) []*v1.Pod {evicted := []*v1.Pod{}
for _, pod := range pods {podStats, ok := statsFunc(pod)
if !ok {continue}
...
if m.podEphemeralStorageLimitEviction(podStats, pod) {evicted = append(evicted, pod)
continue
}
if m.containerEphemeralStorageLimitEviction(podStats, pod) {evicted = append(evicted, pod)
}
}
return evicted
}
重点看一下 container 的存储容量查看:
func (m *managerImpl) containerEphemeralStorageLimitEviction(podStats statsapi.PodStats, pod *v1.Pod) bool {
//limits 限度值
thresholdsMap := make(map[string]*resource.Quantity)
for _, container := range pod.Spec.Containers {ephemeralLimit := container.Resources.Limits.StorageEphemeral()
if ephemeralLimit != nil && ephemeralLimit.Value() != 0 {thresholdsMap[container.Name] = ephemeralLimit
}
}
// 遍历所有的 container
for _, containerStat := range podStats.Containers {containerUsed := diskUsage(containerStat.Logs)
if !*m.dedicatedImageFs {containerUsed.Add(*diskUsage(containerStat.Rootfs))
}
if ephemeralStorageThreshold, ok := thresholdsMap[containerStat.Name]; ok {
// 若 limits < usage,则 evicted
if ephemeralStorageThreshold.Cmp(*containerUsed) < 0 {if m.evictPod(pod, 0, fmt.Sprintf(containerEphemeralStorageMessageFmt, containerStat.Name, ephemeralStorageThreshold.String()), nil) {metrics.Evictions.WithLabelValues(signalEphemeralContainerFsLimit).Inc()
return true
}
return false
}
}
}
return false
}
参考:
1.https://ieevee.com/tech/2019/05/23/ephemeral-storage.html
2.https://www.bookstack.cn/read/okd-4.7-en/995d491a59697e3b.md