乐趣区

在K8S集群下为应用配置本地卷Local-Volume

本地卷描述

HadoopES 等系统,其 DataNode 需大量存储,且其本身提供了冗余功能,那么我们就没必要让其从存储系统中分配卷,而是像裸机部署一样让其使用本地节点上的存储,local volumes出现之前,我们可使用 HostPath 挂载卷到容器中,但此方案有很多局限性:

The prior mechanism of accessing local storage through hostPath volumes had many challenges. hostPath volumes were difficult to use in production at scale: operators needed to care for local disk management, topology, and scheduling of individual pods when using hostPath volumes, and could not use many Kubernetes features (like StatefulSets). Existing Helm charts that used remote volumes could not be easily ported to use hostPath volumes. The Local Persistent Volumes feature aims to address hostPath volumes’portability, disk accounting, and scheduling challenges.

注意 :本地卷仅适用于少量应用,如同HostPath 一样 Pod 被绑定到特定的主机上,若主机异常,则 Pod 没法调度到其他节点,其适用场景:

  • Caching of datasets that can leverage data gravity for fast processing
  • Distributed storage systems that shard or replicate data across multiplenodes. Examples include distributed datastores like Cassandra, or distributedfile systems like Gluster or Ceph.

localvolumehostpath 类似,但其更灵活且统一,如应用使用 hostpath,则只能在yaml 文件中硬编码,而 localvolume 如同普通 pvc 灵活可用,具体见下文。

手动配置本地卷

此节介绍在不使用 local provisioner 情况下如何配置手动本地卷。

管理员必须为本地卷创建一个存储类,如下所示,创建了一个名为 local-storage 的存储类:

% kubectl create -f - <<EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
EOF

注意

  • 此存储类无法动态提供存储功能,所有 PV 需手动创建;
  • volumeBindingMode: WaitForFirstConsumer:Pod 调度前不先绑定 PVC 与 PV,而是等待 Pod 被调度时,这样可根据 Pod 资源等请求合理调度,如:selectors, affinity and anti-affinity policies

宿主机准备目录,即配置本地硬盘。如实验环境 okd-c01okd-c02主机均配置了 /mnt/test 本地存储,而对于 OKD 集群需设置 SeLinux 权限:

chcon -R unconfined_u:object_r:svirt_sandbox_file_t:s0 /mnt/test

手动创建两 PVexample-local-pv-1example-local-pv-2 分别绑定两主机存储,如绑定 okd-c01.zyl.io 主机的本地卷如下:

% kubectl create -f - <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-local-pv-1
spec:
  capacity:
    storage: 1Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/test
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - okd-c01.zyl.io
EOF

创建PVC

% kubectl create -f - <<EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: example-local-claim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi 
  storageClassName: local-storage
EOF

注意 :此时PVC 不会立马绑定到 PV,其等待Pod 被调度时才会绑定到 PV 上,当前 PVC 状态为Pending

% oc describe pvc example-local-claim
...
Events:
  Type    Reason                Age               From                         Message
  ----    ------                ----              ----                         -------
  Normal  WaitForFirstConsumer  5s (x2 over 10s)  persistentvolume-controller  waiting for first consumer to be created before binding

配置应用使用存储:

% kubectl create -f - <<'EOF'
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: local-volume-test
  name: local-volume-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: local-volume-test
  template:
    metadata:
      labels:
        app: local-volume-test
    spec:
      containers:
      - image: busybox
        name: local-volume-test
        imagePullPolicy: IfNotPresent
        command: ["/bin/sh", "-c", "while true; do sleep 2; echo $(date) $(hostname)>> /mnt/test/test.log; done" ]
        volumeMounts:
        - name: local-data
          mountPath: /mnt/test
      volumes:
      - name: local-data
        persistentVolumeClaim:
          claimName: example-local-claim
EOF

Pod调度后会发现 PVC 被绑定:

% oc get pvc example-local-claim 
NAME                  STATUS    VOLUME             CAPACITY   ACCESS MODES STORAGECLASS    AGE
example-local-claim   Bound     example-local-pv-2 1Gi        RWO          local-storage   7m

注意:

  • 此时删除 Pod,可发现其仍然被调度到本地存储example-local-pv-2 所在的主机,即okd-c02.zyl.io
  • 删除 deployment 后,PVC不会与 PV 解绑,即第一次 Pod 调度后,PVC就与 PV 关联了;
  • 删除 PVC 后,发现 PV 一直处于 Released 状态,导致 PV 无法被重用,需管理员手动删除 PV 并重建 PV 以便被重用:

When local volumes are manually created like this, the only supported persistentVolumeReclaimPolicy is“Retain”. When the PersistentVolume is released from the PersistentVolumeClaim, an administrator must manually clean up and set up the local volume again for reuse.

自动配置本地卷

手动配置本地券 中描述的 PV 需手动创建,而 PVC 删除后 PV 没法重用,即需重建 PV 才行,当系统使用大量 Local Volume 时会加重管理负担,鉴于此,可通过 external static provisioner 协助简化 local volume 配置。

当前版本(<=K8S 1.12OKD 1.11)管理员需手动在主机上将volume 挂载到特定目录上,而后 external static provisioner 会扫描此目录,其将自动创建 PV,而当PVC 被释放后又会清理目录内容并重建 PV,其是半自动化的,但并非动态提供,后续版本会提供动态提供支持,如管理员仅需提供LVM VG,此provisioner 将自动完成格式化、挂载等步骤:

Dynamic provisioning of local volumes using LVM is under design and an alpha implementation will follow in a future release. This will eliminate the current need for an administrator to pre-partition, format and mount local volumes, as long as the workload’s performance requirements can tolerate sharing disks.

参考:

  • Local Persistent Volumes for Kubernetes Goes Beta;
  • [kubernetes-sigs/sig-storage-local-static-provisioner:K8S官方提供的external static provisioner
  • Configuring Local Volumes:OKD如何配置local provisioner

以下描述如何使用 OKD 提供的local provisioner

主机挂载本地卷

如本例在 okd-c0[1-3] 主机配置本地卷,需将目录手动挂载到 /mnt/local-storage/<storage-class-name>/<volume> 目录,如:

cat >> /etc/fstab <<EOF
/dev/datavg/d1 /mnt/local-storage/local-storage-1/d1 xfs defaults 0 0
/dev/datavg/d2 /mnt/local-storage/local-storage-1/d2 xfs defaults 0 0
/dev/datavg/d3 /mnt/local-storage/local-storage-2/d3 xfs defaults 0 0
/dev/datavg/d4 /mnt/local-storage/local-storage-2/d4 xfs defaults 0 0
EOF
mkdir -p /mnt/local-storage/local-storage-1/{d1,d2}
mkdir -p /mnt/local-storage/local-storage-2/{d3,d4}
mount -a

OKD/Openshift环境设置 SeLinux 权限:

chcon -R unconfined_u:object_r:svirt_sandbox_file_t:s0 /mnt/local-storage/

部署local provisioner

可选。在单独的项目下部署local provisioner,创建项目:

oc new-project local-storage

创建一个 ConfigMap,其被external provisioner 使用,用于描述存储类:

% kubectl create -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: local-volume-config
data:
    storageClassMap: |
        local-storage-1:
            hostDir: /mnt/local-storage/local-storage-1
            mountDir: /mnt/local-storage/local-storage-1
        local-storage-2:
            hostDir: /mnt/local-storage/local-storage-2
            mountDir: /mnt/local-storage/local-storage-2
EOF

注意

  • local-storage-1:为 StorageClass 名称,与 /mnt/local-storage/<storage-class-name> 对应;
  • hostDir:本机目录;
  • moutDirexternal provisionerhostDir 挂载到 pod 中的目录,与 hostDir 保持一致即可。

provisionerroot 权限运行,OKD集群需赋权:

oc create serviceaccount local-storage-admin
oc adm policy add-scc-to-user privileged -z local-storage-admin

OKD集群安装模板:

oc create -f https://raw.githubusercontent.com/openshift/origin/release-3.11/examples/storage-examples/local-examples/local-storage-provisioner-template.yaml

从以上模板安装应用(PS:其以 DS 模式运行在所有计算节点上):

oc new-app -p CONFIGMAP=local-volume-config \
  -p SERVICE_ACCOUNT=local-storage-admin \
  -p NAMESPACE=local-storage \
  -p PROVISIONER_IMAGE=quay.io/external_storage/local-volume-provisioner:latest \
  local-storage-provisioner

创建所需的StorageClass

% kubectl create -f - <<'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: local-storage-1
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
EOF

% kubectl create -f - <<'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: local-storage-2
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
EOF

注意 StorageClass 创建完成后,PV才会被自动创建。

接着,我们手动创建 PVC 或者在 Statefulset 中使用此存储卷:

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: example-local-claim
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: local-storage-1
  resources:
    requests:
      storage: 5Gi

---
kind: StatefulSet
...
 volumeClaimTemplates:
  - metadata:
      name: example-local-claim
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: local-storage-1
      resources:
        requests:
          storage: 5Gi
退出移动版