redis-cluster集群的部署网上一堆,用k8s部署也不在少数,但都是抄来抄去,问题不少,实际操作分享进去的还是太少。
1、redis启动配置文件,应用CofigMap来治理比拟不便,redis-config.yaml
apiVersion: v1kind: ConfigMapmetadata: name: redis-config namespace: defaultdata: update-node.sh: | #!/bin/sh REDIS_NODES="/data/nodes.conf" sed -i -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${MY_POD_IP}/" ${REDIS_NODES} exec "$@" redis.conf: |+ port 7001 protected-mode no cluster-enabled yes cluster-config-file nodes.conf cluster-node-timeout 15000 #cluster-announce-ip ${MY_POD_IP} #cluster-announce-port 7001 #cluster-announce-bus-port 17001 logfile "/data/redis.log"
阐明:2个文件,node-update.sh用来redis启动时执行脚本,具体前面再介绍为何要减少一个启动脚本;redis.conf为redis启动配置文件。
2、redis长久化存储pv,pv应用nfs服务存储,集群6个节点别离创立6个pv不同存储,动态创立pv文件redis-pv.yaml
apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv0 labels: pv: nfs-pv0spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster0---apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv1 labels: pv: nfs-pv1spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster1---apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv2 labels: pv: nfs-pv2spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster2---apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv3 labels: pv: nfs-pv3spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster3---apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv4 labels: pv: nfs-pv4spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster4---apiVersion: v1kind: PersistentVolumemetadata: name: nfs-pv5 labels: pv: nfs-pv5spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 192.168.11.242 path: /root/nfs/redis-cluster5
3、应用StatefulSet创立redis-cluster集群节点和headless service
---apiVersion: apps/v1kind: StatefulSetmetadata: labels: app: redis-cluster name: redis-cluster namespace: defaultspec: replicas: 6 selector: matchLabels: app: redis-cluster serviceName: redis-cluster template: metadata: labels: app: redis-cluster spec: containers: - command: ["/bin/bash", "/usr/local/etc/redis/update-node.sh", "redis-server", "/usr/local/etc/redis/redis.conf"] #args: # - /usr/local/etc/redis/redis.conf # - --cluster-announce-ip # - "$(MY_POD_IP)" env: - name: MY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: TZ value: Asia/Shanghai image: 'redis:6.0.10' imagePullPolicy: IfNotPresent name: redis ports: - containerPort: 7001 hostPort: 7001 name: redis-port protocol: TCP volumeMounts: - mountPath: /data name: redis-cluster-data subPath: data readOnly: false - mountPath: /usr/local/etc/redis name: redis-config readOnly: false dnsPolicy: ClusterFirst volumes: - name: redis-config configMap: name: redis-config volumeClaimTemplates: #PVC模板 - metadata: name: redis-cluster-data namespace: default spec: accessModes: [ "ReadWriteMany" ] resources: requests: storage: 1Gi---apiVersion: v1kind: Servicemetadata: labels: app: redis-cluster name: redis-cluster namespace: defaultspec: ports: - name: redis-port port: 7001 protocol: TCP targetPort: 7001 selector: app: redis-cluster type: ClusterIP clusterIP: None
阐明2个点:第一,update-node.sh脚本须要加上bash环境变量,否则脚本启动执行时提醒没有权限,这也是网上很多文章并没有提到的;
第二,env环境变量中加了2个,${MY_POD_IP}是节点重启时更新pod的ip地址,TZ就是时区更改;
4、初始化集群
redis-cluster从4.0开始反对命令行间接初始化,不再须要装置ruby了。
kubectl exec -it redis-cluster-0 -- redis-cli -p 7001 --cluster create --cluster-replicas 1 $(kubectl get pods -l k8s.kuboard.cn/name=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:7001 ')
5、验证集群
kubectl exec -it redis-cluster-0 -- redis-cli cluster info
redis-cluster集群部署结束。
踩坑总结
问题1,集群初始化时,始终期待 Waiting for the cluster to join
解决:最开始部署时,从应用docker-comopose部署的办法套用过去,因为redis.conf配置文件中参数cluster-announce-ip配置了宿主机的ip,当初始化时集群节点之间须要通信,然而再k8s中宿主机ip曾经不实用,导致无奈通信统一卡住在那个地位。
禁用#cluster-announce-ip
问题2,集群创立时,[ERR] Node 10.244.7.224:7001 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
解决:因为后面初始化集群时卡住,导致局部节点的nodes.conf文件中更新了新节点数据,须要删除数据。能够删除文件,也能够通过如下命令
redis-cli -p 7001 -c 登录每个redis节点重置:
flushdbcluster reset
问题3,pvc如何和pv一对一绑定?
阐明:分动态和动静绑定
首先动态,之前始终没想明确,写好了pv的yaml文件,service外面定义好pvc之后,如何让pv和pvc进行绑定?只有咱们定义的pv存储空间大小和pvc申请大小统一就会匹配绑定bound,当然也能够通过lebel标签来关联pvc到特定的pv。
另外,redis集群多个节点,长久化存储目录必定不能一样,如果yaml中间接指定PVC的名字,只能指定一个名字,这样是办不到关联到多个不同的PV,于是须要volumeClaimTemplates(PVC模板)。
volumeClaimTemplates: - metadata: name: redis-cluster-data namespace: default spec: accessModes: [ "ReadWriteMany" ] resources: requests: storage: 1Gi
动静,因为NFS不反对动静存储,所以咱们须要借用这个存储插件StorageClass,关联pvc,动态创建pv。分3个步骤,1,定义storage;2,部署受权,因为storage主动创立pv须要通过kube-apiserver,所以要进行受权;3,部署主动创立pv的服务nfs-client-provisioner;4,部署redis服务。
参考链接地址:
https://blog.csdn.net/qq_2561...
以上动态创建pv的确是个好方法,后续有工夫再实际操作下。
当然,如果嫌麻烦举荐应用kuboard,手动点点体验也很不错,齐全不必本人这么麻烦了哈。
问题4,测试一个节点重启,发现节点重启后,nodes.conf中本人的pod ip 不会更新,倒是其余节点之间通信会更新pod ip。
解决:下面也提到过,增加一个启动脚本update-node.sh ,每次启动查找podIP后替换nodes.conf文件中myself节点的ip地址就能够解决。留神,网上很多都是通过容器启动时在参数args外面增加pod ip来更新,但我实际过并不行,他们说5.0能够,然而我换5.0也试过并不行。如果有能够的请留言,也有可能是我配置写的不对吧。
最初,以上内容尽管不多,但相对本人手敲的,心愿能分享给大家,这也是年前的最初一篇文章,欢送留言。