Redis-4.0.14 Cluster 安装部署

本教程安装环境是kvm虚拟机,nat模式下内网环境。这里为了节约服务器资源采用了在单实例Linux主机上进行演示。生产环境大流量下请勿将多台Redis实例部署到一台Linux服务器上(小流量也不推荐一机多redis实例)。idc机器环境下要保证网卡的并发性、云主机要选用内存型(或者内存io型)。毕竟Redis是高io服务。
Tips:如果Redis是3主3从及以上规模集群建议关闭主服务器上的bgsave操作,改为在从上进行。降低流量高峰时bgsave产生延时影响io吞吐量。
  • Redis下载、安装
 cd ~ yum install gcc gcc-c++ -y wget http://download.redis.io/releases/redis-4.0.14.tar.gz tar -zxf redis-4.0.14.tar.gz cd redis-4.0.14 cd deps make hiredis jemalloc linenoise lua geohash-int cd .. make PREFIX=/app/coohua/redis-9701 install make install
  • 将redis-cli加入系统路径中
[root@redis-nec001 ~]# cp /app/coohua/redis-9701/bin/redis-cli /usr/local/bin/redis-cli[root@redis-nec001 ~]# redis-cli  --versionredis-cli 4.0.14
  • 制作多副本,生产环境建议每个实例配置一个单独的服务器
cd /app/coohua/mkdir -p redis-9701/confcp ~/redis-4.0.14/redis.conf  ./redis-9701/conf/cp -arp redis-9701 redis-9702cp -arp redis-9701 redis-9703cp -arp redis-9701 redis-9704cp -arp redis-9701 redis-9705cp -arp redis-9701 redis-9706cp -arp redis-9701 redis-9707
  • 修改配置文件redis.conf修改或开启以下几项
bind 192.168.100.214port 9701daemonize yespidfile /app/coohua/redis-9701/conf/redis_9701.pidlogfile "/data/coohua/redis-9701/redis.log"dir /data/coohua/redis-9701/maxmemory 4096Mcluster-enabled yescluster-config-file /app/coohua/redis-9701/conf/nodes-9701.conf
  • 配置ruby环境RVM方式
gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDBcurl -sSL https://get.rvm.io | bash -s stablesource /etc/profile.d/rvm.shrvm install 2.6.4rvm 2.6.4 --defaultgem install rediscp ~/redis/redis-4.0.14/src/redis-trib.rb /usr/local/bin/
  • 设置系统内核参数
echo never > /sys/kernel/mm/transparent_hugepage/enabledecho "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.localcp ~/redis/redis-4.0.14/src/redis-trib.rb /usr/local/bin/cat /etc/sysctl.confnet.ipv4.ip_forward = 0net.ipv4.conf.default.accept_source_route = 0kernel.core_uses_pid = 1net.bridge.bridge-nf-call-ip6tables = 0net.bridge.bridge-nf-call-iptables = 0net.bridge.bridge-nf-call-arptables = 0kernel.msgmnb = 65536kernel.msgmax = 65536kernel.shmmax = 68719476736kernel.shmall = 4294967296vm.swappiness = 0net.ipv4.neigh.default.gc_stale_time=120net.ipv4.conf.all.rp_filter=0net.ipv4.conf.default.rp_filter=0net.ipv4.conf.default.arp_announce = 2net.ipv4.conf.all.arp_announce=2net.ipv4.conf.lo.arp_announce=2net.core.somaxconn = 262144net.core.netdev_max_backlog = 262144net.ipv4.ip_local_port_range = 1024 65000net.ipv4.tcp_max_tw_buckets = 5000net.ipv4.tcp_max_orphans = 262144net.ipv4.tcp_max_syn_backlog = 262144net.ipv4.tcp_timestamps = 0net.ipv4.tcp_synack_retries = 1net.ipv4.tcp_syn_retries = 1net.ipv4.tcp_fin_timeout = 2net.ipv4.tcp_tw_recycle = 1net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_syncookies = 0net.ipv4.tcp_keepalive_time = 30net.ipv4.tcp_orphan_retries = 2kernel.core_pattern = /data/coohua/core/core_%e_%pvm.overcommit_memory = 1kernel.sysrq = 1
  • 启动Redis(以coohua用户启动)
  chown -R coohua:coohua /app/coohua/redis* /data/coohua/  su - coohua /app/coohua/redis-9701/bin/redis-server  /app/coohua/redis-9701/conf/redis.conf /app/coohua/redis-9702/bin/redis-server  /app/coohua/redis-9702/conf/redis.conf /app/coohua/redis-9703/bin/redis-server  /app/coohua/redis-9703/conf/redis.conf /app/coohua/redis-9704/bin/redis-server  /app/coohua/redis-9704/conf/redis.conf /app/coohua/redis-9705/bin/redis-server  /app/coohua/redis-9705/conf/redis.conf /app/coohua/redis-9706/bin/redis-server  /app/coohua/redis-9706/conf/redis.conf /app/coohua/redis-9707/bin/redis-server  /app/coohua/redis-9707/conf/redis.conf
  • 集群创建分为3主、3主3从(这也是最少的集群数量)
redis-trib.rb  create --replicas 0 192.168.100.214:9701 192.168.100.214:9702 192.168.100.214:9703 #3主模式,最小节点集群,无法提供高可用.redis-trib.rb  create --replicas 1 192.168.100.214:9701 192.168.100.214:9702 192.168.100.214:9703 192.168.100.214:9704 192.168.100.214:9705 192.168.100.214:9706 #主从模式,Slave从节点即是Master的备用节点也是数据的读取节点
  • 创建1个最小的3主集群
[root@redis-nec001 bin]# redis-trib.rb  create --replicas 0 192.168.100.214:9701  192.168.100.214:9702  192.168.100.214:9703>>> Creating cluster>>> Performing hash slots allocation on 3 nodes...Using 3 masters:192.168.100.214:9701192.168.100.214:9702192.168.100.214:9703M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701   slots:0-5460 (5461 slots) masterM: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702   slots:5461-10922 (5462 slots) masterM: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703   slots:10923-16383 (5461 slots) masterCan I set the above configuration? (type 'yes' to accept): yes         >>> Nodes configuration updated>>> Assign a different config epoch to each node>>> Sending CLUSTER MEET messages to join the clusterWaiting for the cluster to join..>>> Performing Cluster Check (using node 192.168.100.214:9701)M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701   slots:0-5460 (5461 slots) master   0 additional replica(s)M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702   slots:5461-10922 (5462 slots) master   0 additional replica(s)M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703   slots:10923-16383 (5461 slots) master   0 additional replica(s)[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.
  • 查看集群状态
[root@redis-nec001 bin]# redis-cli  -h 192.168.100.214 -p 9701 -c192.168.100.214:9701> cluster nodes517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702@19702 master - 0 1568616352578 2 connected 5461-10922ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703@19703 master - 0 1568616353579 3 connected 10923-16383fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701@19701 myself,master - 0 1568616352000 1 connected 0-5460192.168.100.214:9701> 
  • 添加从节点到集群中,升级为3主3从高可用方式
redis-trib.rb add-node --slave 其中 --slave表示添加的节点为从节点 其Master主节点用--master-id方式,其后是要加入的从节点ip及端口,最后是随机选择一个主节点完成命令的格式完成,不知道Redis设计时是如何思考的。非要添加一个无关紧要的参数,但是又不可少。
[root@redis-nec001 coohua]# redis-trib.rb add-node --slave --master-id fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9704   192.168.100.214:9701>>> Adding node 192.168.100.214:9704 to cluster 192.168.100.214:9701>>> Performing Cluster Check (using node 192.168.100.214:9701)M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701   slots:0-5460 (5461 slots) master   0 additional replica(s)M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702   slots:5461-10922 (5462 slots) master   0 additional replica(s)M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703   slots:10923-16383 (5461 slots) master   0 additional replica(s)[OK] All nodes agree about slots configuration.>>> Check for open slots...>>> Check slots coverage...[OK] All 16384 slots covered.>>> Send CLUSTER MEET to node 192.168.100.214:9704 to make it join the cluster.Waiting for the cluster to join.>>> Configure node as replica of 192.168.100.214:9701.[OK] New node added correctly.
  • 添加主节点分配槽位流程
    主节点的添加配置略为复杂,先要将存储为空的Redis实例添加到集群,再'合理'的分配槽位给它。
    如果源集群是3主,又添加3主这样比较好分配,原来的节点每个分配一般槽位给新加入的节点
    即:
    节点1分配新节点1一半
    节点2分配新节点2一半
    节点3分配新节点3一半
    非对称方式添加 前提是各主节点槽位数一致、或者接近一致
    现节点槽位数=总槽位数/集群节点数(包含最新的)
    各节点需要迁移槽位数=源各节点槽位数-现节点槽位数
  • 添加主节点
redis-trib.rb add-node 192.168.100.214:9707  192.168.100.214:9701
  • 查看各主节点槽位 可以看到除了新节点外,每个主节点的槽位数基本都是5461.

现节点槽位数(迁移分配后)=16384/4=4096
旧主节点需要迁移槽位数=5461-4094=1365

[root@redis-nec001 coohua]# redis-trib.rb check 192.168.100.214:9701              >>> Performing Cluster Check (using node 192.168.100.214:9701)M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701   slots:0-5460 (5461 slots) master   1 additional replica(s)S: 62b3ded1a7545f0931611e837cfdbe6dc6fa580c 192.168.100.214:9704   slots: (0 slots) slave   replicates fa820855aeebad6551d09d0cd6063aeaefc8f4f9M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702   slots:5461-10922 (5462 slots) master   1 additional replica(s)S: 143e3f136118e62ae6b5a6d64fc21e1fcafee4b4 192.168.100.214:9706   slots: (0 slots) slave   replicates ccf082f6516ec23c1aee891358a3daf47d2b5ca7S: 97147860d71d363059927f21ac92b16b4d17c97e 192.168.100.214:9705   slots: (0 slots) slave   replicates 517fd7f65b7e653a91b24aa7a06f1ec360bd8220M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703   slots:10923-16383 (5461 slots) masterM: 2c7eb280218234fb6adbd4718c7e21b128f1a938 192.168.100.214:9707   slots: (0 slots) master   0 additional replica(s)
  • 槽位分配
redis-trib.rb reshard 192.168.100.214:9707[OK] All 16384 slots covered.How many slots do you want to move (from 1 to 16384)? 4096What is the receiving node ID? 2c7eb280218234fb6adbd4718c7e21b128f1a938Please enter all the source node IDs.  Type 'all' to use all the nodes as source nodes for the hash slots.  Type 'done' once you entered all the source nodes IDs.Source node #1:fa820855aeebad6551d09d0cd6063aeaefc8f4f9Source node #2:517fd7f65b7e653a91b24aa7a06f1ec360bd8220Source node #3:ccf082f6516ec23c1aee891358a3daf47d2b5ca7Source node #4:doneMoving slot 1334 from fa820855aeebad6551d09d0cd6063aeaefc8f4f9..............Do you want to proceed with the proposed reshard plan (yes/no)? yes
依次将另外两个主节点上的槽位也分配过来,过程略
分配过后,各个主节点的槽位基本一致。
槽位迁移过程中可能会出错 使用redis-trib.rb fix命令修复即可
迁移错误原因
--- 主节点服务器当时流量过大、cpu负载负载已经出现问题,避开峰值
--- 主机点开启了bgsave导致 关闭bgsave,并把bgsave开启在从节点,减轻主节点压力