Redis-4.0.14 Cluster 安装部署
本教程安装环境是 kvm 虚拟机,nat 模式下内网环境。这里为了节约服务器资源采用了在单实例 Linux 主机上进行演示。生产环境大流量下请勿将多台 Redis 实例部署到一台 Linux 服务器上(小流量也不推荐一机多 redis 实例)。idc 机器环境下要保证网卡的并发性、云主机要选用内存型(或者内存 io 型)。毕竟 Redis 是高 io 服务。
Tips:如果 Redis 是 3 主 3 从及以上规模集群建议关闭主服务器上的 bgsave 操作,改为在从上进行。降低流量高峰时 bgsave 产生延时影响 io 吞吐量。
- Redis 下载、安装
cd ~
yum install gcc gcc-c++ -y
wget http://download.redis.io/releases/redis-4.0.14.tar.gz
tar -zxf redis-4.0.14.tar.gz
cd redis-4.0.14
cd deps
make hiredis jemalloc linenoise lua geohash-int
cd ..
make PREFIX=/app/coohua/redis-9701 install
make install
- 将 redis-cli 加入系统路径中
[root@redis-nec001 ~]# cp /app/coohua/redis-9701/bin/redis-cli /usr/local/bin/redis-cli
[root@redis-nec001 ~]# redis-cli --version
redis-cli 4.0.14
- 制作多副本, 生产环境建议每个实例配置一个单独的服务器
cd /app/coohua/
mkdir -p redis-9701/conf
cp ~/redis-4.0.14/redis.conf ./redis-9701/conf/
cp -arp redis-9701 redis-9702
cp -arp redis-9701 redis-9703
cp -arp redis-9701 redis-9704
cp -arp redis-9701 redis-9705
cp -arp redis-9701 redis-9706
cp -arp redis-9701 redis-9707
- 修改配置文件
redis.conf
修改或开启以下几项
bind 192.168.100.214
port 9701
daemonize yes
pidfile /app/coohua/redis-9701/conf/redis_9701.pid
logfile "/data/coohua/redis-9701/redis.log"
dir /data/coohua/redis-9701/
maxmemory 4096M
cluster-enabled yes
cluster-config-file /app/coohua/redis-9701/conf/nodes-9701.conf
- 配置
ruby
环境 RVM 方式
gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
curl -sSL https://get.rvm.io | bash -s stable
source /etc/profile.d/rvm.sh
rvm install 2.6.4
rvm 2.6.4 --default
gem install redis
cp ~/redis/redis-4.0.14/src/redis-trib.rb /usr/local/bin/
- 设置系统内核参数
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.local
cp ~/redis/redis-4.0.14/src/redis-trib.rb /usr/local/bin/
cat /etc/sysctl.conf
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
kernel.core_uses_pid = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
vm.swappiness = 0
net.ipv4.neigh.default.gc_stale_time=120
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.all.arp_announce=2
net.ipv4.conf.lo.arp_announce=2
net.core.somaxconn = 262144
net.core.netdev_max_backlog = 262144
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_keepalive_time = 30
net.ipv4.tcp_orphan_retries = 2
kernel.core_pattern = /data/coohua/core/core_%e_%p
vm.overcommit_memory = 1
kernel.sysrq = 1
- 启动 Redis(以 coohua 用户启动)
chown -R coohua:coohua /app/coohua/redis* /data/coohua/
su - coohua
/app/coohua/redis-9701/bin/redis-server /app/coohua/redis-9701/conf/redis.conf
/app/coohua/redis-9702/bin/redis-server /app/coohua/redis-9702/conf/redis.conf
/app/coohua/redis-9703/bin/redis-server /app/coohua/redis-9703/conf/redis.conf
/app/coohua/redis-9704/bin/redis-server /app/coohua/redis-9704/conf/redis.conf
/app/coohua/redis-9705/bin/redis-server /app/coohua/redis-9705/conf/redis.conf
/app/coohua/redis-9706/bin/redis-server /app/coohua/redis-9706/conf/redis.conf
/app/coohua/redis-9707/bin/redis-server /app/coohua/redis-9707/conf/redis.conf
- 集群创建分为 3 主、3 主 3 从(这也是最少的集群数量)
redis-trib.rb create --replicas 0 192.168.100.214:9701 192.168.100.214:9702 192.168.100.214:9703 #3 主模式, 最小节点集群, 无法提供高可用.
redis-trib.rb create --replicas 1 192.168.100.214:9701 192.168.100.214:9702 192.168.100.214:9703 192.168.100.214:9704 192.168.100.214:9705 192.168.100.214:9706 #主从模式,Slave 从节点即是 Master 的备用节点也是数据的读取节点
- 创建 1 个最小的 3 主集群
[root@redis-nec001 bin]# redis-trib.rb create --replicas 0 192.168.100.214:9701 192.168.100.214:9702 192.168.100.214:9703
>>> Creating cluster
>>> Performing hash slots allocation on 3 nodes...
Using 3 masters:
192.168.100.214:9701
192.168.100.214:9702
192.168.100.214:9703
M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701
slots:0-5460 (5461 slots) master
M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702
slots:5461-10922 (5462 slots) master
M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703
slots:10923-16383 (5461 slots) master
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join..
>>> Performing Cluster Check (using node 192.168.100.214:9701)
M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701
slots:0-5460 (5461 slots) master
0 additional replica(s)
M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702
slots:5461-10922 (5462 slots) master
0 additional replica(s)
M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703
slots:10923-16383 (5461 slots) master
0 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
- 查看集群状态
[root@redis-nec001 bin]# redis-cli -h 192.168.100.214 -p 9701 -c
192.168.100.214:9701> cluster nodes
517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702@19702 master - 0 1568616352578 2 connected 5461-10922
ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703@19703 master - 0 1568616353579 3 connected 10923-16383
fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701@19701 myself,master - 0 1568616352000 1 connected 0-5460
192.168.100.214:9701>
- 添加从节点到集群中, 升级为 3 主 3 从高可用方式
redis-trib.rb add-node –slave 其中 –slave 表示添加的节点为从节点 其 Master 主节点用 –master-id 方式,其后是要加入的从节点 ip 及端口,最后是随机选择一个主节点完成命令的格式完成,不知道 Redis 设计时是如何思考的。非要添加一个无关紧要的参数,但是又不可少。
[root@redis-nec001 coohua]# redis-trib.rb add-node --slave --master-id fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9704 192.168.100.214:9701
>>> Adding node 192.168.100.214:9704 to cluster 192.168.100.214:9701
>>> Performing Cluster Check (using node 192.168.100.214:9701)
M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701
slots:0-5460 (5461 slots) master
0 additional replica(s)
M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702
slots:5461-10922 (5462 slots) master
0 additional replica(s)
M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703
slots:10923-16383 (5461 slots) master
0 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.100.214:9704 to make it join the cluster.
Waiting for the cluster to join.
>>> Configure node as replica of 192.168.100.214:9701.
[OK] New node added correctly.
- 添加主节点分配槽位流程
主节点的添加配置略为复杂,先要将存储为空的 Redis 实例添加到集群,再 ’ 合理 ’ 的分配槽位给它。
如果源集群是 3 主,又添加 3 主这样比较好分配,原来的节点每个分配一般槽位给新加入的节点
即:
节点 1 分配新节点 1 一半
节点 2 分配新节点 2 一半
节点 3 分配新节点 3 一半
非对称方式添加 前提是各主节点槽位数一致、或者接近一致
现节点槽位数 = 总槽位数 / 集群节点数 (包含最新的)
各节点需要迁移槽位数 = 源各节点槽位数 - 现节点槽位数 - 添加主节点
redis-trib.rb add-node 192.168.100.214:9707 192.168.100.214:9701
- 查看各主节点槽位 可以看到除了新节点外,每个主节点的槽位数基本都是 5461.
现节点槽位数 (迁移分配后)=16384/4=4096
旧主节点需要迁移槽位数 =5461-4094=1365
[root@redis-nec001 coohua]# redis-trib.rb check 192.168.100.214:9701
>>> Performing Cluster Check (using node 192.168.100.214:9701)
M: fa820855aeebad6551d09d0cd6063aeaefc8f4f9 192.168.100.214:9701
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 62b3ded1a7545f0931611e837cfdbe6dc6fa580c 192.168.100.214:9704
slots: (0 slots) slave
replicates fa820855aeebad6551d09d0cd6063aeaefc8f4f9
M: 517fd7f65b7e653a91b24aa7a06f1ec360bd8220 192.168.100.214:9702
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 143e3f136118e62ae6b5a6d64fc21e1fcafee4b4 192.168.100.214:9706
slots: (0 slots) slave
replicates ccf082f6516ec23c1aee891358a3daf47d2b5ca7
S: 97147860d71d363059927f21ac92b16b4d17c97e 192.168.100.214:9705
slots: (0 slots) slave
replicates 517fd7f65b7e653a91b24aa7a06f1ec360bd8220
M: ccf082f6516ec23c1aee891358a3daf47d2b5ca7 192.168.100.214:9703
slots:10923-16383 (5461 slots) master
M: 2c7eb280218234fb6adbd4718c7e21b128f1a938 192.168.100.214:9707
slots: (0 slots) master
0 additional replica(s)
- 槽位分配
redis-trib.rb reshard 192.168.100.214:9707
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 4096
What is the receiving node ID? 2c7eb280218234fb6adbd4718c7e21b128f1a938
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1:fa820855aeebad6551d09d0cd6063aeaefc8f4f9
Source node #2:517fd7f65b7e653a91b24aa7a06f1ec360bd8220
Source node #3:ccf082f6516ec23c1aee891358a3daf47d2b5ca7
Source node #4:done
Moving slot 1334 from fa820855aeebad6551d09d0cd6063aeaefc8f4f9
.......
.......
Do you want to proceed with the proposed reshard plan (yes/no)? yes
依次将另外两个主节点上的槽位也分配过来,过程略
分配过后, 各个主节点的槽位基本一致。
槽位迁移过程中可能会出错 使用redis-trib.rb fix
命令修复即可
迁移错误原因
— 主节点服务器当时流量过大、cpu 负载负载已经出现问题,避开峰值
— 主机点开启了 bgsave 导致 关闭 bgsave,并把 bgsave 开启在从节点,减轻主节点压力