事变形容

之前应用docker-compose在测试服务器上搭建了一个redis测试集群.运行了很久工夫都没有异样.
无奈机房有次事变,服务器被无端重启了.而后重启redis集群也没有任何异样,然而get,set等办法就出
现题目中的谬误.
上面是错误信息:

127.0.0.1:6378> set ceshi 123(error) CLUSTERDOWN Hash slot not served
127.0.0.1:6378> get ceshi(error) CLUSTERDOWN The cluster is down
127.0.0.1:6378> cluster infocluster_state:failcluster_slots_assigned:16289cluster_slots_ok:16289cluster_slots_pfail:0cluster_slots_fail:0cluster_known_nodes:6cluster_size:3cluster_current_epoch:170cluster_my_epoch:106cluster_stats_messages_ping_sent:1587cluster_stats_messages_pong_sent:1589cluster_stats_messages_sent:3176cluster_stats_messages_ping_received:1589cluster_stats_messages_pong_received:1587cluster_stats_messages_received:3176
cluster_stats_messages_ping_sent:1587cluster_stats_messages_pong_sent:1589cluster_stats_messages_sent:3176cluster_stats_messages_ping_received:1589cluster_stats_messages_pong_received:1587cluster_stats_messages_received:3176127.0.0.1:6378> 127.0.0.1:6378> 127.0.0.1:6378> 127.0.0.1:6378> cluster slots 1) 1) (integer) 5461    2) (integer) 5488    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 2) 1) (integer) 5490    2) (integer) 5491    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 3) 1) (integer) 5493    2) (integer) 5590    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 4) 1) (integer) 5592    2) (integer) 5648    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 5) 1) (integer) 5650    2) (integer) 5657    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 6) 1) (integer) 5659    2) (integer) 5755    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464" 7) 1) (integer) 5757    2) (integer) 5769    3) 1) "127.0.0.1"       2) (integer) 6373       3) "33961d33e3f9aca7e38670602878b89c1cee00a4"    4) 1) "127.0.0.1"       2) (integer) 6377       3) "fd2f54ae6e078a35b228fd1524d24640b63df464"  ......(这里还有很多) 97)...

看到大略就猜到什么问题了,因为集群就是要满足所有的16364个槽点全副调配才会胜利。

解决问题

通过下面的错误信息就晓得,是短少槽点.因为数据比拟多,人工去审查少了那些槽点
挺吃力的,罗唆间接用脚本跑(不论你有没有,我一律加上完事)

增加一个槽的命令:redis-cli -h 127.0.0.1 -p 6376 cluster addslots 0redis-cli -h 127.0.0.1 -p 6376 cluster addslots 1redis-cli -h 127.0.0.1 -p 6376 cluster addslots 2redis-cli -h 127.0.0.1 -p 6376 cluster addslots 3......备注(0是槽点.总共有0-16363个槽点)这样写太吃力了
上脚本
// go代码生成脚本文件.func TestShell(t *testing.T) {    var sb strings.Builder    for i := 0; i < 16384; i++ {        sprintf := fmt.Sprintf("redis-cli -h 127.0.0.1 -p 6378 cluster addslots %d\n", i)        sb.WriteString(sprintf)    }    create, _ := os.Create("s6378.sh")    create.WriteString(sb.String())}

执行脚本后.重启redis

docker-compose restart// 再次查看127.0.0.1:6378> cluster infocluster_state:okcluster_slots_assigned:16384cluster_slots_ok:16384cluster_slots_pfail:0cluster_slots_fail:0cluster_known_nodes:6cluster_size:3cluster_current_epoch:170cluster_my_epoch:106cluster_stats_messages_ping_sent:369cluster_stats_messages_pong_sent:382cluster_stats_messages_sent:751cluster_stats_messages_ping_received:382cluster_stats_messages_pong_received:369cluster_stats_messages_received:751

cluster_state:fail => cluster_state:ok

127.0.0.1:6378> cluster slots1) 1) (integer) 5461   2) (integer) 10922   3) 1) "127.0.0.1"      2) (integer) 6373      3) "33961d33e3f9aca7e38670602878b89c1cee00a4"   4) 1) "127.0.0.1"      2) (integer) 6377      3) "fd2f54ae6e078a35b228fd1524d24640b63df464"2) 1) (integer) 0   2) (integer) 5460   3) 1) "127.0.0.1"      2) (integer) 6378      3) "2d65cfd4af71fd7c99d60ee2b75be371b705097f"   4) 1) "127.0.0.1"      2) (integer) 6374      3) "3de92e59c2e1a45fe1013caa73696c9b1d1d62b8"3) 1) (integer) 10923   2) (integer) 16383   3) 1) "127.0.0.1"      2) (integer) 6376      3) "939f8e1da6b8a687e0b7876875b38f29da063c48"   4) 1) "127.0.0.1"      2) (integer) 6375      3) "275d32d018acf42c06ab0fb5cd7036b5c6f41acf"

从上数据得出,redis曾经失常了.(阐明,这里搭建的redis是三主三从,主6378,6377,6376,从6375,6374,6373)

验证

127.0.0.1:6378> set ceshi 123-> Redirected to slot [11469] located at 127.0.0.1:6376OK127.0.0.1:6376> get ceshi"123"

至此,redis集群问题曾经解决.

留神

我脚本里写的只是往主节点6378 addslots 槽点的
其余主节点也有可能会持续报错
所以 把其它主节点也 addslots 槽点(0-168363)

附上docker-compose.yml文件

version: '3'services:    master-1:        container_name: master-1        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/master1/redis.conf:/etc/usr/local/redis.conf            - ./redis/master1/redis.log:/usr/local/redis/logs/redis-server.log    master-2:        container_name: master-2        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/master2/redis.conf:/etc/usr/local/redis.conf            - ./redis/master2/redis.log:/usr/local/redis/logs/redis-server.log            master-3:        container_name: master-3        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/master3/redis.conf:/etc/usr/local/redis.conf            - ./redis/master3/redis.log:/usr/local/redis/logs/redis-server.log    slave-1:        container_name: slave-1        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/slave1/redis.conf:/etc/usr/local/redis.conf            - ./redis/slave1/redis.log:/usr/local/redis/logs/redis-server.log    slave-2:        container_name: slave-2        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/slave2/redis.conf:/etc/usr/local/redis.conf            - ./redis/slave2/redis.log:/usr/local/redis/logs/redis-server.log                slave-3:        container_name: slave-3        image: redis        command: redis-server /etc/usr/local/redis.conf        network_mode: "host"        volumes:            - ./redis/slave3/redis.conf:/etc/usr/local/redis.conf            - ./redis/slave3/redis.log:/usr/local/redis/logs/redis-server.log