Redis-cluster集群创建异常及常用操作命令

搭建Redis-cluster集群见问题

redis-trib.rb 使用密码连接集群

前面的文档提到的配置,我们都设置的密码
如果Redis设定了密码,那么通过redis-trib.rb脚本创建集群时,是会类似这样的错误的:[ERR] Sorry, can’t connect to node ...:7001
[ERR] Sorry, can’t connect to node 192.168..:7001
原因:
这是因为redis-trib.rb脚本中连接Redis时,并未设定密码,这确实是个很大的坑。我的解决方法时,修改该脚本中连接Redis时的代码,修改内容如下:

大概是99行
把下面内容

@r = Redis.new(:host => @info[:host], :port => @info[:port], :timeout => 60)

修改为:

@r = Redis.new(:host => @info[:host], :port => @info[:port], :timeout => 60, :password => "123456")

redis5.0之后的版本就不会有这个问题了

redis集群部署一直卡在Waiting for the cluster to join …

[root@localhost conf]# redis-trib.rb create --replicas 1 xxxxxxxxxxx
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.65.171:7001
192.168.65.175:7003
192.168.65.176:7005
Adding replica 192.168.65.175:7004 to 192.168.65.171:7001
Adding replica 192.168.65.176:7006 to 192.168.65.175:7003
Adding replica 192.168.65.171:7002 to 192.168.65.176:7005
M: 9808b9c852bc3e1dbbbd8f7c2e892217370c801d 192.168.65.171:7001
   slots:0-5460 (5461 slots) master
S: 8085e19aac1485f2804adb6b86f2cce2d045bf16 192.168.65.171:7002
   replicates 10b93027efc04877339a3030ecdf09c46fc86e7b
M: fd3874bc2ccf78a37dec0ec897650e9c3a0e93fd 192.168.65.175:7003
   slots:5461-10922 (5462 slots) master
S: 49fbb8e2cae0d9d899f3e3a8ceec474aee692283 192.168.65.175:7004
   replicates 9808b9c852bc3e1dbbbd8f7c2e892217370c801d
M: 10b93027efc04877339a3030ecdf09c46fc86e7b 192.168.65.176:7005
   slots:10923-16383 (5461 slots) master
S: 9c9a22f730882bff0b829755f26845e19fff1bee 192.168.65.176:7006
   replicates fd3874bc2ccf78a37dec0ec897650e9c3a0e93fd
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.............

接下来 提示 Waiting for the cluster to join… 安装的时候在这里就一直等等等,没反应,傻傻等半天,看这句提示上面一句,Sending Cluster Meet Message to join the Cluster.

正常的解决方法:
在 redis-cli -c -h xxx.xxx.xxx.xxx -p 700* 分别进入redis各节点的客户端命令窗口, 依次输入 cluster meet xxx.xxx.xxx.xxx 700* ......

回到Server1,已经创建完毕了。

cluster meet ip port命令无效

这个是困扰了我最久的问题,使用cluster meet ip port命令无效,同时,很少有博客提到redis集群总线的内容,都是叫你关闭防火墙,实际生产中谁会这么做?生产中一般不会关闭防火墙的

首先需要理清一个概念,就是redis集群总线:

redis集群总线端口为redis客户端端口加上10000,比如说你的redis 6379端口为客户端通讯端口,那么16379端口为集群总线端口
那么:
还有一个需要注意的问题是,iptables或者firewall需要添加对应集群总线端口,如果你把这两个都关了的话可以忽略以下内容:

iptables和firewall的配置参考博客 :https://www.code404.icu/213.html

添加redis集群总线端口和每个redis实例的通信端口即可

redis集群操作

登录redis集群

root@localhost conf]# redis-cli -c -h 192.168.65.176 -p 7001 -a 123456
192.168.65.176:7001>

查看redis集群信息

192.168.65.176:7001> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:60631
cluster_stats_messages_pong_sent:60631
cluster_stats_messages_sent:121262
cluster_stats_messages_ping_received:60626
cluster_stats_messages_pong_received:60631
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:121262
192.168.65.176:7001> 

列出redis集群节点信息

192.168.65.176:7001> CLUSTER NODES
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621327172067 4 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 myself,master - 0 1621327172000 1 connected 0-5460
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621327172067 2 connected 5461-10922
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621327172067 6 connected
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621327172067 5 connected
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621327172067 3 connected 10923-16383

新增redis集群节点

在服务器新增一个redis节点

[root@localhost conf]# cp redis_7006.conf redis_7007.conf
[root@localhost conf]# vim redis_7007.conf 

bind 192.168.65.176
port 7007
daemonize yes
pidfile "/data/redis/redis_7007.pid"
logfile "/data/redis/logs/redis_7007.log"
dir "/data/redis/data/redis_7007"
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes_7007.conf
cluster-node-timeout 60

[root@localhost conf]# mkdir /data/redis/data/redis_7007
[root@localhost conf]# redis-server /data/redis/conf/redis_7007.conf

集群中增加redis节点

192.168.65.176:7001> CLUSTER MEET 192.168.65.176 7007
OK
192.168.65.176:7001> CLUSTER NODES
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 master - 0 1621327679230 0 connected
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621327679230 4 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 myself,master - 0 1621327679000 1 connected 0-5460
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621327679230 2 connected 5461-10922
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621327679230 6 connected
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621327679230 5 connected
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621327679230 3 connected 10923-16383

可以看到,新增的节点都是以master身份加入集群的

更换节点身份:

将新增的192.168.65.176:7007节点身份改为192.168.65.176:7001的slave

[root@localhost conf]# redis-cli -c -h 192.168.65.176 -p 7007 -a 123456
192.168.65.176:7007> CLUSTER REPLICATE 0a856e5007487002cfb70f6c6789223a4bfc05e7
OK
192.168.65.176:7007> CLUSTER NODES
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621327910379 2 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 master - 0 1621327910379 1 connected 0-5460
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621327910379 3 connected
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621327910379 2 connected 5461-10922
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621327910379 3 connected 10923-16383
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621327910379 1 connected
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 myself,slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621327910000 0 connected
192.168.65.176:7007> 

删除redis cluster节点

原则:

无法删除已经登录的节点
无法删除自己所属的master节点

192.168.65.176:7007> cluster nodes
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621328097777 2 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 master - 0 1621328097777 1 connected 0-5460
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621328097777 3 connected
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621328097777 2 connected 5461-10922
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621328097777 3 connected 10923-16383
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621328097777 1 connected
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 myself,slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621328097000 0 connected

192.168.65.176:7007> CLUSTER FORGET 048083c0ffc5309c87994402b33db74537eb45e3
(error) ERR I tried hard but I can't forget myself...
192.168.65.176:7007> CLUSTER FORGET 0a856e5007487002cfb70f6c6789223a4bfc05e7
(error) ERR Can't forget my master!
192.168.65.176:7007> 

192.168.65.176:7007> CLUSTER FORGET bec54e69084ba31da7f9f64f362229c92458ced5
OK
192.168.65.176:7007> CLUSTER FORGET 13ca1d78a69771be772f9dc2c49bf72291de779e
OK
192.168.65.176:7007> CLUSTER NODES
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 master - 0 1621328660369 1 connected 0-5460
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621328660369 3 connected
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621328660369 3 connected 10923-16383
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621328660369 1 connected
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 myself,slave 0a856e5007487002cfb70f6c6789223a4bfc05e7 0 1621328660000 0 connected
192.168.65.176:7007> 

可以看到,之前删除的节点又恢复了,这是因为对应的配置文件没有删除

模拟master节点挂掉

[root@localhost conf]# netstat -nptl|grep 7001
tcp        0      0 192.168.65.176:7001     0.0.0.0:*               LISTEN      908/redis-server 19 
tcp        0      0 192.168.65.176:17001    0.0.0.0:*               LISTEN      908/redis-server 19 
[root@localhost conf]# kill -9 908
[root@localhost conf]# redis-cli -c -h 192.168.65.176 -p 7007 -a 123456
192.168.65.176:7007> 
192.168.65.176:7007> CLUSTER NODES
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621329155995 2 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 master,fail - 1621329083066 1621329082965 1 disconnected
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621329155995 2 connected 5461-10922
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621329155995 3 connected
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621329155995 3 connected 10923-16383
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 master - 0 1621329155995 7 connected 0-5460
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 myself,slave 7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 0 1621329155000 0 connected

对应7001的一行可以看到,master fail,状态为disconnected;而对应7005的一行,slave已经变成master。

重新启动 7001 端口节点

[root@eureka176 conf]# redis-server redis_7001.conf 
[root@eureka176 conf]# redis-cli -c -h 192.168.65.176 -p 7007 -a 123456
192.168.65.176:7007> CLUSTER NODES 
bec54e69084ba31da7f9f64f362229c92458ced5 192.168.65.176:7006@17006 slave 13ca1d78a69771be772f9dc2c49bf72291de779e 0 1621329255956 2 connected
0a856e5007487002cfb70f6c6789223a4bfc05e7 192.168.65.176:7001@17001 slave 7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 0 1621329255956 7 connected
13ca1d78a69771be772f9dc2c49bf72291de779e 192.168.65.176:7002@17002 master - 0 1621329255956 2 connected 5461-10922
2a4dddc7aa5c124adc0d62bd4ed55d7ce71679bf 192.168.65.176:7004@17004 slave 5cc4d8633f243f1cde83098e87a052f477c2d189 0 1621329255956 3 connected
5cc4d8633f243f1cde83098e87a052f477c2d189 192.168.65.176:7003@17003 master - 0 1621329255956 3 connected 10923-16383
7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 192.168.65.176:7005@17005 master - 0 1621329255956 7 connected 0-5460
048083c0ffc5309c87994402b33db74537eb45e3 192.168.65.176:7007@17007 myself,slave 7e8d7e7ab1d54b859709ff1b3850b624bfd9e9c4 0 1621329255000 0 connected
192.168.65.176:7007> 

可以看到,7001节点启动后为slave节点,并且是7005的slave节点。即master节点如果挂掉,它的slave节点变为新master节点继续对外提供服务,而原来的master节点如果重启,则变为新master节点的slave节点。

版权声明:本文为作者原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。

原创文章,作者:老C,如若转载,请注明出处:https://www.code404.icu/625.html

发表评论

登录后才能评论