关于数据库:MySQL-Router高可用搭建

3次阅读

共计 31009 个字符,预计需要花费 78 分钟才能阅读完成。

  1. 装置简介
  2. 高可用搭建
  3. 高可用及负载平衡测试
  4. 问题解决

一、装置简介

1.1 装置目标

MySQL 官网提供了 InnoDB Cluster,该集群由 MySQL MGR 和 MySQL Router 组成。MySQL MGR 在数据库层面实现自主高可用性,而 MySQL Router 则负责代理拜访。在部署实现后,MySQL Router 将造成单点,如果呈现故障,将会影响数据库集群的可用性。因而,为了进步数据库系统的可用性,须要搭建 MySQL Router 的高可用性计划。

1.2 MySQL router 高可用组件介绍

本篇文章中的高可用计划,次要是通过 Corosync 和 Pacemaker 是两个开源软件我的项目实现,它们联合起来为高可用性集群提供了通信、同步、资源管理和故障转移等服务。

1.2.1 corosync

Corosync 是一个开源的高可用性集群通信和同步服务,能够实现集群节点之间的通信和数据同步,同时提供了牢靠的消息传递机制和成员治理性能,以确保在分布式环境下集群的稳固运行。Corosync 基于牢靠的 UDP 多播协定进行通信,并提供了可插拔的协定栈接口,能够反对多种协定和网络环境。它还提供了一个 API,能够让其余应用程序应用 Corosync 的通信和同步服务。

1.2.2 pacemaker

Pacemaker 是一个开源的高可用性集群资源管理和故障转移工具,能够实现在集群节点之间主动治理资源(如虚构 IP、文件系统、数据库等),并在节点或资源故障时进行主动迁徙,从而确保整个零碎的高可用性和连续性。Pacemaker 反对多种资源管理策略,能够依据不同的需要进行配置。它还提供了一个灵便的插件框架,能够反对不同的集群环境和利用场景,比方虚拟化、云计算等。

将 Corosync 和 Pacemaker 联合起来,能够提供一个残缺的高可用性集群解决方案。它通过 Corosync 实现集群节点之间的通信和同步,通过 Pacemaker 实现集群资源管理和故障转移,从而确保整个零碎的高可用性和连续性。它们联合起来为高可用性集群提供了牢靠的通信、同步、资源管理和故障转移等服务,是构建牢靠、高效的分布式系统的重要根底。

1.2.3 ldirectord

ldirectord 是一个用于 Linux 零碎的负载平衡工具,它能够治理多个服务器上的服务,并将客户端申请散发到这些服务器中的一个或多个上,以进步服务的可用性和性能。ldirectord 通常是与 Heartbeat 或 Keepalived 等集群软件一起应用,以确保高可用性和负载平衡。ldirectord 主要用途包含:

  • 负载平衡:ldirectord 能够基于不同的负载平衡算法进行申请散发,例如轮询、加权轮询、起码连贯、源地址哈希等。它能够将客户端申请散发到多个后端服务器中的一个或多个上,从而实现负载平衡。
  • 健康检查:ldirectord 能够定期检查后端服务器的可用性,并将不可用的服务器从服务池中排除,从而确保服务的高可用性和稳定性。
  • 会话放弃:ldirectord 能够依据客户端的 IP 地址、Cookie 等标识,将客户端申请路由到雷同的后端服务器上,从而实现会话放弃,确保客户端与后端服务器之间的连贯不会被中断。
  • 动静配置:ldirectord 反对动静增加、删除、批改后端服务器和服务,管理员能够通过命令行或配置文件等形式进行操作,从而实现动静配置。

ldirectord 是专门为 LVS 监控而编写的,用来监控 lvs 架构中服务器池(server pool)的服务器状态。ldirectord 运行在 IPVS 节点上,ldirectord 作为一个守护过程启动后会对服务器池中的每个实在服务器发送申请进行监控, 如果服务器没有响应 ldirectord 的申请,那么 ldirectord 认为该服务器不可用,ldirectord 会运行 ipvsadm 对 IPVS 表中该服务器进行删除,如果等下次再次检测有相应则通过 ipvsadm 进行增加。

2、装置布局

MySQL 及 MySQL Router 版本均为 8.0.32

IP 主机名 装置组件 应用端口
172.17.140.25 gdb1 MySQL MySQL Router ipvsadm ldirectord pcs pacemaker corosync MySQL:3309 MySQL Router:6446 MySQL Router:6447 pcs_tcp:13314 pcs_udp:13315
172.17.140.24 gdb2 MySQL MySQL Router ipvsadm ldirectord pcs pacemaker corosync MySQL:3309 MySQL Router:6446 MySQL Router:6447 pcs_tcp:13314 pcs_udp:13315
172.17.139.164 gdb3 MySQL MySQL Router ipvsadm ldirectord pcs pacemaker corosync MySQL:3309 MySQL Router:6446 MySQL Router:6447 pcs_tcp:13314 pcs_udp:13315
172.17.129.1 VIP 6446、6447
172.17.139.62 MySQL client

大略装置步骤如下

二、高可用搭建

2.1 根底环境设置(三台服务器都做)

  1. 别离在三台服务器上依据布局设置主机名
hostnamectl set-hostname gdb1
hostnamectl set-hostname gdb2
hostnamectl set-hostname gdb3
  1. 将上面内容追加保留在三台服务器的文件 /etc/hosts 中
172.17.140.25   gdb1
172.17.140.24   gdb2
172.17.139.164  gdb3
  1. 在三台服务器上禁用防火墙
systemctl stop firewalld
systemctl disable firewalld
  1. 在三台服务器上禁用 selinux,如果 selinux 未敞开,批改配置文件后,须要重启服务器才会失效

如下输入示意实现敞开

  1. 在三台服务器上别离执行上面命令,用户建设相互

建设互信,仅仅是为了服务器间传输文件不便,不是集群搭建的必要根底。

ssh-keygen -t dsa
ssh-copy-id gdb1
ssh-copy-id gdb2
ssh-copy-id gdb3

执行状况如下

[#19#root@gdb1 ~ 16:16:54]19 ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):         ## 间接回车
/root/.ssh/id_dsa already exists.
Overwrite (y/n)? y                                               ## 如果原来有 ssh 配置文件,能够输出 y 笼罩
Enter passphrase (empty for no passphrase):                      ## 间接回车
Enter same passphrase again:                                     ## 间接回车
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
SHA256:qwJXgfN13+N1U5qvn9fC8pyhA29iuXvQVhCupExzgTc root@gdb1
The key's randomart image is:
+---[DSA 1024]----+
|     .   .. ..   |
|    o . o Eo.   .|
|     o ooooo.o o.|
|      oo = .. *.o|
|     .  S .. o +o|
|  . .    .o o . .|
|   o    .  * ....|
|    .  .  + *o+o+|
|     ..  .o*.+++o|
+----[SHA256]-----+
[#20#root@gdb1 ~ 16:17:08]20 ssh-copy-id gdb1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_dsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@gdb1's password:                                            ## 输出 gdb1 服务器的 root 用户对应明码

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh'gdb1'"
and check to make sure that only the key(s) you wanted were added.

[#21#root@gdb1 ~ 16:17:22]21 ssh-copy-id gdb2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_dsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@gdb2's password:                                             ## 输出 gdb2 服务器的 root 用户对应明码

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh'gdb2'"
and check to make sure that only the key(s) you wanted were added.

[#22#root@gdb1 ~ 16:17:41]22 ssh-copy-id gdb3
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_dsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@gdb3's password:                                             ## 输出 gdb3 服务器的 root 用户对应明码

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh'gdb3'"
and check to make sure that only the key(s) you wanted were added.

[#23#root@gdb1 ~ 16:17:44]23

任意切换服务器,不须要输出明码,则阐明相互建设胜利

[#24#root@gdb1 ~ 16:21:16]24 ssh gdb1
Last login: Tue Feb 21 16:21:05 2023 from 172.17.140.25
[#1#root@gdb1 ~ 16:21:19]1 logout
Connection to gdb1 closed.
[#25#root@gdb1 ~ 16:21:19]25 ssh gdb2
Last login: Tue Feb 21 16:21:09 2023 from 172.17.140.25
[#1#root@gdb2 ~ 16:21:21]1 logout
Connection to gdb2 closed.
[#26#root@gdb1 ~ 16:21:21]26 ssh gdb3
Last login: Tue Feb 21 10:53:47 2023
[#1#root@gdb3 ~ 16:21:22]1 logout
Connection to gdb3 closed.
[#27#root@gdb1 ~ 16:21:24]27 
  1. 时钟同步,对于分布式、集中式集群,时钟同步都十分重要,工夫不统一会引发各种异常情况
yum -y install ntpdate    // 装置 ntpdate 客户端
ntpdate npt1.aliyun.com   // 如果连通外网,能够指定阿里云 ntp 服务器,或者指定内网 ntp server
hwclock -w            // 更新 BIOS 工夫

2.2 通过 MySQL Router 搭建读写拆散 MGR 集群

具体参考文章https://gitee.com/GreatSQL/GreatSQL-Doc/blob/master/deep-dive…

2.3 在三台服务器上别离进行进行 MySQL Router 部署并启动,MySQL Router 配置文件如下

# File automatically generated during MySQL Router bootstrap
[DEFAULT]
name=system
user=root
keyring_path=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/var/lib/mysqlrouter/keyring
master_key_path=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/mysqlrouter.key
connect_timeout=5
read_timeout=30
dynamic_state=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/bin/../var/lib/mysqlrouter/state.json
client_ssl_cert=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/var/lib/mysqlrouter/router-cert.pem
client_ssl_key=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/var/lib/mysqlrouter/router-key.pem
client_ssl_mode=DISABLED
server_ssl_mode=AS_CLIENT
server_ssl_verify=DISABLED
unknown_config_option=error

[logger]
level=INFO

[metadata_cache:bootstrap]
cluster_type=gr
router_id=1
user=mysql_router1_g9c62rk29lcn
metadata_cluster=gdbCluster
ttl=0.5
auth_cache_ttl=-1
auth_cache_refresh_interval=2
use_gr_notifications=0

[routing:bootstrap_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://gdbCluster/?role=PRIMARY
routing_strategy=first-available
protocol=classic

[routing:bootstrap_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://gdbCluster/?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=classic

[routing:bootstrap_x_rw]
bind_address=0.0.0.0
bind_port=6448
destinations=metadata-cache://gdbCluster/?role=PRIMARY
routing_strategy=first-available
protocol=x

[routing:bootstrap_x_ro]
bind_address=0.0.0.0
bind_port=6449
destinations=metadata-cache://gdbCluster/?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=x

[http_server]
port=8443
ssl=1
ssl_cert=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/var/lib/mysqlrouter/router-cert.pem
ssl_key=/opt/software/mysql-router-8.0.32-linux-glibc2.17-x86_64-minimal/var/lib/mysqlrouter/router-key.pem

[http_auth_realm:default_auth_realm]
backend=default_auth_backend
method=basic
name=default_realm

[rest_router]
require_realm=default_auth_realm

[rest_api]

[http_auth_backend:default_auth_backend]
backend=metadata_cache

[rest_routing]
require_realm=default_auth_realm

[rest_metadata_cache]
require_realm=default_auth_realm

2.4 验证三台 MySQL Router 连贯测试

[#12#root@gdb2 ~ 14:12:45]12 mysql -uroot -pAbc1234567* -h172.17.140.25 -P6446 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:12:46 |
+---------------------+
[#13#root@gdb2 ~ 14:12:46]13 mysql -uroot -pAbc1234567* -h172.17.140.25 -P6447 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:12:49 |
+---------------------+
[#14#root@gdb2 ~ 14:12:49]14 mysql -uroot -pAbc1234567* -h172.17.140.24 -P6446 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:12:52 |
+---------------------+
[#15#root@gdb2 ~ 14:12:52]15 mysql -uroot -pAbc1234567* -h172.17.140.24 -P6447 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:12:55 |
+---------------------+
[#16#root@gdb2 ~ 14:12:55]16 mysql -uroot -pAbc1234567* -h172.17.139.164 -P6446 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:12:58 |
+---------------------+
[#17#root@gdb2 ~ 14:12:58]17 mysql -uroot -pAbc1234567* -h172.17.139.164 -P6447 -N -e 'select now()' 2> /dev/null
+---------------------+
| 2023-03-17 14:13:01 |
+---------------------+
[#18#root@gdb2 ~ 14:13:01]18

2.5 装置 pacemaker

  1. 装置 pacemaker

装置 pacemaker 会依赖 corosync 这个包,所以间接装置 pacemaker 这一个包就能够了

[#1#root@gdb1 ~ 10:05:55]1 yum -y install pacemaker
  1. 装置 pcs 管理工具
[#1#root@gdb1 ~ 10:05:55]1 yum -y install pcs
  1. 创立集群认证操作系统用户,用户名为 hacluster,明码设置为 abc123
[#13#root@gdb1 ~ 10:54:13]13 echo abc123 | passwd --stdin hacluster
更改用户 hacluster 的明码。passwd:所有的身份验证令牌曾经胜利更新。
  1. 启动 pcsd,并且设置开机自启动
[#16#root@gdb1 ~ 10:55:30]16 systemctl enable pcsd
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
[#17#root@gdb1 ~ 10:56:03]17 systemctl start pcsd
[#18#root@gdb1 ~ 10:56:08]18 systemctl status pcsd
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
   Active: active (running) since 三 2023-02-22 10:56:08 CST; 6s ago
     Docs: man:pcsd(8)
           man:pcs(8)
 Main PID: 27677 (pcsd)
    Tasks: 4
   Memory: 29.9M
   CGroup: /system.slice/pcsd.service
           └─27677 /usr/bin/ruby /usr/lib/pcsd/pcsd

2 月 22 10:56:07 gdb1 systemd[1]: Starting PCS GUI and remote configuration interface...
2 月 22 10:56:08 gdb1 systemd[1]: Started PCS GUI and remote configuration interface.
[#19#root@gdb1 ~ 10:56:14]19
  1. 批改 pcsd 的 TCP 端口为指定的 13314
sed -i '/#PCSD_PORT=2224/a\
PCSD_PORT=13314' /etc/sysconfig/pcsd

重启 pcsd 服务,让新端口失效

[#23#root@gdb1 ~ 11:23:20]23 systemctl restart pcsd
[#24#root@gdb1 ~ 11:23:39]24 systemctl status pcsd
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
   Active: active (running) since 三 2023-02-22 11:23:39 CST; 5s ago
     Docs: man:pcsd(8)
           man:pcs(8)
 Main PID: 30041 (pcsd)
    Tasks: 4
   Memory: 27.3M
   CGroup: /system.slice/pcsd.service
           └─30041 /usr/bin/ruby /usr/lib/pcsd/pcsd

2 月 22 11:23:38 gdb1 systemd[1]: Starting PCS GUI and remote configuration interface...
2 月 22 11:23:39 gdb1 systemd[1]: Started PCS GUI and remote configuration interface.
[#25#root@gdb1 ~ 11:23:45]25
  1. 设置集群认证信息, 通过操作系统用户 hacluster 进行认证
[#27#root@gdb1 ~ 11:31:43]27 cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
[#28#root@gdb1 ~ 11:32:15]28 pcs cluster auth gdb1:13314 gdb2:13314 gdb3:13314 -u hacluster -p 'abc123'
gdb1: Authorized
gdb2: Authorized
gdb3: Authorized
[#29#root@gdb1 ~ 11:33:18]29
  1. 创立集群,任意节点执行即可
## 名称为 gdb_ha , udp 协定为 13315, 掩码为 24 , 集群成员为主机 gdb1, gdb2, gdb3
[#31#root@gdb1 ~ 11:41:48]31 pcs cluster setup --force --name gdb_ha --transport=udp --addr0 24 --mcastport0 13315 gdb1 gdb2 gdb3
Destroying cluster on nodes: gdb1, gdb2, gdb3...
gdb1: Stopping Cluster (pacemaker)...
gdb2: Stopping Cluster (pacemaker)...
gdb3: Stopping Cluster (pacemaker)...
gdb2: Successfully destroyed cluster
gdb1: Successfully destroyed cluster
gdb3: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'gdb1', 'gdb2', 'gdb3'
gdb2: successful distribution of the file 'pacemaker_remote authkey'
gdb3: successful distribution of the file 'pacemaker_remote authkey'
gdb1: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
gdb1: Succeeded
gdb2: Succeeded
gdb3: Succeeded

Synchronizing pcsd certificates on nodes gdb1, gdb2, gdb3...
gdb1: Success
gdb2: Success
gdb3: Success
Restarting pcsd on the nodes in order to reload the certificates...
gdb1: Success
gdb2: Success
gdb3: Success
  1. 确认残缺的集群配置,在任意节点查看即可
[#21#root@gdb2 ~ 11:33:18]21 more /etc/corosync/corosync.conf
totem {
    version: 2
    cluster_name: gdb_ha
    secauth: off
    transport: udp
    rrp_mode: passive

    interface {
        ringnumber: 0
        bindnetaddr: 24
        mcastaddr: 239.255.1.1
        mcastport: 13315
    }
}

nodelist {
    node {
        ring0_addr: gdb1
        nodeid: 1
    }

    node {
        ring0_addr: gdb2
        nodeid: 2
    }

    node {
        ring0_addr: gdb3
        nodeid: 3
    }
}

quorum {provider: corosync_votequorum}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}
[#22#root@gdb2 ~ 14:23:50]22
  1. 启动所有集群节点的 pacemaker 相干服务,任意节点执行即可
[#35#root@gdb1 ~ 15:30:51]35 pcs cluster start --all
gdb1: Starting Cluster (corosync)...
gdb2: Starting Cluster (corosync)...
gdb3: Starting Cluster (corosync)...
gdb3: Starting Cluster (pacemaker)...
gdb1: Starting Cluster (pacemaker)...
gdb2: Starting Cluster (pacemaker)...

敞开服务时,应用 pcs cluster stop –all,或者用 pcs cluster stop《server》敞开某一台

  1. 在每个节点上设置 pacemaker 相干服务开机自启动
[#35#root@gdb1 ~ 15:30:51]35 systemctl enable pcsd corosync pacemaker
[#36#root@gdb1 ~ 15:30:53]36 pcs cluster enable --all
  1. 没有 STONITH 设施时,禁用 STONITH 组件性能

禁用 STONITH 组件性能后,分布式锁管理器 DLM 等资源以及依赖 DLM 的所有服务:例如 cLVM2,GFS2,OCFS2 等都将无奈启动,不禁用时会有错误信息

pcs property set stonith-enabled=false

残缺的命令执行过程如下

[#32#root@gdb1 ~ 15:48:20]32 systemctl status pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled)
   Active: active (running) since 三 2023-02-22 15:35:48 CST; 1min 54s ago
     Docs: man:pacemakerd
           https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html
 Main PID: 25661 (pacemakerd)
    Tasks: 7
   Memory: 51.1M
   CGroup: /system.slice/pacemaker.service
           ├─25661 /usr/sbin/pacemakerd -f
           ├─25662 /usr/libexec/pacemaker/cib
           ├─25663 /usr/libexec/pacemaker/stonithd
           ├─25664 /usr/libexec/pacemaker/lrmd
           ├─25665 /usr/libexec/pacemaker/attrd
           ├─25666 /usr/libexec/pacemaker/pengine
           └─25667 /usr/libexec/pacemaker/crmd

2 月 22 15:35:52 gdb1 crmd[25667]:   notice: Fencer successfully connected
2 月 22 15:36:11 gdb1 crmd[25667]:   notice: State transition S_ELECTION -> S_INTEGRATION
2 月 22 15:36:12 gdb1 pengine[25666]:    error: Resource start-up disabled since no STONITH resources have been defined
2 月 22 15:36:12 gdb1 pengine[25666]:    error: Either configure some or disable STONITH with the stonith-enabled option
2 月 22 15:36:12 gdb1 pengine[25666]:    error: NOTE: Clusters with shared data need STONITH to ensure data integrity
2 月 22 15:36:12 gdb1 pengine[25666]:   notice: Delaying fencing operations until there are resources to manage
2 月 22 15:36:12 gdb1 pengine[25666]:   notice: Calculated transition 0, saving inputs in /var/lib/pacemaker/pengine/pe-input-0.bz2
2 月 22 15:36:12 gdb1 pengine[25666]:   notice: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
2 月 22 15:36:12 gdb1 crmd[25667]:   notice: Transition 0 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-0.bz2): Complete
2 月 22 15:36:12 gdb1 crmd[25667]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
[#33#root@gdb1 ~ 15:37:43]33 pcs property set stonith-enabled=false
[#34#root@gdb1 ~ 15:48:20]34 systemctl status pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled)
   Active: active (running) since 三 2023-02-22 15:35:48 CST; 12min ago
     Docs: man:pacemakerd
           https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html
 Main PID: 25661 (pacemakerd)
    Tasks: 7
   Memory: 51.7M
   CGroup: /system.slice/pacemaker.service
           ├─25661 /usr/sbin/pacemakerd -f
           ├─25662 /usr/libexec/pacemaker/cib
           ├─25663 /usr/libexec/pacemaker/stonithd
           ├─25664 /usr/libexec/pacemaker/lrmd
           ├─25665 /usr/libexec/pacemaker/attrd
           ├─25666 /usr/libexec/pacemaker/pengine
           └─25667 /usr/libexec/pacemaker/crmd

2 月 22 15:36:12 gdb1 pengine[25666]:   notice: Calculated transition 0, saving inputs in /var/lib/pacemaker/pengine/pe-input-0.bz2
2 月 22 15:36:12 gdb1 pengine[25666]:   notice: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
2 月 22 15:36:12 gdb1 crmd[25667]:   notice: Transition 0 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-0.bz2): Complete
2 月 22 15:36:12 gdb1 crmd[25667]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
2 月 22 15:48:20 gdb1 crmd[25667]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
2 月 22 15:48:21 gdb1 pengine[25666]:  warning: Blind faith: not fencing unseen nodes
2 月 22 15:48:21 gdb1 pengine[25666]:   notice: Delaying fencing operations until there are resources to manage
2 月 22 15:48:21 gdb1 pengine[25666]:   notice: Calculated transition 1, saving inputs in /var/lib/pacemaker/pengine/pe-input-1.bz2
2 月 22 15:48:21 gdb1 crmd[25667]:   notice: Transition 1 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1.bz2): Complete
2 月 22 15:48:21 gdb1 crmd[25667]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
[#35#root@gdb1 ~ 15:48:31]35 
  1. 验证 pcs 集群状态失常,无异样信息输入
[#35#root@gdb1 ~ 15:48:31]35 crm_verify -L
[#36#root@gdb1 ~ 17:33:31]36

2.6 装置 ldirectord(三台都做)

  1. ldirectord 下载

下载地址 https://rpm.pbone.net/info_idpl_23860919_distro_centos_6_com_…

新标签关上获取到地址后,能够用迅雷下载

  1. 下载依赖包 ipvsadm
[#10#root@gdb1 ~ 19:51:20]10 wget http://mirror.centos.org/altarch/7/os/aarch64/Packages/ipvsadm-1.27-8.el7.aarch64.rpm
  1. 执行装置,如果装置过程中,还须要其余依赖,须要自行处理
[#11#root@gdb1 ~ 19:51:29]11 yum -y install ldirectord-3.9.5-3.1.x86_64.rpm ipvsadm-1.27-8.el7.aarch64.rpm
  1. 创立配置文件 /etc/ha.d/ldirectord.cf,编写内容如下
checktimeout=3
checkinterval=1
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=no
virtual=172.17.129.1:6446
        real=172.17.140.25:6446 gate
        real=172.17.140.24:6446 gate
        real=172.17.139.164:6446 gate
        scheduler=rr
        service=mysql
        protocol=tcp
        checkport=6446
        checktype=connect
        login="root"
        passwd="Abc1234567*"
        database="information_schema"
        request="SELECT 1"
virtual=172.17.129.1:6447
        real=172.17.140.25:6447 gate
        real=172.17.140.24:6447 gate
        real=172.17.139.164:6447 gate
        scheduler=rr
        service=mysql
        protocol=tcp
        checkport=6447
        checktype=connect
        login="root"
        passwd="Abc1234567*"
        database="information_schema"
        request="SELECT 1"

参数阐明

  • checktimeout=3 : 后端服务器健康检查等待时间
  • checkinterval=5 : 两次查看间隔时间
  • autoreload=yes : 主动增加或者移除实在服务器
  • logfile="/var/log/ldirectord.log" : 日志文件全门路
  • quiescent=no : 故障时移除服务器的时候中断所有连贯
  • virtual=172.17.129.1:6446 :VIP
  • real=172.17.140.25:6446 gate : 实在服务器
  • scheduler=rr : 指定调度算法:rr 为轮询,wrr 为带权重的轮询
  • service=mysql : 衰弱检测实在服务器时 ldirectord 应用的服务
  • protocol=tcp : 服务协定
  • checktype=connect :ldirectord 守护过程应用什么办法监督实在服务器
  • checkport=16310 : 衰弱检测应用的端口
  • login="root" : 衰弱检测应用的用户名
  • passwd="a123456" : 衰弱检测应用的明码
  • database="information_schema" : 衰弱检测拜访的默认 database
  • request="SELECT1" : 衰弱检测执行的检测命令

将编写好的配置文件,散发到另外两个服务器

[#22#root@gdb1 ~ 20:51:57]22 cd /etc/ha.d/
[#23#root@gdb1 /etc/ha.d 20:52:17]23 scp ldirectord.cf gdb2:`pwd`
ldirectord.cf                                                                                                                                                100% 1300     1.1MB/s   00:00    
[#24#root@gdb1 /etc/ha.d 20:52:26]24 scp ldirectord.cf gdb3:`pwd`
ldirectord.cf                                                                                                                                                100% 1300     1.4MB/s   00:00    
[#25#root@gdb1 /etc/ha.d 20:52:29]25

2.7 配置回环网卡上配置 VIP(三台都做)

此操作用于 pcs 外部负载平衡,在 lo 网卡上配置 VIP 用于 pcs cluster 外部通信,如果不操作,则无奈进行负载平衡,脚本内容如下 vip.sh,放在 mysql_bin 目录即可

#!/bin/bash 
. /etc/init.d/functions
SNS_VIP=172.16.50.161
  
case "$1" in
start) 
      ifconfig lo:0 $SNS_VIP netmask 255.255.240.0 broadcast $SNS_VIP 
#      /sbin/route add -host $SNS_VIP dev lo:0 
      echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
      echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
      echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
      echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
      sysctl -p >/dev/null 2>&1 
      echo "RealServer Start OK"
      ;; 
stop) 
     ifconfig lo:0 down 
#      route del $SNS_VIP >/dev/null 2>&1 
      echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
      echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
      echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
      echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
      echo "RealServer Stoped"
      ;; 
*) 
      echo "Usage: $0 {start|stop}"
      exit 1 
esac
exit 0

启动配置

# sh vip.sh start

进行配置

# sh vip.sh stop

2.8 集群资源增加(任意节点执行即可)

  1. pcs 中增加 vip 资源
[#6#root@gdb1 ~ 11:27:30]6 pcs resource create vip --disabled ocf:heartbeat:IPaddr nic=eth0 ip=172.17.129.1 cidr_netmask=24 broadcast=172.17.143.255 op monitor interval=5s timeout=20s

命令解析

  • pcs resource create:pcs 创立资源对象的起始命令
  • vip: 虚构 IP(VIP)资源对象的名称,能够依据须要自定义
  • --disable: 示意在创立资源对象时将其禁用。这是为了防止资源在尚未齐全配置的状况下被 Pacemaker 集群所应用
  • ocf:heartbeat:IPaddr:通知 Pacemaker 应用 Heartbeat 插件(即 ocf:heartbeat)中的 IPaddr 插件来治理这个 VIP 资源
  • nic=eth0:这个选项指定了网络接口的名称,行将 VIP 绑定到哪个网卡上
  • ip=172.17.129.1:指定了要调配给 VIP 的 IP 地址
  • cidr_netmask=24:指定了 VIP 的子网掩码。在这个例子中,CIDR 格局的子网掩码为 24,相当于 255.255.255.0
  • broadcast=172.17.143.255:指定了播送地址
  • op monitor interval=5s timeout=20s:定义了用于监督这个 VIP 资源的操作。interval=5s 示意 Pacemaker 将每 5 秒查看一次资源的状态,timeout=20s 示意 Pacemaker 将在 20 秒内期待资源的响应。如果在这 20 秒内资源没有响应,Pacemaker 将视为资源不可用。
  1. pcs 中增加 lvs 资源
[#7#root@gdb1 ~ 11:34:50]7 pcs resource create lvs --disabled ocf:heartbeat:ldirectord op monitor interval=10s timeout=10s

命令解析

  • pcs resource create:pcs 创立资源对象的起始命令
  • lvs: 虚构 IP(VIP)资源对象的名称,能够依据须要自定义
  • --disable: 示意在创立资源对象时将其禁用。这是为了防止资源在尚未齐全配置的状况下被 Pacemaker 集群所应用
  • ocf:heartbeat:ldirectord:通知 Pacemaker 应用 Heartbeat 插件(即 ocf:heartbeat)中的 ldirectord 插件来治理 LVS 的负载均衡器,应用的配置文件为下面配置的 /etc/ha.d/ldirectord.cf
  • op monitor interval=10s timeout=10s:定义了用于监督这个 LVS 资源的操作。interval=10s 示意 Pacemaker 将每 10 秒查看一次资源的状态,timeout=10s 示意 Pacemaker 将在 10 秒内期待资源的响应。如果在这 10 秒内资源没有响应,Pacemaker 将视为资源不可用。
  1. 创立实现后检测 resource 状态
[#9#root@gdb1 ~ 11:35:42]9 pcs resource show
 vip        (ocf::heartbeat:IPaddr):        Stopped (disabled)
 lvs        (ocf::heartbeat:ldirectord):        Stopped (disabled)
[#10#root@gdb1 ~ 11:35:48]10
  1. 创立 resource group,并增加 resource
[#10#root@gdb1 ~ 11:37:36]10 pcs resource group add dbservice vip
[#11#root@gdb1 ~ 11:37:40]11 pcs resource group add dbservice lvs
[#12#root@gdb1 ~ 11:37:44]12 

2.9 集群启停

集群启动

  1. 启动 resource
# pcs resource enable vip lvs 或者 pcs resource enable dbservice

如果之前有异样,能够通过上面的命令清理异样信息,而后再启动

# pcs resource cleanup vip
# pcs resource cleanup lvs
  1. 启动状态确认,执行命令 pcs status
[#54#root@gdb1 /etc/ha.d 15:54:22]54 pcs status
Cluster name: gdb_ha
Stack: corosync
Current DC: gdb1 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Thu Feb 23 15:55:27 2023
Last change: Thu Feb 23 15:53:55 2023 by hacluster via crmd on gdb82

3 nodes configured
2 resource instances configured

Online: [gdb1 gdb2 gdb3]

Full list of resources:

 Resource Group: dbservice
     lvs        (ocf::heartbeat:ldirectord):        Started gdb2
     vip        (ocf::heartbeat:IPaddr):            Started gdb3

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[#55#root@gdb1 /etc/ha.d 15:55:27]55

输入后果阐明

Cluster name: gdb_ha: 集群的名称为 gdb_ha。

Stack: corosync: 该集群应用的通信协议栈为 corosync。

Current DC: gdb3 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum `: 以后的集群控制器(DC)为 gdb3,其版本为 1.1.23-1.el7_9.1-9acf116022,并且该节点所在的分区具备投票权。

Last updated: Thu Feb 23 15:55:27 2023: 最初一次更新集群状态信息的工夫为 2023 年 2 月 23 日 15:55:27。

Last change: Thu Feb 23 15:53:55 2023 by hacluster via crmd on gdb2 : 最初一次更改集群配置的工夫为 2023 年 2 月 23 日 15:53:55,由用户 hacluster 通过 crmd 在节点 gdb2 上执行。

3 nodes configured: 该集群配置了 3 个节点。

2 resource instances configured: 该集群中配置了 2 个资源实例。

Online: [gdb1 gdb2 gdb3]: 以后在线的节点为 gdb1、gdb2 和 gdb3。

Full list of resources: 列出了该集群中所有的资源,包含资源名称、资源类型和所在节点,以及资源的启动状态和以后状态。其中,dbservice 是资源组名称,lvs 是类型为 ocf:ldirectord 的资源,vip 是类型为 ocf:IPaddr 的资源。

Daemon Status: 列出了 Pacemaker 各个组件的运行状态,包含 corosync、pacemaker 和 pcsd。corosync、pacemaker 和 pcsd 均为 active/enabled 状态,示意它们都在运行并且曾经启用。

  1. 在下面 pcs status 输入的 vip Started gdb3 的 gdb3 服务器上启动 ldirectord 服务
[#19#root@gdb3 ~ 11:50:51]19 systemctl start ldirectord
[#20#root@gdb3 ~ 11:50:58]20 
[#20#root@gdb3 ~ 11:50:59]20 systemctl status ldirectord
● ldirectord.service - LSB: Control Linux Virtual Server via ldirectord on non-heartbeat systems
   Loaded: loaded (/etc/rc.d/init.d/ldirectord; bad; vendor preset: disabled)
   Active: active (running) since 四 2023-02-23 11:50:58 CST; 2s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1472 ExecStop=/etc/rc.d/init.d/ldirectord stop (code=exited, status=0/SUCCESS)
  Process: 1479 ExecStart=/etc/rc.d/init.d/ldirectord start (code=exited, status=0/SUCCESS)
    Tasks: 1
   Memory: 15.8M
   CGroup: /system.slice/ldirectord.service
           └─1484 /usr/bin/perl -w /usr/sbin/ldirectord start

2 月 23 11:50:58 gdb3 ldirectord[1479]: at /usr/sbin/ldirectord line 838.
2 月 23 11:50:58 gdb3 ldirectord[1479]: Subroutine main::unpack_sockaddr_in6 redefined at /usr/share/perl5/vendor_perl/Exporter.pm line 66.
2 月 23 11:50:58 gdb3 ldirectord[1479]: at /usr/sbin/ldirectord line 838.
2 月 23 11:50:58 gdb3 ldirectord[1479]: Subroutine main::sockaddr_in6 redefined at /usr/share/perl5/vendor_perl/Exporter.pm line 66.
2 月 23 11:50:58 gdb3 ldirectord[1479]: at /usr/sbin/ldirectord line 838.
2 月 23 11:50:58 gdb3 ldirectord[1479]: Subroutine main::pack_sockaddr_in6 redefined at /usr/sbin/ldirectord line 3078.
2 月 23 11:50:58 gdb3 ldirectord[1479]: Subroutine main::unpack_sockaddr_in6 redefined at /usr/sbin/ldirectord line 3078.
2 月 23 11:50:58 gdb3 ldirectord[1479]: Subroutine main::sockaddr_in6 redefined at /usr/sbin/ldirectord line 3078.
2 月 23 11:50:58 gdb3 ldirectord[1479]: success
2 月 23 11:50:58 gdb3 systemd[1]: Started LSB: Control Linux Virtual Server via ldirectord on non-heartbeat systems.
[#21#root@gdb3 ~ 11:51:01]21 

通过上述操作即实现集群启动。

集群进行

  1. 进行 resource
# pcs resource disable vip lvs 或者 pcs resource disable dbservice
# systemctl stop corosync pacemaker pcsd ldirectord

卸载集群

# pcs cluster stop
# pcs cluster destroy
# systemctl stop pcsd pacemaker corosync ldirectord
# systemctl disable pcsd pacemaker corosync ldirectord
# yum remove -y pacemaker corosync pcs ldirectord
# rm -rf /var/lib/pcsd/* /var/lib/corosync/*
# rm -f /etc/ha.d/ldirectord.cf

三、高可用及负载平衡测试

  1. 在 172.17.139.62 上通过 for 循环,拜访 VIP,察看负载平衡状况

留神:VIP 无奈在 real server 服务器上进行拜访,因而须要第 4 台服务器进行拜访验证

# for x in {1..100}; do mysql -uroot -pAbc1234567* -h172.17.129.1 -P6446 -N -e 'select sleep(60)' 2> /dev/null & done

在 pcs resource lvs 运行的服务器上,执行ipvsadm -Ln

[#26#root@gdb1 ~ 15:52:28]26 ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.17.129.1:6446 rr
  -> 172.17.139.164:6446          Route   1      33         0         
  -> 172.17.140.24:6446           Route   1      34         0         
  -> 172.17.140.25:6446           Route   1      33         0         
TCP  172.17.129.1:6447 rr
  -> 172.17.139.164:6447          Route   1      0          0         
  -> 172.17.140.24:6447           Route   1      0          0         
  -> 172.17.140.25:6447           Route   1      0          0         
[#27#root@gdb1 ~ 15:52:29]27 

能够看到拜访被均匀负载到每个服务器上了。

在每个服务器上,通过 netstat -alntp| grep 172.17.139.62 确认申请的存在,其中 172.17.139.62 是发动申请的 IP 地址。

[#28#root@gdb1 ~ 15:53:10]28 netstat -alntp| grep 172.17.139.62 | grep 6446
tcp        0      0 172.17.129.1:6446       172.17.139.62:54444     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54606     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54592     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54492     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54580     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54432     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54586     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54552     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54404     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54566     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54516     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54560     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54450     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54480     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54540     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54522     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54462     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54528     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54534     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54598     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54498     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54426     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54510     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54504     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54412     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54612     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54456     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54468     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54474     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54486     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54574     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54438     ESTABLISHED 1902/./mysqlrouter  
tcp        0      0 172.17.129.1:6446       172.17.139.62:54546     ESTABLISHED 1902/./mysqlrouter  
[#29#root@gdb1 ~ 15:53:13]29
  1. 进行 gdb3 服务器上的 MySQl Router,从新发动 100 个新的申请,察看路由转发状况
[#29#root@gdb1 ~ 15:55:02]29 ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.17.129.1:6446 rr
  -> 172.17.140.24:6446           Route   1      0          34        
  -> 172.17.140.25:6446           Route   1      0          33        
TCP  172.17.129.1:6447 rr
  -> 172.17.140.24:6447           Route   1      0          0         
  -> 172.17.140.25:6447           Route   1      0          0         
[#30#root@gdb1 ~ 15:55:03]30 
[#30#root@gdb1 ~ 15:55:21]30 ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.17.129.1:6446 rr
  -> 172.17.140.24:6446           Route   1      0          34        
  -> 172.17.140.25:6446           Route   1      0          33        
TCP  172.17.129.1:6447 rr
  -> 172.17.140.24:6447           Route   1      50         0         
  -> 172.17.140.25:6447           Route   1      50         0         
[#31#root@gdb1 ~ 15:55:21]31

通过上述后果能够看到,gdb3 服务器的 MySQL Router 进行后,路由规定从集群中剔除,再次发动的 100 个申请,平均分配到了剩下的两个服务器上,合乎预期成果。

四、问题解决

  1. pcs cluster 启动异样
# pcs cluster start --all 
报错:unable to connect to [node], try setting higher timeout in --request-timeout option

增加超时参数,再次启动

# pcs cluster start --all --request-timeout 120000
# pcs cluster enable --all

也有可能是其余节点的 pcsd 服务没有启动胜利,启动其余节点 pcsd 服务后再启动 pcs cluster

  1. 两个节点的 pcs 集群,须要敞开投票机制
# pcs property set no-quorum-policy=ignore
  1. 日志文件查看,如果启动、运行异样,能够查看上面两个日志文件,剖析具体异样起因
# tail -n 30 /var/log/ldirectord.log
# tail -n 30 /var/log/pacemaker.log
  1. pcs status 输入 offline 节点
[#4#root@gdb1 ~ 11:21:23]4 pcs status
Cluster name: db_ha_lvs
Stack: corosync
Current DC: gdb2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Thu Mar  2 11:21:27 2023
Last change: Wed Mar  1 16:01:56 2023 by root via cibadmin on gdb1

3 nodes configured
2 resource instances configured (2 DISABLED)

Online: [gdb1 gdb2]
OFFLINE: [gdb3]

Full list of resources:

 Resource Group: dbservice
     vip        (ocf::heartbeat:IPaddr):        Stopped (disabled)
     lvs        (ocf::heartbeat:ldirectord):        Stopped (disabled)

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[#5#root@gdb1 ~ 11:21:27]5
  1. 启动 sh vip.sh start 后节点退出集群
[#28#root@gdb3 /data/dbscale/lvs 10:06:10]28 ifconfig -a
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.139.164  netmask 255.255.240.0  broadcast 172.17.143.255
        inet6 fe80::216:3eff:fe07:3778  prefixlen 64  scopeid 0x20<link>
        ether 00:16:3e:07:37:78  txqueuelen 1000  (Ethernet)
        RX packets 17967625  bytes 2013372790 (1.8 GiB)
        RX errors 0  dropped 13  overruns 0  frame 0
        TX packets 11997866  bytes 7616182902 (7.0 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 177401  bytes 16941285 (16.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 177401  bytes 16941285 (16.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:54:00:96:cf:dd  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0-nic: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 52:54:00:96:cf:dd  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

然而因为 real server 的是 172.17.140.24、172.17.140.25、172.17.139.164,此时应用 255.255.240.0 无奈通信,将其批改为 255.255.0.0,再次启动后拜访失常。

#!/bin/bash 
. /etc/init.d/functions
SNS_VIP=172.17.129.1
  
case "$1" in
start) 
      ifconfig lo:0 $SNS_VIP netmask 255.255.0.0 broadcast $SNS_VIP 
#      /sbin/route add -host $SNS_VIP dev lo:0 
      echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
      echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
      echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
      echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
      sysctl -p >/dev/null 2>&1 
      echo "RealServer Start OK"
      ;; 
stop) 
     ifconfig lo:0 down 
#      route del $SNS_VIP >/dev/null 2>&1 
      echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
      echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
      echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
      echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
      echo "RealServer Stoped"
      ;; 
*) 
      echo "Usage: $0 {start|stop}"
      exit 1 
esac
exit 0

Enjoy GreatSQL :)

## 对于 GreatSQL

GreatSQL 是由万里数据库保护的 MySQL 分支,专一于晋升 MGR 可靠性及性能,反对 InnoDB 并行查问个性,是实用于金融级利用的 MySQL 分支版本。

相干链接:GreatSQL 社区 Gitee GitHub Bilibili

GreatSQL 社区:

社区博客有奖征稿详情:https://greatsql.cn/thread-100-1-1.html

技术交换群:

微信:扫码增加 GreatSQL 社区助手 微信好友,发送验证信息 加群

正文完
 0