- GreatSQL社区原创内容未经受权不得随便应用,转载请分割小编并注明起源。
1、问题形容
在做MGR测试的时候偶然遇到gtid_executed事务ID不间断的问题,然而并不影响数据库的失常运行。景象如下
GreatDB Cluster[sysbench]> select @@gtid_executed;+-----------------------------------------------------------------+| @@gtid_executed |+-----------------------------------------------------------------+| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-1286:2052-2290:3052-3277 |+-----------------------------------------------------------------+1 row in set (0.00 sec)
2、确认起因
查看官网文档发现以下参数:
group_replication_gtid_assignment_block_size
以下是对官网文档的翻译和了解
group_replication_gtid_assignment_block_size为每个成员保留的间断GTID数。每个成员从中进行耗费,并在其须要的时候获取更多的GTID数(相似于分布式事务中的全局序列,该零碎变量设置的值示意每个成员每一次从全局序列中获取多大范畴的间断数字范畴来作为本身写事务的GTID号)。
该零碎变量是组范畴的配置设置,它必须在所有组成员上设置雷同的值,在组复制运行时不容许批改,要使得批改值失效,须要齐全从新疏导组(应用零碎变量group_replication_bootstrap_group= on来从新疏导组)。
全局变量,动静变量,整型类型,默认值为1000000,取值范畴:32位平台为1~4294967295,64位平台为1~9223372036854775807,MySQL 5.7.17版本引入。
官网文档地址:https://dev.mysql.com/doc/ref...
通过对文档理解到MGR会为每个实例节点调配一段间断的GTID值,所以狐疑是MGR产生了主从切换,从而导致GTID的事务ID不间断
2.1、复现问题
为不便测试批改group_replication_gtid_assignment_block_size为1000,并重启MGR
GreatDB Cluster[(none)]> set persist group_replication_gtid_assignment_block_size=1000;Query OK, 0 rows affected (0.00 sec)GreatDB Cluster[(none)]> set global group_replication_bootstrap_group=on;Query OK, 0 rows affected (0.00 sec)GreatDB Cluster[(none)]> start group_replication;Query OK, 0 rows affected (2.45 sec)GreatDB Cluster[(none)]> set global group_replication_bootstrap_group=off;Query OK, 0 rows affected (0.00 sec)
把82切换为主节点而后通过sysbench模仿业务
GreatDB Cluster[sysbench]> select @@gtid_executed;+---------------------------------------------+| @@gtid_executed |+---------------------------------------------+| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5445 |+---------------------------------------------+1 row in set (0.00 sec)GreatDB Cluster[sysbench]> select * from performance_schema.replication_group_members;+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+| group_replication_applier | cf43d5d7-7354-11ec-af9a-080027fa2d35 | 172.16.50.81 | 4444 | ONLINE | PRIMARY | 8.0.25 || group_replication_applier | cf520b3b-7354-11ec-b785-08002792d155 | 172.16.50.82 | 4444 | ONLINE | SECONDARY | 8.0.25 || group_replication_applier | cf85763c-7354-11ec-898d-0800276e4bea | 172.16.50.83 | 4444 | ONLINE | SECONDARY | 8.0.25 |+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+3 rows in set (0.00 sec)GreatDB Cluster[sysbench]> SELECT group_replication_set_as_primary('cf520b3b-7354-11ec-b785-08002792d155');+--------------------------------------------------------------------------+| group_replication_set_as_primary('cf520b3b-7354-11ec-b785-08002792d155') |+--------------------------------------------------------------------------+| Primary server switched to: cf520b3b-7354-11ec-b785-08002792d155 |+--------------------------------------------------------------------------+1 row in set (1.38 sec)GreatDB Cluster[sysbench]> select * from performance_schema.replication_group_members;+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+| group_replication_applier | cf43d5d7-7354-11ec-af9a-080027fa2d35 | 172.16.50.81 | 4444 | ONLINE | SECONDARY | 8.0.25 || group_replication_applier | cf520b3b-7354-11ec-b785-08002792d155 | 172.16.50.82 | 4444 | ONLINE | PRIMARY | 8.0.25 || group_replication_applier | cf85763c-7354-11ec-898d-0800276e4bea | 172.16.50.83 | 4444 | ONLINE | SECONDARY | 8.0.25 |+---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+3 rows in set (0.00 sec)
模仿业务在新的主节点写业务
# sysbench ./oltp_read_write.lua --mysql-db=sysbench --mysql-host=172.16.50.82 --mysql-port=4444 --mysql-user=greatdb --mysql-password=greatdb --tables=3 --table_size=1000000 --report-interval=2 --threads=3 --db-driver=mysql --skip-trx=off --db-ps-mode=disable --create-secondary=off --time=10 --mysql-ignore-errors=9001,9002,9000,1062,8530,8532 run
查看以后的gtid_executed
GreatDB Cluster[sysbench]> select @@gtid_executed;+-------------------------------------------------------+| @@gtid_executed |+-------------------------------------------------------+| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5445:6053-6268 |+-------------------------------------------------------+1 row in set (0.00 sec)
2.2、空洞对理论应用的影响
模仿业务继续在82节点写业务
sysbench ./oltp_read_write.lua --mysql-db=sysbench --mysql-host=172.16.50.82 --mysql-port=4444 --mysql-user=greatdb --mysql-password=greatdb --tables=3 --table_size=1000000 --report-interval=2 --threads=3 --db-driver=mysql --skip-trx=off --db-ps-mode=disable --create-secondary=off --time=100 --mysql-ignore-errors=9001,9002,9000,1062,8530,8532 run
查看以后的gtid_executed,一段时间后gtid_executed复原间断
GreatDB Cluster[sysbench]> select @@gtid_executed;+-------------------------------------------------------+| @@gtid_executed |+-------------------------------------------------------+| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-5451:6053-6659 |+-------------------------------------------------------+1 row in set (0.00 sec)GreatDB Cluster[sysbench]> select @@gtid_executed;+---------------------------------------------+| @@gtid_executed |+---------------------------------------------+| 5cd1a64d-7358-11ec-b349-080027fa2d35:1-6665 |+---------------------------------------------+1 row in set (0.01 sec)
解析binlog,剖析gtid生成工夫
SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6268'/*!*/;# at 173167217#220112 15:12:29 server id 1 end_log_pos 173167287 Query thread_id=19 exec_time=0 error_code=0SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6269'/*!*/;# at 173169472#220112 15:14:29 server id 1 end_log_pos 173169542 Query thread_id=19 exec_time=1 error_code=0SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6659'/*!*/;# at 174048922#220112 15:14:50 server id 1 end_log_pos 174048992 Query thread_id=19 exec_time=0 error_code=0SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:6660'/*!*/;# at 175419962#220112 15:15:04 server id 1 end_log_pos 175420032 Query thread_id=19 exec_time=0 error_code=0SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:5445'/*!*/;# at 172680137#220112 15:06:15 server id 1 end_log_pos 172680207 Query thread_id=19 SET @@SESSION.GTID_NEXT= '5cd1a64d-7358-11ec-b349-080027fa2d35:5446'/*!*/;# at 174051177#220112 15:14:50 server id 1 end_log_pos 174051247 Query thread_id=19
剖析发现事务id先从6268增长到6659而后开始补空洞补完后持续从6659开始增长
3、总结
group_replication_gtid_assignment_block_size为每个成员保留的间断GTID数。每个成员从中进行耗费,并在其须要的时候获取更多的GTID数(相似于分布式事务中的全局序列,该零碎变量设置的值示意每个成员每一次从全局序列中获取多大范畴的间断数字范畴来作为本身写事务的GTID号)。
举个例子,集群中有2个节点,group_replication_gtid_assignment_block_size为1000,那么为节点A调配的Gtid_set为group_name:1-1000,节点B调配的Gtid_set为group_name:1001-2000。
则group_name:1-1000和group_name:1001-2000别离作为Gtid_set保留在member_gtids上。A节点的事务T1认证通过后,调配gtid为group_name:1,接着A节点事务T2调配group_name:2,而后B节点事务进入认证模块,认证通过后,为其调配group_name:1001,每调配一次gtid则gtids_assigned_in_blocks_counter增一。当产生主从切换时候,节点B会从1001开始记录gtid,所以会造成MGR的gtid_executed有时是不间断的多段,如aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-2:1001-1005若调配次数gtids_assigned_in_blocks_counter已达到gtid_assignment_block_size,则须要compute_group_available_gtid_intervals()从新计算。基于member_uuid找到该成员可用的gtid区间,若还没为该成员调配gtid,则调用reserve_gtid_block()进行调配。须要留神的是,reserve_gtid_block()是最多调配而不是肯定调配block_size大小的gtid序列,是否等于block_size依赖于group_available_gtid_intervals的第一个可用的间断gtid序列大小是否等于或大于block_size。
Enjoy GreatSQL :)
文章举荐:
GreatSQL MGR FAQ
https://mp.weixin.qq.com/s/J6...
万答#12,MGR整个集群挂掉后,如何能力主动选主,不必手动干涉
https://mp.weixin.qq.com/s/07...
『2021数据技术嘉年华·ON LINE』:《MySQL高可用架构演进及实际》
https://mp.weixin.qq.com/s/u7...
一条sql语句慢在哪之抓包剖析
https://mp.weixin.qq.com/s/AY...
万答#15,都有哪些状况可能导致MGR服务无奈启动
https://mp.weixin.qq.com/s/in...
技术分享 | 为什么MGR一致性模式不举荐AFTER
https://mp.weixin.qq.com/s/rN...
对于 GreatSQL
GreatSQL是由万里数据库保护的MySQL分支,专一于晋升MGR可靠性及性能,反对InnoDB并行查问个性,是实用于金融级利用的MySQL分支版本。
Gitee:
https://gitee.com/GreatSQL/Gr...
GitHub:
https://github.com/GreatSQL/G...
Bilibili:
https://space.bilibili.com/13...
微信&QQ群:
可搜寻增加GreatSQL社区助手微信好友,发送验证信息“加群”退出GreatSQL/MGR交换微信群
QQ群:533341697
微信小助手:wanlidbc
本文由博客一文多发平台 OpenWrite 公布!