共计 16071 个字符,预计需要花费 41 分钟才能阅读完成。
本文解说了 MySQL 8 在复制观测性上带来更丰盛的观测信息。
作者:Frederic Descamps MySQL 社区经理
本文起源:Oracle MySQL 官网博客
- 爱可生开源社区出品。
许多经验丰富的 MySQL DBA 都应用过 SHOW REPLICA STATUS
输入中的 Seconds_Behind_Source
来判断(异步)复制的运行状态。
留神:这里应用新的术语 REPLICA。我置信所有人都应用过旧的术语。
然而,MySQL 复制机制曾经有很大提高,复制性能团队也做了很多工作,可能为 MySQL 中所有可用的复制模式提供的更丰盛观测信息。
例如,咱们减少了并行复制,组复制等,然而这些信息都无奈从原来的 SHOW REPLICA STATUS
输入中看到。
零碎库 Performance_Schema
提供了比 SHOW REPLICA STATUS
更好的监控和察看复制过程的形式。
以后,Performance_Schema
中有 15 张表用于记录复制信息量化:
+------------------------------------------------------+
| Tables_in_performance_schema (replication%) |
+------------------------------------------------------+
| replication_applier_configuration |
| replication_applier_filters |
| replication_applier_global_filters |
| replication_applier_status |
| replication_applier_status_by_coordinator |
| replication_applier_status_by_worker |
| replication_asynchronous_connection_failover |
| replication_asynchronous_connection_failover_managed |
| replication_connection_configuration |
| replication_connection_status |
| replication_group_communication_information |
| replication_group_configuration_version |
| replication_group_member_actions |
| replication_group_member_stats |
| replication_group_members |
+------------------------------------------------------+
15 rows in set (0.0038 sec)
但不容置疑,了解这些指标的含意并找出对咱们 MySQL DBA 真正有意义的信息也不总是那么容易:例如,正本与数据源是否有提早?
我筹备了一些视图,能够装置在 sys schema
中,利用这些指标为咱们 DBA 提供相干信息。
咱们来更认真地看一下这些视图。
复制提早
select * from sys.replication_lag;
+---------------------------+-----------------------+------------------------+
| channel_name | max_lag_from_original | max_lag_from_immediate |
+---------------------------+-----------------------+------------------------+
| clusterset_replication | 00:00:04.963223 | 00:00:04.940782 |
| group_replication_applier | 0 | 0 |
+---------------------------+-----------------------+------------------------+
从下面的输入中,咱们能够看到此实例是一个异步复制的正本,但它也是组复制集群的一部分。
事实上,这是 InnoDB ClusterSet 中 DR 集群的次要成员。
咱们还能够看到这个正本的早退差不多有 5 秒。
而后,咱们会看到复制通道的名称,以及与原始提交者和间接源(在级联复制的状况下)的最大提早(因为在并行复制的状况下可能有多个工作线程)。
在组复制集群(InnoDB Cluster)的 Secondary 节点上,咱们能够看到以下输入:
select * from sys.replication_lag;
+----------------------------+-----------------------+------------------------+
| channel_name | max_lag_from_original | max_lag_from_immediate |
+----------------------------+-----------------------+------------------------+
| group_replication_recovery | null | null |
| group_replication_applier | 00:00:02.733008 | 00:00:02.733008 |
+----------------------------+-----------------------+------------------------+
咱们能够看到,用于复原的通道(当节点退出组时读取失落的二进制日志事件、事务)没有被应用,而组复制的应用程序绝对源节点稍有滞后。
复制状态
这个视图更残缺,每个工作线程都有一行。
以咱们 DR 站点 InnoDB ClusterSet 中的 Primary 节点为例:
select * from replication_status;
+-------------------------------+----------+----------+---------+-------------------+--------------------+
| channel | io_state | co_state | w_state | lag_from_original | lag_from_immediate |
+-------------------------------+----------+----------+---------+-------------------+--------------------+
| group_replication_applier (1) | ON | ON | ON | none | none |
| group_replication_applier (2) | ON | ON | ON | none | none |
| group_replication_applier (3) | ON | ON | ON | none | none |
| group_replication_applier (4) | ON | ON | ON | none | none |
| clusterset_replication (1) | ON | ON | ON | 00:00:15.395870 | 00:00:15.380884 |
| clusterset_replication (2) | ON | ON | ON | 00:00:15.395686 | 00:00:15.380874 |
| clusterset_replication (3) | ON | ON | ON | 00:00:15.411204 | 00:00:15.388451 |
| clusterset_replication (4) | ON | ON | ON | 00:00:15.406154 | 00:00:15.388434 |
+-------------------------------+----------+----------+---------+-------------------+--------------------+
能够看到,集群并行(异步)复制应用 4 个并行工作线程。
你可能留神到有 3 种状态(所有都为 ON)。但应用 SHOW REPLICA STATUS
咱们只能看到:
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
通过并行复制,咱们在利用 binlog 事件期间有另一个线程参加复制:coordinator 线程。
残缺复制状态信息
当然,咱们也能够获取更多对于复制的详细信息。
来看一个后果例子:
select * from sys.replication_status_full\G
*************************** 1. row ***************************
channel: group_replication_applier (1)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 784.00 us
last_applied_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 2. row ***************************
channel: group_replication_applier (2)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 3. row ***************************
channel: group_replication_applier (3)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 4. row ***************************
channel: group_replication_applier (4)
host: <NULL>
port: 0
user:
source_uuid: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
group_name: 7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e
last_heartbeat_timestamp: 0000-00-00 00:00:00.000000
heartbeat_interval: 30
io_state: ON
io_thread_state: NULL
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Replica has read all relay log; waiting for more updates
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: Waiting for an event from Coordinator
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 03:36:40.474223
applier_busy_state: IDLE
lag_from_original: none
lag_from_immediate: none
transport_time: 1.80 us
time_to_relay_log: 12.00 us
apply_time: 0 ps
last_applied_transaction:
last_queued_transaction: 7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:3
queued_gtid_set_to_apply:
*************************** 5. row ***************************
channel: clusterset_replication (1)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.799071
lag_from_immediate: 00:00:01.783404
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 14.63 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105180
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 6. row ***************************
channel: clusterset_replication (2)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.797743
lag_from_immediate: 00:00:01.783390
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 21.47 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105181
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 7. row ***************************
channel: clusterset_replication (3)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.786087
lag_from_immediate: 00:00:01.767563
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 21.58 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105182
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
*************************** 8. row ***************************
channel: clusterset_replication (4)
host: 127.0.0.1
port: 3310
user: mysql_innodb_cs_b0adbc6c
source_uuid: 2cb77a02-40eb-11ee-83f4-c8cb9e32df8e
group_name:
last_heartbeat_timestamp: 2023-08-22 18:48:41.037817
heartbeat_interval: 30
io_state: ON
io_thread_state: Waiting for source to send event
io_errno: 0
io_errmsg:
io_errtime: 0000-00-00 00:00:00.000000
co_state: ON
co_thread_state: Waiting for replica workers to process their queues
co_errno: 0
co_errmsg:
co_errtime: 0000-00-00 00:00:00.000000
w_state: ON
w_thread_state: waiting for handler commit
w_errno: 0
w_errmsg:
w_errtime: 0000-00-00 00:00:00.000000
time_since_last_message: 00:00:00.001134
applier_busy_state: APPLYING
lag_from_original: 00:00:01.785881
lag_from_immediate: 00:00:01.767550
transport_time: 2.26 ms
time_to_relay_log: 19.00 us
apply_time: 29.59 ms
last_applied_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105183
last_queued_transaction: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105547
queued_gtid_set_to_apply: 54d83026-40eb-11ee-a5d3-c8cb9e32df8e:105184-105547
通过这个视图,咱们能够取得更多详细信息,例如复制心跳。咱们还对 GTID(排队、利用等)进行了概述。
咱们也能够看到传输工夫(网络)、写入中继日志的工夫以及最终利用的工夫等信息。
当然,您能够依据须要来抉择视图的显示内容,例如:
InnoDB 集群,ClusterSet,只读正本
如果应用由 Admin API 和 MySQL Shell 治理的集成解决方案,那么所有这些信息曾经通过 status()
办法能够获知。
status()
办法能够扩大 3 个值:
- 返回集群元数据版本、组协定版本、组名称、集群成员 UUID、成员角色和状态(由组复制报告),及被隔离的零碎变量列表。
- 返回每个连贯和利用程序处理的事务信息。
- 返回每个集群成员复制机制更具体的统计信息。
上面来看一下扩大选项 3 的 ClusterSet 示例:
JS> cs.status({extended:3})
{
"clusters": {
"cluster2": {
"clusterRole": "REPLICA",
"clusterSetReplication": {
"applierQueuedTransactionSet": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e:137385-138500",
"applierQueuedTransactionSetSize": 1116,
"applierState": "ON",
"applierStatus": "APPLYING",
"applierThreadState": "waiting for handler commit",
"applierWorkerThreads": 4,
"coordinatorState": "ON",
"coordinatorThreadState": "Waiting for replica workers to process their queues",
"options": {
"connectRetry": 3,
"delay": 0,
"heartbeatPeriod": 30,
"retryCount": 10
},
"receiver": "127.0.0.1:4420",
"receiverStatus": "ON",
"receiverThreadState": "Waiting for source to send event",
"receiverTimeSinceLastMessage": "00:00:00.002737",
"replicationSsl": null,
"source": "127.0.0.1:3310"
},
"clusterSetReplicationStatus": "OK",
"communicationStack": "MYSQL",
"globalStatus": "OK",
"groupName": "7b6bf13d-40ed-11ee-bfdd-c8cb9e32df8e",
"groupViewChangeUuid": "7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e",
"paxosSingleLeader": "OFF",
"receivedTransactionSet": "54d83026-40eb-11ee-a5d3-c8cb9e32df8e:129-138500",
"ssl": "REQUIRED",
"status": "OK_NO_TOLERANCE",
"statusText": "Cluster is NOT tolerant to any failures.",
"topology": {
"127.0.0.1:4420": {
"address": "127.0.0.1:4420",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "c3d726ac-40ec-11ee-ab38-c8cb9e32df8e",
"memberRole": "PRIMARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:05.420247",
"replicationLagFromOriginalSource": "00:00:05.433548",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:4430": {
"address": "127.0.0.1:4430",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "709b15ea-40ed-11ee-a9b3-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:00.038075",
"replicationLagFromOriginalSource": "00:00:05.432536",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
}
},
"transactionSet": "2cb77a02-40eb-11ee-83f4-c8cb9e32df8e:1-4,54d83026-40eb-11ee-a5d3-c8cb9e32df8e:1-137384,54d8329c-40eb-11ee-a5d3-c8cb9e32df8e:1-5,7b6bf4f0-40ed-11ee-bfdd-c8cb9e32df8e:1-3",
"transactionSetConsistencyStatus": "OK",
"transactionSetErrantGtidSet": "","transactionSetMissingGtidSet":"54d83026-40eb-11ee-a5d3-c8cb9e32df8e:137385-138552"},"myCluster": {"clusterRole":"PRIMARY","communicationStack":"MYSQL","globalStatus":"OK","groupName":"54d83026-40eb-11ee-a5d3-c8cb9e32df8e","groupViewChangeUuid":"54d8329c-40eb-11ee-a5d3-c8cb9e32df8e","paxosSingleLeader":"OFF","primary":"127.0.0.1:3310","ssl":"REQUIRED","status":"OK","statusText":"Cluster is ONLINE and can tolerate up to ONE failure.","topology": {"127.0.0.1:3310": {"address":"127.0.0.1:3310","applierWorkerThreads": 4,"fenceSysVars": [],"memberId":"2cb77a02-40eb-11ee-83f4-c8cb9e32df8e","memberRole":"PRIMARY","memberState":"ONLINE","mode":"R/W","readReplicas": {"127.0.0.1:4410": {"address":"127.0.0.1:4410","applierStatus":"APPLYING","applierThreadState":"waiting for handler commit","applierWorkerThreads": 4,"receiverStatus":"ON","receiverThreadState":"Waiting for source to send event","replicationSources": ["PRIMARY"],"replicationSsl":"TLS_AES_256_GCM_SHA384 TLSv1.3","role":"READ_REPLICA","status":"ONLINE","version":"8.1.0"
}
},
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:3320": {
"address": "127.0.0.1:3320",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "327cb102-40eb-11ee-9904-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:04.536190",
"replicationLagFromOriginalSource": "00:00:04.536190",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
},
"127.0.0.1:3330": {
"address": "127.0.0.1:3330",
"applierWorkerThreads": 4,
"fenceSysVars": [
"read_only",
"super_read_only"
],
"memberId": "3d141d7e-40eb-11ee-933b-c8cb9e32df8e",
"memberRole": "SECONDARY",
"memberState": "ONLINE",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "00:00:04.652745",
"replicationLagFromOriginalSource": "00:00:04.652745",
"role": "HA",
"status": "ONLINE",
"version": "8.1.0"
}
},
"transactionSet": "2cb77a02-40eb-11ee-83f4-c8cb9e32df8e:1-4,54d83026-40eb-11ee-a5d3-c8cb9e32df8e:1-138552,54d8329c-40eb-11ee-a5d3-c8cb9e32df8e:1-5"
}
},
"domainName": "myClusterSet",
"globalPrimaryInstance": "127.0.0.1:3310",
"metadataServer": "127.0.0.1:3310",
"primaryCluster": "myCluster",
"status": "HEALTHY",
"statusText": "All Clusters available."
}
论断
复制局部的可察看性十分具体,并通过 MySQL 8 提供了大量信息。兴许当初是扭转查看或监督复制形式的好时机。