乐趣区

关于mysql:MySQL-810-推出-InnoDB-Cluster-只读副本

全面理解 8.1.0 版本新性能:InnoDB Cluster 只读正本的相干操作。

作者:Miguel Araújo 高级软件工程师 / Kenny Gryp MySQL 产品总监

本文起源:Oracle MySQL 官网博客

* 爱可生开源社区出品。

前言

MySQL 的第一个 Innovation 版本 8.1.0 曾经公布,咱们将推出 MySQL InnoDB Cluster 只读正本

在 MySQL InnoDB Cluster 中,正本(Secondray)节点的次要目标是在主(Primary)节点产生故障时做好切换代替(高可用性)。这是由 MySQL 组复制(MGR)插件配合实现的。正本节点的另一个用处是缩小主节点的只读工作负载。

当初,能够向数据库拓扑构造中增加异步正本节点这些正本节点能够用于:

  1. 将读流量从主节点或者其余正本节点转移过去,分担它们的读压力。
  2. 设置专门用于读取的正本节点。
  3. 设置专门用于报表等特定目标的正本节点。
  4. 通过增加多个读取正本节点,实现超出其余正本节点解决能力范畴的扩容。

MySQL InnoDB Cluster 会治理这些正本节点的操作,包含:

  • MySQL Router 会主动从新导向流量。
  • MySQL Shell 负责配置、首次预配(应用 InnoDB Clone 插件)。
  • 复制配置和任何想要进行的拓扑构造变更。

只读正本节点也能够与 MySQL InnoDB ClusterSet 配合应用,这个模式下,正本节点能够增加到主集群或正本集群中。

MySQL Router 会辨认这些读取正本节点,并依据一些配置选项将读流量从新导向到它们。这些读取正本节点会从主节点或者其余正本节点进行复制,如果复制中断或成员变更(例如新主节点选举),它会主动从新连贯到另一个成员。

创立只读正本

将读取正本节点增加到集群中和增加从节点很类似。Admin API 的外观和交互方式放弃不变,这能够确保最佳的用户体验。

假如您曾经有一个运行中的集群,能够应用新增的这个命令来增加读取正本节点:

\<Cluster\>.addReplicaInstance(instance[, options])

mysqlsh-js> cluster.addReplicaInstance("rome4:3306")

Setting up 'rome4:3306' as a Read Replica of Cluster 'rome'.
Validating instance configuration at rome4:3306...
This instance reports its own address as rome4:3306
Instance configuration is suitable.
* Checking transaction state of the instance...

省略……

* Waiting for clone to finish...
NOTE: rome4:3306 is being cloned from rome1:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed

NOTE: rome4:3306 is shutting down...

* Waiting for server restart... ready 
* rome4:3306 has restarted, waiting for clone to finish...
** Stage RESTART: Completed
* Clone process has finished: 4.30 GB transferred in 4 sec (1.08 GB/s)
* Configuring Read-Replica managed replication channel...
** Changing replication source of rome4:3306 to rome1:3306
* Waiting for Read-Replica 'rome4:3306' to synchronize with Cluster...
** Transactions replicated  ############################################################  100% 

'rome4:3306' successfully added as a Read-Replica of Cluster 'rome'.

如此简略,一个新的正本就增加到集群中了。和一般的集群成员一样,只读正本也反对 Clone 或者增量预配。

和平常一样,能够用 <Cluster>.status() 命令查看集群的状态:

mysqlsh-js> cluster.status()
{
    "clusterName": "rome", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "rome1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "rome1:3306": {
                "address": "rome1:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {
                    "rome4:3306": {
                        "address": "rome4:3306", 
                        "role": "READ_REPLICA", 
                        "status": "ONLINE", 
                        "version": "8.1.0"
                    }
                }, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }, 
            "rome2:3306": {
                "address": "rome2:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }, 
            "rome3:3306": {
                "address": "rome3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "rome1:3306"
}

你可能留神到,新增的只读正本运行在 rome4:3306 端口,它被搁置在集群主节点 rome1:3306 上面,这示意它会将 rome1:3306 当作源节点进行数据同步,即它会从 rome1:3306 节点复制数据。

从主节点复制数据有一些长处,如能够缩小潜在的复制提早。然而,它也可能减少主节点的负载。为了提供灵活性以及依据不同场景抉择,咱们改良让这个配置成为可配置项。

运行原理:源节点辨认

如上所示,只读正本默认会从主节点进行同步,如果主从切换或故障转移产生,它会主动从新连贯到集群新的主节点。换句话说,它总是会跟踪主节点,放弃与集群的同步。

这是通过 MySQL 复制技术里的异步复制连贯故障转移个性实现的。正本会检查组复制状态,存储一份潜在源服务列表。如果以后源服务下线,它会从列表中抉择一个新的源。

这个源列表能够手动或主动保护,后者反对集群复制拓扑。在这种状况下,故障转移机制能够监控成员变更,相应增加或删除候选源。此外,它也能辨别主从节点。这与 MySQL InnoDB Cluster 中正本集复制主集群的形式统一。

有了这样灵便的架构,Admin API 能够依据每个用户的需要治理和配置读取正本。

运行原理:配置只读正本

抉择故障复原候选项

从主节点复制有升高提早的长处,但也可能减少主节点压力。

咱们理解主动故障转移机制工作原理,了解不同配置选项:

  • 优先从主节点复制
  • 优先从正本节点中抉择一个复制
  • 设置候选列表

能够在增加正本时定义,或随时扭转现有正本的配置。下图显示,Rome 地区正本抉择从节点作为源,Brussels 正本抉择主节点作为源。

作为一个例子,让咱们应用指定实例作为源,同时预约义故障复原候选列表,来向集群增加一个新的正本:

mysqlsh-js> cluster.addReplicaInstance("rome5:3306", {replicationSources: ["rome2:3306", "rome3:3306"]})
Setting up 'rome5:3306' as a Read Replica of Cluster 'rome'.

Validating instance configuration at rome5:3306...

This instance reports its own address as rome5:3306

Instance configuration is suitable.
* Checking transaction state of the instance...
NOTE: A GTID set check of the MySQL instance at 'rome5:3306' determined that it is missing transactions that were purged from all cluster members.
NOTE: The target instance 'rome5:3306' has not been pre-provisioned (GTID set is empty). The Shell is unable to determine whether the instance has pre-existing data that would be overwritten with clone based recovery.
The safest and most convenient way to provision a new instance is through automatic clone provisioning, which will completely overwrite the state of 'rome5:3306' with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.


Please select a recovery method [C]lone/[A]bort (default Clone): c
* Waiting for the donor to synchronize with PRIMARY...
** Transactions replicated  ############################################################  100% 


Monitoring Clone based state recovery of the new member. Press ^C to abort the operation.
Clone based state recovery is now in progress.

NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.

* Waiting for clone to finish...
NOTE: rome5:3306 is being cloned from rome2:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed

NOTE: rome5:3306 is shutting down...

* Waiting for server restart... ready 
* rome5:3306 has restarted, waiting for clone to finish...
** Stage RESTART: Completed
* Clone process has finished: 4.30 GB transferred in 6 sec (717.27 MB/s)

* Configuring Read-Replica managed replication channel...
** Changing replication source of rome5:3306 to rome2:3306

* Waiting for Read-Replica 'rome5:3306' to synchronize with Cluster...
** Transactions replicated  ############################################################  100% 

'rome5:3306' successfully added as a Read-Replica of Cluster 'rome'.

此次就是实现向集群增加运行在 rome5:3306 上的新的读取正本。

这个正本应用一个固定的候选失败列表,蕴含 rome2:3306rome3:3306,其中 rome2:3306 是以后运行中的源服务。列表中的其余成员是潜在的失败转移候选,他们的程序决定权重,靠前的权重高,靠后的权重低。

让咱们查看集群形容,以更直观地理解以后的拓扑构造:

mysqlsh-js> cluster.describe()
{
    "clusterName": "rome", 
    "defaultReplicaSet": {
        "name": "default", 
        "topology": [
            {
                "address": "rome1:3306", 
                "label": "rome1:3306", 
                "role": "HA"
            }, 
            {
                "address": "rome2:3306", 
                "label": "rome2:3306", 
                "role": "HA"
            }, 
            {
                "address": "rome3:3306", 
                "label": "rome3:3306", 
                "role": "HA"
            }, 
            {
                "address": "rome4:3306", 
                "label": "rome4:3306", 
                "replicationSources": ["PRIMARY"], 
                "role": "READ_REPLICA"
            }, 
            {
                "address": "rome5:3306", 
                "label": "rome5:3306", 
                "replicationSources": [
                    "rome2:3306", 
                    "rome3:3306"
                ], 
                "role": "READ_REPLICA"
            }
        ], 
        "topologyMode": "Single-Primary"
    }
}

同样,应用扩大状态,咱们能够查看以后拓扑构造的更多信息:

mysqlsh-js> cluster.status({extended:1})
{
    "clusterName": "rome", 
    "defaultReplicaSet": {
        "GRProtocolVersion": "8.0.27", 
        "communicationStack": "MYSQL", 
        "groupName": "33cfdab9-3469-11ee-9f3b-d08e7912e4ee", 
        "groupViewChangeUuid": "33cfe2b0-3469-11ee-9f3b-d08e7912e4ee", 
        "groupViewId": "16913336945761559:7", 
        "name": "default", 
        "paxosSingleLeader": "OFF", 
        "primary": "rome1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "rome1:3306": {
                "address": "rome1:3306", 
                "applierWorkerThreads": 4, 
                "fenceSysVars": [], 
                "memberId": "e304af5d-3466-11ee-8d97-d08e7912e4ee", 
                "memberRole": "PRIMARY", 
                "memberState": "ONLINE", 
                "mode": "R/W", 
                "readReplicas": {
                    "rome4:3306": {
                        "address": "rome4:3306", 
                        "applierStatus": "APPLIED_ALL", 
                        "applierThreadState": "Waiting for an event from Coordinator", 
                        "applierWorkerThreads": 4, 
                        "receiverStatus": "ON", 
                        "receiverThreadState": "Waiting for source to send event", 
                        "replicationLag": null, 
                        "replicationSources": ["PRIMARY"], 
                        "replicationSsl": "TLS_AES_256_GCM_SHA384 TLSv1.3", 
                        "role": "READ_REPLICA", 
                        "status": "ONLINE", 
                        "version": "8.1.0"
                    }
                }, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }, 
            "rome2:3306": {
                "address": "rome2:3306", 
                "applierWorkerThreads": 4, 
                "fenceSysVars": [
                    "read_only", 
                    "super_read_only"
                ], 
                "memberId": "e6eb91c6-3466-11ee-aca6-d08e7912e4ee", 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/O", 
                "readReplicas": {
                    "rome5:3306": {
                        "address": "rome5:3306", 
                        "applierStatus": "APPLIED_ALL", 
                        "applierThreadState": "Waiting for an event from Coordinator", 
                        "applierWorkerThreads": 4, 
                        "receiverStatus": "ON", 
                        "receiverThreadState": "Waiting for source to send event", 
                        "replicationLag": null, 
                        "replicationSources": [
                            "rome2:3306", 
                            "rome3:3306"
                        ], 
                        "role": "READ_REPLICA", 
                        "status": "ONLINE", 
                        "version": "8.1.0"
                    }
                }, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }, 
            "rome3:3306": {
                "address": "rome3:3306", 
                "applierWorkerThreads": 4, 
                "fenceSysVars": [
                    "read_only", 
                    "super_read_only"
                ], 
                "memberId": "ea08833f-3466-11ee-b87c-d08e7912e4ee", 
                "memberRole": "SECONDARY", 
                "memberState": "ONLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.1.0"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "rome1:3306", 
    "metadataVersion": "2.2.0"
}

扭转故障候选项

扭转源配置就像扭转实例选项一样简略。Cluster.setInstanceOption() 加强了一个新的选项 replicationSources,容许执行这个操作:

让咱们批改第一个读取正本的配置,优先从第二实例而不是主实例进行同步。

mysqlsh-js> cluster.setInstanceOption("rome4:3306", "replicationSources", "secondary")
Setting the value of 'replicationSources' to 'secondary' in the instance: 'rome4:3306' ...

WARNING: To update the replication channel with the changes the Read-Replica must be reconfigured using Cluster.rejoinInstance().
Successfully set the value of 'replicationSources' to 'secondary' in the cluster member: 'rome4:3306'.

为让新的设置立刻失效,咱们须要应用 Admin API 十分有名的命令 Cluster.rejoinInstance() 强制实例重新加入集群:

mysqlsh-js> cluster.rejoinInstance("rome4:3306")
Rejoining Read-Replica 'rome4:3306' to Cluster 'rome'...

* Checking transaction state of the instance...
The safest and most convenient way to provision a new instance is through automatic clone provisioning, which will completely overwrite the state of 'rome4:3306' with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.

The incremental state recovery may be safely used if you are sure all updates ever executed in the cluster were done with GTIDs enabled, there are no purged transactions and the new instance contains the same GTID set as the cluster or a subset of it. To use this method by default, set the 'recoveryMethod' option to 'incremental'.

Incremental state recovery was selected because it seems to be safely usable.

NOTE: User 'mysql_innodb_replica_2506922964'@'%' already existed at instance 'rome1:3306'. It will be deleted and created again with a new password.
** Changing replication source of rome4:3306 to rome2:3306

* Waiting for Read-Replica 'rome4:3306' to synchronize with Cluster...
** Transactions replicated  ############################################################  100% 

<strong>Read-Replica 'rome4:3306' successfully rejoined to the Cluster 'rome'.

阐明:和一般实例一样,应用 Cluster.removeInstance() 能够从集群中删除读取正本。

路由读申请到正本

MySQL Router 在 InnoDB Cluster 中扮演着至关重要的角色,所以当初它齐全通晓只读正本的存在。Router 位于利用和集群之间,将客户端流量定向到正确的指标。它能够为只读申请应用只读正本。然而从节点也能够,那么它如何抉择?

配置 Router 指标池

默认状况下,Router 的行为不变,即读流量定向到集群的从节点。然而,当初是可配置的。

read_only_targets 模式承受以下可能的行为:

  • secondaries:只将指标集群的正本成员用作只读流量(默认)
  • read_replicas:只将指标集群的读取正本用于只读流量
  • all:将指标集群的所有读取正本以及其余正本成员一起用于只读流量

该模式能够通过另一个常见的命令 .setRoutingOption() 进行配置,该命令能够在 Router、Cluster 或 ClusterSet 级别进行配置。

例如:

mysqlsh-js> cluster.setRoutingOption("read_only_targets", "all")
Routing option 'read_only_targets' successfully updated.
mysqlsh-js> cluster.routingOptions()
{
    "clusterName": "rome", 
    "global": {
        "read_only_targets": "all", 
        "stats_updates_frequency": null, 
        "tags": {}}, 
    "routers": {"domus::": {}
    }
}

下图显示了更简单的拓扑,其中 ClusterSet 由主集群(Rome)和部署了多个路由器的正本集群(Brussels)组成。

次要集群是蕴含 3 个次要成员和 3 个只读正本的 3 个成员集群。读取正本应用 sourcesList:"secondary,因而它们从集群的主要成员进行复制。

部署在该数据中心的两个路由器应用次要集群作为指标集群,并且被配置为应用所有读取指标,即 read_only_targets:all

另一个数据中心的正本集群也是一个 3 个成员的集群,蕴含 3 个只读正本,它们被配置为从主集群主成员进行复制。两个路由器有不同的配置:第一个路由器被配置为应用次要集群作为指标集群,并且只将读取正本用于只读指标,即 read_only_targets: read_replicas。另一个路由器被配置为应用 Brussels 作为指标集群,并且只将主要成员用于只读指标,即 read_only_targets:secondaries

健康检查和隔离

MySQL Router 作为无状态服务,因而它依赖于 InnoDB Cluster 成员信息进行精确的路由。MGR 基于 Paxos 实现提供集群成员服务,定义哪些服务器在线并参加组。通过利用该信息,Router 防止连贯到每个成员查看其状态。

然而,当只读正本也是集群的一部分时,该信息在组成员信息中不可用,Router 无奈依赖于该机制。同时,集群成员间看到的可用性不同于 Router 视角,成员信息可能不精确。

为解决这些挑战,Router 实现了内置的隔离机制。

简而言之,当 Router 因为新用户连贯尝试连贯指标端点失败时,它会将该指标置于隔离状态。但指标不会永远停留在隔离状态,在超时后,Router 会对该服务器执行健康检查,以确定是否能够从隔离池中移除它。

隔离机制在两个方面是可配置的:

  • 隔离阈值:承受失败连贯尝试的次数,直到服务器被隔离。
  • 隔离服务器应该多频繁进行健康检查。

默认状况下,这两个设置的值都是 1,即失败连贯会导致指标实例被隔离,每秒对隔离实例进行健康检查,以查看是否能够从隔离池中移除。

复制提早

同步复制会导致复制提早是常见问题,须要思考。如果实例提早重大,DBA 可能心愿暗藏它,期待它复原同步,而后再将它曝光给利用。

此外,DBA 也可能心愿暗藏读取正本:

  • 进行一直降级而不影响流入流量
  • 进行保护操作或配置更改而无需进行 MySQL
  • 在备份或生成报表时排除它免得影响其余查问
  • 将其配置为备份服务器,排除任何读流量

与惯例集群成员一样,这能够通过领导 Router 不应用其标签来实现。

暗藏正本

要将实例从 Router 流量中暗藏,能够应用内置的 _hidden 标签,通过 .setInstanceOption() 命令很容易设置:

mysqlsh-js> cluster.setInstanceOption("rome5:3306", "tag:_hidden", true)

该性能不限于正本,主要实例也能够标记为暗藏。

总结

只读正本能够扩大读密集型工作负载的规模,加重其余集群成员的压力,并提供更多的数据冗余。

感谢您应用 MySQL!

更多技术文章,请拜访:https://opensource.actionsky.com/

对于 SQLE

爱可生开源社区的 SQLE 是一款面向数据库使用者和管理者,反对多场景审核,反对标准化上线流程,原生反对 MySQL 审核且数据库类型可扩大的 SQL 审核工具。

SQLE 获取

类型 地址
版本库 https://github.com/actiontech/sqle
文档 https://actiontech.github.io/sqle-docs/
公布信息 https://github.com/actiontech/sqle/releases
数据审核插件开发文档 https://actiontech.github.io/sqle-docs/docs/dev-manual/plugin…
退出移动版