共计 10258 个字符,预计需要花费 26 分钟才能阅读完成。
简介:本文偏重介绍了通过 ACK One 的多集群利用散发性能,能够帮忙企业治理多集群环境,通过多集群主控示例提供的对立的利用下发入口,实现利用的多集群散发,差异化配置,工作流治理等散发策略。联合 GTM 全局流量治理,疾速搭建治理两地三核心的利用容灾零碎。
作者:宇汇,壮怀,先河
概述
两地三核心是指在两个城市部署三个业务解决核心,即:生产核心、同城容灾核心、异地容灾核心。在一个城市部署 2 套环境造成同城双核心,同时解决业务并通过高速链路实现数据同步,可切换运行。在另一城市部署 1 套环境做异地灾备核心,做数据备份,当双核心同时故障时,异地灾备核心可切换解决业务。两地三核心容灾计划能够极大水平的保障业务的间断运行。
应用 ACK One 的多集群治理利用散发性能,能够帮忙企业对立治理 3 个 K8s 集群,实现利用在 3 个 K8s 集群疾速部署降级,同时实现利用在 3 个 K8s 集群上的差异化配置。配合应用 GTM(全局流量治理)能够实现在故障产生时业务流量在 3 个 K8s 集群的主动切换。对 RDS 数据层面的数据复制,本实际不做具体介绍,可参考 DTS 数据传输服务。
计划架构
前提条件
开启多集群治理主控实例[1]
通过治理关联集群[2],增加 3 个 K8s 集群到主控实例中,构建两地三核心。本实际中,作为示例,在北京部署 2 个 K8s 集群(cluster1-beijing 和 cluster2-beijing),在杭州部署 1 个 K8s 集群(cluster1-hangzhou)。
创立 GTM 实例[3]
利用部署
通过 ACK One 主控实例的利用散发性能[4],在 3 个 K8s 集群中散发利用。比照传统的脚本部署,应用 ACK One 的利用散发可取得如下收益。
本实际中,示例利用为 web 利用,蕴含 K8s Deployment/Service/Ingress/Configmap 资源,Service/Ingress 对外裸露服务,Deployment 读取 Configmap 中的配置参数。通过创立利用散发规定,将利用散发到 3 个 K8s 集群,包含 2 个北京集群,1 个杭州集群,实现两地三核心。散发过程中对 deployment 和 configmap 资源做差异化配置,以适应不必地点的集群,同时散发过程实现人工审核的灰度管制,限度谬误的爆炸半径。
- 执行一下命令创立命名空间 demo。
kubectl create namespace demo
- 应用以下内容,创立 app-meta.yaml 文件。
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: web-demo
name: web-demo
namespace: demo
spec:
replicas: 5
selector:
matchLabels:
app: web-demo
template:
metadata:
labels:
app: web-demo
spec:
containers:
- image: acr-multiple-clusters-registry.cn-hangzhou.cr.aliyuncs.com/ack-multiple-clusters/web-demo:0.4.0
name: web-demo
env:
- name: ENV_NAME
value: cluster1-beijing
volumeMounts:
- name: config-file
mountPath: "/config-file"
readOnly: true
volumes:
- name: config-file
configMap:
items:
- key: config.json
path: config.json
name: web-demo
---
apiVersion: v1
kind: Service
metadata:
name: web-demo
namespace: demo
labels:
app: web-demo
spec:
selector:
app: web-demo
ports:
- protocol: TCP
port: 80
targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-demo
namespace: demo
labels:
app: web-demo
spec:
rules:
- host: web-demo.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-demo
port:
number: 80
---
apiVersion: v1
kind: ConfigMap
metadata:
name: web-demo
namespace: demo
labels:
app: web-demo
data:
config.json: |
{database-host: "beijing-db.pg.aliyun.com"}
- 执行以下命令,在主控实例上部署利用 web-demo。留神:在主控实例上创立 kube 资源并不会下发到子集群,此 kube 资源作为原数据,被后续 Application(步骤 4b)中援用。
kubectl apply -f app-meta.yaml
- 创立利用散发规定。
a. 执行以下命令,查看主控实例治理的关联集群,确定利用的散发指标
kubectl amc get managedcluster
预期输入:
Name Alias HubAccepted
managedcluster-cxxx cluster1-hangzhou true
managedcluster-cxxx cluster2-beijing true
managedcluster-cxxx cluster1-beijing true
b. 应用以下内容,创立利用散发规定 app.yaml。替换示例中的和 managedcluster-cxxx 为理论待发布集群名称。散发规定定义的最佳实际在正文中阐明。
在 app.yaml 中,蕴含以下资源类型:Policy (type:topology) 散发指标,Policy (type: override)差异化规定,Workflow 工作流,Application 利用。具体可参考:利用复制散发 [5]、利用散发差异化配置[6] 和利用集群间灰度散发[7]。
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: cluster1-beijing
namespace: demo
type: topology
properties:
clusters: ["<managedcluster-cxxx>"] #散发指标集群 1 cluster1-beijing
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: cluster2-beijing
namespace: demo
type: topology
properties:
clusters: ["<managedcluster-cxxx>"] #散发指标集群 2 cluster2-beijing
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: cluster1-hangzhou
namespace: demo
type: topology
properties:
clusters: ["<managedcluster-cxxx>"] #散发指标集群 3 cluster1-hangzhou
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: override-env-cluster2-beijing
namespace: demo
type: override
properties:
components:
- name: "deployment"
traits:
- type: env
properties:
containerName: web-demo
env:
ENV_NAME: cluster2-beijing #对集群 cluster2-beijing 的 deployment 做环境变量的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: override-env-cluster1-hangzhou
namespace: demo
type: override
properties:
components:
- name: "deployment"
traits:
- type: env
properties:
containerName: web-demo
env:
ENV_NAME: cluster1-hangzhou #对集群 cluster1-hangzhou 的 deployment 做环境变量的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: override-replic-cluster1-hangzhou
namespace: demo
type: override
properties:
components:
- name: "deployment"
traits:
- type: scaler
properties:
replicas: 1 #对集群 cluster1-hangzhou 的 deployment 做正本数的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
name: override-configmap-cluster1-hangzhou
namespace: demo
type: override
properties:
components:
- name: "configmap"
traits:
- type: json-merge-patch #对集群 cluster1-hangzhou 的 deployment 做 configmap 的差异化配置
properties:
data:
config.json: |
{database-address: "hangzhou-db.pg.aliyun.com"}
---
apiVersion: core.oam.dev/v1alpha1
kind: Workflow
metadata:
name: deploy-demo
namespace: demo
steps: #程序部署 cluster1-beijing,cluster2-beijing,cluster1-hangzhou。- type: deploy
name: deploy-cluster1-beijing
properties:
policies: ["cluster1-beijing"]
- type: deploy
name: deploy-cluster2-beijing
properties:
auto: false #部署 cluster2-beijing 前须要人工审核
policies: ["override-env-cluster2-beijing", "cluster2-beijing"] #在部署 cluster2-beijing 时做环境变量的差异化
- type: deploy
name: deploy-cluster1-hangzhou
properties:
policies: ["override-env-cluster1-hangzhou", "override-replic-cluster1-hangzhou", "override-configmap-cluster1-hangzhou", "cluster1-hangzhou"]
#在部署 cluster2-beijing 时做环境变量,正本数,configmap 的差异化
---
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
annotations:
app.oam.dev/publishVersion: version8
name: web-demo
namespace: demo
spec:
components:
- name: deployment #独立援用 deployment,不便差异化配置
type: ref-objects
properties:
objects:
- apiVersion: apps/v1
kind: Deployment
name: web-demo
- name: configmap #独立援用 configmap,不便差异化配置
type: ref-objects
properties:
objects:
- apiVersion: v1
kind: ConfigMap
name: web-demo
- name: same-resource #不做差异化配置
type: ref-objects
properties:
objects:
- apiVersion: v1
kind: Service
name: web-demo
- apiVersion: networking.k8s.io/v1
kind: Ingress
name: web-demo
workflow:
ref: deploy-demo
- 执行以下命令,在主控实例上部署散发规定 app.yaml。
kubectl apply -f app.yaml
- 查看利用的部署状态。
kubectl get app web-demo -n demo
预期输入,workflowSuspending 示意部署暂停
NAME COMPONENT TYPE PHASE HEALTHY STATUS AGE
web-demo deployment ref-objects workflowSuspending true 47h
- 查看利用在各个集群上的运行状态
kubectl amc get deployment web-demo -n demo -m all
预期输入:
Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
No resources found in demo namespace #第一次新部署利用,工作流还没有开始部署 cluster1-hangzhou
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
No resources found in demo namespace #第一次新部署利用,工作流还没有开始部署 cluster2-beijiing,期待人工审核
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME READY UP-TO-DATE AVAILABLE AGE
web-demo 5/5 5 5 47h #Deployment 在 cluster1-beijing 集群上运行失常
- 人工审核通过,部署集群 cluster2-beijing,cluster1-hangzhou。
kubectl amc workflow resume web-demo -n demo
Successfully resume workflow: web-demo
- 查看利用的部署状态。
kubectl get app web-demo -n demo
预期输入,running 示意利用运行失常
NAME COMPONENT TYPE PHASE HEALTHY STATUS AGE
web-demo deployment ref-objects running true 47h
- 查看利用在各个集群上的运行状态
kubectl amc get deployment web-demo -n demo -m all
预期输入:
Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
NAME READY UP-TO-DATE AVAILABLE AGE
web-demo 1/1 1 1 47h
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
NAME READY UP-TO-DATE AVAILABLE AGE
web-demo 5/5 5 5 2d
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME READY UP-TO-DATE AVAILABLE AGE
web-demo 5/5 5 5 47h
- 查看利用在各个集群上的 Ingress 状态
kubectl amc get ingress -n demo -m all
预期后果,每个集群的 Ingress 运行失常,公网 IP 调配胜利。
Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
NAME CLASS HOSTS ADDRESS PORTS AGE
web-demo nginx web-demo.example.com 47.xxx.xxx.xxx 80 47h
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
NAME CLASS HOSTS ADDRESS PORTS AGE
web-demo nginx web-demo.example.com 123.xxx.xxx.xxx 80 2d
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME CLASS HOSTS ADDRESS PORTS AGE
web-demo nginx web-demo.example.com 182.xxx.xxx.xxx 80 2d
流量治理
通过配置全局流量治理,自动检测利用运行状态,并在异样产生时,主动切换流量到监控集群。
- 配置全局流量治理实例,web-demo.example.com 为示例利用的域名,请替换为理论利用的域名,并设置 DNS 解析到全局流量治理的 CNAME 接入域名。
- 在已创立的 GTM 示例中,创立 2 个地址池:
pool-beijing:蕴含 2 个北京集群的 Ingress IP 地址,负载平衡策略为返回全副地址,实现北京 2 个集群的负载平衡。Ingress IP 地址可通过在主控实例上运行“kubectl amc get ingress -n demo -m all”获取。
pool-hangzhou:蕴含 1 个杭州集群的 Ingress IP 地址。
- 在地址池中开启健康检查,查看失败的地址将从地址池中移除,不再接管流量。
- 配置拜访策略,设置主地址池为北京地址池,备地址池为杭州地址池。失常流量都有北京集群利用解决,当所有北京集群利用不可用时,主动切换到杭州集群利用解决。
部署验证
- 失常状况,所有有流量都有北京的 2 个集群上的利用解决,每个集群各解决 50% 流量。
for i in {1..50}; do curl web-demo.example.com; sleep 3; done
This is env cluster1-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster1-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster1-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
- 当集群 cluster1-beijing 上的利用异样时,GTM 将所有的流量路由到 cluster2-bejing 集群解决。
for i in {1..50}; do curl web-demo.example.com; sleep 3; done
...
<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
This is env cluster2-beijing !
Config file is {database-host: "beijing-db.pg.aliyun.com"}
- 当集群 cluster1-beijing 和 cluster2-beijing 上的利用同时异样时,GTM 将流量路由到 cluster1-hangzhou 集群解决。
for i in {1..50}; do curl web-demo.example.com; sleep 3; done
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>
<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>
This is env cluster1-hangzhou !
Config file is {database-address: "hangzhou-db.pg.aliyun.com"}
This is env cluster1-hangzhou !
Config file is {database-address: "hangzhou-db.pg.aliyun.com"}
This is env cluster1-hangzhou !
Config file is {database-address: "hangzhou-db.pg.aliyun.com"}
This is env cluster1-hangzhou !
Config file is {database-address: "hangzhou-db.pg.aliyun.com"}
总结
本文偏重介绍了通过 ACK One 的多集群利用散发性能,能够帮忙企业治理多集群环境,通过多集群主控示例提供的对立的利用下发入口,实现利用的多集群散发,差异化配置,工作流治理等散发策略。联合 GTM 全局流量治理,疾速搭建治理两地三核心的利用容灾零碎。
除多集群利用散发外,ACK One 更是反对连贯并治理任何地区、任何基础设施上的 Kubernetes 集群,提供统一的治理和社区兼容的 API,反对对计算、网络、存储、平安、监控、日志、作业、利用、流量等进行对立运维管控。阿里云分布式云容器平台(简称 ACK One)是面向混合云、多集群、分布式计算、容灾等场景推出的企业级云原生平台。更多内容能够查看产品介绍分布式云容器平台 ACK One[8]。
相干链接
[1] 开启多集群治理主控实例:
https://help.aliyun.com/docum…
[2] 通过治理关联集群:
https://help.aliyun.com/docum…
[3] 创立 GTM 实例:
https://dns.console.aliyun.co…
[4] 利用散发性能:
https://help.aliyun.com/docum…
[5] 利用复制散发:
https://help.aliyun.com/docum…
[6] 利用散发差异化配置:
https://help.aliyun.com/docum…
[7] 利用集群间灰度散发:
https://help.aliyun.com/docum…
[8] 分布式云容器平台 ACK One:
https://www.aliyun.com/produc…
原文链接
本文为阿里云原创内容,未经容许不得转载。