过来,可怜的金丝雀会作为试验品,用来测试煤矿中甲烷的含量。用绳子将装有金丝雀的笼子放入矿井一段时间,再拉上来,如果金丝雀还活着,矿井就能够平安开采;如果金丝雀死亡,则不能开采。当初,这种办法早已弃用,因为这对动物太不人道了。
金丝雀总是在矿工身边彷徨,如果它进行鸣叫,则示意矿工必须来到矿井。
金丝雀部署是指两个版本的利用共存,新版本在开始时规模较小,解决的负载流量也较少。随着对新部署的剖析,所有申请逐步切换到新版本,而旧版本利用被移除。
人们普遍认为,治理这些部署的流量须要应用一个 Service Mesh,然而,要治理入站流量,你只需在 nginx ingress controller 上设置 annotations 即可:
nginx.ingress.kubernetes.io/canary: "true"nginx.ingress.kubernetes.io/canary-weight: <num>
这种办法的毛病是必须手动治理。为了实现自动化,咱们能够应用 Argo Rollouts (https://argoproj.github.io/ar...)。
运行Argo Rollouts
增加 helm-repo: https://argoproj.github.io/ar...
argo-rollouts chart:
Helm-values:
installCRDs: true
批改 Deployment 并运行 Rollouts CRD
ScaleDown deployment,设置 Replicas 0:
运行 service
apiVersion: v1kind: Servicemetadata: annotations: argo-rollouts.argoproj.io/managed-by-rollouts: rollout-pregap name: rollouts-pregap-canary namespace: pregapspec: clusterIP: 10.43.139.197 ports: - name: http port: 8080 protocol: TCP targetPort: 8080 selector: app: test2-pregap sessionAffinity: None type: ClusterIP
apiVersion: v1kind: Servicemetadata: annotations: argo-rollouts.argoproj.io/managed-by-rollouts: rollout-pregapspec: clusterIP: 10.43.61.221 ports: - name: http port: 8080 protocol: TCP targetPort: 8080 selector: app: test2-pregap sessionAffinity: None type: ClusterIP
运行 Rollouts CRD
因为咱们不想更改Deployment,因而在Rollout manifest中援用它:workloadRef.kind: Deployment, workloadRef.name
运行 manifest 将创立额定 ingress:
Argo Rollouts 仪表板
CD-pipeline 中的其余步骤
在.drone.yml 中增加晋升步骤:
- name: promote-release-dr image: plugins/docker settings: repo: 172.16.77.115:5000/pregap registry: 172.16.77.115:5000 insecure: true dockerfile: Dockerfile.multistage tags: - latest - ${DRONE_TAG##v} when: event: - promote target: - production - name: promote-release-prod image: plugins/webhook settings: username: admin password: admin urls: http://172.16.77.118:9300/v1/webhooks/native debug: true content_type: application/json template: | { "name": "172.16.77.115:5000/pregap", "tag": "${DRONE_TAG##v}" } when: event: - promote target: - production
增加 Keel 审批:
结 论
金丝雀部署或绿/蓝部署一点都不难 - 它将进步生产环境的可靠性,并在呈现任何设计谬误时缩小受影响的区域。未来,我会在服务器上增加 RAM,而且有可能启用Prometheus 监控和 Istio,并尝试执行剖析和试验阶段,以实现 Argo Rollouts。