在之前的文章中,我们在k8s中部署了consul 生产集群。今天我继续在k8s中部署一个vault的生产集群。
Vault可以在高可用性(HA)模式下运行,以通过运行多个Vault服务器来防止中断。Vault通常受存储后端的IO限制的约束,而不是受计算要求的约束。某些存储后端(例如Consul)提供了附加的协调功能,使Vault可以在HA配置中运行,而其他一些则提供了更强大的备份和还原过程。
在高可用性模式下运行时,Vault服务器具有两个附加状态:备用和活动状态。在Vault群集中,只有一个实例将处于活动状态并处理所有请求(读取和写入),并且所有备用节点都将请求重定向到活动节点。
部署
我们的consul 集群复用之前文章中部署的consul集群。
vault配置文件server.hcl如下:
listener "tcp" { address = "0.0.0.0:8200" cluster_address = "POD_IP:8201" tls_disable = "true"}storage "consul" { address = "127.0.0.1:8500" path = "vault/"}api_addr = "http://POD_IP:8200"cluster_addr = "https://POD_IP:8201"
接下我们创建configmap:
kubectl create configmap vault --from-file=server.hcl
大家可以注意到配置文件中的POD_IP,我们将会在容器启动的时候,sed替换成真实的pod的IP。
我们采用StatefulSet方式部署一个两个节点的vault集群。通过sidecar的方式将consul client agent和vault部署到一个Pod中。
apiVersion: apps/v1kind: StatefulSetmetadata: name: vault labels: app: vaultspec: serviceName: vault podManagementPolicy: Parallel replicas: 3 updateStrategy: type: OnDelete selector: matchLabels: app: vault template: metadata: labels: app: vault spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - consul topologyKey: kubernetes.io/hostname podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - vault topologyKey: kubernetes.io/hostname containers: - name: vault command: - "/bin/sh" - "-ec" args: - | sed -E "s/POD_IP/${POD_IP?}/g" /vault/config/server.hcl > /tmp/server.hcl; /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/server.hcl image: "vault:1.4.2" imagePullPolicy: IfNotPresent securityContext: capabilities: add: - IPC_LOCK env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: VAULT_ADDR value: "http://127.0.0.1:8200" - name: VAULT_API_ADDR value: "http://$(POD_IP):8200" - name: SKIP_CHOWN value: "true" volumeMounts: - name: vault-config mountPath: /vault/config/server.hcl subPath: server.hcl ports: - containerPort: 8200 name: vault-port protocol: TCP - containerPort: 8201 name: cluster-port protocol: TCP readinessProbe: # Check status; unsealed vault servers return 0 # The exit code reflects the seal status: # 0 - unsealed # 1 - error # 2 - sealed exec: command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"] failureThreshold: 2 initialDelaySeconds: 5 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 5 lifecycle: # Vault container doesn't receive SIGTERM from Kubernetes # and after the grace period ends, Kube sends SIGKILL. This # causes issues with graceful shutdowns such as deregistering itself # from Consul (zombie services). preStop: exec: command: [ "/bin/sh", "-c", # Adding a sleep here to give the pod eviction a # chance to propagate, so requests will not be made # to this pod while it's terminating "sleep 5 && kill -SIGTERM $(pidof vault)", ] - name: consul-client image: consul:1.7.4 env: - name: GOSSIP_ENCRYPTION_KEY valueFrom: secretKeyRef: name: consul key: gossip-encryption-key - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP args: - "agent" - "-advertise=$(POD_IP)" - "-config-file=/etc/consul/config/client.json" - "-encrypt=$(GOSSIP_ENCRYPTION_KEY)" volumeMounts: - name: consul-config mountPath: /etc/consul/config - name: consul-tls mountPath: /etc/tls lifecycle: preStop: exec: command: - /bin/sh - -c volumes: - name: vault-config configMap: defaultMode: 420 name: vault - name: consul-config configMap: defaultMode: 420 name: consul-client - name: consul-tls secret: secretName: consul
如果你的k8s集群pod网段flat,可以和vpc当中的主机互相访问。那么按照以上的配置即可。否则需要设置pod的hostNetwork: true。
查看部署情况:
kubectl get pods -l app=vaultNAME READY STATUS RESTARTS AGEvault-0 2/2 Running 0 3m3svault-1 2/2 Running 0 3m3s
此时补充一下consul client agent 的配置文件:
{ "bind_addr": "0.0.0.0", "client_addr": "0.0.0.0", "ca_file": "/etc/tls/ca.pem", "cert_file": "/etc/tls/consul.pem", "key_file": "/etc/tls/consul-key.pem", "data_dir": "/consul/data", "datacenter": "dc1", "domain": "cluster.consul", "server": false, "verify_incoming": true, "verify_outgoing": true, "verify_server_hostname": true, "retry_join": [ "prod.discovery-01.xx.sg2.consul", "prod.discovery-02.xx.sg2.consul", "prod.discovery-03.xx.sg2.consul" ] }
prod.discovery-01.xx.sg2.consul 是我们私有域名,分别解析到之前部署的三个consul实例。
现在需要初始化和启动每个Vault实例
首先exec到其中一个vault实例:
kubectl exec -it vault-68bcdf8dbc-7gf29 -c vault sh
执行
vault operator initUnseal Key 1: 4uyvFnGT8WxM7OXXvFJh0ich8W/4yDh27MBBjUnseal Key 2: RzbrhGbV4hA+MlxkzwtPRP7aGXA3UaK95+5ebUnseal Key 3: hBIv4GiVkMvrWMDnxoW7m4MAYZqgX/xvwF1KSUnseal Key 4: +KyBJREqU+1p4qao1red/i7EX0ASmzWP2Ch79Unseal Key 5: 8v0Q3ZHvMi7QwsJxmH3ay8h7KrJAE3ESgh+qKInitial Root Token: s.mbHbP3WOWGEpaCT8zaoVlVault initialized with 5 key shares and a key threshold of 3. Please securelydistribute the key shares printed above. When the Vault is re-sealed,restarted, or stopped, you must supply at least 3 of these keys to unseal itbefore it can start servicing requests.Vault does not store the generated master key. Without at least 3 key toreconstruct the master key, Vault will remain permanently sealed!It is possible to generate new unseal keys, provided you have a quorum ofexisting unseal keys shares. See "vault operator rekey" for more information.
接着使用上面生成的Unseal Key 去 Unseal 三次:
vault operator unseal <unseal_key_1>Key Value--- -----Seal Type shamirInitialized trueSealed trueTotal Shares 5Threshold 3Unseal Progress 1/3Unseal Nonce 3b5933b9-4120-5dcb-40df-afc8ab9e6563Version 1.4.2HA Enabled truevault operator unseal <unseal_key_2>Key Value--- -----Seal Type shamirInitialized trueSealed trueTotal Shares 5Threshold 3Unseal Progress 2/3Unseal Nonce 3b5933b9-4120-5dcb-40df-afc8ab9e6563Version 1.4.2HA Enabled truevault operator unseal <unseal_key_3>Key Value--- -----Seal Type shamirInitialized trueSealed falseTotal Shares 5Threshold 3Version 1.4.2Cluster Name vault-cluster-b9554129Cluster ID e6cedfdd-07d2-520a-9a7c-c4e857803c7eHA Enabled trueHA Cluster n/aHA Mode standbyActive Node Address <none>
此时查看status:
vault statusKey Value--- -----Seal Type shamirInitialized trueSealed falseTotal Shares 5Threshold 3Version 1.4.2Cluster Name vault-cluster-b9554129Cluster ID e6cedfdd-07d2-520a-9a7c-c4e857803c7eHA Enabled trueHA Cluster https://10.xx.xx.229:8201HA Mode active
接下来操作另外一个实例,用同样的key Unseal 三次。
最后查看状态:
vault statusKey Value--- -----Seal Type shamirInitialized trueSealed falseTotal Shares 5Threshold 3Version 1.4.2Cluster Name vault-cluster-b9554129Cluster ID e6cedfdd-07d2-520a-9a7c-c4e857803c7eHA Enabled trueHA Cluster https://10.xx.3.229:8201HA Mode standbyActive Node Address http://10.xx.3.229:8200
最后创建svc:
apiVersion: v1kind: Servicemetadata: name: vault labels: app: vaultspec: type: ClusterIP ports: - port: 8200 targetPort: 8200 protocol: TCP name: vault selector: app: vault
总结
- 对于一些高可用的部署,我们需要加一些反亲和性的设置,比如我们设置了vault之间的反亲和性,以及和consul的反亲和性。
- 由于我们运行的1号进程是sh,所以我们必须自己通过preStop实现优雅退出。