关于云计算:Kubernetes-HPA-基于-Prometheus-自定义指标的可控弹性伸缩

在《Kubernetes 的主动伸缩你用对了吗？》
一文中具体阐明了如何应用 Kubernetes 的主动伸缩。在 Kubernetes 中弹性伸缩次要有三种：HPA、VPA、CA。本文不再具体阐明，有趣味的能够看那篇文章。这里次要来说下 Pod 程度缩放 HPA。

随着 Kubernetes v1.23 的公布，HPA 的 API 来到了稳定版 autoscaling/v2：

基于自定义指标的伸缩
基于多项指标的伸缩
可配置的伸缩行为

从最后的 v1 版本 HPA 只反对 CPU、内存利用率的伸缩，到起初的自定义指标、聚合层 API 的反对，到了 v1.18 版本又退出了配置伸缩行为的反对，HPA 也越来越好用、牢靠。

依附 CPU 或者内存指标的扩容并非应用所有零碎，看起来也没那么牢靠。对大部分的 web 后端系统来说，基于 RPS（每秒申请数）的弹性伸缩来解决突发的流量则会更加靠谱。

Prometheus 也是当下风行开源监控零碎，通过 Prometheus 能够获取到零碎的实时流量负载指标，明天咱们就来尝试下基于 Prometheus 的自定义指标进行弹性伸缩。

注：目前 HPA 的缩容0 （scale to 0），则须要在 feature gate 关上 alpha 版本的 HPAScaleToZero 以及配置一个对象或者内部指标。即便是关上了，从 0 到 1 的扩容须要调度、IP 调配、镜像拉取等过程，存在肯定的开销。如果升高这部分开销，这里先卖个关子，后续的文章进行补充。

文章中应用的所有代码都能够从这里下载。

整体架构

HPA 要获取 Prometheus 的指标数据，这里引入 Prometheus Adapter 组件。Prometheus Adapter 实现了 resource metrics、custom metrics 和 external metrics APIs API，反对 autoscaling/v2 的 HPA。

获取到指标数据后，依据预约义的规定对工作负载的示例数进行调整。

环境搭建

K3s

咱们应用最新 1.23 版本的 K3s 作为 Kubernetes 环境。

export INSTALL_K3S_VERSION=v1.23.1+k3s2curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 --write-kubeconfig ~/.kube/config

示例利用

咱们筹备一个简略的 web 利用，能够记录申请次数并通过 /metrics 端点输入 Prometheus 格局的指标 http_requests_total。

func main() {    metrics := prometheus.NewCounterVec(        prometheus.CounterOpts{            Name:        "http_requests_total",            Help:        "Number of total http requests",        },        []string{"status"},    )    prometheus.MustRegister(metrics)    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {        path := r.URL.Path        statusCode := 200        switch path {        case "/metrics":            promhttp.Handler().ServeHTTP(w, r)        default:            w.WriteHeader(statusCode)            w.Write([]byte("Hello World!"))        }        metrics.WithLabelValues(strconv.Itoa(statusCode)).Inc()    })    http.ListenAndServe(":3000", nil)}

将利用部署到集群：

kubectl apply -f kubernetes/sample-httpserver-deployment.yaml

Prometheus

应用 Helm 装置 Prometheus，先增加 prometheus 的 chart 仓库：

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

这里的测试只须要用到 prometheus-server，装置时禁用其余组件。同时为了演示成果的实效性，将指标的拉取距离设置为 10s。

# install prometheus with some components disabled# set scrape interval to 10shelm install prometheus prometheus-community/prometheus -n default --set alertmanager.enabled=false,pushgateway.enabled=false,nodeExporter.enabled=false,kubeStateMetrics.enabled=false,server.global.scrape_interval=10s

通过端口转发，能够在浏览器中拜访 web 页面。

# port forwardkubectl port-forward svc/prometheus-server 9090:80 -n prometheus

这里查问 Pod 的 RPS 应用 sum(rate(http_requests_total[30s])) by (pod) 语句查问：

Prometheus Adapter

同样应用 Helm 装置 Produmetheus Adapter，这里要进行额定的配置。

helm install prometheus-adapter prometheus-community/prometheus-adapter -n default -f kubernetes/values-adapter.yaml

除了要配置 Prometheus server 的拜访形式外，还要配置自定义指标的计算规定，通知 adapter 如何从 Prometheus 获取指标并计算出咱们须要的指标：

rules:  default: false  custom:   - seriesQuery: '{__name__=~"^http_requests.*_total$",container!="POD",namespace!="",pod!=""}'     resources:       overrides:         namespace: { resource: "namespace" }         pod: { resource: "pod" }     name:       matches: "(.*)_total"       as: "${1}_qps"     metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[30s])) by (<<.GroupBy>>)

能够参考具体的 Adapter 配置。

待 promethues-adapter pod 胜利运行后，能够执行 custom.metrics.k8s.io 申请：

kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_qps' | jq .{  "kind": "MetricValueList",  "apiVersion": "custom.metrics.k8s.io/v1beta1",  "metadata": {    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests_qps"  },  "items": [    {      "describedObject": {        "kind": "Pod",        "namespace": "default",        "name": "sample-httpserver-64c495844f-b58pl",        "apiVersion": "/v1"      },      "metricName": "http_requests_qps",      "timestamp": "2022-01-18T03:32:51Z",      "value": "100m",      "selector": null    }  ]}

留神：这里的 value: 100m，值的后缀“m” 标识 milli-requests per seconds，所以这里的 100m 的意思是 0.1/s 每秒0.1 个申请。

HPA

最初就是 HPA 的配置了：

最小最大的正本数别离设置 1、10
为了测试成果的实效性，设置扩缩容的行为 behavior
指定指标 http_requests_qps、类型 Pods 以及目标值 50000m：示意均匀每个 pod 的 RPS 50 。比方以 300 的 RPS 拜访，正本数就是 300/50=6 。

kind: HorizontalPodAutoscalerapiVersion: autoscaling/v2metadata:  name: sample-httpserverspec:  scaleTargetRef:    apiVersion: apps/v1    kind: Deployment    name: sample-httpserver  minReplicas: 1  maxReplicas: 10  behavior:    scaleDown:      stabilizationWindowSeconds: 30      policies:        - type: Percent          value: 100          periodSeconds: 15    scaleUp:      stabilizationWindowSeconds: 0      policies:        - type: Percent          value: 100          periodSeconds: 15  metrics:    - type: Pods      pods:        metric:          name: http_requests_qps        target:          type: AverageValue          averageValue: 50000m

测试

测试工具选用 vegeta，因为其能够指定 RPS。

先为利用创立 NodePort service：

kubectl expose deploy sample-httpserver --name sample-httpserver-host --type NodePort --target-port 3000kubectl get svc sample-httpserver-hostNAME                     TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGEsample-httpserver-host   NodePort   10.43.66.206   <none>        3000:31617/TCP   12h

别离应用 240、120、40 的 RPS 发动申请：

# 240echo "GET http://192.168.1.92:31617" | vegeta attack -duration 60s -connections 10 -rate 240 | vegeta report# 120echo "GET http://192.168.1.92:31617" | vegeta attack -duration 60s -connections 10 -rate 120 | vegeta report# 40echo "GET http://192.168.1.92:31617" | vegeta attack -duration 60s -connections 10 -rate 40 | vegeta report

从 Prometheus 的 web 界面上察看申请量与示例数的变动：

kubectl describe hpa sample-httpserverWarning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscalerName:                           sample-httpserverNamespace:                      defaultLabels:                         <none>Annotations:                    <none>CreationTimestamp:              Mon, 17 Jan 2022 23:18:46 +0800Reference:                      Deployment/sample-httpserverMetrics:                        ( current / target )  "http_requests_qps" on pods:  100m / 50Min replicas:                   1Max replicas:                   10Behavior:  Scale Up:    Stabilization Window: 0 seconds    Select Policy: Max    Policies:      - Type: Percent  Value: 100  Period: 15 seconds  Scale Down:    Stabilization Window: 30 seconds    Select Policy: Max    Policies:      - Type: Percent  Value: 100  Period: 15 secondsDeployment pods:       1 current / 1 desiredConditions:  Type            Status  Reason              Message  ----            ------  ------              -------  AbleToScale     True    ReadyForNewScale    recommended size matches current size  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric http_requests_qps  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable rangeEvents:  Type    Reason             Age                  From                       Message  ----    ------             ----                 ----                       -------  Normal  SuccessfulRescale  25m                  horizontal-pod-autoscaler  New size: 6; reason: pods metric http_requests_qps above target  Normal  SuccessfulRescale  19m                  horizontal-pod-autoscaler  New size: 4; reason: All metrics below target  Normal  SuccessfulRescale  12m (x2 over 9h)     horizontal-pod-autoscaler  New size: 4; reason: pods metric http_requests_qps above target  Normal  SuccessfulRescale  11m                  horizontal-pod-autoscaler  New size: 5; reason: pods metric http_requests_qps above target  Normal  SuccessfulRescale  9m40s (x2 over 12m)  horizontal-pod-autoscaler  New size: 2; reason: pods metric http_requests_qps above target  Normal  SuccessfulRescale  9m24s (x4 over 10h)  horizontal-pod-autoscaler  New size: 3; reason: pods metric http_requests_qps above target  Normal  SuccessfulRescale  7m54s (x3 over 9h)   horizontal-pod-autoscaler  New size: 2; reason: All metrics below target  Normal  SuccessfulRescale  7m39s (x4 over 9h)   horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

总结

基于自定义指标比方每秒申请量进行利用的程度扩容相比 CPU/内存作为根据更加靠谱，实用于大部分的 web 零碎。在突发流量时能够进行疾速扩容，通过对伸缩行为的管制，能够缩小正本数的抖动。Promeheus 作为风行利用的监控零碎，在 Adapter 和 Aggregate API 的反对下，能够作为伸缩的指标。

目前 HPA 的 scale to 0 还在 alpha 的阶段，还须要关注正本从 0 到 N 的实效性。如果最小正本数大于0 ，对某些服务来说又会占用资源。接下来，咱们会为尝试解决 0 到 N 的性能，以及资源占用的问题。

文章对立公布在公众号云原生指北