kubernetes v1.18减少了HPA v2beta2的behavior字段,能够精细化的管制伸缩的行为。

若不指定behavior字段,则按默认的behavior行为执行伸缩。

一. 默认behavior

默认的behavior:

behavior:  scaleDown:    stabilizationWindowSeconds: 300            // 冷却工夫=5min    policies:    - type: Percent      value: 100      periodSeconds: 15  scaleUp:    stabilizationWindowSeconds: 0            // 没有冷却工夫    policies:    - type: Percent      value: 100      periodSeconds: 15                        // replicas翻倍扩容    - type: Pods      value: 4      periodSeconds: 15                        // replicas扩容4个    selectPolicy: Max                        // 取最大的策略

扩容时,疾速扩容:

  • 不思考历史计算值(stabilizationWindowSeconds=0),即reconcile()发现指标变动时,计算新replicas后,立刻执行伸缩;
  • 每15秒最多容许:

    • 正本翻倍(减少currentReplicas*100%个正本) 或者 每15s新增4个正本,两者取max;
    • 即:max(2*currentReplicas, 4)

缩容时,迟缓缩容:

  • 缩容后的最终正本数不得<过来5min的历史最大值(stabilizationWindowSeconds=300),即冷却工夫=300s;
  • 每15秒最多容许:正本缩小currentReplicas*100%个正本;

二. demo

扩容,将指标猛增(1-->13):

  • 首先,从1正本扩容到4正本;
  • 而后,从4正本扩容到8正本;
  • 最初,从8正本扩容至指标计算的13正本;
# kubectl describe hpaName:                    sample-appNamespace:               defaultLabels:                  <none>Annotations:             <none>Reference:               Deployment/sample-appMetrics:                 ( current / target )  "metric_hpa" on pods:  1 / 1Min replicas:            1Max replicas:            15Deployment pods:         13 current / 13 desiredConditions:  Type            Status  Reason              Message  ----            ------  ------              -------  AbleToScale     True    ReadyForNewScale    recommended size matches current size  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric metric_hpa  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable rangeEvents:  Type    Reason             Age    From                       Message  ----    ------             ----   ----                       -------  Normal  SuccessfulRescale  2m14s  horizontal-pod-autoscaler  New size: 4; reason: pods metric metric_hpa above target  Normal  SuccessfulRescale  119s   horizontal-pod-autoscaler  New size: 8; reason: pods metric metric_hpa above target  Normal  SuccessfulRescale  103s   horizontal-pod-autoscaler  New size: 13; reason: pods metric metric_hpa above target

缩容,将指标猛降(13-->1):

  • 5min后,hpa间接将replicas缩容至minReplicas=1;
# kubectl describe hpaName:                    sample-appNamespace:               defaultLabels:                  <none>Annotations:             <none>Reference:               Deployment/sample-appMetrics:                 ( current / target )  "metric_hpa" on pods:  1 / 1Min replicas:            1Max replicas:            15Deployment pods:         1 current / 1 desiredConditions:  Type            Status  Reason              Message  ----            ------  ------              -------  AbleToScale     True    ReadyForNewScale    recommended size matches current size  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric metric_hpa  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable rangeEvents:  Type    Reason             Age   From                       Message  ----    ------             ----  ----                       -------  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 4; reason: pods metric metric_hpa above target  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 8; reason: pods metric metric_hpa above target  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 13; reason: pods metric metric_hpa above target  Normal  SuccessfulRescale  98s   horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

三. 源码剖析

基于v1.20源码。

代码入口:

  • 输出参数:prenormalizedDesiredReplicas = 依据指标计算的所需replicas;
  • 首先,依据默认的冷却工夫,计算stabilizedRecommendation正本数;
  • 而后,依据stabilizedRecommendation正本数 + 默认的伸缩策略,计算最终的正本数;
// pkg/controller/podautoscaler/horizontal.gofunc (a *HorizontalController) normalizeDesiredReplicas(hpa *autoscalingv2.HorizontalPodAutoscaler, key string, currentReplicas int32, prenormalizedDesiredReplicas int32, minReplicas int32) int32 {    // 冷却工夫    stabilizedRecommendation := a.stabilizeRecommendation(key, prenormalizedDesiredReplicas)    ...    // 伸缩策略    desiredReplicas, condition, reason := convertDesiredReplicasWithRules(currentReplicas, stabilizedRecommendation, minReplicas, hpa.Spec.MaxReplicas)    ...    return desiredReplicas}

先看下冷却工夫:

  • 计算maxRecommendation的值:

    • 初始=prenormalizedDesiredReplicas,即指标计算的replicas;
  • 遍历历史伸缩记录:

    • 若5min内有伸缩 && 伸缩replicas > maxRecommendation,则笼罩maxRecommendation;
  • 根据上述逻辑:

    • 扩容时,最近的oldReplicas < newReplicas,故最终返回值=prenormalizedDesiredReplicas,即立刻扩容;
    • 缩容时,要抉择max(5min内oldReplicas,newReplicas),因为缩容时oldReplicas > newReplicas,故要期待5min后能力执行缩容;
// pkg/controller/podautoscaler/horizontal.gofunc (a *HorizontalController) stabilizeRecommendation(key string, prenormalizedDesiredReplicas int32) int32 {    maxRecommendation := prenormalizedDesiredReplicas    foundOldSample := false    oldSampleIndex := 0    cutoff := time.Now().Add(-a.downscaleStabilisationWindow)    // 300s    for i, rec := range a.recommendations[key] {        if rec.timestamp.Before(cutoff) {            foundOldSample = true            oldSampleIndex = i        } else if rec.recommendation > maxRecommendation {            maxRecommendation = rec.recommendation        }    }    if foundOldSample {        a.recommendations[key][oldSampleIndex] = timestampedRecommendation{prenormalizedDesiredReplicas, time.Now()}    } else {        a.recommendations[key] = append(a.recommendations[key], timestampedRecommendation{prenormalizedDesiredReplicas, time.Now()})    }    return maxRecommendation}

再看下伸缩策略:

  • 扩容时,最终正本数=min(hpaMaxReplicas, max(2*currentReplicas, 4), desiredReplicas);

    • 也就是说,若计算的desiredReplicas太大的话,会应用max(2*currentReplicas, 4)进行限度;
  • 缩容时,最终正本数=max(desiredReplicas, hpaMinReplicas);
// pkg/controller/podautoscaler/horizontal.gofunc convertDesiredReplicasWithRules(currentReplicas, desiredReplicas, hpaMinReplicas, hpaMaxReplicas int32) (int32, string, string) {    var minimumAllowedReplicas int32    var maximumAllowedReplicas int32    var possibleLimitingCondition string    var possibleLimitingReason string    minimumAllowedReplicas = hpaMinReplicas    // Do not upscale too much to prevent incorrect rapid increase of the number of master replicas caused by    // bogus CPU usage report from heapster/kubelet (like in issue #32304).    scaleUpLimit := calculateScaleUpLimit(currentReplicas)    if hpaMaxReplicas > scaleUpLimit {        maximumAllowedReplicas = scaleUpLimit        possibleLimitingCondition = "ScaleUpLimit"        possibleLimitingReason = "the desired replica count is increasing faster than the maximum scale rate"    } else {        maximumAllowedReplicas = hpaMaxReplicas        possibleLimitingCondition = "TooManyReplicas"        possibleLimitingReason = "the desired replica count is more than the maximum replica count"    }    if desiredReplicas < minimumAllowedReplicas {        possibleLimitingCondition = "TooFewReplicas"        possibleLimitingReason = "the desired replica count is less than the minimum replica count"        return minimumAllowedReplicas, possibleLimitingCondition, possibleLimitingReason    } else if desiredReplicas > maximumAllowedReplicas {        return maximumAllowedReplicas, possibleLimitingCondition, possibleLimitingReason    }    return desiredReplicas, "DesiredWithinRange", "the desired count is within the acceptable range"}
func calculateScaleUpLimit(currentReplicas int32) int32 {    return int32(math.Max(scaleUpLimitFactor*float64(currentReplicas), scaleUpLimitMinimum))    // scaleUpLimitFactor=2, scaleUpLimitMinimum=4}

参考:

1.https://zhuanlan.zhihu.com/p/245208287