乐趣区

关于kubernetes:HPA默认的伸缩策略

kubernetes v1.18 减少了 HPA v2beta2 的 behavior 字段,能够精细化的管制伸缩的行为。

若不指定 behavior 字段,则按默认的 behavior 行为执行伸缩。

一. 默认 behavior

默认的 behavior:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300            // 冷却工夫 =5min
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0            // 没有冷却工夫
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15                        // replicas 翻倍扩容
    - type: Pods
      value: 4
      periodSeconds: 15                        // replicas 扩容 4 个
    selectPolicy: Max                        // 取最大的策略

扩容时,疾速扩容:

  • 不思考历史计算值 (stabilizationWindowSeconds=0),即 reconcile() 发现指标变动时,计算新 replicas 后,立刻执行伸缩;
  • 每 15 秒最多容许:

    • 正本翻倍(减少 currentReplicas*100% 个正本) 或者 每 15s 新增 4 个正本,两者取 max;
    • 即:max(2*currentReplicas, 4)

缩容时,迟缓缩容:

  • 缩容后的最终正本数不得 < 过来 5min 的历史最大值(stabilizationWindowSeconds=300),即冷却工夫 =300s;
  • 每 15 秒最多容许:正本缩小 currentReplicas*100% 个正本;

二. demo

扩容,将指标猛增(1–>13):

  • 首先,从 1 正本扩容到 4 正本;
  • 而后,从 4 正本扩容到 8 正本;
  • 最初,从 8 正本扩容至指标计算的 13 正本;
# kubectl describe hpa
Name:                    sample-app
Namespace:               default
Labels:                  <none>
Annotations:             <none>
Reference:               Deployment/sample-app
Metrics:                 (current / target)
  "metric_hpa" on pods:  1 / 1
Min replicas:            1
Max replicas:            15
Deployment pods:         13 current / 13 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric metric_hpa
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type    Reason             Age    From                       Message
  ----    ------             ----   ----                       -------
  Normal  SuccessfulRescale  2m14s  horizontal-pod-autoscaler  New size: 4; reason: pods metric metric_hpa above target
  Normal  SuccessfulRescale  119s   horizontal-pod-autoscaler  New size: 8; reason: pods metric metric_hpa above target
  Normal  SuccessfulRescale  103s   horizontal-pod-autoscaler  New size: 13; reason: pods metric metric_hpa above target

缩容,将指标猛降(13–>1):

  • 5min 后,hpa 间接将 replicas 缩容至 minReplicas=1;
# kubectl describe hpa
Name:                    sample-app
Namespace:               default
Labels:                  <none>
Annotations:             <none>
Reference:               Deployment/sample-app
Metrics:                 (current / target)
  "metric_hpa" on pods:  1 / 1
Min replicas:            1
Max replicas:            15
Deployment pods:         1 current / 1 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric metric_hpa
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 4; reason: pods metric metric_hpa above target
  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 8; reason: pods metric metric_hpa above target
  Normal  SuccessfulRescale  11m   horizontal-pod-autoscaler  New size: 13; reason: pods metric metric_hpa above target
  Normal  SuccessfulRescale  98s   horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

三. 源码剖析

基于 v1.20 源码。

代码入口:

  • 输出参数:prenormalizedDesiredReplicas = 依据指标计算的所需 replicas;
  • 首先,依据默认的冷却工夫,计算 stabilizedRecommendation 正本数;
  • 而后,依据 stabilizedRecommendation 正本数 + 默认的伸缩策略,计算最终的正本数;
// pkg/controller/podautoscaler/horizontal.go
func (a *HorizontalController) normalizeDesiredReplicas(hpa *autoscalingv2.HorizontalPodAutoscaler, key string, currentReplicas int32, prenormalizedDesiredReplicas int32, minReplicas int32) int32 {
    // 冷却工夫
    stabilizedRecommendation := a.stabilizeRecommendation(key, prenormalizedDesiredReplicas)
    ...
    // 伸缩策略
    desiredReplicas, condition, reason := convertDesiredReplicasWithRules(currentReplicas, stabilizedRecommendation, minReplicas, hpa.Spec.MaxReplicas)
    ...
    return desiredReplicas
}

先看下冷却工夫:

  • 计算 maxRecommendation 的值:

    • 初始 =prenormalizedDesiredReplicas,即指标计算的 replicas;
  • 遍历历史伸缩记录:

    • 若 5min 内有伸缩 && 伸缩 replicas > maxRecommendation,则笼罩 maxRecommendation;
  • 根据上述逻辑:

    • 扩容时,最近的 oldReplicas < newReplicas,故最终返回值 =prenormalizedDesiredReplicas,即立刻扩容;
    • 缩容时,要抉择 max(5min 内 oldReplicas,newReplicas),因为缩容时 oldReplicas > newReplicas,故要期待 5min 后能力执行缩容;
// pkg/controller/podautoscaler/horizontal.go
func (a *HorizontalController) stabilizeRecommendation(key string, prenormalizedDesiredReplicas int32) int32 {
    maxRecommendation := prenormalizedDesiredReplicas
    foundOldSample := false
    oldSampleIndex := 0
    cutoff := time.Now().Add(-a.downscaleStabilisationWindow)    // 300s
    for i, rec := range a.recommendations[key] {if rec.timestamp.Before(cutoff) {
            foundOldSample = true
            oldSampleIndex = i
        } else if rec.recommendation > maxRecommendation {maxRecommendation = rec.recommendation}
    }
    if foundOldSample {a.recommendations[key][oldSampleIndex] = timestampedRecommendation{prenormalizedDesiredReplicas, time.Now()}
    } else {a.recommendations[key] = append(a.recommendations[key], timestampedRecommendation{prenormalizedDesiredReplicas, time.Now()})
    }
    return maxRecommendation
}

再看下伸缩策略:

  • 扩容时,最终正本数 =min(hpaMaxReplicas, max(2*currentReplicas, 4), desiredReplicas);

    • 也就是说,若计算的 desiredReplicas 太大的话,会应用 max(2*currentReplicas, 4)进行限度;
  • 缩容时,最终正本数 =max(desiredReplicas, hpaMinReplicas);
// pkg/controller/podautoscaler/horizontal.go
func convertDesiredReplicasWithRules(currentReplicas, desiredReplicas, hpaMinReplicas, hpaMaxReplicas int32) (int32, string, string) {
    var minimumAllowedReplicas int32
    var maximumAllowedReplicas int32
    var possibleLimitingCondition string
    var possibleLimitingReason string
    minimumAllowedReplicas = hpaMinReplicas
    // Do not upscale too much to prevent incorrect rapid increase of the number of master replicas caused by
    // bogus CPU usage report from heapster/kubelet (like in issue #32304).
    scaleUpLimit := calculateScaleUpLimit(currentReplicas)
    if hpaMaxReplicas > scaleUpLimit {
        maximumAllowedReplicas = scaleUpLimit
        possibleLimitingCondition = "ScaleUpLimit"
        possibleLimitingReason = "the desired replica count is increasing faster than the maximum scale rate"
    } else {
        maximumAllowedReplicas = hpaMaxReplicas
        possibleLimitingCondition = "TooManyReplicas"
        possibleLimitingReason = "the desired replica count is more than the maximum replica count"
    }
    if desiredReplicas < minimumAllowedReplicas {
        possibleLimitingCondition = "TooFewReplicas"
        possibleLimitingReason = "the desired replica count is less than the minimum replica count"
        return minimumAllowedReplicas, possibleLimitingCondition, possibleLimitingReason
    } else if desiredReplicas > maximumAllowedReplicas {return maximumAllowedReplicas, possibleLimitingCondition, possibleLimitingReason}
    return desiredReplicas, "DesiredWithinRange", "the desired count is within the acceptable range"
}
func calculateScaleUpLimit(currentReplicas int32) int32 {return int32(math.Max(scaleUpLimitFactor*float64(currentReplicas), scaleUpLimitMinimum))    // scaleUpLimitFactor=2, scaleUpLimitMinimum=4
}

参考:

1.https://zhuanlan.zhihu.com/p/245208287

退出移动版