关于监控工具:Openfalcon-judge告警判定表达式的解析

38次阅读

共计 2889 个字符,预计需要花费 8 分钟才能阅读完成。

judge 组件在做告警断定的时候,会解析配置的告警策略,生成一个 fn,由 fn.Compute() 计算是否触发,比方:

  • 配置 all(#3)>90 示意最近 3 次的数据都 > 90 触发;
  • 配置 max(#3)>90 示意最近 3 次的最大值 > 90 触发;
  • 配置 min(#3)<10 示意最近 3 次的最小值 < 10 触发;
  • 配置 avg(#3)>90 标识最近 3 次的 avg > 90 触发;

本文次要介绍 fn 如何生成,fn 如何计算。

1. 告警判断

告警断定的入口代码:解说 judge 代码时有介绍

  • ParseFuncFromString() 生成 fn;
  • fn.Compute(L) 依据最近的数据计算,断定是否触发;
// modules/judge/store/judge.go
func judgeItemWithStrategy(L *SafeLinkedList, strategy model.Strategy, firstItem *model.JudgeItem, now int64) {fn, err := ParseFuncFromString(strategy.Func, strategy.Operator, strategy.RightValue)    
    historyData, leftValue, isTriggered, isEnough := fn.Compute(L)
    // 以后的数据点太少,不足以做告警断定
    if !isEnough {return}
    ......
}

2. 如何生成 fn

fn 由告警策略配置的 string 生成:比方配置 all(#3) > 90

  • str = all(#3)
  • operator = >
  • rightValue = 90
// modules/judge/store/func.go
// @str: e.g. all(#3) sum(#3) avg(#10) 
// @opeartor: > < 
func ParseFuncFromString(str string, operator string, rightValue float64) (fn Function, err error) {
    if str == "" {return nil, fmt.Errorf("func can not be null!")
    }
    idx := strings.Index(str, "#")
    args, err := atois(str[idx+1 : len(str)-1])
    if err != nil {return nil, err}

    switch str[:idx-1] {
    case "max":
        fn = &MaxFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
    case "min":
        fn = &MinFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
    case "all":
        fn = &AllFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
    case "sum":
        fn = &SumFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
    case "avg":
        fn = &AvgFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
    ......
    default:
        err = fmt.Errorf("not_supported_method")
    }

    return
}

返回的 Funtion 是个 interface 类型,AllFunction、AvgFunction 都实现了这个 interface:

// modules/judge/store/func.go
type Function interface {Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool)
}

3. fn 如何计算

fn 的计算形式都在其实现的 Compute() 办法内;

AllFunction 须要最近的点都要满足:

func (this AllFunction) Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool) {vs, isEnough = L.HistoryData(this.Limit)
    if !isEnough {return}
    isTriggered = true
    for i := 0; i < this.Limit; i++ {isTriggered = checkIsTriggered(vs[i].Value, this.Operator, this.RightValue)
        if !isTriggered {break}
    }

    leftValue = vs[0].Value
    return
}

checkIsTriggered() 就是简略的数值判断:

// modules/judge/store/func.go
func checkIsTriggered(leftValue float64, operator string, rightValue float64) (isTriggered bool) {
    switch operator {
    case "=", "==":
        isTriggered = math.Abs(leftValue-rightValue) < 0.0001
    case "!=":
        isTriggered = math.Abs(leftValue-rightValue) > 0.0001
    case "<":
        isTriggered = leftValue < rightValue
    case "<=":
        isTriggered = leftValue <= rightValue
    case ">":
        isTriggered = leftValue > rightValue
    case ">=":
        isTriggered = leftValue >= rightValue
    }

    return
}

MaxFunction 须要最近 N 个点的最大值满足阈值:

func (this MaxFunction) Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool) {vs, isEnough = L.HistoryData(this.Limit)
    if !isEnough {return}
    // 先计算最大值
    max := vs[0].Value
    for i := 1; i < this.Limit; i++ {if max < vs[i].Value {max = vs[i].Value
        }
    }

    leftValue = max
    // 断定最大值是否触发阈值
    isTriggered = checkIsTriggered(leftValue, this.Operator, this.RightValue)
    return
}

正文完
 0