关于prometheus:prometheus对指标timestamp的处理

72次阅读

共计 4047 个字符,预计需要花费 11 分钟才能阅读完成。

prometheus 中的指标 timestamp 有两个:

  • prometheus 拉取时刻的 timestamp,即服务端的工夫:time.Now();
  • exporter 的 /metrics 接口,除了返回 metric,value,还返回 timestamp;
# HELP container_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
# TYPE container_cpu_user_seconds_total counter
container_cpu_user_seconds_total{container="",id="/",image="",name="",namespace="",pod=""} 788250.59 1692241058502
container_cpu_user_seconds_total{container="",id="/kubepods.slice",image="",name="",namespace="",pod=""} 378238.54 1692241058529

一. prometheus 的配置

对下面的两个 timestamp,prometheus 通过上面的配置决定抉择哪一个。

1. 配置

# honor_timestamps controls whether Prometheus respects the timestamps present
# in scraped data.
#
# If honor_timestamps is set to "true", the timestamps of the metrics exposed
# by the target will be used.
#
# If honor_timestamps is set to "false", the timestamps of the metrics exposed
# by the target will be ignored.
[honor_timestamps: <boolean> | default = true]

应用 honor_timestamps 配置拉取指标的工夫:

  • 默认 honor_timestamps=true;
  • honor_timestamps=true 时:

    • 应用拉取 /metrics 时 exporter 返回的 timestamps;
    • 若 exporter 未返回 timestamps,则应用 prometheus 拉取时刻的 timestamps(即服务端的工夫);
  • honor_timestamps=false 时:

    • 间接应用 prometheus 拉取时刻的 timestamps(即服务端的工夫);

2. 源码

// scrape/scape.go
func (sl *scrapeLoop) append(app storage.Appender, b []byte, contentType string, ts time.Time) (total, added, seriesAdded int, err error) {
    var (p              = textparse.New(b, contentType)
        defTime        = timestamp.FromTime(ts)     // 这里的 ts=time.Now(),即拉取时的工夫)
    ...
    for {
        var (
            et          textparse.Entry
            sampleAdded bool
        )
        if et, err = p.Next(); err != nil {
            if err == io.EOF {err = nil}
            break
        }  
        ...
        t := defTime                    // defTime=time.Now()
        met, tp, v := p.Series()        // 解析出拉取的:met=metrics, tp=timestamp, v=value
        if !sl.honorTimestamps {// 若 honor_timestamps=false,tp=nil, 即应用服务端的工夫:time.Now()
            tp = nil
        }
        if tp != nil {// 若解析出之间,则 t = 解析的工夫,否则应用 time.Now()
            t = *tp
        }
        ...
        err = app.AddFast(ce.ref, t, v)     // 将 t /v/series 写入 tsdb
        ...
    }  
    ...
}

二. exporter 中的 timestamp

1. 不带 timestamp

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000112409
go_gc_duration_seconds{quantile="0.25"} 0.000435099
go_gc_duration_seconds{quantile="0.5"} 0.000530901
go_gc_duration_seconds{quantile="0.75"} 0.000681327
go_gc_duration_seconds{quantile="1"} 0.00163155
go_gc_duration_seconds_sum 11.546457813
go_gc_duration_seconds_count 2331

在开发 exporter 时,也能够应用 client_go 的 SetToCurrentTime() 设置为以后工夫,这样 /metrics 就会返回 timestamp。

2. 带 timestamp(eg.cadvisor)

cavisor 的 /metrics 返回了 timestamp,紧跟在 value 前面:

# HELP container_cpu_user_seconds_total Cumulative user cpu time consumed in seconds.
# TYPE container_cpu_user_seconds_total counter
container_cpu_user_seconds_total{container="",id="/",image="",name="",namespace="",pod=""} 788250.59 1692241058502
container_cpu_user_seconds_total{container="",id="/kubepods.slice",image="",name="",namespace="",pod=""} 378238.54 1692241058529
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice",image="",name="",namespace="",pod=""} 17053.5 1692241041865
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod6cc5ddd5_aa45_4889_9106_20fcae0951e8.slice",image="",name="",namespace="kube-system",pod="kube-proxy-f6pjd"} 9980.23 1692241057892
container_cpu_user_seconds_total{container="",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-poddabdd3fc_2893_4a90_99b8_464462a5ab6a.slice",image="",name="",namespace="kube-system",pod="calico-kube-controllers-75ddb95444-gv7j7"} 7073.3 1692241058973

cadvisor 中结构 prometheus 指标中工夫的办法:

// cadvisor/metrics/prometheus.go

{
   name:      "container_cpu_user_seconds_total",
   help:      "Cumulative user cpu time consumed in seconds.",
   valueType: prometheus.CounterValue,
   getValues: func(s *info.ContainerStats) metricValues {
      return metricValues{
         {value:     float64(s.Cpu.Usage.User) / float64(time.Second),
            timestamp: s.Timestamp,
         },
      }
   },
}

能够看到,在采集的时候,记录了采集的 timestamp,同时也把这个 timestamp 传给了 prometheus:

for _, metricValue := range cm.getValues(stats) {
   ch <- prometheus.NewMetricWithTimestamp(
      metricValue.timestamp,
      prometheus.MustNewConstMetric(desc, cm.valueType, float64(metricValue.value), append(values, metricValue.labels...)...),
   )
}

参考:

1.https://prometheus.io/docs/prometheus/latest/configuration/co…
2.https://mp.weixin.qq.com/s/kxHgNN_d83nT2LTNQyj6tg

正文完
 0