关于prometheus:prometheus中remoteread的preferLocalStorage逻辑分析

43次阅读

共计 3258 个字符，预计需要花费 9 分钟才能阅读完成。

prometheus 配置了 remote-read 之后，能够读近程的 tsdb 存储；

remote_read:
  - url: "http://storage01:9090/api/v1/read"
    read_recent: true
  - url: "http://storage02:9090/api/v1/read"
    read_recent: true

prometheus 在执行查问时，本地 tsdb 和近程 tsdb 是如何取舍的呢？

论断：

首先，查看 remote-read 的 readRecent 配置，默认 =false；
若 remote-read.readRecent=true，则本地 tsdb + 近程 tsdb 同时查问，而后将后果 merge 返回 client；
若 remote-read.readRecent=false，则：
- 若 Prometheus 已有 block 生成，则对于 4hour 之后的查问，仅查问本地 tsdb，不查问近程 tsdb；
- 否则，对于其它状况，须要同时查问本地 tsdb 和近程 tsdb，最初将后果 merge 返回 client；

也就是说，prometheus 运行一段时间后，为减小 API 查问提早，做了肯定的优化，对于本地能够笼罩的数据，尽量从本地 tsdb 中的查问。

为每个 remote-read 配置，创立 1 个 SampleAndChunkQueryableClient，蕴含 1 个 Http client:

// storage/remote/storage.go
func (s *Storage) ApplyConfig(conf *config.Config) error {
    ...
    readHashes := make(map[string]struct{})
    queryables := make([]storage.SampleAndChunkQueryable, 0, len(conf.RemoteReadConfigs))
    for _, rrConf := range conf.RemoteReadConfigs {hash, err := toHash(rrConf)
        ...
        readHashes[hash] = struct{}{}
        ...
        name := hash[:6]
        // httpclient
        c, err := newReadClient(name, &ClientConfig{
            URL:              rrConf.URL,
            Timeout:          rrConf.RemoteTimeout,
            HTTPClientConfig: rrConf.HTTPClientConfig,
        })
        ...
        queryables = append(queryables, NewSampleAndChunkQueryableClient(
            c,
            conf.GlobalConfig.ExternalLabels,
            labelsToEqualityMatchers(rrConf.RequiredMatchers),
            rrConf.ReadRecent,                // readRecent 参数
            s.localStartTimeCallback,
        ))
    }
    s.queryables = queryables
    return nil
}

remote 查问是通过 sampleAndChunkQueryableClient.Querier() 返回的对象进行的；

readRecent=true 时，跳过优化逻辑，间接查问近程 tsdb；
readRecent=false 时，能够看到，是否查问远端 tsdb，由 c.preferLocalStorage() 返回值确定：
- 若返回的 noop=true，则远端 Queries=Storage.NoopQueries()，即不查问远端 tsdb；
- 否则，执行远端 tsdb 的查问；

// storage/remote/read.go
func (c *sampleAndChunkQueryableClient) Querier(ctx context.Context, mint, maxt int64) (storage.Querier, error) {
    q := &querier{
        ctx:              ctx,
        mint:             mint,
        maxt:             maxt,
        client:           c.client,
        externalLabels:   c.externalLabels,
        requiredMatchers: c.requiredMatchers,
    }
    // readRecent=true 时，跳过优化逻辑，间接查问近程 tsdb
    if c.readRecent {return q, nil}
    var (
        noop bool
        err  error
    )
    q.maxt, noop, err = c.preferLocalStorage(mint, maxt)
    // 若 noop=true，则不查问远端的 tsdb
    if noop {return storage.NoopQuerier(), nil
    }
    return q, nil
}

c.preferLocalStorage() 的实现代码：

mint,maxt 是查问申请传入的工夫；
对于查问的工夫范畴 mint~maxt：
- 对于 localStartTime 之后的 (min>localStartTime)，无需查问近程 tsdb，仅查问本地 tsdb 即可；
- 否则，查问本地 tsdb 和近程 tsdb，最初 merge；

// storage/remote/read.go
func (c *sampleAndChunkQueryableClient) preferLocalStorage(mint, maxt int64) (cmaxt int64, noop bool, err error) {localStartTime, err := c.callback()
    cmaxt = maxt

    // Avoid queries whose time range is later than the first timestamp in local DB.
    if mint > localStartTime {return 0, true, nil}
    // Query only samples older than the first timestamp in local DB.
    if maxt > localStartTime {cmaxt = localStartTime}
    return cmaxt, false, nil
}

localStartTime 的计算方法：

若存在 blocks，则 localStartTime=block0.minTime + 4 hour；
- 4hour = startTimeMargin
否则，localStartTime=time.Now() + 4hour:

// cmd/prometheus/main.go
func (s *readyStorage) StartTime() (int64, error) {if x := s.get(); x != nil {
        var startTime int64
        if len(x.Blocks()) > 0 {startTime = x.Blocks()[0].Meta().MinTime} else {startTime = time.Now().Unix() * 1000}
        // Add a safety margin as it may take a few minutes for everything to spin up.
        // s.startTimeMargin=4hour
        return startTime + s.startTimeMargin, nil
    }
    return math.MaxInt64, tsdb.ErrNotReady
}

startTimeMargin 的计算：4hour

// cfg.tsdb.MinBlockDuration = 2hour
startTimeMargin := int64(2 * time.Duration(cfg.tsdb.MinBlockDuration).Seconds() * 1000)

综上所述，c.preferLocalStorage() 的逻辑如图所示：

mint/maxt 为查问条件传入的起止工夫；
localStartTime 为代码中计算的工夫；

正文完

prometheus

发表至： prometheus

2023-01-29

0

关于prometheus:prometheusoperator使用五-自定义podservice自动发现配置

关于prometheus:PromQL查询

关于prometheus:promethues源码剖析head-block

关于prometheus:alertmanager-silence-api-添加

关于软件测试:年后面了15个人发现这些测试人都有个通病

关于prometheus:prometheus中remoteread的preferLocalStorage逻辑分析

一.remote read 配置加载

二.preferLocalStorage 的逻辑

Just My Socks（注册教程内含优惠码）

关于prometheus:prometheus中remoteread的preferLocalStorage逻辑分析

一.remote read 配置加载

二.preferLocalStorage 的逻辑

Just My Socks（注册教程 内含优惠码）

Just My Socks（注册教程内含优惠码）