prometheus配置了remote-read之后,能够读近程的tsdb存储;
remote_read: - url: "http://storage01:9090/api/v1/read" read_recent: true - url: "http://storage02:9090/api/v1/read" read_recent: true
prometheus在执行查问时,本地tsdb和近程tsdb是如何取舍的呢?
论断:
- 首先,查看remote-read的readRecent配置,默认=false;
- 若remote-read.readRecent=true,则本地tsdb + 近程tsdb同时查问,而后将后果merge返回client;
若remote-read.readRecent=false,则:
- 若Prometheus已有block生成,则对于4hour之后的查问,仅查问本地tsdb,不查问近程tsdb;
- 否则,对于其它状况,须要同时查问本地tsdb和近程tsdb,最初将后果merge返回client;
也就是说,prometheus运行一段时间后,为减小API查问提早,做了肯定的优化,对于本地能够笼罩的数据,尽量从本地tsdb中的查问。
一.remote read配置加载
为每个remote-read配置,创立1个SampleAndChunkQueryableClient,蕴含1个Http client:
// storage/remote/storage.gofunc (s *Storage) ApplyConfig(conf *config.Config) error { ... readHashes := make(map[string]struct{}) queryables := make([]storage.SampleAndChunkQueryable, 0, len(conf.RemoteReadConfigs)) for _, rrConf := range conf.RemoteReadConfigs { hash, err := toHash(rrConf) ... readHashes[hash] = struct{}{} ... name := hash[:6] // httpclient c, err := newReadClient(name, &ClientConfig{ URL: rrConf.URL, Timeout: rrConf.RemoteTimeout, HTTPClientConfig: rrConf.HTTPClientConfig, }) ... queryables = append(queryables, NewSampleAndChunkQueryableClient( c, conf.GlobalConfig.ExternalLabels, labelsToEqualityMatchers(rrConf.RequiredMatchers), rrConf.ReadRecent, // readRecent参数 s.localStartTimeCallback, )) } s.queryables = queryables return nil}
二.preferLocalStorage的逻辑
remote查问是通过sampleAndChunkQueryableClient.Querier()返回的对象进行的;
- readRecent=true时,跳过优化逻辑,间接查问近程tsdb;
readRecent=false时,能够看到,是否查问远端tsdb,由c.preferLocalStorage()返回值确定:
- 若返回的noop=true,则远端Queries=Storage.NoopQueries(),即不查问远端tsdb;
- 否则,执行远端tsdb的查问;
// storage/remote/read.gofunc (c *sampleAndChunkQueryableClient) Querier(ctx context.Context, mint, maxt int64) (storage.Querier, error) { q := &querier{ ctx: ctx, mint: mint, maxt: maxt, client: c.client, externalLabels: c.externalLabels, requiredMatchers: c.requiredMatchers, } // readRecent=true时,跳过优化逻辑,间接查问近程tsdb if c.readRecent { return q, nil } var ( noop bool err error ) q.maxt, noop, err = c.preferLocalStorage(mint, maxt) // 若noop=true,则不查问远端的tsdb if noop { return storage.NoopQuerier(), nil } return q, nil}
c.preferLocalStorage()的实现代码:
- mint,maxt是查问申请传入的工夫;
对于查问的工夫范畴mint~maxt:
- 对于localStartTime之后的(min>localStartTime),无需查问近程tsdb,仅查问本地tsdb即可;
- 否则,查问本地tsdb和近程tsdb,最初merge;
// storage/remote/read.gofunc (c *sampleAndChunkQueryableClient) preferLocalStorage(mint, maxt int64) (cmaxt int64, noop bool, err error) { localStartTime, err := c.callback() cmaxt = maxt // Avoid queries whose time range is later than the first timestamp in local DB. if mint > localStartTime { return 0, true, nil } // Query only samples older than the first timestamp in local DB. if maxt > localStartTime { cmaxt = localStartTime } return cmaxt, false, nil}
localStartTime的计算方法:
若存在blocks,则localStartTime=block0.minTime + 4 hour;
- 4hour = startTimeMargin
- 否则,localStartTime=time.Now() + 4hour:
// cmd/prometheus/main.gofunc (s *readyStorage) StartTime() (int64, error) { if x := s.get(); x != nil { var startTime int64 if len(x.Blocks()) > 0 { startTime = x.Blocks()[0].Meta().MinTime } else { startTime = time.Now().Unix() * 1000 } // Add a safety margin as it may take a few minutes for everything to spin up. // s.startTimeMargin=4hour return startTime + s.startTimeMargin, nil } return math.MaxInt64, tsdb.ErrNotReady}
startTimeMargin的计算:4hour
// cfg.tsdb.MinBlockDuration = 2hourstartTimeMargin := int64(2 * time.Duration(cfg.tsdb.MinBlockDuration).Seconds() * 1000)
综上所述,c.preferLocalStorage()的逻辑如图所示:
- mint/maxt为查问条件传入的起止工夫;
- localStartTime为代码中计算的工夫;