数据写入
数据写入时,首先points按shard划分,归属于一个shard的points一起写入:
//tsdb/store.go// WriteToShard writes a list of points to a shard identified by its ID.func (s *Store) WriteToShard(shardID uint64, points []models.Point) error { sh := s.shards[shardID] return sh.WritePoints(points)}//tsdb/shard.go// WritePoints will write the raw data points and any new metadata to the index in the shard.func (s *Shard) WritePoints(points []models.Point) error { ..... // Write to the engine. err := engine.WritePoints(points); .....}
由tsm1.Engine负责写入points:
- 首先,结构数据,由points结构values=map[string][]Values,key=seriesKey+分隔符+fieldName, value=[]Value={timestamp,fieldValue}汇合;
- 而后,将values写入cache;
- 最初,将values写入WAL;
//tsdb/engine/tsm1/engine.go// WritePoints writes metadata and point data into the engine.// It returns an error if new points are added to an existing key.func (e *Engine) WritePoints(points []models.Point) error { values := make(map[string][]Value, len(points)) for _, p := range points { keyBuf = append(keyBuf[:0], p.Key()...) keyBuf = append(keyBuf, keyFieldSeparator...) //一个Point中可能含多个field iter := p.FieldIterator() t := p.Time().UnixNano() for iter.Next() { keyBuf = append(keyBuf[:baseLen], iter.FieldKey()...) var v Value switch iter.Type() { case models.Float: fv, err := iter.FloatValue() if err != nil { return err } v = NewFloatValue(t, fv) ...... } values[string(keyBuf)] = append(values[string(keyBuf)], v) } } //先写到cache // first try to write to the cache if err := e.Cache.WriteMulti(values); err != nil { return err } //再写到WAL if e.WALEnabled { if _, err := e.WAL.WriteMulti(values); err != nil { return err } } return seriesErr}
数据删除
与LSM-Tree相似,influxdb应用标记删除的办法,待执行compactor的时候,再真正的将其删除。
在data目录,有.tombstone文件,记录了哪个时间段的数据须要删除:
- 查问时,将查问后果和.tombstone内容比对,将要删除的记录去掉;
- compactor时,查问.tombstone内容,将数据删除;
数据查问与索引构造
LSM-Tree有良好的写入性能,然而查问性能有余;TSM-Tree基于LSM-Tree,通过采纳索引、布隆过滤器的办法进行查问优化,这里重点关注索引。
influxdb中有两种类型的索引:元数据索引和TSM File索引
元数据索引
元数据指measurement和series信息,每个database都有一个Index构造,存储该database中的元数据索引信息:
//tsdb/store.gotype Store struct { path string // shared per-database indexes, only if using "inmem". indexes map[string]interface{} //key=databaseName, value理论是*Index ....}
元数据索引的内部结构:
type Index struct { //数据库下name-->*measurement measurements map[string]*measurement // measurement name to object and index //数据库下seriesKey-->*series series map[string]*series // map series key to the Series object //数据库名称 database string}type measurement struct { Database string Name string `json:"name,omitempty"` fieldNames map[string]struct{} // in-memory index fields //seriesId-->*series seriesByID map[uint64]*series // lookup table for series by their id //tagKey-->tagValue-->[]seriesId //查问时,可依据tagKey找到seriesId,而后再找到相干的series seriesByTagKeyValue map[string]*tagKeyValue // map from tag key to value to sorted set of series ids sortedSeriesIDs seriesIDs // sorted list of series IDs in this measurement}type tagKeyValue struct { mu sync.RWMutex entries map[string]*tagKeyValueEntry}type tagKeyValueEntry struct { m map[uint64]struct{} // series id set}
对于元数据查问语句:
show tag values from "cpu_usage" with key="host"
该语句的查问过程:
- 依据"cpu_usage"找到measurement对象;
- 在measurement对象内,依据tagKey="host",找到其对应的tagValue+[]seriesId;
对于一般查问语句:
select value from "cpu_usage" where host='server01' and time > now() - 1h
该语句的查问过程:
- 依据工夫:time > now() - 1h,失去数据shard;
- 在shard内,依据"cpu_usage"找到measurement对象;
- 在measurement对象内,依据tagKey="server01",找到其对应的tagValue+[]seriesId;
- 遍历[]seriesId,取得[]series对象,再应用TSM File索引查找TSM File,读取TSM File block失去后果;
TSM File索引
单个TSM File中蕴含block数据和index数据:
Blocks中寄存压缩后的timestamp/value。
Index中寄存Block中的索引,Index会存储到内存做间接索引,以便实现疾速检索。
间接索引的数据结构:
//tsdb/engine/tsm1/reader.gotype indirectIndex struct { b []byte //Index的内容 offsets []byte minKey, maxKey []byte //最小/最大key minTime, maxTime int64 //最小/最大工夫}
TSM File的查找过程:
- 依据seriesKey,在[]offset和Index中各offset的key进行二分查找,失去offset;
- 依据offset读取[]byte内容,失去indexEntries;
- 在indexEntries中,失去TSM File的偏移量,而后读取文件内容失去后果;
//tsdb/engine/tsm1/reader.gotype indexEntries struct { Type byte entries []IndexEntry}// IndexEntry is the index information for a given block in a TSM file.type IndexEntry struct { // The min and max time of all points stored in the block. MinTime, MaxTime int64 // The absolute position in the file where this block is located. Offset int64 //TSM文件的偏移量 // The size in bytes of the block in the file. Size uint32}
参考
1.http://blog.fatedier.com/2016...