原本打算只用一篇文章来解说Redis中的list,在理论写作过程中发现Redis中有多种list的实现,所以筹备拆成多篇文章,本文次要讲ziplist,ziplist也是quicklist的根底。另外还有skiplist,skiplist尽管是list,当次要和set命令相干,所以会放到前面。
本文次要波及到的源码在ziplist.c

何为ziplist?咱们能够在ziplist.c源码头部找到一段Redis作者的一段介绍。

The ziplist is a specially encoded dually linked list that is designed to be very memory efficient. It stores both strings and integer values, where integers are encoded as actual integers instead of a series of characters. It allows push and pop operations on either side of the list in O(1) time.However, because every operation requires a reallocation of the memory used by the ziplist, the actual complexity is related to the amount of memory used by the ziplist.

ziplist是为了进步存储效率而设计的一种非凡编码的双向链表。它能够存储字符串或者整数,存储整数时是采纳整数的二进制而不是字符串模式存储。他能在O(1)的工夫复杂度下实现list两端的push和pop操作。然而因为每次操作都须要重新分配ziplist的内存,所以理论复杂度和ziplist的内存使用量相干。

前半句还好了解,但每次操作都须要从新分配内存…… 就有点回味无穷了。别急,你看完ziplist的具体实现就懂了。

ziplist在逻辑上是个双向链表,但它是存储在一大块间断的内存空间上的。与其说ziplist是个数据结构,倒不如说他是Redis中双向链表的序列化存储形式。

ziplist构造

整个ziplist在内存中的存储格局如下:

ziplist次要有这么几个局部:

  • zlbytes: 32位无符号整型,示意整个ziplist所占的空间大小,蕴含了zlbytes所占的4个字节。
  • 这个字段能够在重置整个ziplist大小时不须要遍历整个list来确定大小,空间换工夫。
  • zltail: 32位无符号整型,示意整个list中最初一项所在的偏移量,不便在尾部做pop操作。
  • zllen: 16位,示意ziplist中所存储的entry数量,然而留神,这里最多示意$2^{16} -2$个entry, 如果是$2^{16}-1$有非凡含意,$2^{16}-1$示意存储数量超过了$2^{16}-2$个,但具体是多少个得遍历一次能力晓得。
  • zlend: 8位,ziplist的开端示意,值固定是255.
  • entry: 不定长,可能有多个,list中具体的数据项,上面会具体介绍。

entry

这里最外围的就是entry的数据格式,entry还真有些简单,从上图中能够看出它次要有三个局部。

  • prelen: 前一个entry的存储大小,次要是为了不便从后往前遍历。
  • encoding: 数据的编码模式(字符串还是数字,长度是多少)
  • data: 理论存储的数据

比较复杂的是Redis为了节俭内存空间,对下面三个字段设计了一套比较复杂的编码方式,实质上就是一套变长的编码协定,具体规定如下:

prelen

如果prelen数值小于254,那就只用一个字节来示意长度,如果长度大于等于254就用5个字节,第一个字节是固定值254(FE)来标识这是个非凡的数据,剩下的4个字节来示意理论的长度。

encoding

encoding的具体值取决于entry中具体的内容,当entry是个string时,encoding的前两字节存储了字符串的长度。当entry是一个整数的时候,前两字节默认都是1,前面两字节标识出前面存的是哪种类型的整数,第一个字节就足够判断出entry是什么类型了。不同的encoding类型示例如下:

  • |00pppppp| - 1字节
长度小于或者等于63的String类型,'pppppp'无符号6位数标识string长度。
  • |01pppppp|qqqqqqqq| - 2字节
长度小于或者等于16383的String类型(14位),留神:14位'pppppp'采纳大端的形式存储
  • |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5字节
长度大于等于16384的String类型,第二字节开始的qqrrsstt都是用来存储字符串长度的二进制位,可示意的字符串长度最大2^32-1,第一字节的低6位没有用,所以都是0。
留神: 32位数采纳大端的形式存储
  • |11000000| - 3字节
存储int16_t (2字节).
  • |11010000| - 5字节
存储int32_t (4字节).
  • |11100000| - 9字节
存储int64_t (8字节).
  • |11110000| - 4字节
24位有符号类型整数 (3字节).
  • |11111110| - 2字节
8位有符号类型整数 (1字节).
  • |1111xxxx| - (xxxx在0001和1101之间) 4位无符号整数.
0到12的无符号整数.编码值实际上是从1到13,因为0000和1111不能应用,要留出一位示意0,所以应该从编码值中减去1才是精确值

在某些比拟小的数值下,具体值能够间接存储到encoding字段里。

ziplist的API

ziplist.c代码也较多,双链表操作很多代码在ziplist中比拟多,其实实质上都是它这简单的存储格局导致的,实际上了解了它的编码格局,具体代码不难理解。这里我只列出几个我认为比拟重要的API,其余能够参考源码ziplist.c。

ziplist其实只是一种双向队列的序列化形式,是在内存中的存储格局,实际上并不能间接拿过去用,用户看到的ziplist只是一个char *指针,其中每个entry在理论应用中还须要反序列化成zlentry不便调用。

typedef struct zlentry {    unsigned int prevrawlensize; /* 内存中编码后的prevrawlen用了多少字节 */    unsigned int prevrawlen;     /* 前一个entry占用的长度,次要是为了entry之间跳转 */    unsigned int lensize;        /* 内存中编码后的len用了多少字节 */    unsigned int len;            /* 以后entry的长度,如果是string则示意string的长度,如果是整数,则len依赖于具体数值大小。*/    unsigned int headersize;     /* prevrawlensize + lensize. entry的head局部用了多少字节 */    unsigned char encoding;      /* 以后entry的编码格局 */    unsigned char *p;            /* 指向数据域的指针 */} zlentry;

另外有一点,ziplist在内存中是高度紧凑的间断存储,这意味着它起始对批改并不敌对,如果要对ziplist做批改类的操作,那就需重新分配新的内存来存储新的ziplist,代价很大,具体插入和删除的代码如下。

/* 在p地位插入数据 *s. */unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;    unsigned int prevlensize, prevlen = 0;    size_t offset;    int nextdiff = 0;    unsigned char encoding = 0;    long long value = 123456789; /* initialized to avoid warning. Using a value                                    that is easy to see if for some reason                                    we use it uninitialized. */    zlentry tail;    /* 找到前一个节点计算出prevlensize和prevlen */    if (p[0] != ZIP_END) {        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);    } else {        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);        if (ptail[0] != ZIP_END) {            prevlen = zipRawEntryLength(ptail);        }    }    /* See if the entry can be encoded */    if (zipTryEncoding(s,slen,&value,&encoding)) {        /* 'encoding' is set to the appropriate integer encoding */        reqlen = zipIntSize(encoding);    } else {        /* 'encoding' is untouched, however zipStoreEntryEncoding will use the         * string length to figure out how to encode it. */        reqlen = slen;    }    /* We need space for both the length of the previous entry and     * the length of the payload. */    reqlen += zipStorePrevEntryLength(NULL,prevlen);    reqlen += zipStoreEntryEncoding(NULL,encoding,slen);    /* When the insert position is not equal to the tail, we need to     * make sure that the next entry can hold this entry's length in     * its prevlen field. */    int forcelarge = 0;    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;    if (nextdiff == -4 && reqlen < 4) {        nextdiff = 0;        forcelarge = 1;    }    /* Store offset because a realloc may change the address of zl. */    offset = p-zl;    // 计算出须要的内存容量,而后从新生成一个新大小的zl替换掉原来的zl。    zl = ziplistResize(zl,curlen+reqlen+nextdiff);    p = zl+offset;    /* 迁徙数据,而后更新tail的offset */    if (p[0] != ZIP_END) {        /* Subtract one because of the ZIP_END bytes */        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);        /* Encode this entry's raw length in the next entry. */        if (forcelarge)            zipStorePrevEntryLengthLarge(p+reqlen,reqlen);        else            zipStorePrevEntryLength(p+reqlen,reqlen);        /* Update offset for tail */        ZIPLIST_TAIL_OFFSET(zl) =            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);        /* When the tail contains more than one entry, we need to take         * "nextdiff" in account as well. Otherwise, a change in the         * size of prevlen doesn't have an effect on the *tail* offset. */        zipEntry(p+reqlen, &tail);        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);        }    } else {        /* This element will be the new tail. */        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);    }    /* When nextdiff != 0, the raw length of the next entry has changed, so     * we need to cascade the update throughout the ziplist */    if (nextdiff != 0) {        offset = p-zl;        zl = __ziplistCascadeUpdate(zl,p+reqlen);        p = zl+offset;    }    /* 写入数据 */    p += zipStorePrevEntryLength(p,prevlen);    p += zipStoreEntryEncoding(p,encoding,slen);    if (ZIP_IS_STR(encoding)) {        memcpy(p,s,slen);    } else {        zipSaveInteger(p,value,encoding);    }    ZIPLIST_INCR_LENGTH(zl,1);    return zl;}

ziplist节点删除

unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {    unsigned int i, totlen, deleted = 0;    size_t offset;    int nextdiff = 0;    zlentry first, tail;    zipEntry(p, &first);    for (i = 0; p[0] != ZIP_END && i < num; i++) {        p += zipRawEntryLength(p);        deleted++;    }    totlen = p-first.p; /* 删除元素后缩小的内存空间(字节) */    if (totlen > 0) {        if (p[0] != ZIP_END) {            /* Storing `prevrawlen` in this entry may increase or decrease the             * number of bytes required compare to the current `prevrawlen`.             * There always is room to store this, because it was previously             * stored by an entry that is now being deleted. */            nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);            /* Note that there is always space when p jumps backward: if             * the new previous entry is large, one of the deleted elements             * had a 5 bytes prevlen header, so there is for sure at least             * 5 bytes free and we need just 4. */            p -= nextdiff;            zipStorePrevEntryLength(p,first.prevrawlen);            /* Update offset for tail */            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);            /* When the tail contains more than one entry, we need to take             * "nextdiff" in account as well. Otherwise, a change in the             * size of prevlen doesn't have an effect on the *tail* offset. */            zipEntry(p, &tail);            if (p[tail.headersize+tail.len] != ZIP_END) {                ZIPLIST_TAIL_OFFSET(zl) =                   intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);            }            /* 把tail挪动到ziplist的后面*/            memmove(first.p,p,                intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);        } else {            /* The entire tail was deleted. No need to move memory. */            ZIPLIST_TAIL_OFFSET(zl) =                intrev32ifbe((first.p-zl)-first.prevrawlen);        }        /* 更新ziplist大小 */        offset = first.p-zl;        zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);        ZIPLIST_INCR_LENGTH(zl,-deleted); // 更新zllen         p = zl+offset;        /* When nextdiff != 0, the raw length of the next entry has changed, so         * we need to cascade the update throughout the ziplist */        if (nextdiff != 0)            zl = __ziplistCascadeUpdate(zl,p);    }    return zl;}

插入删除的根本逻辑都是相似,先定位,而后算插入/删除后所需的内存空间变动,依据计算出来新的空间大小对zl做ziplistResize(),而后更新zl的元信息。
除了插入删除外,像ziplistPush ziplistMerge,这这种带改变的API,最初都调用了 ziplistResizeziplistResize代码如下:

unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {    zl = zrealloc(zl,len);    ZIPLIST_BYTES(zl) = intrev32ifbe(len);    zl[len-1] = ZIP_END;    return zl;}

看起来很简短,其实大量的逻辑都在zrealloc中,zrealloc是个宏定义(忽然感觉c的宏定义很骚),其实次要逻辑就是申请一块长度为len的空间,而后开释原来zl所指向的空间。这里能够看出 ziplist批改的代价是很高的 ,如果在应用中有频繁更新list的操作,倡议对list相干的配置做些优化。

其余API

具体API定义列表见源码ziplist.h

unsigned char *ziplistNew(void);  // 新建ziplistunsigned char *ziplistMerge(unsigned char **first, unsigned char **second);  // 合并两个ziplist unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where); // 在ziplist头部或者尾部push一个节点 unsigned char *ziplistIndex(unsigned char *zl, int index); // 找到某个下标的节点  unsigned char *ziplistNext(unsigned char *zl, unsigned char *p);  // 找到p节点的下一个节点 unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p);  // 找到p节点的前一个节点  unsigned int ziplistGet(unsigned char *p, unsigned char **sval, unsigned int *slen, long long *lval);  // 获取entry中存储的具体内容unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen);  // 插入unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p); // 删除  unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num); // 删除某个下标区间内的节点 unsigned int ziplistCompare(unsigned char *p, unsigned char *s, unsigned int slen);  // 比拟两个节点的大小 unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip); // 找到某个特定值的节点unsigned int ziplistLen(unsigned char *zl);  // ziplist的长度  size_t ziplistBlobLen(unsigned char *zl);  // ziplist的存储空间大小 void ziplistRepr(unsigned char *zl);   // 

结语

ziplist其实是一个逻辑上的双向链表,能够疾速找到头节点和尾节点,而后每个节点(entry)中也蕴含指向前/后节点的"指针",但作者为了将内存节俭到极致,摒弃了传统的链表设计(前后指针须要16字节的空间,而且会导致内存碎片化重大),设计出了内存十分紧凑的存储格局。内存是省下来了,但操作复杂性也更新的复杂度上来了,当然Redis作者也思考了这点,所以也设计出了ziplist和传统双向链表的折中——quicklist,咱们将在下一篇博文中具体介绍quicklist。

本文是Redis源码分析系列博文,同时也有与之对应的Redis中文正文版,有想深刻学习Redis的同学,欢送star和关注。
Redis中文注解版仓库:https://github.com/xindoo/Redis
Redis源码分析专栏:https://zxs.io/s/1h