关于redis:为什么LRU算法原理和代码实现不一样

(1) Redis里缓存有哪些淘汰策略

内存淘汰策略	解释	备注
noeviction	不进行数据淘汰
allkeys-random	在所有key里随机筛选数据
allkeys-lru	在所有key里筛选最近最久未应用的数据
allkeys-lfu	在所有key里筛选最近起码应用的数据	Redis 4.0 新增
volatile-ttl	在有过期工夫key里依据过期工夫的先后筛选
volatile-random	在有过期工夫key里随机筛选数据
volatile-lru	在有过期工夫key里筛选最近最久未应用的数据
volatile-lfu	在有过期工夫key里筛选最近起码应用的数据	Redis 4.0 新增

lru (Least Recently Used) 最近最久未应用
lfu (Least Frequently Used) 最近起码应用

在redis3.0之前，默认淘汰策略是volatile-lru；在redis3.0及之后(包含3.0)，默认淘汰策略是noeviction。

在3.0及之后的版本，Redis 在应用的内存空间超过 maxmemory 值时，并不会淘汰数据。

对应到 Redis 缓存，也就是指，一旦缓存被写满了，再有写申请来时，Redis 不再提供服务，而是间接返回谬误。

(1.1) Redis内存淘汰机制如何启用

Redis 的内存淘汰机制是如何启用近似 LRU 算法的

和Redis配置文件redis.conf中的两个配置参数无关：

maxmemory，该配置项设定了 Redis server 能够应用的最大内存容量，一旦 server 应用的理论内存量超出该阈值时，server 就会依据 maxmemory-policy 配置项定义的策略，执行内存淘汰操作；

maxmemory-policy，该配置项设定了 Redis server 的内存淘汰策略，次要包含近似 LRU 算法、LFU 算法、按 TTL 值淘汰和随机淘汰等几种算法。

(2) 缓存淘汰算法/页面置换算法原理

(2.1) LRU

LRU 算法背地的想法十分奢侈：它认为刚刚被拜访的数据，必定还会被再次拜访。
抉择最近最久未被应用的数据进行淘汰。

长处：
简略、高效
有余：
可能造成缓存净化。

缓存净化：在一些场景下，有些数据被拜访的次数非常少，甚至只会被拜访一次。当这些数据服务完拜访申请后，如果还持续留存在缓存中的话，就只会白白占用缓存空间。

典型场景：全表扫描，对所有数据进行一次读取，每个数据都被读取到了，

(2.2) LFU

记录数据被拜访的频率，抉择在最近应用起码的数据进行淘汰。

(3) Redis里缓存淘汰算法实现

(3.1) Redis-LRU

LRU 算法在理论实现时，须要用链表治理所有的缓存数据，这会带来额定的空间开销。

而且，当有数据被拜访时，须要在链表上把该数据挪动到 MRU 端，如果有大量数据被拜访，就会带来很多链表挪动操作，会很耗时，进而会升高 Redis 性能。

在 Redis 中，LRU 算法被做了简化，以加重数据淘汰对缓存性能的影响。

Redis 并没有为所有的数据保护一个全局的链表，而是通过随机采样形式，选取肯定数量（例如 100 个）的数据放入候选汇合，后续在候选汇合中依据 lru 字段值的大小进行筛选。

(3.2) Redis-LFU

LFU 缓存策略是在 LRU 策略根底上，为每个数据减少了一个计数器，来统计这个数据的拜访次数。

当应用 LFU 策略筛选淘汰数据时，首先会依据数据的拜访次数进行筛选，把拜访次数最低的数据淘汰出缓存。
如果两个数据的拜访次数雷同，LFU 策略再比拟这两个数据的拜访时效性，把间隔上一次拜访工夫更久的数据淘汰出缓存。

Redis 在实现 LFU 策略的时候，只是把原来 24bit 大小的 lru 字段，又进一步拆分成了两局部。
ldt 值：lru 字段的前 16bit，示意数据的拜访工夫戳；
counter 值：lru 字段的后 8bit，示意数据的拜访次数。

在实现 LFU 策略时，Redis 并没有采纳数据每被拜访一次，就给对应的 counter 值加 1 的计数规定，而是采纳了一个更优化的计数规定。

LFU 策略实现的计数规定是：每当数据被拜访一次时，首先，用计数器以后的值乘以配置项 lfu_log_factor 再加 1，再取其倒数，失去一个 p 值；而后，把这个 p 值和一个取值范畴在（0，1）间的随机数 r 值比大小，只有 p 值大于 r 值时，计数器才加 1。

double r = (double)rand()/RAND_MAX;...double p = 1.0/(baseval*server.lfu_log_factor+1);if (r < p) counter++;

(4) 源码解读

(4.1) 全局LRU时钟值的计算

LRU算法须要晓得数据的最近一次拜访工夫。因而，Redis设计了LRU时钟来记录数据每次拜访的工夫戳。

// file: src/server.h /* * redis对象 */typedef struct redisObject {    unsigned type:4;  // 数据类型 （string/list/hash/set/zset等）    unsigned encoding:4;  // 编码方式     unsigned lru:LRU_BITS;  // LRU工夫(绝对于全局 lru_clock)                             // 或 LFU数据(低8位保留频率 和 高16位保留拜访工夫)。                              // LRU_BITS为24个bits    int refcount;  // 援用计数  4字节    void *ptr;  // 指针 指向对象的值  8字节} robj;

// file: src/server.cvoid initServerConfig(void) {    // 计算全局LRU时钟值    server.lruclock = getLRUClock();}

// file: src/evict.c/*  * 依据时钟分辨率返回 LRU 时钟。  * 这是一个缩小位格局的工夫，可用于设置和查看 redisObject 构造的 object->lru 字段。 */unsigned int getLRUClock(void) {    // mstime()是毫秒工夫戳  // mstime()/1000=秒级工夫戳    // 与运算 保障值 <= LRU_CLOCK_MAX    return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;}

从代码能够看出，LRU时钟精度是1000毫秒，也就是1秒。

#define LRU_BITS 24// obj->lru的最大值 // LRU_CLOCK_MAX = 1^24 - 1#define LRU_CLOCK_MAX ((1<<LRU_BITS)-1) /* Max value of obj->lru */// LRU 时钟分辨率(毫秒)#define LRU_CLOCK_RESOLUTION 1000 /* LRU clock resolution in ms */

// file: src/server.c/*  * 返回UNIX毫秒工夫戳 * Return the UNIX time in milliseconds  */mstime_t mstime(void) {    return ustime()/1000;}

// file: src/server.c/* * 返回UNIX微秒工夫戳  * Return the UNIX time in microseconds  */long long ustime(void) {    struct timeval tv;    long long ust;    gettimeofday(&tv, NULL);    ust = ((long long)tv.tv_sec)*1000000;    ust += tv.tv_usec;    return ust;}

(4.2) 在运行过程中LRU时钟值是如何更新的

和 Redis server 在事件驱动框架中，定期运行的工夫事件所对应的 serverCron 函数无关。

serverCron 函数作为工夫事件的回调函数，自身会依照肯定的频率周期性执行，其频率值是由 Redis 配置文件 redis.conf 中的 hz 配置项决定的。

hz 配置项的默认值是 10，这示意 serverCron 函数会每 100 毫秒(1秒 / 10 = 100 毫秒)运行一次。

// file: src/server.c/* This is our timer interrupt, called server.hz times per second. * Here is where we do a number of things that need to be done asynchronously. * For instance: * * - Active expired keys collection (it is also performed in a lazy way on *   lookup). * - Software watchdog. * - Update some statistic. * - Incremental rehashing of the DBs hash tables. * - Triggering BGSAVE / AOF rewrite, and handling of terminated children. * - Clients timeout of different kinds. * - Replication reconnection. * - Many more... * * Everything directly called here will be called server.hz times per second, * so in order to throttle execution of things we want to do less frequently * a macro is used: run_with_period(milliseconds) { .... } */int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {    /* We have just LRU_BITS bits per object for LRU information.     * So we use an (eventually wrapping) LRU clock.     *     * Note that even if the counter wraps it's not a big problem,     * everything will still work but some object will appear younger     * to Redis. However for this to happen a given object should never be     * touched for all the time needed to the counter to wrap, which is     * not likely.     *     * Note that you can change the resolution altering the     * LRU_CLOCK_RESOLUTION define. */    // 默认状况下，每100毫秒调用getLRUClock函数更新一次全局LRU时钟值     server.lruclock = getLRUClock();}

这样一来，每个键值对就能够从全局 LRU 时钟获取最新的拜访工夫戳了。

(4.3) key-value-LRU时钟值的初始化与更新

(4.3.1) key-LRU时钟初始化

对于key-value来说，它的 LRU 时钟值最后是在这个键值对被创立的时候，进行初始化设置的，这个初始化操作是在 createObject 函数中调用的。

// file: src/object.c/* * 创立一个redisObject对象 * * @param type redisObject的类型 * @param *ptr 值的指针 */robj *createObject(int type, void *ptr) {    // 为redisObject构造体分配内存空间    robj *o = zmalloc(sizeof(*o));      // 省略局部代码     // 将lru字段设置为以后的 lruclock（分钟分辨率），或者 LFU 计数器。     // 判断内存过期策略    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {        // 对应lfu         // LFU_INIT_VAL=5 对应二进制是 0101         // 或运算         o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;    } else {        // 对应lru         o->lru = LRU_CLOCK();    }    return o;}

(4.3.2) key-LRU时钟更新

只有一个key被拜访了，它的 LRU 时钟值就会被更新。而当一个键值对被拜访时，拜访操作最终都会调用 lookupKey 函数。

// file: src/db.c/*  * 低级key查找API * 实际上并没有间接从应该依赖lookupKeyRead()、lookupKeyWrite()和lookupKeyReadWithFlags()的命令实现中调用。 */robj *lookupKey(redisDb *db, robj *key, int flags) {    dictEntry *de = dictFind(db->dict,key->ptr);    // 如果节点存在    if (de) {        // 从节点里获取redisObject        robj *val = dictGetVal(de);        /*          * 更新老化算法的拜访工夫。         * 如果咱们有一个正在保留的子过程，请不要这样做，因为这会触发疯狂写入正本。         */        // 没有沉闷子过程 并且          if (!hasActiveChildProcess() && !(flags & LOOKUP_NOTOUCH)){            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {                // 更新lfu                updateLFU(val);            } else {                // 更新lru工夫                val->lru = LRU_CLOCK();            }        }        return val;    } else {        return NULL;    }}

(4.4) 近似LRU算法的理论执行

Redis 之所以实现近似 LRU 算法的目标，是为了缩小内存资源和操作工夫上的开销。
何时触发算法执行？
算法具体如何执行？

(4.4.1) 触发机会

近似 LRU 算法的次要逻辑是在 freeMemoryIfNeeded 函数中实现的

processCommand -> freeMemoryIfNeededAndSafe -> freeMemoryIfNeeded

(4.4.2) 近似LRU算法执行

次要分3大步

判断以后内存应用状况-getMaxmemoryState
更新待淘汰的候选键值对汇合-evictionPoolPopulate
抉择被淘汰的键值对并删除-freeMemoryIfNeeded

// file: src/evict.c/* This function is periodically called to see if there is memory to free * according to the current "maxmemory" settings. In case we are over the * memory limit, the function will try to free some memory to return back * under the limit. * * The function returns C_OK if we are under the memory limit or if we * were over the limit, but the attempt to free memory was successful. * Otherwise if we are over the memory limit, but not enough memory * was freed to return back under the limit, the function returns C_ERR. */int freeMemoryIfNeeded(void) {    int keys_freed = 0;    /* By default replicas should ignore maxmemory     * and just be masters exact copies. */    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;    size_t mem_reported, mem_tofree, mem_freed;    mstime_t latency, eviction_latency, lazyfree_latency;    long long delta;    int slaves = listLength(server.slaves);    int result = C_ERR;    /* When clients are paused the dataset should be static not just from the     * POV of clients not being able to write, but also from the POV of     * expires and evictions of keys not being performed. */    if (clientsArePaused()) return C_OK;    // 如果以后内存使用量没有超过 maxmemory，返回    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)        return C_OK;    mem_freed = 0;    latencyStartMonitor(latency);    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)        goto cant_free; /* We need to free memory, but policy forbids. */    while (mem_freed < mem_tofree) {        int j, k, i;        static unsigned int next_db = 0;        sds bestkey = NULL;        int bestdbid;        redisDb *db;        dict *dict;        dictEntry *de;        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)        {            struct evictionPoolEntry *pool = EvictionPoolLRU;            while(bestkey == NULL) {                unsigned long total_keys = 0, keys;                /* We don't want to make local-db choices when expiring keys,                 * so to start populate the eviction pool sampling keys from                 * every DB. */                for (i = 0; i < server.dbnum; i++) {                    db = server.db+i;                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?                            db->dict : db->expires;                    if ((keys = dictSize(dict)) != 0) {                        evictionPoolPopulate(i, dict, db->dict, pool);                        total_keys += keys;                    }                }                if (!total_keys) break; /* No keys to evict. */                /* Go backward from best to worst element to evict. */                for (k = EVPOOL_SIZE-1; k >= 0; k--) {                    if (pool[k].key == NULL) continue;                    bestdbid = pool[k].dbid;                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {                        de = dictFind(server.db[pool[k].dbid].dict,                            pool[k].key);                    } else {                        de = dictFind(server.db[pool[k].dbid].expires,                            pool[k].key);                    }                    /* Remove the entry from the pool. */                    if (pool[k].key != pool[k].cached)                        sdsfree(pool[k].key);                    pool[k].key = NULL;                    pool[k].idle = 0;                    /* If the key exists, is our pick. Otherwise it is                     * a ghost and we need to try the next element. */                    if (de) {                        bestkey = dictGetKey(de);                        break;                    } else {                        /* Ghost... Iterate again. */                    }                }            }        }        /* volatile-random and allkeys-random policy */        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)        {            /* When evicting a random key, we try to evict a key for             * each DB, so we use the static 'next_db' variable to             * incrementally visit all DBs. */            for (i = 0; i < server.dbnum; i++) {                j = (++next_db) % server.dbnum;                db = server.db+j;                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?                        db->dict : db->expires;                if (dictSize(dict) != 0) {                    de = dictGetRandomKey(dict);                    bestkey = dictGetKey(de);                    bestdbid = j;                    break;                }            }        }        /* Finally remove the selected key. */        if (bestkey) {            db = server.db+bestdbid;            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);            /* We compute the amount of memory freed by db*Delete() alone.             * It is possible that actually the memory needed to propagate             * the DEL in AOF and replication link is greater than the one             * we are freeing removing the key, but we can't account for             * that otherwise we would never exit the loop.             *             * Same for CSC invalidation messages generated by signalModifiedKey.             *             * AOF and Output buffer memory will be freed eventually so             * we only care about memory used by the key space. */            delta = (long long) zmalloc_used_memory();            latencyStartMonitor(eviction_latency);            if (server.lazyfree_lazy_eviction)                dbAsyncDelete(db,keyobj);            else                dbSyncDelete(db,keyobj);            latencyEndMonitor(eviction_latency);            latencyAddSampleIfNeeded("eviction-del",eviction_latency);            delta -= (long long) zmalloc_used_memory();            mem_freed += delta;            server.stat_evictedkeys++;            signalModifiedKey(NULL,db,keyobj);            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",                keyobj, db->id);            decrRefCount(keyobj);            keys_freed++;            /* When the memory to free starts to be big enough, we may             * start spending so much time here that is impossible to             * deliver data to the slaves fast enough, so we force the             * transmission here inside the loop. */            if (slaves) flushSlavesOutputBuffers();            /* Normally our stop condition is the ability to release             * a fixed, pre-computed amount of memory. However when we             * are deleting objects in another thread, it's better to             * check, from time to time, if we already reached our target             * memory, since the "mem_freed" amount is computed only             * across the dbAsyncDelete() call, while the thread can             * release the memory all the time. */            if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {                if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {                    /* Let's satisfy our stop condition. */                    mem_freed = mem_tofree;                }            }        } else {            goto cant_free; /* nothing to free... */        }    }    result = C_OK;cant_free:    /* We are here if we are not able to reclaim memory. There is only one     * last thing we can try: check if the lazyfree thread has jobs in queue     * and wait... */    if (result != C_OK) {        latencyStartMonitor(lazyfree_latency);        while(bioPendingJobsOfType(BIO_LAZY_FREE)) {            if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {                result = C_OK;                break;            }            usleep(1000);        }        latencyEndMonitor(lazyfree_latency);        latencyAddSampleIfNeeded("eviction-lazyfree",lazyfree_latency);    }    latencyEndMonitor(latency);    latencyAddSampleIfNeeded("eviction-cycle",latency);    return result;}

(4.4.2.1) 判断以后内存应用状况-getMaxmemoryState

// file: src/evict.c/* Get the memory status from the point of view of the maxmemory directive: * if the memory used is under the maxmemory setting then C_OK is returned. * Otherwise, if we are over the memory limit, the function returns * C_ERR. * * The function may return additional info via reference, only if the * pointers to the respective arguments is not NULL. Certain fields are * populated only when C_ERR is returned: * *  'total'     total amount of bytes used. *              (Populated both for C_ERR and C_OK) * *  'logical'   the amount of memory used minus the slaves/AOF buffers. *              (Populated when C_ERR is returned) * *  'tofree'    the amount of memory that should be released *              in order to return back into the memory limits. *              (Populated when C_ERR is returned) * *  'level'     this usually ranges from 0 to 1, and reports the amount of *              memory currently used. May be > 1 if we are over the memory *              limit. *              (Populated both for C_ERR and C_OK) */int getMaxmemoryState(size_t *total, size_t *logical, size_t *tofree, float *level) {    size_t mem_reported, mem_used, mem_tofree;    /* Check if we are over the memory usage limit. If we are not, no need     * to subtract the slaves output buffers. We can just return ASAP. */    mem_reported = zmalloc_used_memory();    if (total) *total = mem_reported;    /* We may return ASAP if there is no need to compute the level. */    int return_ok_asap = !server.maxmemory || mem_reported <= server.maxmemory;    if (return_ok_asap && !level) return C_OK;    /* Remove the size of slaves output buffers and AOF buffer from the     * count of used memory. */    mem_used = mem_reported;    size_t overhead = freeMemoryGetNotCountedMemory();    mem_used = (mem_used > overhead) ? mem_used-overhead : 0;    /* Compute the ratio of memory usage. */    if (level) {        if (!server.maxmemory) {            *level = 0;        } else {            *level = (float)mem_used / (float)server.maxmemory;        }    }    if (return_ok_asap) return C_OK;    /* Check if we are still over the memory limit. */    if (mem_used <= server.maxmemory) return C_OK;    /* Compute how much memory we need to free. */    mem_tofree = mem_used - server.maxmemory;    if (logical) *logical = mem_used;    if (tofree) *tofree = mem_tofree;    return C_ERR;}

(4.4.2.2) 更新待淘汰的候选键值对汇合-evictionPoolPopulate

// file: src/evict.c/* This is an helper function for freeMemoryIfNeeded(), it is used in order * to populate the evictionPool with a few entries every time we want to * expire a key. Keys with idle time smaller than one of the current * keys are added. Keys are always added if there are free entries. * * We insert keys on place in ascending order, so keys with the smaller * idle time are on the left, and keys with the higher idle time on the * right. */void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {    int j, k, count;    dictEntry *samples[server.maxmemory_samples];    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);    for (j = 0; j < count; j++) {        unsigned long long idle;        sds key;        robj *o;        dictEntry *de;        de = samples[j];        key = dictGetKey(de);        /* If the dictionary we are sampling from is not the main         * dictionary (but the expires one) we need to lookup the key         * again in the key dictionary to obtain the value object. */        if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {            if (sampledict != keydict) de = dictFind(keydict, key);            o = dictGetVal(de);        }        /* Calculate the idle time according to the policy. This is called         * idle just because the code initially handled LRU, but is in fact         * just a score where an higher score means better candidate. */        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {            idle = estimateObjectIdleTime(o);        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {            /* When we use an LRU policy, we sort the keys by idle time             * so that we expire keys starting from greater idle time.             * However when the policy is an LFU one, we have a frequency             * estimation, and we want to evict keys with lower frequency             * first. So inside the pool we put objects using the inverted             * frequency subtracting the actual frequency to the maximum             * frequency of 255. */            idle = 255-LFUDecrAndReturn(o);        } else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {            /* In this case the sooner the expire the better. */            idle = ULLONG_MAX - (long)dictGetVal(de);        } else {            serverPanic("Unknown eviction policy in evictionPoolPopulate()");        }        /* Insert the element inside the pool.         * First, find the first empty bucket or the first populated         * bucket that has an idle time smaller than our idle time. */        k = 0;        while (k < EVPOOL_SIZE &&               pool[k].key &&               pool[k].idle < idle) k++;        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {            /* Can't insert if the element is < the worst element we have             * and there are no empty buckets. */            continue;        } else if (k < EVPOOL_SIZE && pool[k].key == NULL) {            /* Inserting into empty position. No setup needed before insert. */        } else {            /* Inserting in the middle. Now k points to the first element             * greater than the element to insert.  */            if (pool[EVPOOL_SIZE-1].key == NULL) {                /* Free space on the right? Insert at k shifting                 * all the elements from k to end to the right. */                /* Save SDS before overwriting. */                sds cached = pool[EVPOOL_SIZE-1].cached;                memmove(pool+k+1,pool+k,                    sizeof(pool[0])*(EVPOOL_SIZE-k-1));                pool[k].cached = cached;            } else {                /* No free space on right? Insert at k-1 */                k--;                /* Shift all elements on the left of k (included) to the                 * left, so we discard the element with smaller idle time. */                sds cached = pool[0].cached; /* Save SDS before overwriting. */                if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);                memmove(pool,pool+1,sizeof(pool[0])*k);                pool[k].cached = cached;            }        }        /* Try to reuse the cached SDS string allocated in the pool entry,         * because allocating and deallocating this object is costly         * (according to the profiler, not my fantasy. Remember:         * premature optimization bla bla bla. */        int klen = sdslen(key);        if (klen > EVPOOL_CACHED_SDS_SIZE) {            pool[k].key = sdsdup(key);        } else {            memcpy(pool[k].cached,key,klen+1);            sdssetlen(pool[k].cached,klen);            pool[k].key = pool[k].cached;        }        pool[k].idle = idle;        pool[k].dbid = dbid;    }}

(4.4.2.3) 抉择被淘汰的键值对并删除-freeMemoryIfNeeded

// file: src/evict.c/* This function is periodically called to see if there is memory to free * according to the current "maxmemory" settings. In case we are over the * memory limit, the function will try to free some memory to return back * under the limit. * * The function returns C_OK if we are under the memory limit or if we * were over the limit, but the attempt to free memory was successful. * Otherwise if we are over the memory limit, but not enough memory * was freed to return back under the limit, the function returns C_ERR. */int freeMemoryIfNeeded(void) {    int keys_freed = 0;    /* By default replicas should ignore maxmemory     * and just be masters exact copies. */    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;    size_t mem_reported, mem_tofree, mem_freed;    mstime_t latency, eviction_latency, lazyfree_latency;    long long delta;    int slaves = listLength(server.slaves);    int result = C_ERR;    /* When clients are paused the dataset should be static not just from the     * POV of clients not being able to write, but also from the POV of     * expires and evictions of keys not being performed. */    if (clientsArePaused()) return C_OK;    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)        return C_OK;    mem_freed = 0;    latencyStartMonitor(latency);    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)        goto cant_free; /* We need to free memory, but policy forbids. */    while (mem_freed < mem_tofree) {        int j, k, i;        static unsigned int next_db = 0;        sds bestkey = NULL;        int bestdbid;        redisDb *db;        dict *dict;        dictEntry *de;        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)        {            struct evictionPoolEntry *pool = EvictionPoolLRU;            while(bestkey == NULL) {                unsigned long total_keys = 0, keys;                /* We don't want to make local-db choices when expiring keys,                 * so to start populate the eviction pool sampling keys from                 * every DB. */                for (i = 0; i < server.dbnum; i++) {                    db = server.db+i;                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?                            db->dict : db->expires;                    if ((keys = dictSize(dict)) != 0) {                        evictionPoolPopulate(i, dict, db->dict, pool);                        total_keys += keys;                    }                }                if (!total_keys) break; /* No keys to evict. */                /* Go backward from best to worst element to evict. */                for (k = EVPOOL_SIZE-1; k >= 0; k--) {                    if (pool[k].key == NULL) continue;                    bestdbid = pool[k].dbid;                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {                        de = dictFind(server.db[pool[k].dbid].dict,                            pool[k].key);                    } else {                        de = dictFind(server.db[pool[k].dbid].expires,                            pool[k].key);                    }                    /* Remove the entry from the pool. */                    if (pool[k].key != pool[k].cached)                        sdsfree(pool[k].key);                    pool[k].key = NULL;                    pool[k].idle = 0;                    /* If the key exists, is our pick. Otherwise it is                     * a ghost and we need to try the next element. */                    if (de) {                        bestkey = dictGetKey(de);                        break;                    } else {                        /* Ghost... Iterate again. */                    }                }            }        }        /* volatile-random and allkeys-random policy */        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)        {            /* When evicting a random key, we try to evict a key for             * each DB, so we use the static 'next_db' variable to             * incrementally visit all DBs. */            for (i = 0; i < server.dbnum; i++) {                j = (++next_db) % server.dbnum;                db = server.db+j;                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?                        db->dict : db->expires;                if (dictSize(dict) != 0) {                    de = dictGetRandomKey(dict);                    bestkey = dictGetKey(de);                    bestdbid = j;                    break;                }            }        }        /* Finally remove the selected key. */        if (bestkey) {            db = server.db+bestdbid;            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);            /* We compute the amount of memory freed by db*Delete() alone.             * It is possible that actually the memory needed to propagate             * the DEL in AOF and replication link is greater than the one             * we are freeing removing the key, but we can't account for             * that otherwise we would never exit the loop.             *             * Same for CSC invalidation messages generated by signalModifiedKey.             *             * AOF and Output buffer memory will be freed eventually so             * we only care about memory used by the key space. */            delta = (long long) zmalloc_used_memory();            latencyStartMonitor(eviction_latency);            if (server.lazyfree_lazy_eviction)                dbAsyncDelete(db,keyobj);            else                dbSyncDelete(db,keyobj);            latencyEndMonitor(eviction_latency);            latencyAddSampleIfNeeded("eviction-del",eviction_latency);            delta -= (long long) zmalloc_used_memory();            mem_freed += delta;            server.stat_evictedkeys++;            signalModifiedKey(NULL,db,keyobj);            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",                keyobj, db->id);            decrRefCount(keyobj);            keys_freed++;            /* When the memory to free starts to be big enough, we may             * start spending so much time here that is impossible to             * deliver data to the slaves fast enough, so we force the             * transmission here inside the loop. */            if (slaves) flushSlavesOutputBuffers();            /* Normally our stop condition is the ability to release             * a fixed, pre-computed amount of memory. However when we             * are deleting objects in another thread, it's better to             * check, from time to time, if we already reached our target             * memory, since the "mem_freed" amount is computed only             * across the dbAsyncDelete() call, while the thread can             * release the memory all the time. */            if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {                if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {                    /* Let's satisfy our stop condition. */                    mem_freed = mem_tofree;                }            }        } else {            goto cant_free; /* nothing to free... */        }    }    result = C_OK;cant_free:    /* We are here if we are not able to reclaim memory. There is only one     * last thing we can try: check if the lazyfree thread has jobs in queue     * and wait... */    if (result != C_OK) {        latencyStartMonitor(lazyfree_latency);        while(bioPendingJobsOfType(BIO_LAZY_FREE)) {            if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {                result = C_OK;                break;            }            usleep(1000);        }        latencyEndMonitor(lazyfree_latency);        latencyAddSampleIfNeeded("eviction-lazyfree",lazyfree_latency);    }    latencyEndMonitor(latency);    latencyAddSampleIfNeeded("eviction-cycle",latency);    return result;}

参考资料

https://weikeqin.com/tags/redis/

Redis源码分析与实战学习笔记 Day15 15 | 为什么LRU算法原理和代码实现不一样？
https://time.geekbang.org/col...