在 Redis 中，允許用戶設(shè)置最大使用內(nèi)存大小 server.maxmemory，在內(nèi)存限定的情況下是很有用的。譬如，在一臺(tái) 8G 機(jī)子上部署了 4 個(gè) Redis 服務(wù)點(diǎn)，每一個(gè)服務(wù)點(diǎn)分配 1G 的內(nèi)存大小，減少內(nèi)存緊張的情況，由此獲取更為穩(wěn)健的服務(wù)。Redis 內(nèi)存數(shù)據(jù)集大小上升到一定大小的時(shí)候，就會(huì)施行數(shù)據(jù)淘汰策略。Redis 提供 6 種數(shù)據(jù)淘汰策略：

volatile-lru：從已設(shè)置過期時(shí)間的數(shù)據(jù)集（server.db[i].expires）中挑選最近最少使用的數(shù)據(jù)淘汰
volatile-ttl：從已設(shè)置過期時(shí)間的數(shù)據(jù)集（server.db[i].expires）中挑選將要過期的數(shù) 據(jù)淘汰
volatile-random：從已設(shè)置過期時(shí)間的數(shù)據(jù)集（server.db[i].expires）中任意選擇數(shù)據(jù) 淘汰
allkeys-lru：從數(shù)據(jù)集（server.db[i].dict）中挑選最近最少使用的數(shù)據(jù)淘汰
allkeys-random：從數(shù)據(jù)集（server.db[i].dict）中任意選擇數(shù)據(jù)淘汰
no-enviction（驅(qū)逐）：禁止驅(qū)逐數(shù)據(jù)

Redis 確定驅(qū)逐某個(gè)鍵值對(duì)后，會(huì)刪除這個(gè)數(shù)據(jù)，并將這個(gè)數(shù)據(jù)變更消息發(fā)布到本地（AOF 持久化）和從機(jī)（主從連接）。

LRU 數(shù)據(jù)淘汰機(jī)制

在服務(wù)器配置中保存了 lru 計(jì)數(shù)器 server.lrulock，會(huì)定時(shí)（Redis 定時(shí)程序serverCorn()）更新，server.lrulock 的值是根據(jù) server.unixtime 計(jì)算出來的。

// redisServer 保存了lru 計(jì)數(shù)器
struct redisServer {
...
unsigned lruclock:22; /* Clock incrementing every minute, for LRU */
...
};

另外，從 struct redisObject 中可以發(fā)現(xiàn)，每一個(gè) Redis 對(duì)象都會(huì)設(shè)置相應(yīng)的 lru，即最近訪問的時(shí)間?？梢韵胂蟮氖牵恳淮卧L問數(shù)據(jù)的時(shí)候，會(huì)更新 redisObject.lru。

LRU 數(shù)據(jù)淘汰機(jī)制是這樣的：在數(shù)據(jù)集中隨機(jī)挑選幾個(gè)鍵值對(duì)，取出其中 lru 最大的鍵值對(duì)淘汰。所以，你會(huì)發(fā)現(xiàn)，Redis 并不是保證取得所有數(shù)據(jù)集中最近最少使用（LRU）的鍵值對(duì)，而只是隨機(jī)挑選的幾個(gè)鍵值對(duì)中的。

// 每一個(gè)redis 對(duì)象都保存了lru
#define REDIS_LRU_CLOCK_MAX ((1<<21)-1) /* Max value of obj->lru */
#define REDIS_LRU_CLOCK_RESOLUTION 10 /* LRU clock resolution in seconds */
typedef struct redisObject {
    // 剛剛好32 bits
    // 對(duì)象的類型，字符串/列表/集合/哈希表
    unsigned type:4;
    // 未使用的兩個(gè)位
    unsigned notused:2; /* Not used */
    // 編碼的方式，redis 為了節(jié)省空間，提供多種方式來保存一個(gè)數(shù)據(jù)
    // 譬如：“123456789” 會(huì)被存儲(chǔ)為整數(shù)123456789
    unsigned encoding:4;
    unsigned lru:22; /* lru time (relative to server.lruclock) */
    // 引用數(shù)
    int refcount;
    // 數(shù)據(jù)指針
    void *ptr;
} robj;

// redis 定時(shí)執(zhí)行程序。聯(lián)想：linux cron
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    ......
    /* We have just 22 bits per object for LRU information.
    * So we use an (eventually wrapping) LRU clock with 10 seconds resolution.
    * 2^22 bits with 10 seconds resolution is more or less 1.5 years.
    **
    Note that even if this will wrap after 1.5 years it's not a problem,
    * everything will still work but just some object will appear younger
    * to Redis. But for this to happen a given object should never be touched
    * for 1.5 years.
    **
    Note that you can change the resolution altering the
    * REDIS_LRU_CLOCK_RESOLUTION define.
    */
    updateLRUClock();
    ......
}
    // 更新服務(wù)器的lru 計(jì)數(shù)器
void updateLRUClock(void) {
    server.lruclock = (server.unixtime/REDIS_LRU_CLOCK_RESOLUTION) &
    REDIS_LRU_CLOCK_MAX;
}

TTL 數(shù)據(jù)淘汰機(jī)制

Redis 數(shù)據(jù)集數(shù)據(jù)結(jié)構(gòu)中保存了鍵值對(duì)過期時(shí)間的表，即 redisDb.expires，在使用 SET 命令的時(shí)候，就有一個(gè)鍵值對(duì)超時(shí)時(shí)間的選項(xiàng)。和 LRU 數(shù)據(jù)淘汰機(jī)制類似，TTL 數(shù)據(jù)淘汰機(jī)制是這樣的：從過期時(shí)間 redisDB.expires 表中隨機(jī)挑選幾個(gè)鍵值對(duì)，取出其中 ttl 最大的鍵值對(duì)淘汰。同樣你會(huì)發(fā)現(xiàn)，Redis 并不是保證取得所有過期時(shí)間的表中最快過期的鍵值對(duì)，而只是隨機(jī)挑選的幾個(gè)鍵值對(duì)中的。

無論是什么機(jī)制，都是從所有的鍵值對(duì)中挑選合適的淘汰。

在哪里開始淘汰數(shù)據(jù)

Redis 每服務(wù)客戶端執(zhí)行一個(gè)命令的時(shí)候，會(huì)檢測使用的內(nèi)存是否超額。如果超額，即進(jìn)行數(shù)據(jù)淘汰。

// 執(zhí)行命令
int processCommand(redisClient *c) {
        ......
        // 內(nèi)存超額
        /* Handle the maxmemory directive.
        **
        First we try to free some memory if possible (if there are volatile
        * keys in the dataset). If there are not the only thing we can do
        * is returning an error. */
        if (server.maxmemory) {
                int retval = freeMemoryIfNeeded();
        if ((c->cmd->flags & REDIS_CMD_DENYOOM) && retval == REDIS_ERR) {
                flagTransaction(c);
                addReply(c, shared.oomerr);
                return REDIS_OK;
        }
    }
    ......
}

這是我們之前講述過的命令處理函數(shù)。在處理命令處理函數(shù)的過程，會(huì)涉及到內(nèi)存使用量的檢測，如果檢測到內(nèi)存使用超額，會(huì)觸發(fā)數(shù)據(jù)淘汰機(jī)制。我們來看看淘汰機(jī)制觸發(fā)的函數(shù) freeMemoryIfNeeded() 里面發(fā)生了什么。

// 如果需要，是否一些內(nèi)存
int freeMemoryIfNeeded(void) {
    size_t mem_used, mem_tofree, mem_freed;
    int slaves = listLength(server.slaves);
    // redis 從機(jī)回復(fù)空間和AOF 內(nèi)存大小不計(jì)算入redis 內(nèi)存大小
    // 關(guān)于已使用內(nèi)存大小是如何統(tǒng)計(jì)的，我們會(huì)其他章節(jié)講解，這里先忽略這個(gè)細(xì)節(jié)
    /* Remove the size of slaves output buffers and AOF buffer from the
    * count of used memory. */
    mem_used = zmalloc_used_memory();
    // 從機(jī)回復(fù)空間大小
    if (slaves) {
        listIter li;
        listNode *ln;
        listRewind(server.slaves,&li);
    while((ln = listNext(&li))) {
        redisClient *slave = listNodeValue(ln);
    unsigned long obuf_bytes = getClientOutputBufferMemoryUsage(slave);
    if (obuf_bytes > mem_used)
        mem_used = 0;
    else
        mem_used -= obuf_bytes;
    }
}
    // server.aof_buf && server.aof_rewrite_buf_blocks
    if (server.aof_state != REDIS_AOF_OFF) {
        mem_used -= sdslen(server.aof_buf);
        mem_used -= aofRewriteBufferSize();
    }
    // 內(nèi)存是否超過設(shè)置大小
    /* Check if we are over the memory limit. */
    if (mem_used <= server.maxmemory) return REDIS_OK;
    // redis 中可以設(shè)置內(nèi)存超額策略
    if (server.maxmemory_policy == REDIS_MAXMEMORY_NO_EVICTION)
        return REDIS_ERR; /* We need to free memory, but policy forbids. */
    /* Compute how much memory we need to free. */
        mem_tofree = mem_used - server.maxmemory;
        mem_freed = 0;
    while (mem_freed < mem_tofree) {
        int j, k, keys_freed = 0;
        // 遍歷所有數(shù)據(jù)集
        for (j = 0; j < server.dbnum; j++) {
            long bestval = 0; /* just to prevent warning */
            sds bestkey = NULL;
            struct dictEntry *de;
            redisDb *db = server.db+j;
            dict *dict;
        // 不同的策略，選擇的數(shù)據(jù)集不一樣
        if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
            server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM)
        {
            dict = server.db[j].dict;
        } else {
            dict = server.db[j].expires;
        }
        // 數(shù)據(jù)集為空，繼續(xù)下一個(gè)數(shù)據(jù)集
        if (dictSize(dict) == 0) continue;
        // 隨機(jī)淘汰隨機(jī)策略：隨機(jī)挑選
        /* volatile-random and allkeys-random policy */
        if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||
            server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM)
        {
            de = dictGetRandomKey(dict);
            bestkey = dictGetKey(de);
        }
    // LRU 策略：挑選最近最少使用的數(shù)據(jù)
    /* volatile-lru and allkeys-lru policy */
        else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
            server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
        {
        // server.maxmemory_samples 為隨機(jī)挑選鍵值對(duì)次數(shù)
        // 隨機(jī)挑選server.maxmemory_samples 個(gè)鍵值對(duì)，驅(qū)逐最近最少使用的數(shù)據(jù)
        for (k = 0; k < server.maxmemory_samples; k++) {
            sds thiskey;
            long thisval;
            robj *o;
        // 隨機(jī)挑選鍵值對(duì)
            de = dictGetRandomKey(dict);
        // 獲取鍵
            thiskey = dictGetKey(de);
        /* When policy is volatile-lru we need an additional lookup
        * to locate the real key, as dict is set to db->expires. */
        if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
            de = dictFind(db->dict, thiskey);
            o = dictGetVal(de);
            // 計(jì)算數(shù)據(jù)的空閑時(shí)間
            thisval = estimateObjectIdleTime(o);
            // 當(dāng)前鍵值空閑時(shí)間更長，則記錄
            /* Higher idle time is better candidate for deletion */
        if (bestkey == NULL || thisval > bestval) {
            bestkey = thiskey;
            bestval = thisval;
            }
        }
    }
    // TTL 策略：挑選將要過期的數(shù)據(jù)
    /* volatile-ttl */
    else if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_TTL) {
    // server.maxmemory_samples 為隨機(jī)挑選鍵值對(duì)次數(shù)
    // 隨機(jī)挑選server.maxmemory_samples 個(gè)鍵值對(duì)，驅(qū)逐最快要過期的數(shù)據(jù)
    for (k = 0; k < server.maxmemory_samples; k++) {
    sds thiskey;
    long thisval;
    de = dictGetRandomKey(dict);
    thiskey = dictGetKey(de);
    thisval = (long) dictGetVal(de);
    /* Expire sooner (minor expire unix timestamp) is better
    * candidate for deletion */
    if (bestkey == NULL || thisval < bestval) {
        bestkey = thiskey;
        bestval = thisval;
        }
    }
}
    // 刪除選定的鍵值對(duì)
    /* Finally remove the selected key. */
    if (bestkey) {
        long long delta;
    robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
    // 發(fā)布數(shù)據(jù)更新消息，主要是AOF 持久化和從機(jī)
    propagateExpire(db,keyobj);
    // 注意， propagateExpire() 可能會(huì)導(dǎo)致內(nèi)存的分配，
    // propagateExpire() 提前執(zhí)行就是因?yàn)閞edis 只計(jì)算
    // dbDelete() 釋放的內(nèi)存大小。倘若同時(shí)計(jì)算dbDelete()
    // 釋放的內(nèi)存和propagateExpire() 分配空間的大小，與此
    // 同時(shí)假設(shè)分配空間大于釋放空間，就有可能永遠(yuǎn)退不出這個(gè)循環(huán)。
    // 下面的代碼會(huì)同時(shí)計(jì)算dbDelete() 釋放的內(nèi)存和propagateExpire() 分配空間的大小// propagateExpire(db,keyobj);
    // delta = (long long) zmalloc_used_memory();
    // dbDelete(db,keyobj);
    // delta -= (long long) zmalloc_used_memory();
    // mem_freed += delta;
    /////////////////////////////////////////
/* We compute the amount of memory freed by dbDelete() alone.
* It is possible that actually the memory needed to propagate
* the DEL in AOF and replication link is greater than the one
* we are freeing removing the key, but we can't account for
* that otherwise we would never exit the loop.
**
AOF and Output buffer memory will be freed eventually so
* we only care about memory used by the key space. */
// 只計(jì)算dbDelete() 釋放內(nèi)存的大小
    delta = (long long) zmalloc_used_memory();
    dbDelete(db,keyobj);
    delta -= (long long) zmalloc_used_memory();
    mem_freed += delta;
    server.stat_evictedkeys++;
    // 將數(shù)據(jù)的刪除通知所有的訂閱客戶端
    notifyKeyspaceEvent(REDIS_NOTIFY_EVICTED, "evicted",
    keyobj, db->id);
    decrRefCount(keyobj);
    keys_freed++;
    // 將從機(jī)回復(fù)空間中的數(shù)據(jù)及時(shí)發(fā)送給從機(jī)
/* When the memory to free starts to be big enough, we may
* start spending so much time here that is impossible to
* deliver data to the slaves fast enough, so we force the
* transmission here inside the loop. */
    if (slaves) flushSlavesOutputBuffers();
    }
}
    // 未能釋放空間，且此時(shí)redis 使用的內(nèi)存大小依舊超額，失敗返回
    if (!keys_freed) return REDIS_ERR; /* nothing to free... */
    }
    return REDIS_OK;
}

上一篇：Redis 數(shù)據(jù)結(jié)構(gòu) ziplist下一篇：Redis 與 Lua 腳本