前言

近段时间在理解分布式时,常常绕不开一个算法: 一致性哈希算法。于是在理解并实际这个算法后,就有了此文章。

算法间的比照

在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。

取模分段一致性哈希
下层是否感知
迁徙老本低,只波及相邻节点
单点故障影响低,只影响相邻节点
算法复杂度
热点数据存在存在存在

一致性哈希次要解决问题

从上述比照可知,一致性哈希次要升高节点高低线中带来的数据迁徙老本,同时节点数量的变动与分片准则于下层无感,使下层更专一于畛域内逻辑的编写,使整体架构更加灵便。

一致性 hash 原理

  1. 根本数据结构

根本数据类型因人而已,惯例的哈希取模采纳大多采纳将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。

本次实现采纳的是 key 按大小排列的哈希表。原打算应用数组承接数据,但排序老本随 key 的增多而加大,遂放弃。

  1. 数据存储

    数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差别之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。

    场景假如: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,后果如下图所示。

本次实现采纳的思路是

1. 计算存入数据的 hash 值2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中
  1. 节点上线

新节点退出一致性哈希环中,原理是通过计算节点所代表的 hash 值,并依据计算值将节点映射在环上,最初迁徙相邻节点数据到新节点上。

场景假如: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。后果如下图所示。

本次实现采纳的思路是

1. 计算上线节点 hash 值2. 计算上线节点所新增的虚构节点的 hash 值(若初始化指定虚构节点数量)3. 寻找最近的(比上线节点与虚构节点 hash 值大的最小的节点 hash)节点,取出节点数据4. 将1 2点节点退出到 hash 环中5. 将 3 中取出的数据从新放入到 hash 环上
  1. 节点下线

已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据从新放入 hash 环上。

场景假如: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。后果如下图所示。

本次实现采纳的思路是

1. 计算下线节点 hash 值2. 取出下线节点以及虚构节点(若初始化指定虚构节点数量)存储数据3. 将下线节点以及虚构节点(若初始化指定虚构节点数量)从 hash 环上移除4. 将 2 中数据从新放入到环上

代码实现

一致性哈希分为两个计划: 不带虚构节点与带虚构节点。而两个计划实现相似,所以本次实现将两种计划合在一起实现。实现如下。

package org.CommonAlgorithms.ConsistentHash;import org.CommonAlgorithms.HashAlgorithm.HashService;import java.util.List;/** * consistentHashing interface * @author cartoon * @since  10/01/2021 * @version 1.1 */public interface ConsistentHashing {    /**     * put data to hash loop     * @param data data list     * @return put result     */    boolean putData(List<String> data);    /**     * put data to hash loop     * @param data data     * @return put result     */    boolean putData(String data);    /**     * remove node from hash loop     * @param node removing node     * @return remove result     */    boolean removeNode(String node);    /**     * add node to hash loop     * @param node adding node     * @return add result     */    boolean addNode(String node);    /**     * inject hash method to hash loop     * @param hashService hash method     * @throws UnsupportedOperationException if loop already has node     */    void setHashMethod(HashService hashService);    /**     * print all data in loop according ascending hash value with nodes     */    void printAllData();}
package org.CommonAlgorithms.ConsistentHash;import org.CommonAlgorithms.HashAlgorithm.HashService;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import java.util.*;/** * consistent hash achieve * @author cartoon * @since  2021/01/17 */public class ConsistentHashingImpl implements ConsistentHashing {    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class);    /**     * virtual node name template     */    private static final String virtualNodeFormat = "%s&&%d";    /**     * real node and its virtual node mapping     */    private SortedMap<String, List<String>> realNodeToVirtualNode;    /**     * hash and its node mapping     */    private SortedMap<Integer, String> hashToNodes;    /**     * node and its data mapping     */    private Map<String, List<String>> nodeToData;    /**     * determine virtual node's number of each node     */    private int virtualNodeNum;    /**     * inject hash method, if null, use loop default hash method     */    private HashService hashService;    public ConsistentHashingImpl() {        this(0, new String[0]);    }    public ConsistentHashingImpl(String... nodes) {        this(0, nodes);    }    public ConsistentHashingImpl(int virtualNodeNum) {        this(virtualNodeNum, new String[0]);    }    public ConsistentHashingImpl(int virtualNodeNum, String... nodes) {        //1. intercept virtual num smaller than 0        if(virtualNodeNum < 0){            log.error("virtual num is not allow smaller than 0");            throw new IllegalArgumentException();        }        //2. initialize loop member attributes        this.virtualNodeNum = virtualNodeNum;        realNodeToVirtualNode = new TreeMap<>();        hashToNodes = new TreeMap<>();        nodeToData = new HashMap<>();        for(String server : nodes){            hashToNodes.put(getHash(server), server);            nodeToData.put(server, new LinkedList<>());        }        //3. if virtual node number bigger than 0, add virtual node        if(virtualNodeNum > 0){            for(String server : nodes){                addVirtualNode(server);            }        }    }    @Override    public boolean putData(List<String> data) {        //1. circulate call put data method to add data to loop        for(String incomingData : data){            if(!putData(incomingData)){                return false;            }        }        return true;    }    @Override    public boolean putData(String data) {        if(hashToNodes.isEmpty()){            log.error("put data, usable server is empty");            return false;        }        //1. calculate data's hash value        int currentHash = getHash(data);        //2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop        SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash);        String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey());        //3. add data to node        List<String> dataList = nodeToData.get(node);        dataList.add(data);        log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash);        return true;    }    @Override    public boolean removeNode(String node) {        //1. calculate hash value of removing node        int removeServerHash = getHash(node);        if(!hashToNodes.containsKey(removeServerHash)){            log.error("remove server, current server is not in server list, please check server ip");            return false;        }        //2. get data from removing node        List<String> removeServerData = nodeToData.get(node);        //3. get removing node's virtual node data, remove all virtual node with removing node        if(virtualNodeNum != 0){            for(String virtualNode : realNodeToVirtualNode.get(node)){                removeServerData.addAll(nodeToData.get(virtualNode));                hashToNodes.remove(getHash(virtualNode));                nodeToData.remove(virtualNode);            }        }        //4. remove node from hash loop        hashToNodes.remove(removeServerHash);        nodeToData.remove(node);        if(hashToNodes.size() == 0){            log.info("remove server, after remove, server list is empty");            return true;        }        //5. put data to loop by call put data method        putData(removeServerData);        log.info("remove server, remove server {} success", node);        return true;    }    @Override    public boolean addNode(String node) {        //1, calculate adding node's hash value        int addServerHash = getHash(node);        //2. add node and migrate data        if(hashToNodes.isEmpty()){            //2.1 add node and its virtual node to loop directly when current loop is empty            hashToNodes.put(addServerHash, node);            nodeToData.put(node, new LinkedList<>());            if(virtualNodeNum > 0){                addVirtualNode(node);            }        } else{            //2.2.1 get data to be migrated from loop            SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash);            String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey());            List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer));            //2.2.2 add node and its virtual node to loop            hashToNodes.put(addServerHash, node);            nodeToData.put(greatServer, new LinkedList<>());            nodeToData.put(node, new LinkedList<>());            if(virtualNodeNum != 0){                addVirtualNode(node);            }            //2.2.3 migrate 2.2.1 data to loop by call put data method            putData(firstGreatServerData);        }        log.info("add server, server {} has been added", node);        return true;    }    @Override    public void printAllData() {        nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data));    }    @Override    public void setHashMethod(HashService hashService) {        if(!hashToNodes.isEmpty()){            throw new UnsupportedOperationException();        }        this.hashService = hashService;    }    private void addVirtualNode(String realNode){        if(virtualNodeNum > 0){            List<String> virtualNodeList = new LinkedList<>();            for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){                //1. generate virtual node name by default format                String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt);                //2. calculate each virtual node's hash value                int virtualNodeHash = getHash(virtualNodeName);                //3. current node already exist in loop, continue                if(hashToNodes.containsKey(virtualNodeHash)){                    continue;                }                //4. add node to loop                virtualNodeList.add(virtualNodeName);                hashToNodes.put(virtualNodeHash, virtualNodeName);                nodeToData.put(virtualNodeName, new LinkedList<>());            }            //5. map virtual node to real node            realNodeToVirtualNode.put(realNode, virtualNodeList);        }    }    private int getHash(String data){        return hashService == null ? defaultGetHash(data) : hashService.getHash(data);    }    private int defaultGetHash(String data){        int res = 0;        for(char tempChar : data.toCharArray()){            if(tempChar >= '0' && tempChar <= '9'){                res += tempChar;            }        }        return res;    }}

测试后果

不带虚构节点的一致性哈希
测试代码
package ConsistentHash;import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;import org.junit.Assert;import org.junit.Before;import org.junit.Test;import org.slf4j.Logger;import org.slf4j.LoggerFactory;/** * @author cartoon * @date 2020/12/27 */public class ConsistentHashingWithoutVirtualNodeTest {    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class);    private ConsistentHashing consistentHashing;    private String[] servers;    private String[] data;    @Before    public void before(){        servers = new String[]{"000", "111", "222", "333", "555"};        consistentHashing = new ConsistentHashingImpl(servers);        data = new String[]{"000", "111", "222", "333", "555"};    }    @Test    public void testConsistentHashing(){        for(String str : data){            Assert.assertTrue(consistentHashing.putData(str));        }        consistentHashing.removeNode("333");        consistentHashing.addNode("444");        consistentHashing.putData("444");        consistentHashing.printAllData();    }}
测试后果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
含虚构节点的一致性哈希测试
测试代码
package ConsistentHash;import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;import org.junit.Assert;import org.junit.Before;import org.junit.Test;import org.slf4j.Logger;import org.slf4j.LoggerFactory;/** * @author cartoon * @date 2021/01/17 */public class ConsistentHashingWithVirtualNodeTest {    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class);    private ConsistentHashing consistentHashing;    private String[] servers;    private String[] data;    @Before    public void before(){        servers = new String[]{"000", "111", "222", "333", "555"};        consistentHashing = new ConsistentHashingImpl(3, servers);        data = new String[]{"000", "111", "222", "333", "555"};    }    @Test    public void testConsistentHashing(){        for(String str : data){            Assert.assertTrue(consistentHashing.putData(str));        }        consistentHashing.removeNode("333");        consistentHashing.addNode("444");        consistentHashing.putData("444");        consistentHashing.putData("555&&0");        consistentHashing.printAllData();    }}
测试后果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []

实现存在的缺点

1. 哈希算法过于简略,哈希抵触概率较大2. 实在节点含有虚构节点的数量不均3. 节点上线时实在节点与已存在的虚构节点的程序抵触尚未解决

后记

本次实现的所有代码已全副上传到 github,我的项目次要蕴含一些罕用的算法,如排序算法,限流算法的简略实现,欢送提 issue。