前言
近段时间在理解分布式时,常常绕不开一个算法: 一致性哈希算法。于是在理解并实际这个算法后,就有了此文章。
算法间的比照
在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。
取模 | 分段 | 一致性哈希 | |
---|---|---|---|
下层是否感知 | 是 | 是 | 否 |
迁徙老本 | 高 | 高 | 低,只波及相邻节点 |
单点故障影响 | 高 | 高 | 低,只影响相邻节点 |
算法复杂度 | 低 | 低 | 高 |
热点数据 | 存在 | 存在 | 存在 |
一致性哈希次要解决问题
从上述比照可知,一致性哈希次要升高节点高低线中带来的数据迁徙老本,同时节点数量的变动与分片准则于下层无感,使下层更专一于畛域内逻辑的编写,使整体架构更加灵便。
一致性 hash 原理
- 根本数据结构
根本数据类型因人而已,惯例的哈希取模采纳大多采纳将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。
本次实现采纳的是 key 按大小排列的哈希表。原打算应用数组承接数据,但排序老本随 key 的增多而加大,遂放弃。
- 数据存储
数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差别之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。
场景假如: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,后果如下图所示。
本次实现采纳的思路是
1. 计算存入数据的 hash 值2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中
- 节点上线
新节点退出一致性哈希环中,原理是通过计算节点所代表的 hash 值,并依据计算值将节点映射在环上,最初迁徙相邻节点数据到新节点上。
场景假如: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。后果如下图所示。
本次实现采纳的思路是
1. 计算上线节点 hash 值2. 计算上线节点所新增的虚构节点的 hash 值(若初始化指定虚构节点数量)3. 寻找最近的(比上线节点与虚构节点 hash 值大的最小的节点 hash)节点,取出节点数据4. 将1 2点节点退出到 hash 环中5. 将 3 中取出的数据从新放入到 hash 环上
- 节点下线
已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据从新放入 hash 环上。
场景假如: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。后果如下图所示。
本次实现采纳的思路是
1. 计算下线节点 hash 值2. 取出下线节点以及虚构节点(若初始化指定虚构节点数量)存储数据3. 将下线节点以及虚构节点(若初始化指定虚构节点数量)从 hash 环上移除4. 将 2 中数据从新放入到环上
代码实现
一致性哈希分为两个计划: 不带虚构节点与带虚构节点。而两个计划实现相似,所以本次实现将两种计划合在一起实现。实现如下。
package org.CommonAlgorithms.ConsistentHash;import org.CommonAlgorithms.HashAlgorithm.HashService;import java.util.List;/** * consistentHashing interface * @author cartoon * @since 10/01/2021 * @version 1.1 */public interface ConsistentHashing { /** * put data to hash loop * @param data data list * @return put result */ boolean putData(List<String> data); /** * put data to hash loop * @param data data * @return put result */ boolean putData(String data); /** * remove node from hash loop * @param node removing node * @return remove result */ boolean removeNode(String node); /** * add node to hash loop * @param node adding node * @return add result */ boolean addNode(String node); /** * inject hash method to hash loop * @param hashService hash method * @throws UnsupportedOperationException if loop already has node */ void setHashMethod(HashService hashService); /** * print all data in loop according ascending hash value with nodes */ void printAllData();}
package org.CommonAlgorithms.ConsistentHash;import org.CommonAlgorithms.HashAlgorithm.HashService;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import java.util.*;/** * consistent hash achieve * @author cartoon * @since 2021/01/17 */public class ConsistentHashingImpl implements ConsistentHashing { private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class); /** * virtual node name template */ private static final String virtualNodeFormat = "%s&&%d"; /** * real node and its virtual node mapping */ private SortedMap<String, List<String>> realNodeToVirtualNode; /** * hash and its node mapping */ private SortedMap<Integer, String> hashToNodes; /** * node and its data mapping */ private Map<String, List<String>> nodeToData; /** * determine virtual node's number of each node */ private int virtualNodeNum; /** * inject hash method, if null, use loop default hash method */ private HashService hashService; public ConsistentHashingImpl() { this(0, new String[0]); } public ConsistentHashingImpl(String... nodes) { this(0, nodes); } public ConsistentHashingImpl(int virtualNodeNum) { this(virtualNodeNum, new String[0]); } public ConsistentHashingImpl(int virtualNodeNum, String... nodes) { //1. intercept virtual num smaller than 0 if(virtualNodeNum < 0){ log.error("virtual num is not allow smaller than 0"); throw new IllegalArgumentException(); } //2. initialize loop member attributes this.virtualNodeNum = virtualNodeNum; realNodeToVirtualNode = new TreeMap<>(); hashToNodes = new TreeMap<>(); nodeToData = new HashMap<>(); for(String server : nodes){ hashToNodes.put(getHash(server), server); nodeToData.put(server, new LinkedList<>()); } //3. if virtual node number bigger than 0, add virtual node if(virtualNodeNum > 0){ for(String server : nodes){ addVirtualNode(server); } } } @Override public boolean putData(List<String> data) { //1. circulate call put data method to add data to loop for(String incomingData : data){ if(!putData(incomingData)){ return false; } } return true; } @Override public boolean putData(String data) { if(hashToNodes.isEmpty()){ log.error("put data, usable server is empty"); return false; } //1. calculate data's hash value int currentHash = getHash(data); //2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash); String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey()); //3. add data to node List<String> dataList = nodeToData.get(node); dataList.add(data); log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash); return true; } @Override public boolean removeNode(String node) { //1. calculate hash value of removing node int removeServerHash = getHash(node); if(!hashToNodes.containsKey(removeServerHash)){ log.error("remove server, current server is not in server list, please check server ip"); return false; } //2. get data from removing node List<String> removeServerData = nodeToData.get(node); //3. get removing node's virtual node data, remove all virtual node with removing node if(virtualNodeNum != 0){ for(String virtualNode : realNodeToVirtualNode.get(node)){ removeServerData.addAll(nodeToData.get(virtualNode)); hashToNodes.remove(getHash(virtualNode)); nodeToData.remove(virtualNode); } } //4. remove node from hash loop hashToNodes.remove(removeServerHash); nodeToData.remove(node); if(hashToNodes.size() == 0){ log.info("remove server, after remove, server list is empty"); return true; } //5. put data to loop by call put data method putData(removeServerData); log.info("remove server, remove server {} success", node); return true; } @Override public boolean addNode(String node) { //1, calculate adding node's hash value int addServerHash = getHash(node); //2. add node and migrate data if(hashToNodes.isEmpty()){ //2.1 add node and its virtual node to loop directly when current loop is empty hashToNodes.put(addServerHash, node); nodeToData.put(node, new LinkedList<>()); if(virtualNodeNum > 0){ addVirtualNode(node); } } else{ //2.2.1 get data to be migrated from loop SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash); String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey()); List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer)); //2.2.2 add node and its virtual node to loop hashToNodes.put(addServerHash, node); nodeToData.put(greatServer, new LinkedList<>()); nodeToData.put(node, new LinkedList<>()); if(virtualNodeNum != 0){ addVirtualNode(node); } //2.2.3 migrate 2.2.1 data to loop by call put data method putData(firstGreatServerData); } log.info("add server, server {} has been added", node); return true; } @Override public void printAllData() { nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data)); } @Override public void setHashMethod(HashService hashService) { if(!hashToNodes.isEmpty()){ throw new UnsupportedOperationException(); } this.hashService = hashService; } private void addVirtualNode(String realNode){ if(virtualNodeNum > 0){ List<String> virtualNodeList = new LinkedList<>(); for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){ //1. generate virtual node name by default format String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt); //2. calculate each virtual node's hash value int virtualNodeHash = getHash(virtualNodeName); //3. current node already exist in loop, continue if(hashToNodes.containsKey(virtualNodeHash)){ continue; } //4. add node to loop virtualNodeList.add(virtualNodeName); hashToNodes.put(virtualNodeHash, virtualNodeName); nodeToData.put(virtualNodeName, new LinkedList<>()); } //5. map virtual node to real node realNodeToVirtualNode.put(realNode, virtualNodeList); } } private int getHash(String data){ return hashService == null ? defaultGetHash(data) : hashService.getHash(data); } private int defaultGetHash(String data){ int res = 0; for(char tempChar : data.toCharArray()){ if(tempChar >= '0' && tempChar <= '9'){ res += tempChar; } } return res; }}
测试后果
不带虚构节点的一致性哈希
测试代码
package ConsistentHash;import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;import org.junit.Assert;import org.junit.Before;import org.junit.Test;import org.slf4j.Logger;import org.slf4j.LoggerFactory;/** * @author cartoon * @date 2020/12/27 */public class ConsistentHashingWithoutVirtualNodeTest { private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class); private ConsistentHashing consistentHashing; private String[] servers; private String[] data; @Before public void before(){ servers = new String[]{"000", "111", "222", "333", "555"}; consistentHashing = new ConsistentHashingImpl(servers); data = new String[]{"000", "111", "222", "333", "555"}; } @Test public void testConsistentHashing(){ for(String str : data){ Assert.assertTrue(consistentHashing.putData(str)); } consistentHashing.removeNode("333"); consistentHashing.addNode("444"); consistentHashing.putData("444"); consistentHashing.printAllData(); }}
测试后果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
含虚构节点的一致性哈希测试
测试代码
package ConsistentHash;import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;import org.junit.Assert;import org.junit.Before;import org.junit.Test;import org.slf4j.Logger;import org.slf4j.LoggerFactory;/** * @author cartoon * @date 2021/01/17 */public class ConsistentHashingWithVirtualNodeTest { private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class); private ConsistentHashing consistentHashing; private String[] servers; private String[] data; @Before public void before(){ servers = new String[]{"000", "111", "222", "333", "555"}; consistentHashing = new ConsistentHashingImpl(3, servers); data = new String[]{"000", "111", "222", "333", "555"}; } @Test public void testConsistentHashing(){ for(String str : data){ Assert.assertTrue(consistentHashing.putData(str)); } consistentHashing.removeNode("333"); consistentHashing.addNode("444"); consistentHashing.putData("444"); consistentHashing.putData("555&&0"); consistentHashing.printAllData(); }}
测试后果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data [][main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []
实现存在的缺点
1. 哈希算法过于简略,哈希抵触概率较大2. 实在节点含有虚构节点的数量不均3. 节点上线时实在节点与已存在的虚构节点的程序抵触尚未解决
后记
本次实现的所有代码已全副上传到 github,我的项目次要蕴含一些罕用的算法,如排序算法,限流算法的简略实现,欢送提 issue。