本文作者系:视线金服工程师 | 吴海胜
首发于 Nebula Graph 论坛:https://discuss.nebula-graph.com.cn/t/topic/1388
一、前言
本文介绍如何应用 Docker Swarm 来部署 Nebula Graph 集群,并部署客户端负载平衡和高可用。
二、nebula 集群搭建
2.1 环境筹备
机器筹备
在装置前确保所有机器已装置 Docker
2.2 初始化 swarm 集群
在 192.168.1.166 机器上执行
$ docker swarm init --advertise-addr 192.168.1.166Swarm initialized: current node (dxn1zf6l61qsb1josjja83ngz) is now a manager.To add a worker to this swarm, run the following command: docker swarm join \ --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \ 192.168.1.166:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
2.3 退出 worker 节点
依据 init 命令提醒内容,退出 swarm worker 节点,在 192.168.1.167 192.168.1.168 别离执行
docker swarm join \ --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \ 192.168.1.166:2377
2.4 验证集群
docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSIONh0az2wzqetpwhl9ybu76yxaen * KF2-DATA-166 Ready Active Reachable 18.06.1-ceq6jripaolxsl7xqv3cmv5pxji KF2-DATA-167 Ready Active Leader 18.06.1-ceh1iql1uvm7123h3gon9so69dy KF2-DATA-168 Ready Active 18.06.1-ce
2.5 配置 docker stack
vi docker-stack.yml
配置如下内容
version: '3.6'services: metad0: image: vesoft/nebula-metad:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.166 - --ws_ip=192.168.1.166 - --port=45500 - --data_path=/data/meta - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-166 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.166:11000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 11000 published: 11000 protocol: tcp mode: host - target: 11002 published: 11002 protocol: tcp mode: host - target: 45500 published: 45500 protocol: tcp mode: host volumes: - data-metad0:/data/meta - logs-metad0:/logs networks: - nebula-net metad1: image: vesoft/nebula-metad:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.167 - --ws_ip=192.168.1.167 - --port=45500 - --data_path=/data/meta - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-167 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.167:11000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 11000 published: 11000 protocol: tcp mode: host - target: 11002 published: 11002 protocol: tcp mode: host - target: 45500 published: 45500 protocol: tcp mode: host volumes: - data-metad1:/data/meta - logs-metad1:/logs networks: - nebula-net metad2: image: vesoft/nebula-metad:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.168 - --ws_ip=192.168.1.168 - --port=45500 - --data_path=/data/meta - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-168 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.168:11000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 11000 published: 11000 protocol: tcp mode: host - target: 11002 published: 11002 protocol: tcp mode: host - target: 45500 published: 45500 protocol: tcp mode: host volumes: - data-metad2:/data/meta - logs-metad2:/logs networks: - nebula-net storaged0: image: vesoft/nebula-storaged:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.166 - --ws_ip=192.168.1.166 - --port=44500 - --data_path=/data/storage - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-166 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.166:12000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 12000 published: 12000 protocol: tcp mode: host - target: 12002 published: 12002 protocol: tcp mode: host volumes: - data-storaged0:/data/storage - logs-storaged0:/logs networks: - nebula-net storaged1: image: vesoft/nebula-storaged:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.167 - --ws_ip=192.168.1.167 - --port=44500 - --data_path=/data/storage - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-167 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.167:12000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 12000 published: 12000 protocol: tcp mode: host - target: 12002 published: 12004 protocol: tcp mode: host volumes: - data-storaged1:/data/storage - logs-storaged1:/logs networks: - nebula-net storaged2: image: vesoft/nebula-storaged:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --local_ip=192.168.1.168 - --ws_ip=192.168.1.168 - --port=44500 - --data_path=/data/storage - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-168 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.168:12000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 12000 published: 12000 protocol: tcp mode: host - target: 12002 published: 12006 protocol: tcp mode: host volumes: - data-storaged2:/data/storage - logs-storaged2:/logs networks: - nebula-net graphd1: image: vesoft/nebula-graphd:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --port=3699 - --ws_ip=192.168.1.166 - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-166 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.166:13000/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 3699 published: 3699 protocol: tcp mode: host - target: 13000 published: 13000 protocol: tcp# mode: host - target: 13002 published: 13002 protocol: tcp mode: host volumes: - logs-graphd:/logs networks: - nebula-net graphd2: image: vesoft/nebula-graphd:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --port=3699 - --ws_ip=192.168.1.167 - --log_dir=/logs - --v=2 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-167 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.167:13001/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 3699 published: 3640 protocol: tcp mode: host - target: 13000 published: 13001 protocol: tcp mode: host - target: 13002 published: 13003 protocol: tcp# mode: host volumes: - logs-graphd2:/logs networks: - nebula-net graphd3: image: vesoft/nebula-graphd:nightly env_file: - ./nebula.env command: - --meta_server_addrs=192.168.1.166:45500,192.168.1.167:45500,192.168.1.168:45500 - --port=3699 - --ws_ip=192.168.1.168 - --log_dir=/logs - --v=0 - --minloglevel=2 deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.hostname == KF2-DATA-168 depends_on: - metad0 - metad1 - metad2 healthcheck: test: ["CMD", "curl", "-f", "http://192.168.1.168:13002/status"] interval: 30s timeout: 10s retries: 3 start_period: 20s ports: - target: 3699 published: 3641 protocol: tcp mode: host - target: 13000 published: 13002 protocol: tcp# mode: host - target: 13002 published: 13004 protocol: tcp mode: host volumes: - logs-graphd3:/logs networks: - nebula-netnetworks: nebula-net: external: true attachable: true name: hostvolumes: data-metad0: logs-metad0: data-metad1: logs-metad1: data-metad2: logs-metad2: data-storaged0: logs-storaged0: data-storaged1: logs-storaged1: data-storaged2: logs-storaged2: logs-graphd: logs-graphd2: logs-graphd3:
编辑 nebula.env,退出如下内容
TZ=UTCUSER=root
2.6 启动 nebula 集群
docker stack deploy nebula -c docker-stack.yml
三、集群负载平衡及高可用配置
Nebula Graph 的客户端目前(1.X)没有提供负载平衡的能力,只是随机选一个 graphd 去连贯。所以生产应用的时候要本人做个负载平衡和高可用。
图 3.1
将整个部署架构分为三层,数据服务层,负载平衡层及高可用层。如图 3.1 所示
负载平衡层:对 client 申请做负载平衡,将申请散发至下方数据服务层
高可用层: 这里实现的是 haproxy 的高可用,保障负载平衡层的服务从而保障整个集群的失常服务
3.1 负载平衡配置
haproxy 应用 docker-compose 配置。别离编辑以下三个文件
Dockerfile 退出以下内容
FROM haproxy:1.7COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfgEXPOSE 3640
docker-compose.yml 退出以下内容
version: "3.2"services: haproxy: container_name: haproxy build: . volumes: - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg ports: - 3640:3640 restart: always networks: - app_netnetworks: app_net: external: true
haproxy.cfg 退出以下内容
global daemon maxconn 30000 log 127.0.0.1 local0 info log 127.0.0.1 local1 warningdefaults log-format %hr\ %ST\ %B\ %Ts log global mode http option http-keep-alive timeout connect 5000ms timeout client 10000ms timeout server 50000ms timeout http-request 20000ms# custom your own frontends && backends && listen conf# CUSTOMlisten graphd-cluster bind *:3640 mode tcp maxconn 300 balance roundrobin server server1 192.168.1.166:3699 maxconn 300 check server server2 192.168.1.167:3699 maxconn 300 check server server3 192.168.1.168:3699 maxconn 300 checklisten stats bind *:1080 stats refresh 30s stats uri /stats
3.2 启动 haproxy
docker-compose up -d
3.3 高可用配置
注:配置 keepalive 需事后筹备好 vip(虚构 ip),在以下配置中 192.168.1.99 便为虚构 ip
在 192.168.1.166 、192.168.1.167、192.168.1.168上 均做以下配置
装置 keepalived
apt-get update && apt-get upgrade && apt-get install keepalived -y
更改 keepalived配置文件 /etc/keepalived/keepalived.conf
(三台机器中 做如下配置,priority 应设置不同值确定优先级)
192.168.1.166 机器配置
global_defs { router_id lb01 # 标识信息,一个名字而已;}vrrp_script chk_haproxy { script "killall -0 haproxy" interval 2}vrrp_instance VI_1 { state MASTER interface ens160 virtual_router_id 52 priority 999 # 设定 MASTER 与 BACKUP 负载均衡器之间同步查看的工夫距离,单位是秒 advert_int 1 # 设置验证类型和明码 authentication { # 设置验证类型,次要有 PASS 和 AH 两种 auth_type PASS # 设置验证明码,在同一个 vrrp_instance 下,MASTER 与 BACKUP 必须应用雷同的明码能力失常通信 auth_pass amber1 } virtual_ipaddress { # 虚构 IP 为 192.168.1.99/24; 绑定接口为 ens160; 别名 ens169:1,主备雷同 192.168.1.99/24 dev ens160 label ens160:1 } track_script { chk_haproxy }}
167 机器配置
global_defs { router_id lb01 # 标识信息,一个名字而已;}vrrp_script chk_haproxy { script "killall -0 haproxy" interval 2}vrrp_instance VI_1 { state BACKUP interface ens160 virtual_router_id 52 priority 888 # 设定 MASTER 与 BACKUP 负载均衡器之间同步查看的工夫距离,单位是秒 advert_int 1 # 设置验证类型和明码 authentication { # 设置验证类型,次要有 PASS 和 AH 两种 auth_type PASS # 设置验证明码,在同一个 vrrp_instance 下,MASTER 与 BACKUP 必须应用雷同的明码能力失常通信 auth_pass amber1 } virtual_ipaddress { # 虚构 IP 为 192.168.1.99/24; 绑定接口为 ens160; 别名 ens160:1,主备雷同 192.168.1.99/24 dev ens160 label ens160:1 } track_script { chk_haproxy }}
168 机器配置
global_defs { router_id lb01 # 标识信息,一个名字而已;}vrrp_script chk_haproxy { script "killall -0 haproxy" interval 2}vrrp_instance VI_1 { state BACKUP interface ens160 virtual_router_id 52 priority 777 # 设定 MASTER 与 BACKUP 负载均衡器之间同步查看的工夫距离,单位是秒 advert_int 1 # 设置验证类型和明码 authentication { # 设置验证类型,次要有 PASS 和 AH 两种 auth_type PASS # 设置验证明码,在同一个 vrrp_instance 下,MASTER 与 BACKUP 必须应用雷同的明码能力失常通信 auth_pass amber1 } virtual_ipaddress { # 虚构 IP 为 192.168.1.99/24;绑定接口为 ens160; 别名 ens160:1,主备雷同 192.168.1.99/24 dev ens160 label ens160:1 } track_script { chk_haproxy }}
keepalived 相干命令
# 启动 keepalivedsystemctl start keepalived# 使 keepalived 开机自启systemctl enable keeplived# 重启 keepalivedsystemctl restart keepalived
四、其余
离线怎么部署?把镜像更改为公有镜像库就成了,有问题欢送来勾结啊。
我的小鱼你醒了 还意识晚上吗 昨夜你已经说 愿夜幕永不开启
如果你对本文有任何疑难,欢送来论坛和原作者聊聊~~ 原帖地址:https://discuss.nebula-graph.com.cn/t/topic/1388