乐趣区

干货-TiDB-Operator实践

K8s 和 TiDB 都是目前开源社区中活跃的开源产品,TiDB
Operator 项目是一个在 K8s 上编排管理 TiDB 集群的项目。本文详细记录了部署 K8s 及 install TiDB
Operator 的详细实施过程,希望能对刚 ” 入坑 ” 的同学有所帮助。

一、环境

Ubuntu 16.04
K8s 1.14.1

二、Kubespray 安装 K8s

配置免密登录

1 yum -y install expect
  • vi /tmp/autocopy.exp
 1 #!/usr/bin/expect
 2
 3 set timeout
 4 set user_hostname [lindex $argv]
 5 set password [lindex $argv]
 6 spawn ssh-copy-id $user_hostname
 7    expect {8        "(yes/no)?"
 9        {
10            send "yes\n"
11            expect "*assword:" {send "$password\n"}
12        }
13        "*assword:"
14        {
15            send "$password\n"
16        }
17    }
18 expect eof
 1 ssh-keyscan addedip  >> ~/.ssh/known_hosts
 2
 3 ssh-keygen -t rsa -P ''
 4
 5 for i in 10.0.0.{31,32,33,40,10,20,50}; do  ssh-keyscan $i  >> ~/.ssh/known_hosts ; done
 6
 7 /tmp/autocopy.exp root@addeip
 8 ssh-copy-id addedip
 9
10 /tmp/autocopy.exp root@10.0.0.31
11 /tmp/autocopy.exp root@10.0.0.32
12 /tmp/autocopy.exp root@10.0.0.33
13 /tmp/autocopy.exp root@10.0.0.40
14 /tmp/autocopy.exp root@10.0.0.10
15 /tmp/autocopy.exp root@10.0.0.20
16 /tmp/autocopy.exp root@10.0.0.50

配置 Kubespray

1 pip install -r requirements.txt
2 cp -rfp inventory/sample inventory/mycluster
  • inventory/mycluster/inventory.ini
  • inventory/mycluster/inventory.ini
 1 # ## Configure 'ip' variable to bind kubernetes services on a
 2 # ## different ip than the default iface
 3 # ## We should set etcd_member_name for etcd cluster. The node that is not a etcd member do not need to set the value, or can set the empty string value.
 4 [all]
 5 # node1 ansible_host=95.54.0.12  # ip=10.3.0.1 etcd_member_name=etcd1
 6 # node2 ansible_host=95.54.0.13  # ip=10.3.0.2 etcd_member_name=etcd2
 7 # node3 ansible_host=95.54.0.14  # ip=10.3.0.3 etcd_member_name=etcd3
 8 # node4 ansible_host=95.54.0.15  # ip=10.3.0.4 etcd_member_name=etcd4
 9 # node5 ansible_host=95.54.0.16  # ip=10.3.0.5 etcd_member_name=etcd5
10 # node6 ansible_host=95.54.0.17  # ip=10.3.0.6 etcd_member_name=etcd6
11 etcd1 ansible_host=10.0.0.31 etcd_member_name=etcd1
12 etcd2 ansible_host=10.0.0.32 etcd_member_name=etcd2
13 etcd3 ansible_host=10.0.0.33 etcd_member_name=etcd3
14 master1 ansible_host=10.0.0.40
15 node1 ansible_host=10.0.0.10
16 node2 ansible_host=10.0.0.20
17 node3 ansible_host=10.0.0.50
18
19 # ## configure a bastion host if your nodes are not directly reachable
20 # bastion ansible_host=x.x.x.x ansible_user=some_user
21
22 [kube-master]
23 # node1
24 # node2
25 master1
26 [etcd]
27 # node1
28 # node2
29 # node3
30 etcd1
31 etcd2
32 etcd3
33
34 [kube-node]
35 # node2
36 # node3
37 # node4
38 # node5
39 # node6
40 node1
41 node2
42 node3
43
44 [k8s-cluster:children]
45 kube-master
46 kube-node

节点所需镜像的文件

由于某些镜像国内无法访问需要现将镜像通过代理下载到本地然后上传到本地镜像仓库或 DockerHub,同时修改配置文件,个别组件存放位置 https://storage.googleapis.com,需要新建 Nginx 服务器分发文件。

建立 Nginx 服务器

  • ~/distribution/docker-compose.yml
  • 创建文件目录及 Nginx 配置文件目录
  • ~/distribution/conf.d/open_distribute.conf
  • 启动
  • 下载并上传所需文件 具体版本号参考 roles/download/defaults/main.yml 文件中 kubeadm_version、kube_version、image_arch 参数
  • 安装 Docker 及 Docker-Compose
 1 apt-get install \
 2 apt-transport-https \
 3 ca-certificates \
 4 curl \
 5 gnupg-agent \
 6 software-properties-common
 7
 8 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
 9
10 add-apt-repository \
11 "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
12 $(lsb_release -cs) \
13 stable"
14
15 apt-get update
16
17 apt-get install docker-ce docker-ce-cli containerd.io
18
19 chmod +x /usr/local/bin/docker-compose
20 sudo curl -L "https://github.com/docker/compose/releases/download/1.24.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
  • 新建 Nginx docker-compose.yml
1 mkdir ~/distribution
2 vi ~/distribution/docker-compose.yml
 1 #  distribute
 2 version: '2'
 3 services:    
 4    distribute:
 5        image: nginx:1.15.12
 6        volumes:
 7            - ./conf.d:/etc/nginx/conf.d
 8            - ./distributedfiles:/usr/share/nginx/html
 9        network_mode: "host"
10        container_name: nginx_distribute 
1 mkdir ~/distribution/distributedfiles
2 mkdir ~/distribution/
3 mkdir ~/distribution/conf.d
4 vi ~/distribution/conf.d/open_distribute.conf
 1 #open_distribute.conf
 2
 3 server {
 4    #server_name distribute.search.leju.com;
 5        listen 8888;
 6
 7    root /usr/share/nginx/html;
 8
 9    add_header Access-Control-Allow-Origin *;  
10    add_header Access-Control-Allow-Headers X-Requested-With;  
11    add_header Access-Control-Allow-Methods GET,POST,OPTIONS;  
12
13    location / {
14    #    index index.html;
15                autoindex on;        
16    }
17    expires off;
18    location ~ .*\.(gif|jpg|jpeg|png|bmp|swf|eot|ttf|woff|woff2|svg)$ {
19        expires -1;
20    }
21
22    location ~ .*\.(js|css)?$ {
23        expires -1 ;
24    }
25 } # end of public static files domain : [distribute.search.leju.com]
1 docker-compose up -d
1 wget https://storage.googleapis.com/kubernetes-release/release/v1.14.1/bin/linux/amd64/kubeadm
2
3 scp /tmp/kubeadm  10.0.0.60:/root/distribution/distributedfiles
4
5 wget https://storage.googleapis.com/kubernetes-release/release/v1.14.1/bin/linux/amd64/hyperkube
  • 需要下载并上传到私有仓库的镜像
 1 docker pull k8s.gcr.io/cluster-proportional-autoscaler-amd64:1.4.0
 2 docker tag k8s.gcr.io/cluster-proportional-autoscaler-amd64:1.4.0 jiashiwen/cluster-proportional-autoscaler-amd64:1.4.0
 3 docker push jiashiwen/cluster-proportional-autoscaler-amd64:1.4.0
 4
 5 docker pull k8s.gcr.io/k8s-dns-node-cache:1.15.1
 6 docker tag k8s.gcr.io/k8s-dns-node-cache:1.15.1 jiashiwen/k8s-dns-node-cache:1.15.1
 7 docker push jiashiwen/k8s-dns-node-cache:1.15.1
 8
 9 docker pull gcr.io/google_containers/pause-amd64:3.1
10 docker tag gcr.io/google_containers/pause-amd64:3.1 jiashiwen/pause-amd64:3.1
11 docker push jiashiwen/pause-amd64:3.1
12
13 docker pull gcr.io/google_containers/kubernetes-dashboard-amd64:v1.10.1
14 docker tag gcr.io/google_containers/kubernetes-dashboard-amd64:v1.10.1 jiashiwen/kubernetes-dashboard-amd64:v1.10.1
15 docker push jiashiwen/kubernetes-dashboard-amd64:v1.10.1
16
17 docker pull gcr.io/google_containers/kube-apiserver:v1.14.1
18 docker tag gcr.io/google_containers/kube-apiserver:v1.14.1 jiashiwen/kube-apiserver:v1.14.1
19 docker push jiashiwen/kube-apiserver:v1.14.1
20
21 docker pull gcr.io/google_containers/kube-controller-manager:v1.14.1
22 docker tag gcr.io/google_containers/kube-controller-manager:v1.14.1 jiashiwen/kube-controller-manager:v1.14.1
23 docker push jiashiwen/kube-controller-manager:v1.14.1
24
25 docker pull gcr.io/google_containers/kube-scheduler:v1.14.1
26 docker tag gcr.io/google_containers/kube-scheduler:v1.14.1 jiashiwen/kube-scheduler:v1.14.1
27 docker push jiashiwen/kube-scheduler:v1.14.1
28
29 docker pull gcr.io/google_containers/kube-proxy:v1.14.1
30 docker tag gcr.io/google_containers/kube-proxy:v1.14.1 jiashiwen/kube-proxy:v1.14.1
31 docker push jiashiwen/kube-proxy:v1.14.1
32
33 docker pull gcr.io/google_containers/pause:3.1
34 docker tag gcr.io/google_containers/pause:3.1 jiashiwen/pause:3.1
35 docker push jiashiwen/pause:3.1
36
37 docker pull gcr.io/google_containers/coredns:1.3.1
38 docker tag gcr.io/google_containers/coredns:1.3.1 jiashiwen/coredns:1.3.1
39 docker push  jiashiwen/coredns:1.3.1
  • 用于下载上传镜像的脚本
 1 #!/bin/bash
 2
 3 privaterepo=jiashiwen
 4
 5 k8sgcrimages=(
 6 cluster-proportional-autoscaler-amd64:1.4.0
 7 k8s-dns-node-cache:1.15.1
 8 )
 9
10 gcrimages=(
11 pause-amd64:3.1
12 kubernetes-dashboard-amd64:v1.10.1
13 kube-apiserver:v1.14.1
14 kube-controller-manager:v1.14.1
15 kube-scheduler:v1.14.1
16 kube-proxy:v1.14.1
17 pause:3.1
18 coredns:1.3.1
19 )
20
21
22 for k8sgcrimageName in ${k8sgcrimages[@]} ; do
23 echo $k8sgcrimageName
24 docker pull k8s.gcr.io/$k8sgcrimageName
25 docker tag k8s.gcr.io/$k8sgcrimageName $privaterepo/$k8sgcrimageName
26 docker push $privaterepo/$k8sgcrimageName
27 done
28
29
30 for gcrimageName in ${gcrimages[@]} ; do
31 echo $gcrimageName
32 docker pull gcr.io/google_containers/$gcrimageName
33 docker tag gcr.io/google_containers/$gcrimageName $privaterepo/$gcrimageName
34 docker push $privaterepo/$gcrimageName
35 done
  • 修改文件 inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml,修改 K8s 镜像仓库
1 # kube_image_repo: "gcr.io/google-containers"
2 kube_image_repo: "jiashiwen"
  • 修改 roles/download/defaults/main.yml
 1 #dnsautoscaler_image_repo: "k8s.gcr.io/cluster-proportional-autoscaler-{{image_arch}}"
 2 dnsautoscaler_image_repo: "jiashiwen/cluster-proportional-autoscaler-{{image_arch}}"
 3
 4 #kube_image_repo: "gcr.io/google-containers"
 5 kube_image_repo: "jiashiwen"
 6
 7 #pod_infra_image_repo: "gcr.io/google_containers/pause-{{image_arch}}"
 8 pod_infra_image_repo: "jiashiwen/pause-{{image_arch}}"
 9
10 #dashboard_image_repo: "gcr.io/google_containers/kubernetes-dashboard-{{image_arch}}"
11 dashboard_image_repo: "jiashiwen/kubernetes-dashboard-{{image_arch}}"
12
13 #nodelocaldns_image_repo: "k8s.gcr.io/k8s-dns-node-cache"
14 nodelocaldns_image_repo: "jiashiwen/k8s-dns-node-cache"
15
16 #kubeadm_download_url: "https://storage.googleapis.com/kubernetes-release/  release/{{kubeadm_version}}/bin/linux/{{image_arch}}/kubeadm"
17 kubeadm_download_url: "http://10.0.0.60:8888/kubeadm"
18
19 #hyperkube_download_url: "https://storage.googleapis.com/  kubernetes-release/release/{{kube_version}}/bin/linux/{{image_arch}}/  hyperkube"
20 hyperkube_download_url: "http://10.0.0.60:8888/hyperkube"

三、执行安装

  • 安装命令
1 ansible-playbook -i inventory/mycluster/inventory.ini cluster.yml
  • 重置命令
1 ansible-playbook -i inventory/mycluster/inventory.ini reset.yml

四、验证 K8s 集群

安装 Kubectl

  • 本地浏览器打开 https://storage.googleapis.co…
  • 用上一步得到的最新版本号 v1.7.1 替换下载地址中的 $(curl -s https://storage.googleapis.co…:// storage.googleapis.com/kubernetes-release/release/v1.14.1/bin/linux/amd64/kubectl
  • 上传下载好的 kubectl
1 scp /tmp/kubectl root@xxx:/root
  • 修改属性
1 chmod +x ./kubectl
2 mv ./kubectl /usr/local/bin/kubectl
  • Ubuntu
1 sudo snap install kubectl --classic
  • CentOS

将 master 节点上的~/.kube/config 文件复制到你需要访问集群的客户端上即可

1 scp 10.0.0.40:/root/.kube/config ~/.kube/config

执行命令验证集群

1 kubectl get nodes
2 kubectl cluster-info

五、TiDB-Operaor 部署

安装 helm

https://blog.csdn.net/bbwangj…

  • 安装 helm
1 curl https://raw.githubusercontent.com/helm/helm/master/scripts/get > get_helm.sh
2 chmod 700 get_helm.sh
3 ./get_helm.sh
  • 查看 helm 版本
1 helm version
  • 初始化
1 helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.13.1 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts

为 K8s 提供 local volumes

  • 参考文档 https://github.com/kubernetes…
    tidb-operator 启动会为 pd 和 tikv 绑定 pv,需要在 discovery directory 下创建多个目录
  • 格式化并挂载磁盘
1 mkfs.ext4 /dev/vdb
2 DISK_UUID=$(blkid -s UUID -o value /dev/vdb) 
3 mkdir /mnt/$DISK_UUID
4 mount -t ext4 /dev/vdb /mnt/$DISK_UUID
  • /etc/fstab 持久化 mount
1 echo UUID=`sudo blkid -s UUID -o value /dev/vdb` /mnt/$DISK_UUID ext4 defaults 0 2 | sudo tee -a /etc/fstab
  • 创建多个目录并 mount 到 discovery directory
1 for i in $(seq 1 10); do
2 sudo mkdir -p /mnt/${DISK_UUID}/vol${i} /mnt/disks/${DISK_UUID}_vol${i}
3 sudo mount --bind /mnt/${DISK_UUID}/vol${i} /mnt/disks/${DISK_UUID}_vol${i}
4 done
  • /etc/fstab 持久化 mount
1 for i in $(seq 1 10); do
2 echo /mnt/${DISK_UUID}/vol${i} /mnt/disks/${DISK_UUID}_vol${i} none bind 0 0 | sudo tee -a /etc/fstab
3 done
  • 为 tidb-operator 创建 local-volume-provisioner
1 $ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/manifests/local-dind/local-volume-provisioner.yaml
2 $ kubectl get po -n kube-system -l app=local-volume-provisioner
3 $ kubectl get pv --all-namespaces | grep local-storage 

六、Install TiDB Operator

  • 项目中使用了 gcr.io/google-containers/hyperkube,国内访问不了,简单的办法是把镜像重新 push 到 dockerhub 然后修改 charts/tidb-operator/values.yaml
 1 scheduler:
 2  # With rbac.create=false, the user is responsible for creating this   account
 3  # With rbac.create=true, this service account will be created
 4  # Also see rbac.create and clusterScoped
 5  serviceAccount: tidb-scheduler
 6  logLevel: 2
 7  replicas: 1
 8  schedulerName: tidb-scheduler
 9  resources:
10    limits:
11      cpu: 250m
12      memory: 150Mi
13    requests:
14      cpu: 80m
15      memory: 50Mi
16  # kubeSchedulerImageName: gcr.io/google-containers/hyperkube
17  kubeSchedulerImageName: yourrepo/hyperkube
18  # This will default to matching your kubernetes version
19  # kubeSchedulerImageTag: latest
  • TiDB Operator 使用 CRD 扩展 Kubernetes,因此要使用 TiDB Operator,首先应该创建 TidbCluster 自定义资源类型。
1 kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/manifests/crd.yaml
2 kubectl get crd tidbclusters.pingcap.com
  • 安装 TiDB-Operator
1 $ git clone https://github.com/pingcap/tidb-operator.git
2 $ cd tidb-operator
3 $ helm install charts/tidb-operator --name=tidb-operator   --namespace=tidb-admin
4 $ kubectl get pods --namespace tidb-admin -l app.kubernetes.io/  instance=tidb-operator

七、部署 TiDB

1 helm install charts/tidb-cluster --name=demo --namespace=tidb
2 watch kubectl get pods --namespace tidb -l app.kubernetes.io/instance=demo -o wide

八、验证

安装 MySQL 客户端

  • 参考文档 https://dev.mysql.com/doc/ref…
  • CentOS 安装
1 wget https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm
2 yum localinstall mysql80-community-release-el7-3.noarch.rpm -y
3 yum repolist all | grep mysql
4 yum-config-manager --disable mysql80-community
5 yum-config-manager --enable mysql57-community
6 yum install mysql-community-client
  • Ubuntu 安装
1 wget https://dev.mysql.com/get/mysql-apt-config_0.8.13-1_all.deb
2 dpkg -i mysql-apt-config_0.8.13-1_all.deb
3 apt update
4
5 # 选择 MySQL 版本
6 dpkg-reconfigure mysql-apt-config
7 apt install mysql-client -y

九、映射 TiDB 端口

  • 查看 TiDB Service
1 kubectl get svc --all-namespaces
  • 映射 TiDB 端口
1 # 仅本地访问
2 kubectl port-forward svc/demo-tidb 4000:4000 --namespace=tidb
3
4 # 其他主机访问
5 kubectl port-forward --address 0.0.0.0 svc/demo-tidb 4000:4000 --namespace=tidb
  • 首次登录 MySQL
1 mysql -h 127.0.0.1 -P 4000 -u root -D test
  • 修改 TiDB 密码
1 SET PASSWORD FOR 'root'@'%' = 'wD3cLpyO5M'; FLUSH PRIVILEGES;

趟坑小记

1、K8s 国内安装

K8s 镜像多在 gcr.io 国内访问不到,基本做法是把镜像导入 DockerHub 或者私有镜像,这一点在 K8s 部署章节有详细过程就不累述了。

2、TiDB-Operator 本地存储配置

Operator 在启动集群时 pd 和 TiKV 需要绑定本地存储如果挂载点不足会导致 pod 启动过程中找不到可已 bond 的 pv 始终处于 pending 或 createing 状态,详细配请参阅 https://github.com/kubernetes…“Sharing a disk filesystem by multiple filesystem PVs”一节,同一块磁盘绑定多个挂载目录,为 Operator 提供足够的 bond

3、MySQL 客户端版本问题

目前 TiDB 只支持 MySQL5.7 版本客户端 8.0 会报 ERROR 1105 (HY000): Unknown charset id 255


点击 ”K8s” 了解更多详情。

文章转载自公众号 ” 北京 IT 爷们儿 ”

退出移动版