前言
腾讯云绑定用户,开始应用过腾讯云的 tke1.10 版本。鉴于各种起因抉择了自建。线上 kubeadm 自建 kubernetes 集群 1.16 版本(小版本升级到 1.16.15)。kubeadm+haproxy+slb+flannel 搭建高可用集群,集群启用 ipvs。对外服务应用 slb 绑定 traefik tcp 80 443 端口对外映射(这是历史遗留问题,过来腾讯云 slb 不反对挂载多证书,这样也造成了无奈应用 slb 的日志投递性能,当初 slb 曾经反对了多证书的挂载,能够间接应用 http http 形式了)。生产环境过后搭建仓库没有应用腾讯云的块存储,间接应用 cbs。间接用了 local disk, 还有 nfs 的共享存储。前几天整了个我的项目的压力测试,而后应用 nfs 存储的我的项目 IO 间接就飙升了。生产环境不倡议应用。筹备装置 kubernetes 1.20 版本,并应用 cilium 组网。hubble 代替 kube-proxy 体验一下 ebpf。另外也间接上 containerd。dockershim 的形式的确也浪费资源的。这样也是能够缩小资源开销,部署速度的。反正就是体验一下各种最新性能:
图片援用自:https://blog.kelu.org/tech/2020/10/09/the-diff-between-docker-containerd-runc-docker-shim.html
环境筹备:
主机名 | ip | 零碎 | 内核 |
---|---|---|---|
sh-master-01 | 10.3.2.5 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
sh-master-02 | 10.3.2.13 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
sh-master-03 | 10.3.2.16 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
sh-work-01 | 10.3.2.2 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
sh-work-02 | 10.3.2.2 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
sh-work-03 | 10.3.2.4 | centos8 | 4.18.0-240.15.1.el8_3.x86_64 |
注:用 centos8 是为了懒降级内核版本了。centos7 内核版本 3.10 的确有些老了。然而同样的 centos8 kubernetes 源是没有的,只能应用 centos7 的源。
VIP slb 地址:10.3.2.12(因为内网没有应用域名的需要,间接用了传统型内网负载,为了让 slb 映射端口与本地端口一样两头加了一层 haproxy 代理本地 6443. 而后 slb 代理 8443 端口为 6443.)。
1. 零碎初始化:
注:因为环境是部署在私有云的,应用了懒人办法。间接初始化了一台 server. 而后其余的间接都是复制的形式搭建的。
1. 更改主机名
hostnamectl set-hostname sh-master-01
cat /etc/hosts
就是举个例子了。我的 host 文件只在三台 master 节点写了,work 节点都没有写的 …….
2. 敞开 swap 替换分区
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
3. 敞开 selinux
setenforce 0
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux
sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config
4. 敞开防火墙
systemctl disable --now firewalld
chkconfig firewalld off
5. 调整文件关上数等配置
cat> /etc/security/limits.conf <<EOF
* soft nproc 1000000
* hard nproc 1000000
* soft nofile 1000000
* hard nofile 1000000
* soft memlock unlimited
* hard memlock unlimited
EOF
当然了这里最好的其实是 /etc/security/limits.d 目录下生成一个新的配置文件。防止批改原来的总配置文件、这也是举荐应用的形式。
6. yum update 八仙过海各显神通吧,装置本人所需的习惯的利用
yum update
yum -y install gcc bc gcc-c++ ncurses ncurses-devel cmake elfutils-libelf-devel openssl-devel flex* bison* autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake pcre pcre-devel openssl openssl-devel jemalloc-devel tlc libtool vim unzip wget lrzsz bash-comp* ipvsadm ipset jq sysstat conntrack libseccomp conntrack-tools socat curl wget git conntrack-tools psmisc nfs-utils tree bash-completion conntrack libseccomp net-tools crontabs sysstat iftop nload strace bind-utils tcpdump htop telnet lsof
7. ipvs 增加(centos8 内核默认 4.18. 内核 4.19 不包含 4.19 的是用这个)
:> /etc/modules-load.d/ipvs.conf
module=(
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
br_netfilter
)
for kernel_module in ${module[@]};do
/sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
done
内核大于等于 4.19 的
:> /etc/modules-load.d/ipvs.conf
module=(
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter
)
for kernel_module in ${module[@]};do
/sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
done
这个中央我想我开不开 ipvs 应该没有多大关系了吧?因为我网络组件用的 cilium hubble。网络用的是 ebpf。没有用 iptables ipvs 吧?至于配置 ipvs 算是原来部署养成的习惯
加载 ipvs 模块
systemctl daemon-reload
systemctl enable --now systemd-modules-load.service
查问 ipvs 是否加载
# lsmod | grep ip_vs
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 172032 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 172032 6 xt_conntrack,nf_nat,xt_state,ipt_MASQUERADE,xt_CT,ip_vs
nf_defrag_ipv6 20480 4 nf_conntrack,xt_socket,xt_TPROXY,ip_vs
libcrc32c 16384 3 nf_conntrack,nf_nat,ip_vs
8. 优化零碎参数(不肯定是最优,各取所有)
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
net.ipv4.ip_forward = 1
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
# 要求 iptables 不对 bridge 的数据进行解决
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
net.netfilter.nf_conntrack_max = 2310720
fs.inotify.max_user_watches=89100
fs.may_detach_mounts = 1
fs.file-max = 52706963
fs.nr_open = 52706963
vm.overcommit_memory=1
vm.panic_on_oom=0
vm.swappiness = 0
EOF
sysctl --system
9. containerd 装置
dnf 与 yum centos8 的变动,具体的本人去看了呢。差不多吧 …….
dnf install dnf-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sudo yum update -y && sudo yum install -y containerd.io
containerd config default > /etc/containerd/config.toml
# 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"
# 重启 containerd
$ systemctl daemon-reload
$ systemctl restart containerd
其余的配置一个是启用 SystemdCgroup 另外一个是增加了本地镜像库,账号密码(间接应用了腾讯云的仓库)。
10. 配置 CRI 客户端 crictl
cat <<EOF > /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
# 验证是否可用(能够顺便验证一下公有仓库)crictl pull nginx:alpine
crictl rmi nginx:alpine
crictl images
11. 装置 Kubeadm(centos8 没有对应 yum 源应用 centos7 的阿里云 yum 源)
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# 删除旧版本,如果装置了
yum remove kubeadm kubectl kubelet kubernetes-cni cri-tools socat
# 查看所有可装置版本 上面两个都能够啊
# yum list --showduplicates kubeadm --disableexcludes=kubernetes
# 装置指定版本用上面的命令
# yum -y install kubeadm-1.20.5 kubectl-1.20.5 kubelet-1.20.5
or
# yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
# 默认装置最新稳定版,以后版本 1.20.5
yum install kubeadm
# 开机自启
systemctl enable kubelet.service
12. 批改 kubelet 配置
vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock
13 . journal 日志相干防止日志反复收集,节约系统资源。批改 systemctl 启动的最小文件关上数量, 敞开 ssh 反向 dns 解析. 设置清理日志,最大 200m(可依据集体需要设置)
sed -ri 's/^\$ModLoad imjournal/#&/' /etc/rsyslog.conf
sed -ri 's/^\$IMJournalStateFile/#&/' /etc/rsyslog.conf
sed -ri 's/^#(DefaultLimitCORE)=/\1=100000/' /etc/systemd/system.conf
sed -ri 's/^#(DefaultLimitNOFILE)=/\1=100000/' /etc/systemd/system.conf
sed -ri 's/^#(UseDNS)yes/\1no/' /etc/ssh/sshd_config
journalctl --vacuum-size=200M
2. master 节点操作
1 . 装置 haproxy
yum install haproxy
cat <<EOF > /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
log global
option tcplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes
bind *:8443 #配置端口为 8443
mode tcp
default_backend kubernetes
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend kubernetes #后端服务器,也就是说拜访 10.3.2.12:6443 会将申请转发到后端的三台,这样就实现了负载平衡
balance roundrobin
server master1 10.3.2.5:6443 check maxconn 2000
server master2 10.3.2.13:6443 check maxconn 2000
server master3 10.3.2.16:6443 check maxconn 2000
EOF
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy
嗯 slb 绑定端口
2. sh-master-01 节点初始化
1. 生成 config 配置文件
kubeadm config print init-defaults > config.yaml
上面的图就是举个例子 …….
2. 批改 kubeadm 初始化文件
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.3.2.5
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
name: sh-master-01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
certSANs:
- sh-master-01
- sh-master-02
- sh-master-03
- sh-master.k8s.io
- localhost
- 127.0.0.1
- 10.3.2.5
- 10.3.2.13
- 10.3.2.16
- 10.3.2.12
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "10.3.2.12:6443"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
networking:
dnsDomain: xx.daemon
serviceSubnet: 172.254.0.0/16
podSubnet: 172.3.0.0/16
scheduler: {}
批改的中央在下图中做了标识
3. kubeadm master-01 节点初始化(屏蔽 kube-proxy)。
kubeadm init --skip-phases=addon/kube-proxy --config=config.yaml
装置胜利截图就疏忽了,后写的笔记没有保留截图。胜利的日志中蕴含
mkdir -p $HOME/.kube
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
依照输入 sh-master-02,sh-master-03 节点退出集群
将 sh-master-01 /etc/kubernetes/pki 目录下 ca.* sa.* front-proxy-ca.* etcd/ca* 打包散发到 sh-master-02,sh-master-03 /etc/kubernetes/pki 目录下
kubeadm join 10.3.2.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:eb0fe00b59fa27f82c62c91def14ba294f838cd0731c91d0d9c619fe781286b6 --control-plane
而后同 sh-master-01 一样执行一遍上面的命令:mkdir -p $HOME/.kube
sudo \cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
3. helm 装置 部署 cilium 与 hubble(默认 helm3 了)
1. 下载 helm 并装置 helm
注:因为网络起因。下载 helm 安装包下载不动常常,间接 github 下载到本地了
tar zxvf helm-v3.5.3-linux-amd64.tar.gz
cp helm /usr/bin/
2 . helm 装置 cilium hubble
新近版本 cilium 与 hubble 是离开的当初貌似都集成了一波流走一遍:
helm install cilium cilium/cilium --version 1.9.5
--namespace kube-system
--set nodeinit.enabled=true
--set externalIPs.enabled=true
--set nodePort.enabled=true
--set hostPort.enabled=true
--set pullPolicy=IfNotPresent
--set config.ipam=cluster-pool
--set hubble.enabled=true
--set hubble.listenAddress=":4244"
--set hubble.relay.enabled=true
--set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}"
--set prometheus.enabled=true
--set peratorPrometheus.enabled=true
--set hubble.ui.enabled=true
--set kubeProxyReplacement=strict
--set k8sServiceHost=10.3.2.12
--set k8sServicePort=6443
部署胜利就是这样的
嗯 木有 kube-proxy 的(截图是 work 加点退出后的故 node-init cilium pod 都有 6 个)
4. work 节点部署
sh-work-01 sh-work-02 sh-work-03 节点退出集群
kubeadm join 10.3.2.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:eb0fe00b59fa27f82c62c91def14ba294f838cd0731c91d0d9c619fe781286b6
5. master 节点验证
轻易一台 master 节点。默认 master-01 节点
容易出错 的中央
- 对于 slb 绑定。绑定一台 server 而后 kubeadm init 是容易出差的 slb 端口与主机端口一样。本人连本人是不能够的 …. 不明觉厉。试了好几次。最初绑定三个都先启动了 haproxy。
-
cilium 依赖于 BPF 要先确认下零碎是否挂载了 BPF 文件系统(我的是查看了默认启用了)
[root@sh-master-01 manifests]# mount |grep bpf bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
3. 对于 kubernetes 的配置 Cgroup 设置与 containerd 始终都用了 system, 记得查看
KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock
-
在 kube-controller-manager 中使能 PodCIDR
在 controller-manager.config 中增加
--allocate-node-cidrs=true
6. 其余
1. 验证下 hubble hubble ui
kubectl edit svc hubble-ui -n kube-system
批改为 NodePort 先测试一下。前面会用 traefik 代理
work or master 节点轻易一个公网 IP+nodeport 拜访2 . 将 ETCDCTL 工具部署在容器外
很多时候要用 etcdctl 还要进入容器 比拟麻烦,把 etcdctl 工具间接提取到 master01 节点 docker 有 copy 的命令 containerd 不会玩了 间接 github 仓库下载 etcdctl
tar zxvf etcd-v3.4.15-linux-amd64.tar.gz cd etcd-v3.4.15-linux-amd64/ cp etcdctl /usr/local/bin/etcdctl cat >/etc/profile.d/etcd.sh<<'EOF' ETCD_CERET_DIR=/etc/kubernetes/pki/etcd/ ETCD_CA_FILE=ca.crt ETCD_KEY_FILE=healthcheck-client.key ETCD_CERT_FILE=healthcheck-client.crt ETCD_EP=https://10.3.2.5:2379,https://10.3.2.13:2379,https://10.3.2.16:2379 alias etcd_v3="ETCDCTL_API=3 \ etcdctl \ --cert ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} \ --key ${ETCD_CERET_DIR}/${ETCD_KEY_FILE} \ --cacert ${ETCD_CERET_DIR}/${ETCD_CA_FILE} \ --endpoints $ETCD_EP" EOF source /etc/profile.d/etcd.sh
验证 etcd
etcd_v3 endpoint status –write-out=table总结
综合以上。根本环境算是装置完了,因为文章是后写的,可能有些中央没有写分明,想起来了再补呢