共计 14579 个字符,预计需要花费 37 分钟才能阅读完成。
背景:
网络环境参照:云联网体验,上海 北京两个 vpc 网络。服务器散布如下:
讲一下为什么应用 TencentOS Server 3.1 (TK4)的零碎。还不是因为 centos8 不提供长期保护了 ….,顺便体验一下腾讯云开源的 tencentos. 详情见腾讯云官网:https://cloud.tencent.com/document/product/213/38027。毕竟是与 centos8 兼容的,依照 centos8 的搭建 kubernetes 的流程搭建一遍 kubernetes 体验一下跨区域是否可行!
根本布局:
注:嗯多区域打散比拟也能够高可用!
ip | hostname | 所在区域 | |
---|---|---|---|
10.10.2.8 | sh-master-01 | 上海 2 区 | |
10.10.2.10 | sh-master-02 | 上海 2 区 | |
10.10.5.4 | sh-master-03 | 上海 5 区 | |
10.10.4.7 | sh-work-01 | 上海 4 区 | |
10.10.4.14 | sh-work-02 | 上海 4 区 | |
10.10.12.9 | bj-work-01 | 北京 5 区 |
创立一个内网负载平衡 slb,做 apiserver 的 vip, 过来始终用的传统型,当初只有应用型负载平衡了 ……
零碎初始化
注:1-12 为所有节点执行
1. 更改主机名
注:主机名没有初始化的批改主机名
[root@VM-2-8-centos ~]# hostnamectl set-hostname sh-master-01
[root@VM-2-8-centos ~]# cat /etc/hostname
sh-master-01
其余几台同样的形式
2. 敞开 swap 替换分区
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
3. 敞开 selinux
[root@sh-master-01 ~]# setenforce 0
ssive/SELINUX=disabled/g" /etc/selinux/configsetenforce: SELinux is disabled
[root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux
[root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
[root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux
[root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config
4. 敞开防火墙
systemctl disable --now firewalld
chkconfig firewalld off
注:都没有装置 firewalld and iptables 能够疏忽
5. 调整文件关上数等配置
cat> /etc/security/limits.conf <<EOF
* soft nproc 1000000
* hard nproc 1000000
* soft nofile 1000000
* hard nofile 1000000
* soft memlock unlimited
* hard memlock unlimited
EOF
当然了貌似 tencentos limits.d 目录下有个 80-nofile.conf,批改配置文件能够都放在这里。这样能够防止批改主文件
6. yum update
yum update
yum -y install gcc bc gcc-c++ ncurses ncurses-devel cmake elfutils-libelf-devel openssl-devel flex* bison* autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake pcre pcre-devel openssl openssl-devel jemalloc-devel tlc libtool vim unzip wget lrzsz bash-comp* ipvsadm ipset jq sysstat conntrack libseccomp conntrack-tools socat curl wget git conntrack-tools psmisc nfs-utils tree bash-completion conntrack libseccomp net-tools crontabs sysstat iftop nload strace bind-utils tcpdump htop telnet lsof
当然了 我这里疏忽了 …… 我 cvm 初始化个别会用 oneinstack 的脚本实现初始化一下
7. ipvs 增加
tencentos 的零碎内核是 5.4.119
:> /etc/modules-load.d/ipvs.conf
module=(
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter
)
for kernel_module in ${module[@]};do
/sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
done
systemctl daemon-reload
systemctl enable --now systemd-modules-load.service
验证 ipvs 是否加载胜利
# lsmod | grep ip_vs
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 5
ip_vs 151552 11 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 114688 5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv6 20480 2 nf_conntrack,ip_vs
8. 优化零碎参数(不肯定是最优,各取所有)
oneinstack 默认的 初始化装置的,先不改了,缓缓看。等一会有问题了找问题
cat /etc/sysctl.d/99-sysctl.conf
fs.file-max=1000000
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_syncookies = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65000
net.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_established = 3600
9. containerd 装置
dnf 与 yum centos8 的变动,具体的本人去看了呢。差不多吧 ……., 增加阿里云的源习惯了如下:
dnf install dnf-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sudo yum update -y && sudo yum install -y containerd.io
containerd config default > /etc/containerd/config.toml
# 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"
# 重启 containerd
$ systemctl daemon-reload
$ systemctl restart containerd
看来还是搞不定 …. 匹配的版本不对啊哈哈哈,咋整?
找一下腾讯的源试一下,当然了先删除一下阿里的源:
rm -rf /etc/yum.repos.d/docker-ce.repo
yum clean all
https://mirrors.cloud.tencent.com/docker-ce/linux/centos/
dnf install dnf-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.cloud.tencent.com/docker-ce/linux/centos/docker-ce.repo
sudo yum update -y && sudo yum install -y containerd.io
containerd config default > /etc/containerd/config.toml
# 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"
# 重启 containerd
$ systemctl daemon-reload
$ systemctl restart containerd
仍然如此 ……. 没有本人匹配一下零碎啊 …. 咋整?手动批改一下?
胜利了,这里也心愿 tencentos 可能本人反对一下罕用的 yum 源 … 别让我手动转换啊
containerd config default > /etc/containerd/config.toml
# 重启 containerd
systemctl daemon-reload
systemctl restart containerd
systemctl status containerd
10. 配置 CRI 客户端 crictl
注:貌似有版本匹配的
VERSION="v1.22.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
rm -f crictl-$VERSION-linux-amd64.tar.gz
也可能下不动,github 下载到桌面,手动上传吧 ….
cat <<EOF > /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
# 验证是否可用(能够顺便验证一下公有仓库)crictl pull nginx:alpine
crictl rmi nginx:alpine
crictl images
嗯 批改一下 /etc/containerd/config.toml 中 [plugins.”io.containerd.grpc.v1.cri”.registry.mirrors.”docker.io”] 中endpoint为阿里云的加速器地址(当然 了也能够是其余加速器的),另外,[plugins.”io.containerd.grpc.v1.cri”.containerd.runtimes.runc.options]也增加了 SystemdCgroup = true
endpoint 更换为阿里云加速器地址:https://2lefsjdg.mirror.aliyuncs.com
重启 containerd 服务从新下载镜像验证:
systemctl restart containerd.service
crictl pull nginx:alpine
OK
11. 装置 Kubeadm(centos8 没有对应 yum 源应用 centos7 的阿里云 yum 源)
注:为什么装置 1.21.3 版本呢?因为我线上的也是 1.21.3 版本的。正好到时候测试一下降级
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# 删除旧版本,如果装置了
yum remove kubeadm kubectl kubelet kubernetes-cni cri-tools socat
# 查看所有可装置版本 上面两个都能够啊
# yum list --showduplicates kubeadm --disableexcludes=kubernetes
# 装置指定版本用上面的命令
# yum -y install kubeadm-1.21.3 kubectl-1.21.3 kubelet-1.21.3
or
# 装置默认最新稳固版本,以后版本 1.22.4
#yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
# 开机自启
systemctl enable kubelet.service
当然了,这里也能够间接应用腾讯云的源了 …. 情理一样。
12. 批改 kubelet 配置
vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock
master 节点额定操作:
1. 装置 haproxy
注:三台 master 节点都要装置 haproxy,以及相干配置 ……
yum install haproxy
cat <<EOF > /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
log global
option tcplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes
bind *:8443 #配置端口为 8443
mode tcp
default_backend kubernetes
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend kubernetes #后端服务器,也就是说拜访 10.3.2.12:6443 会将申请转发到后端的三台,这样就实现了负载平衡
balance roundrobin
server master1 10.10.2.8:6443 check maxconn 2000
server master2 10.10.2.10:6443 check maxconn 2000
server master3 10.10.5.4:6443 check maxconn 2000
EOF
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy
登陆腾讯云负载平衡治理后盾:https://console.cloud.tencent.com/clb,创立 TCP 监听器命名 k8s 监听 6443 端口,后端服务绑定三台 master 节点 8443 端口,权重默认 10 没有批改。
2. sh-master-01 节点生成配置文件
注:当然了 也能够是 sh-master-02 or sh-master-03 节点
kubeadm config print init-defaults > config.yaml
批改一下配置文件如下:
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.10.2.8
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
name: sh-master-01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
certSANs:
- sh-master-01
- sh-master-02
- sh-master-03
- sh-master.k8s.io
- localhost
- 127.0.0.1
- 10.10.2.8
- 10.10.2.10
- 10.10.5.4
- 10.10.2.4
- xx.xx.xx.xx
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "10.10.2.4:6443"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.21.3
networking:
dnsDomain: cluster.local
serviceSubnet: 172.31.0.0/16
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 15s
iptables:
masqueradeAll: true
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
减少了 ipvs 的配置,指定了 service 的 subnet, 还有国内的镜像仓库,xx.xx.xx.xx 是我预留了一个 ip(能够预留 ip 的, 不便当前扩容主节点起码)
3. kubeadm master-01 节点初始化
kubeadm init --config /root/config.yaml
注:上面截图跟下面命令不匹配,因为我开始想装置 cilium 来 … 后果失败了哈哈哈还是先搞一下 calico 吧
嗯 优化零碎参数的时候没有搞上net.ipv4.ip_forward 强调一下,sysctl - w 是长期的哦
sysctl -w net.ipv4.ip_forward=1
短暂的还是再配置文件中加一下:
cat <<EOF > /etc/sysctl.d/99-sysctl.conf
fs.file-max=1000000
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_syncookies = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.ip_forward = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65000
net.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_max = 6553500
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_established = 3600
EOF
sysctl --system
注:所有节点执行
kubeadm init --config /root/config.yaml
4. sh-master-02,sh-master-03 管制立体节点退出集群
mkdir -p $HOME/.kube
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
依照输入 sh-master-02,sh-master-03 节点退出集群
将 sh-master-01 /etc/kubernetes/pki 目录下 ca.* sa.* front-proxy-ca.* etcd/ca* 打包散发到 sh-master-02,sh-master-03 /etc/kubernetes/pki 目录下
kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3 --control-plan
而后同 sh-master-01 一样执行一遍上面的命令:mkdir -p $HOME/.kube
sudo \cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get nodes
嗯 因为没有装置 cni 网络插件都是 notready 状态。
work 节点退出集群
kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3
首先 cnn 治理控制台先购买了 1Mbps 的带宽,毕竟是做一下测试:
装置 cni 网络插件
初步先跑一下简略的 calico 了(搞 flannel cilium 开始没有整起来。先跑通一个算一个。其余的前面缓缓学习优化)
curl https://docs.projectcalico.org/v3.11/manifests/calico.yaml -O
sed -i -e "s?192.168.0.0/16?172.31.0.0/16?g" calico.yaml
kubectl apply -f calico.yaml
kubectl get pods -o kube-system -o wide
注:我还额定在腾讯云公有网络控制台增加了辅助 cidr, 我在想这样的话我跟其余区域的容器网络是不是也能够互通?还没有测试 …. 就是想起来增加一下了:
[
](https://console.cloud.tencent…)
做一下简略的 ping 测试:
1. 上海区部署两个 pod
cat<<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:alpine
name: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: busybox:1.28.4
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
嗯 都跑在了上海区
[root@sh-master-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 14 14h 172.31.45.132 sh-work-01 <none> <none>
nginx-7fb7fd49b4-zrg77 1/1 Running 0 14h 172.31.45.131 sh-work-01 <none> <none>
2. nodeSelector 调度在北京区启动一个 pod
而后我还想启动一个 pod 运行在北京区,怎么搞?偷个懒 打标签,nodeSelector 调度吧!
kubectl label node bj-work-01 zone=beijing
cat nginx1.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx1
name: nginx1
spec:
nodeSelector: #将 pod 部署到指定标签为 zone 为 beijing 的节点上
zone: "beijing"
containers:
- image: nginx
name: nginx1
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
kubectl apply -f nginx1.yaml
[root@sh-master-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 14 14h 172.31.45.132 sh-work-01 <none> <none>
nginx-7fb7fd49b4-zrg77 1/1 Running 0 14h 172.31.45.131 sh-work-01 <none> <none>
nginx1 1/1 Running 0 14h 172.31.89.194 bj-work-01 <none> <none>
3. ping 测试
在 sh-master-02 节点 ping 北京 pod 与上海 pod 的 ping 值
exec 上海的 pod ping 上海与北京的 pod 的 ping 值
根本都是差不多的样子。次要是想验证一下是否能够跨区域 vpc 去搭建 kubernetes 集群的可行性。网络品质什么的还没有想好怎么测试。只是抛砖引玉。云上是很大水平上不便了许多。起码 bgp 什么的配置的都绝对省略了。如果有云上跨区域搭建 kubernetes 集群的能够参考一下。