前言
经验了一周的高强度部署,踩了有数的坑后终于搭起了华为云+阿里云的集群,非常感觉@chen645800876大佬的云服务器-异地部署集群服务这篇文章,能力比较顺利的部署,少踩了很多坑。这次记录是基于大佬文章上,缩小了一些我没有应用的步骤,也把我踩的坑记录一下,做一个备份也心愿能帮忙到其他人。
正式装置
调整内核参数
cat > k8s.conf <<EOF#开启网桥模式net.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1#开启转发net.ipv4.ip_forward = 1##敞开ipv6net.ipv6.conf.all.disable_ipv6=1EOFcp k8s.conf /etc/sysctl.d/k8s.confsysctl -p /etc/sysctl.d/k8s.conf
ipvs前置条件筹备
# step1modprobe br_netfilter# step2cat > /etc/sysconfig/modules/ipvs.modules <<EOF#!/bin/bashmodprobe -- ip_vsmodprobe -- ip_vs_rrmodprobe -- ip_vs_wrrmodprobe -- ip_vs_shmodprobe -- nf_conntrackEOF# step3chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack
这个中央须要留神一下的是原文中模块nf_conntrack_ipv4曾经没有应用了,解决办法是上面链接提出的计划,这个中央十分重要,如果抛错的话,前面ipvs转发会有问题
https://github.com/easzlab/ku...敞开swap分区
swapoff -a
Kubeadm、Kubelet、Kubectl装置
# 增加源cat <<EOF > /etc/yum.repos.d/kubernetes.repo[kubernetes]name=Kubernetesbaseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/#baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/#如果是处理器不是amd的话就须要用到另外一个版本 华为鲲鹏型的就是aarch64 而阿里的是x86_64#这个还有个老手小坑,就是docker的镜像也跟处理器版本无关。x86_64上打的包,aarch64的docker就不能公布,如果遇到pod公布不胜利有可能是这个问题enabled=1gpgcheck=1repo_gpgcheck=1gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpgEOF# 敞开selinuxsetenforce 0# 装置kubelet、kubeadm、kubectlyum install -y kubelet kubeadm kubectl# 设置为开机自启systemctl enable kubelet
建设虚构网卡
# step1 ,留神替换你的公网IP进去cat > /etc/sysconfig/network-scripts/ifcfg-eth0:1 <<EOFBOOTPROTO=staticDEVICE=eth0:1IPADDR=你的公网IPPREFIX=32TYPE=EthernetUSERCTL=noONBOOT=yesEOF# step2 如果是centos8,须要重启)(倡议间接换成centos7,centos8的网卡设置简单一些)# 华为云服务器在网卡设置上是默认了有eth1-eth5 所以须要把默认的这些全副勾销 不然会抛错导致网卡无奈重启systemctl restart network# step3 查看新建的IP是否进去ip addr
批改kubelet启动参数(重点,所有节点都要操作)
# 此文件装置kubeadm后就存在了vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf# 留神,这步很重要,如果不做,节点依然会应用内网IP注册进集群# 在开端增加参数 --node-ip=公网IP# Note: This dropin only works with kubeadm and kubelet v1.11+[Service]Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamicallyEnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.EnvironmentFile=-/etc/sysconfig/kubeletExecStart=ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --node-ip=xx.xx.xx.xx
应用kubeadm初始化主节点,提供两个脚本能够下载和清理镜像
#! /bin/bashimages=( kube-apiserver:v1.21.1 kube-controller-manager:v1.21.1 kube-scheduler:v1.21.1 kube-proxy:v1.21.1 pause:3.4.1 etcd:3.4.13-0 #coredns/coredns 间接从dockerhub上下载)for imageName in ${images[@]} ; do docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/${imageName} docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/${imageName} k8s.gcr.io/${imageName} docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/${imageName}done
#! /bin/bashimages=`docker images|grep k8s.gcr|awk '{print $3}'`for image in ${images}doecho $imagedocker rmi $imagedone
# step1 增加配置文件,留神替换上面的IPcat > kubeadm-config.yaml <<EOFapiVersion: kubeadm.k8s.io/v1beta2kind: ClusterConfigurationkubernetesVersion: v1.21.1apiServer: certSANs: #填写所有kube-apiserver节点的hostname、IP、VIP - master #请替换为hostname - xx.xx.xx.xx #请替换为公网 - yy.yy.yy.yy #请替换为私网 - 10.96.0.1 #不要替换,此IP是API的集群地址,局部服务会用到controlPlaneEndpoint: xx.xx.xx.xx:6443 #替换为公网IPnetworking: podSubnet: 10.244.0.0/16 serviceSubnet: 10.96.0.0/12--- 将默认调度形式改为ipvsapiVersion: kubeproxy-config.k8s.io/v1alpha1kind: KubeProxyConfigurationfeatureGates: SupportIPVSProxyMode: truemode: ipvsEOF# step2 如果是1外围或者1G内存的请在开端增加参数(--ignore-preflight-errors=all),否则会初始化失败# 同时留神,此步骤胜利后,会打印,两个重要信息kubeadm init --config=kubeadm-config.yaml # 信息1 下面初始化胜利后,将会生成kubeconfig文件,用于申请api服务器,请执行上面操作mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config# 信息2 此信息用于前面工作节点退出主节点应用kubeadm join xx.xx.xx.xx:6443 --token sdfs.dsfsdfsdfijdth \ --discovery-token-ca-cert-hash sha256:sdfsdfsdfsdfsdfsdfsdfsdfg9a460f44b118050091245c1d
批改kube-apiserver参数(主节点)
# 批改三个信息,增加--bind-address和批改--advertise-address和feature-gates=RemoveSelfLinkvim /etc/kubernetes/manifests/kube-apiserver.yamlapiVersion: v1kind: Podmetadata: annotations: kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 47.74.22.13:6443 creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-systemspec: containers: - command: - kube-apiserver - --feature-gates=RemoveSelfLink=false #如果波及到NFS挂载StorageClass须要减少这个参数 k8s 1.20后就勾销了这个参数所以须要手动减少#解决办法是来源于https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/issues/25 - --advertise-address=47.74.22.13 #批改为公网IP - --bind-address=0.0.0.0 #增加此参数 - --allow-privileged=true - --authorization-mode=Node,RBAC - --client-ca-file=/etc/kubernetes/pki/ca.crt - --enable-admission-plugins=NodeRestriction - --enable-bootstrap-token-auth=true - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 - --insecure-port=0 - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key - --requestheader-allowed-names=front-proxy-client - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --requestheader-extra-headers-prefix=X-Remote-Extra- - --requestheader-group-headers=X-Remote-Group - --requestheader-username-headers=X-Remote-User - --secure-port=6443 - --service-account-key-file=/etc/kubernetes/pki/sa.pub - --service-cluster-ip-range=10.96.0.0/12 - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key image: k8s.gcr.io/kube-apiserver:v1.18.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 175.24.19.12 path: /healthz port: 6443 scheme: HTTPS initialDelaySeconds: 15 timeoutSeconds: 15 name: kube-apiserver resources: requests: cpu: 250m volumeMounts: - mountPath: /etc/ssl/certs name: ca-certs readOnly: true - mountPath: /etc/pki name: etc-pki readOnly: true - mountPath: /etc/kubernetes/pki name: k8s-certs readOnly: true hostNetwork: true priorityClassName: system-cluster-critical volumes: - hostPath: path: /etc/ssl/certs type: DirectoryOrCreate name: ca-certs - hostPath: path: /etc/pki type: DirectoryOrCreate name: etc-pki - hostPath: path: /etc/kubernetes/pki type: DirectoryOrCreate name: k8s-certsstatus: {}
批改flannel文件并装置(主节点)
wget https://raw.githubusercontent.com/coreos/flanne/master/Documentation/kube-flannel.ymlapiVersion: apps/v1kind: DaemonSetmetadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannelspec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - amd64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --public-ip=$(PUBLIC_IP) # 增加此参数,申明公网IP - --iface=eth0 # 增加此参数,绑定网卡 - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: PUBLIC_IP #增加环境变量 valueFrom: # fieldRef: # fieldPath: status.podIP # - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name
手动开启配置,开启ipvs转发模式(主节点)
# 后面都胜利了,然而有时候默认并不会启用`IPVS`模式,那就手动批改一下,只批改一处# 批改后,如果没有及时失效,请删除kube-proxy,会主动从新创立,而后应用ipvsadm -Ln命令,查看是否失效# ipvsadm没有装置的,应用yum install ipvsadm装置kubectl edit configmaps -n kube-system kube-proxy---apiVersion: v1data: config.conf: |- apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 0 contentType: "" kubeconfig: /var/lib/kube-proxy/kubeconfig.conf qps: 0 clusterCIDR: 10.244.0.0/16 configSyncPeriod: 0s conntrack: maxPerCore: null min: null tcpCloseWaitTimeout: null tcpEstablishedTimeout: null detectLocalMode: "" enableProfiling: false healthzBindAddress: "" hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: null minSyncPeriod: 0s syncPeriod: 0s ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "" strictARP: false syncPeriod: 0s tcpFinTimeout: 0s tcpTimeout: 0s udpTimeout: 0s kind: KubeProxyConfiguration metricsBindAddress: "" mode: "ipvs" # 如果为空,请填入`ipvs` nodePortAddresses: null oomScoreAdj: null portRange: "" showHiddenMetricsForVersion: "" udpIdleTimeout: 0s winkernel: enableDSR: false networkName: ""