试验环境阐明

云端环境:

  • OS: Ubuntu Server 20.04.1 LTS 64bit
  • Kubernetes: v1.19.8
  • 网络插件:calico v3.16.3
  • Cloudcore: kubeedge/cloudcore:v1.6.1

边缘环境:

  • OS: Ubuntu Server 18.04.5 LTS 64bit
  • EdgeCore: v1.19.3-kubeedge-v1.6.1
  • docker:

    • version: 20.10.7
    • cgroupDriver: systemd

边缘端注册QuikStart:

参考资料:
https://docs.kubeedge.io/en/d...
https://docs.kubeedge.io/en/d...
  1. 从cloudcore获取token

    kubectl get secret -nkubeedge tokensecret -o=jsonpath='{.data.tokendata}' | base64 -d
  2. 配置edgecore
    如果应用二进制装置,须要先获取初始的最小化edgecore配置文件:

    edgecore --minconfig > edgecore.yaml

    该配置文件适宜刚开始应用KubeEdge的同学,算是最精简的配置。
    批改其中要害配置(这里仅列出要害配置):

    ……modules:  edgeHub: …… httpServer: https://cloudcore侧HttpServer监听地址:端口(默认为10002) token: 第一步中获取的token字符串 websocket: ……   server: cloudcore侧监听地址:端口(默认为10000) ……  edged: cgroupDriver: systemd //和docker所用native.cgroupdriver保持一致 …… hostnameOverride: edge01 //设置该节点注册到cloudcore的名称 nodeIP: 指定该节点IP地址 //默认会取为本机IP地址,多网卡留神查看 ……  eventBus: mqttMode: 0 //应用internal的mqtt服务 ……

    如果应用keadm装置部署,执行:

    keadm join --cloudcore-ipport=cloudcore监听的IP地址:端口(默认为10002) --token=获取到的token字符串

    执行后,edgecore节点会自行应用systemctl进行治理,并退出开机启动项,同时启动edgecore节点,此时edgecore节点的运行状态不肯定失常。
    同样,批改并查看配置文件,配置文件主动生成于/etc/kubeedge/config/edgecore.yaml

  3. 启动edgecore服务
    如采纳二进制装置,则:

    nohup ./edgecore --config edgecore.yaml 2>&1 > edgecore.log &

    如采纳keadm装置,则:

    systemctl restart edgecore
  4. 验证接入
    于节点上云端master节点上执行:

    root@master01:/home/ubuntu# kubectl get nodesNAME       STATUS   ROLES        AGE   VERSIONedge01     Ready    agent,edge   10h   v1.19.3-kubeedge-v1.6.1master01   Ready    master       53d   v1.19.8master02   Ready    master       53d   v1.19.8master03   Ready    master       53d   v1.19.8node01     Ready    worker       53d   v1.19.8node02     Ready    worker       53d   v1.19.8

    我cloudcore开了主动注册,此时可见edge节点曾经注册上了。

排坑过程

然而查看边缘节点运行的pod时发现边缘节点主动起了calico,kube-proxy,nodelocaldns的pod:

root@master01:/home/ubuntu# kubectl get pod -A -o wide | grep edge01kube-system                    calico-node-l2h8l                                                 0/1     Init:Error          2          52s     172.31.100.15    edge01     <none>           <none>kube-system                    kube-proxy-m6rbk                                                  1/1     Running             0          2m22s   172.31.100.15    edge01     <none>           <none>kube-system                    nodelocaldns-hr7fk                                                0/1     Error               2          30s     172.31.100.15    edge01     <none>           <none>

其中:

  • calico初始化呈现Error谬误
  • nodelocaldns呈现Error, 起因: ContainersNotReady,
  • kubeproxy部署胜利了
Note: 网上有其余文章说网络插件部署不胜利暂不影响edge节点的应用,在本文测试环境中,实际上是影响应用的,我测试下发了一个deployment部署nginx,其Pod始终处于Pending状态
  1. 起因剖析:

    • calico初始化谬误,查了2020年12月的一个issues,说是CNI反对还在开发中,临时不反对。
    • 待查,预估是兼容性或者网络插件起因
    • edge节点上是不能运行kubeproxy的,如果装置有kubeproxy,在启动edgecore时日志会呈现报错Failed to check the running environment: Kube-proxy should not running on edge node when running edgecore
    https://github.com/kubeedge/k...
  2. 查看发现,这几个pod是应用daemonset部署的:

    root@master01:/home/ubuntu# kubectl get daemonset -ANAMESPACE                      NAME            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGEkube-system                    calico-node     5         5         5       5            5           kubernetes.io/os=linux   53dkube-system                    kube-proxy      5         5         5       5            5           kubernetes.io/os=linux   53dkube-system                    nodelocaldns    5         5         5       5            5           <none>                   53d
  3. 批改其yaml文件:

    kubectl edit daemonset -n kube-system calico-nodekubectl edit daemonset -n kube-system kube-proxykubectl edit daemonset -n kube-system nodelocaldns

    新增亲和性配置(affinity):

      spec: affinity:   nodeAffinity:     requiredDuringSchedulingIgnoredDuringExecution:       nodeSelectorTerms:         - matchExpressions:           - key: node-role.kubernetes.io/edge             operator: DoesNotExist
  4. 进行edgecore服务

    root@edge01:/usr/local/edge# systemctl stop edgecore
  5. 从k8s集群中清理该edge节点

    root@master01:/home/ubuntu# kubectl drain edge01 --delete-local-data --force --ignore-daemonsetsroot@master01:/home/ubuntu# kubectl delete node edge01
  6. 重启edge节点上docker服务

    root@edge01:/usr/local/edge# systemctl restart docker
  7. 重启edgecore

    root@edge01:/usr/local/edge# systemctl start edgecore

    此时,edge节点从新注册胜利,且edge节点未运行任何pod

    root@master01:/home/ubuntu# kubectl get nodesNAME       STATUS   ROLES        AGE    VERSIONedge01     Ready    agent,edge   8m3s   v1.19.3-kubeedge-v1.6.1master01   Ready    master       53d    v1.19.8master02   Ready    master       53d    v1.19.8master03   Ready    master       53d    v1.19.8node01     Ready    worker       53d    v1.19.8node02     Ready    worker       53d    v1.19.8root@master01:/home/ubuntu# kubectl get pod -A -o wide | grep edge01root@master01:/home/ubuntu#
    Note:
    同理,不须要运行在edge节点上的resource,也须要配置其亲和性。新增resource时(特地时daemonset和cronjob),留神抉择运行节点,否则会导致pod报错或restartPolicyAlways的Pod一直重启。

    除了手动批改外,可应用以下脚本进行操作(我没有进行验证,集体感觉依据resource类型写脚本好些,改了什么本人心里有个底):
    https://github.com/kubesphere...

    #!/bin/bashNodeSelectorPatchJson='{"spec":{"template":{"spec":{"nodeSelector":{"node-role.kubernetes.io/master": "","node-role.kubernetes.io/worker": ""}}}}}'NoShedulePatchJson='{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"node-role.kubernetes.io/edge","operator":"DoesNotExist"}]}]}}}}}}}'edgenode="edgenode"if [ $1 ]; then     edgenode="$1"finamespaces=($(kubectl get pods -A -o wide |egrep -i $edgenode | awk '{print $1}' ))pods=($(kubectl get pods -A -o wide |egrep -i $edgenode | awk '{print $2}' ))length=${#namespaces[@]}for((i=0;i<$length;i++));  do     ns=${namespaces[$i]}     pod=${pods[$i]}     resources=$(kubectl -n $ns describe pod $pod | grep "Controlled By" |awk '{print $3}')     echo "Patching for ns:"${namespaces[$i]}",resources:"$resources     kubectl -n $ns patch $resources --type merge --patch "$NoShedulePatchJson"     sleep 1done

    尝试在edge节点进行部署

    1. 编辑deployment,部署nginx进行测试
    kind: DeploymentapiVersion: apps/v1metadata:  name: nginx-edge  namespace: test-ns  labels: app: nginx-edge  annotations: deployment.kubernetes.io/revision: '1'spec:  replicas: 1  selector: matchLabels:   app: nginx-edge  template: metadata:   creationTimestamp: null   labels:     app: nginx-edge spec:   containers:     - name: nginx-edge01       image: 'nginx:latest'       ports:         - name: tcp-80           containerPort: 80           protocol: TCP       resources:         limits:           cpu: 300m           memory: 200Mi         requests:           cpu: 100m           memory: 10Mi       terminationMessagePath: /dev/termination-log       terminationMessagePolicy: File       imagePullPolicy: IfNotPresent   restartPolicy: Always   terminationGracePeriodSeconds: 30   dnsPolicy: ClusterFirst   nodeSelector:     kubernetes.io/hostname: edge01   serviceAccountName: default   serviceAccount: default   securityContext: {}   affinity: {}   schedulerName: default-scheduler  strategy: type: RollingUpdate rollingUpdate:   maxUnavailable: 25%   maxSurge: 25%  revisionHistoryLimit: 10  progressDeadlineSeconds: 600
    1. 查看部署的nginx
    root@master01:/home/ubuntu# kubectl get pod -A -o wide | grep edge01test-ns                        nginx-edge-946d96f44-n2h8v                                        1/1     Running     0          40s     172.17.0.2       edge01     <none>           <none>

这时边缘侧部署nginx胜利。