1. 问题形容

应用 kubeadm 部署 k8s 集群的时候不晓得哪个步骤出了错,导致 kubelet 10250 端口运行的协定、地址出了问题,如下所示:

[[email protected] ~]# netstat -ntpl | grep 10250     tcp        0      0 127.0.0.1:10250         0.0.0.0:*               LISTEN      52577/kubelet

查看 kubelet 服务也能看到端口运行在 127.0.0.1 上:

[[email protected] ~]# systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)  Drop-In: /usr/lib/systemd/system/kubelet.service.d           └─10-kubeadm.conf, 20-etcd-service-manager.conf   Active: active (running) since 二 2022-10-04 15:04:47 CST; 4 days ago     Docs: https://kubernetes.io/docs/ Main PID: 52577 (kubelet)    Tasks: 16   Memory: 51.4M   CGroup: /system.slice/kubelet.service           └─52577 /usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --network-plugin=cni --pod-infra-cont...10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.352978   52577 topology_manager.go:200] "Topology Admit Handler"10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.514029   52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 05 08:46:44 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...10月 05 11:13:17 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"routes":[{"ds...10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.399294   52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.979964   52577 pod_container_deletor.go:79] "Container not found in pod's containers" contain...d0cda5a59"10月 05 11:13:18 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...10月 07 09:51:53 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"rou...go:187] fa10月 07 09:51:56 k8s-slave2 kubelet[52577]: E1007 09:51:56.391455   52577 kubelet_node_status.go:460] "Error updating node status, will retry" err="error getting ...10月 07 09:51:57 k8s-slave2 kubelet[52577]: E1007 09:51:57.185843   52577 controller.go:187] failed to update lease, error: Operation cannot be fulfille... try againHint: Some lines were ellipsized, use -l to show in full.

而部署失常的集群,kubelet 的 10250 端口运行状况应该是这样的:

  • 基于 tcp6 协定,而不是 tcp
  • 基于 :: 而不是 127.0.0.1

如下所示:

tcp6       0      0 :::10250                :::*                    LISTEN      3272/kubelet

2. kubelet 10250 端口介绍

顺便讲下 10250 端口的作用:
10250 端口监听的是 kubelet 的 API 接口,是 kubelet 与 apiserver 通信的端口。kubelet 通过 10250 端口申请 apiserver 获取本人所该当解决的工作,并通过该端口拜访及获取 node 资源以及状态。kubectl 查看 pod 的日志和 cmd 命令,都是通过 kubelet 端口 10250 拜访。

如果 kubelet 10250 端口运行有问题的话则会呈现相似如下无奈获取日志的状况:

[[email protected] ~]# kubectl logs kube-flannel-ds-9tfc8 -n kube-systemError from server: Get "https://192.168.100.22:10250/containerLogs/kube-system/kube-flannel-ds-9tfc8/kube-flannel": dial tcp 192.168.100.22:10250: connect: connection refused

从下面的这个报错能够看进去,10250 端口运行在 127.0.0.1 上必定是不行的

3. 批改 10250 端口的运行

怎么将 10250 的端口运行批改失常呢?
思路是:查找 kubelet 的各种配置,看看 127.0.0.1 这个 IP 配置在哪里。

kubelet 相干的配置文件及门路可能波及如下其中一个或者多个:

  • /etc/kubernetes/kubelet.conf
  • /var/lib/kubelet/
  • /usr/lib/systemd/system/kubelet.service
  • /usr/lib/systemd/system/kubelet.service.d/

最简略的方法是,通过命令 systemctl status kubelet 查看 kubelet 援用的要害配置文件到底是哪个。最终确认 kubelet 的配置文件是:

/usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf

其配置如下,能够看到这里定义了 kubelet 运行的 ip 地址是 127.0.0.1:

[[email protected] ~]# cat /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf[Service]ExecStart=# 将上面的 "systemd" 替换为你的容器运行时所应用的 cgroup 驱动。# kubelet 的默认值为 "cgroupfs"。ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.2

尝试将 ip 地址改为主机的地址 192.168.100.22,而后重启 kubelet,再次查看 10250 端口运行状况:

[[email protected] ~]# netstat -ntpl | grep kubelettcp        0      0 127.0.0.1:38362         0.0.0.0:*               LISTEN      16424/kubelet       tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      16424/kubelet       tcp        0      0 192.168.100.22:10250    0.0.0.0:*               LISTEN      16424/kubelet

然而这样也不行,该端口依然基于 tcp 运行而不是 tcp6,同时 127.0.0.1 也须要 10250 端口。
最初找到解决办法,即 将配置文件 /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf 文件间接正文掉。而后重启 kubelet 再次查看端口运行状况,曾经失常:

[[email protected] kubelet.service.d]# mv 20-etcd-service-manager.conf 20-etcd-service-manager.conf.bak[[email protected] kubelet.service.d]# [[email protected] kubelet.service.d]# systemctl daemon-reload[[email protected] kubelet.service.d]# [[email protected] kubelet.service.d]# systemctl restart kubelet[[email protected] kubelet.service.d]#[[email protected] kubelet.service.d]# netstat -ntpl | grep kubelettcp        0      0 127.0.0.1:39386         0.0.0.0:*               LISTEN      18890/kubelet       tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      18890/kubelet       tcp6       0      0 :::10250                :::*                    LISTEN      18890/kubelet

查看 kubelet 的服务状态,之前的 127.0.0.1 也去掉了:

[[email protected] kubelet.service.d]# systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)  Drop-In: /usr/lib/systemd/system/kubelet.service.d           └─10-kubeadm.conf   Active: active (running) since 日 2022-10-09 13:59:45 CST; 35min ago     Docs: https://kubernetes.io/docs/ Main PID: 18890 (kubelet)    Tasks: 15   Memory: 51.0M   CGroup: /system.slice/kubelet.service           └─18890 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubel...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143058   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143072   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143086   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143100   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143113   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143129   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143143   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143152   18890 reconciler.go:157] "Reconciler: start to sync state"10月 09 13:59:48 k8s-slave2 kubelet[18890]: I1009 13:59:48.317492   18890 request.go:665] Waited for 1.071720304s due to client-side throttling, not pri...roxy/token10月 09 14:27:07 k8s-slave2 kubelet[18890]: I1009 14:27:07.198435   18890 log.go:184] http: superfluous response.WriteHeader call from k8s.io/kubernetes...se.go:220)Hint: Some lines were ellipsized, use -l to show in full.

再次查看该 node 节点上的 pod 日志,曾经能够失常查看了:

[[email protected] ~]# kubectl logs kube-flannel-ds-8mwsd -n kube-systemI1003 11:37:08.049753       1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[ens33] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}W1003 11:37:08.050009       1 client_config.go:614] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.I1003 11:37:08.451790       1 kube.go:121] Waiting 10m0s for node controller to syncI1003 11:37:08.451939       1 kube.go:402] Starting kube subnet managerI1003 11:37:09.452166       1 kube.go:128] Node controller sync successfulI1003 11:37:09.452199       1 main.go:227] Created subnet manager: Kubernetes Subnet Manager - k8s-slave2I1003 11:37:09.452206       1 main.go:230] Installing signal handlersI1003 11:37:09.452354       1 main.go:463] Found network config - Backend type: vxlanI1003 11:37:09.452652       1 match.go:248] Using interface with name ens33 and address 192.168.100.22I1003 11:37:09.452676       1 match.go:270] Defaulting external address to interface address (192.168.100.22)I1003 11:37:09.452733       1 vxlan.go:138] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=falseI1003 11:37:09.472457       1 kube.go:351] Setting NodeNetworkUnavailableI1003 11:37:09.481646       1 main.go:412] Current network or subnet (10.244.0.0/16, 10.244.2.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rulesI1003 11:37:09.746603       1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.747961       1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.748691       1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.749351       1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.750317       1 main.go:341] Setting up masking rulesI1003 11:37:09.750945       1 main.go:362] Changing default FORWARD chain policy to ACCEPTI1003 11:37:09.750995       1 main.go:375] Wrote subnet file to /run/flannel/subnet.envI1003 11:37:09.751000       1 main.go:379] Running backend.I1003 11:37:09.845326       1 vxlan_network.go:61] watching for new subnet leasesI1003 11:37:09.846711       1 main.go:400] Waiting for all goroutines to exitI1003 11:37:09.847083       1 iptables.go:231] Some iptables rules are missing; deleting and recreating rulesI1003 11:37:09.847088       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.943307       1 iptables.go:231] Some iptables rules are missing; deleting and recreating rulesI1003 11:37:09.943324       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.943450       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.944459       1 iptables.go:255] Deleting iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.945214       1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.945335       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.946028       1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.947474       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.948330       1 iptables.go:243] Adding iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:10.044998       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:10.047373       1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURNI1003 11:37:10.049161       1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fully