1. 问题形容
应用 kubeadm 部署 k8s 集群的时候不晓得哪个步骤出了错,导致 kubelet 10250 端口运行的协定、地址出了问题,如下所示:
[[email protected] ~]# netstat -ntpl | grep 10250 tcp 0 0 127.0.0.1:10250 0.0.0.0:* LISTEN 52577/kubelet
查看 kubelet 服务也能看到端口运行在 127.0.0.1 上:
[[email protected] ~]# systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf, 20-etcd-service-manager.conf Active: active (running) since 二 2022-10-04 15:04:47 CST; 4 days ago Docs: https://kubernetes.io/docs/ Main PID: 52577 (kubelet) Tasks: 16 Memory: 51.4M CGroup: /system.slice/kubelet.service └─52577 /usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --network-plugin=cni --pod-infra-cont...10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.352978 52577 topology_manager.go:200] "Topology Admit Handler"10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.514029 52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 05 08:46:44 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...10月 05 11:13:17 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"routes":[{"ds...10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.399294 52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.979964 52577 pod_container_deletor.go:79] "Container not found in pod's containers" contain...d0cda5a59"10月 05 11:13:18 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...10月 07 09:51:53 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"rou...go:187] fa10月 07 09:51:56 k8s-slave2 kubelet[52577]: E1007 09:51:56.391455 52577 kubelet_node_status.go:460] "Error updating node status, will retry" err="error getting ...10月 07 09:51:57 k8s-slave2 kubelet[52577]: E1007 09:51:57.185843 52577 controller.go:187] failed to update lease, error: Operation cannot be fulfille... try againHint: Some lines were ellipsized, use -l to show in full.
而部署失常的集群,kubelet 的 10250 端口运行状况应该是这样的:
基于 tcp6 协定,而不是 tcp
基于 :: 而不是 127.0.0.1
如下所示:
tcp6 0 0 :::10250 :::* LISTEN 3272/kubelet
2. kubelet 10250 端口介绍
顺便讲下 10250 端口的作用:10250 端口监听的是 kubelet 的 API 接口,是 kubelet 与 apiserver 通信的端口。kubelet 通过 10250 端口申请 apiserver 获取本人所该当解决的工作,并通过该端口拜访及获取 node 资源以及状态。kubectl 查看 pod 的日志和 cmd 命令,都是通过 kubelet 端口 10250 拜访。
如果 kubelet 10250 端口运行有问题的话则会呈现相似如下无奈获取日志的状况:
[[email protected] ~]# kubectl logs kube-flannel-ds-9tfc8 -n kube-systemError from server: Get "https://192.168.100.22:10250/containerLogs/kube-system/kube-flannel-ds-9tfc8/kube-flannel": dial tcp 192.168.100.22:10250: connect: connection refused
从下面的这个报错能够看进去,10250 端口运行在 127.0.0.1 上必定是不行的
。
3. 批改 10250 端口的运行
怎么将 10250 的端口运行批改失常呢?
思路是:查找 kubelet 的各种配置,看看 127.0.0.1 这个 IP 配置在哪里。
kubelet 相干的配置文件及门路可能波及如下其中一个或者多个:
- /etc/kubernetes/kubelet.conf
- /var/lib/kubelet/
- /usr/lib/systemd/system/kubelet.service
- /usr/lib/systemd/system/kubelet.service.d/
最简略的方法是,通过命令 systemctl status kubelet 查看 kubelet 援用的要害配置文件到底是哪个。最终确认 kubelet 的配置文件是:
/usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
其配置如下,能够看到这里定义了 kubelet 运行的 ip 地址是 127.0.0.1:
[[email protected] ~]# cat /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf[Service]ExecStart=# 将上面的 "systemd" 替换为你的容器运行时所应用的 cgroup 驱动。# kubelet 的默认值为 "cgroupfs"。ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.2
尝试将 ip 地址改为主机的地址 192.168.100.22,而后重启 kubelet,再次查看 10250 端口运行状况:
[[email protected] ~]# netstat -ntpl | grep kubelettcp 0 0 127.0.0.1:38362 0.0.0.0:* LISTEN 16424/kubelet tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 16424/kubelet tcp 0 0 192.168.100.22:10250 0.0.0.0:* LISTEN 16424/kubelet
然而这样也不行,该端口依然基于 tcp 运行而不是 tcp6,同时 127.0.0.1 也须要 10250 端口。
最初找到解决办法,即 将配置文件 /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf 文件间接正文掉。而后重启 kubelet 再次查看端口运行状况,曾经失常:
[[email protected] kubelet.service.d]# mv 20-etcd-service-manager.conf 20-etcd-service-manager.conf.bak[[email protected] kubelet.service.d]# [[email protected] kubelet.service.d]# systemctl daemon-reload[[email protected] kubelet.service.d]# [[email protected] kubelet.service.d]# systemctl restart kubelet[[email protected] kubelet.service.d]#[[email protected] kubelet.service.d]# netstat -ntpl | grep kubelettcp 0 0 127.0.0.1:39386 0.0.0.0:* LISTEN 18890/kubelet tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 18890/kubelet tcp6 0 0 :::10250 :::* LISTEN 18890/kubelet
查看 kubelet 的服务状态,之前的 127.0.0.1 也去掉了:
[[email protected] kubelet.service.d]# systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since 日 2022-10-09 13:59:45 CST; 35min ago Docs: https://kubernetes.io/docs/ Main PID: 18890 (kubelet) Tasks: 15 Memory: 51.0M CGroup: /system.slice/kubelet.service └─18890 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubel...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143058 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143072 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143086 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143100 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143113 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143129 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143143 18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143152 18890 reconciler.go:157] "Reconciler: start to sync state"10月 09 13:59:48 k8s-slave2 kubelet[18890]: I1009 13:59:48.317492 18890 request.go:665] Waited for 1.071720304s due to client-side throttling, not pri...roxy/token10月 09 14:27:07 k8s-slave2 kubelet[18890]: I1009 14:27:07.198435 18890 log.go:184] http: superfluous response.WriteHeader call from k8s.io/kubernetes...se.go:220)Hint: Some lines were ellipsized, use -l to show in full.
再次查看该 node 节点上的 pod 日志,曾经能够失常查看了:
[[email protected] ~]# kubectl logs kube-flannel-ds-8mwsd -n kube-systemI1003 11:37:08.049753 1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[ens33] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}W1003 11:37:08.050009 1 client_config.go:614] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.I1003 11:37:08.451790 1 kube.go:121] Waiting 10m0s for node controller to syncI1003 11:37:08.451939 1 kube.go:402] Starting kube subnet managerI1003 11:37:09.452166 1 kube.go:128] Node controller sync successfulI1003 11:37:09.452199 1 main.go:227] Created subnet manager: Kubernetes Subnet Manager - k8s-slave2I1003 11:37:09.452206 1 main.go:230] Installing signal handlersI1003 11:37:09.452354 1 main.go:463] Found network config - Backend type: vxlanI1003 11:37:09.452652 1 match.go:248] Using interface with name ens33 and address 192.168.100.22I1003 11:37:09.452676 1 match.go:270] Defaulting external address to interface address (192.168.100.22)I1003 11:37:09.452733 1 vxlan.go:138] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=falseI1003 11:37:09.472457 1 kube.go:351] Setting NodeNetworkUnavailableI1003 11:37:09.481646 1 main.go:412] Current network or subnet (10.244.0.0/16, 10.244.2.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rulesI1003 11:37:09.746603 1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.747961 1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.748691 1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.749351 1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.750317 1 main.go:341] Setting up masking rulesI1003 11:37:09.750945 1 main.go:362] Changing default FORWARD chain policy to ACCEPTI1003 11:37:09.750995 1 main.go:375] Wrote subnet file to /run/flannel/subnet.envI1003 11:37:09.751000 1 main.go:379] Running backend.I1003 11:37:09.845326 1 vxlan_network.go:61] watching for new subnet leasesI1003 11:37:09.846711 1 main.go:400] Waiting for all goroutines to exitI1003 11:37:09.847083 1 iptables.go:231] Some iptables rules are missing; deleting and recreating rulesI1003 11:37:09.847088 1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.943307 1 iptables.go:231] Some iptables rules are missing; deleting and recreating rulesI1003 11:37:09.943324 1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.943450 1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.944459 1 iptables.go:255] Deleting iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.945214 1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.945335 1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:09.946028 1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:09.947474 1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURNI1003 11:37:09.948330 1 iptables.go:243] Adding iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPTI1003 11:37:10.044998 1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fullyI1003 11:37:10.047373 1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURNI1003 11:37:10.049161 1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fully