k8s 提供了 top 命令可用于统计资源应用状况,它蕴含有 node 和 pod 两个⼦命令,别离显⽰ node 节点和 Pod 对象的资源使⽤信息。

kubectl top 命令依赖于 metrics 接口。k8s 零碎默认未装置该接口,须要独自部署:

[[email protected] k8s-install]# kubectl top poderror: Metrics API not available

装置过程

一、下载部署文件

下载 metrics 接口的部署文件 metrics-server-components.yaml

[[email protected] k8s-install]# wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server-components.yaml--2022-10-11 00:13:01--  https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml正在解析主机 github.com (github.com)... 20.205.243.166正在连接 github.com (github.com)|20.205.243.166|:443... 已连贯。已收回 HTTP 申请,正在期待回应... 302 Found地位:https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/components.yaml [追随至新的 URL]--2022-10-11 00:13:01--  https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/components.yaml再次应用存在的到 github.com:443 的连贯。已收回 HTTP 申请,正在期待回应... 302 Found地位:https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/d85e100a-2404-4c5e-b6a9-f3814ad4e6e5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221010%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221010T161303Z&X-Amz-Expires=300&X-Amz-Signature=efa1ff5dd16b6cd86b6186adb3b4c72afed8197bdf08e2bffcd71b9118137831&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream [追随至新的 URL]--2022-10-11 00:13:02--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/d85e100a-2404-4c5e-b6a9-f3814ad4e6e5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221010%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221010T161303Z&X-Amz-Expires=300&X-Amz-Signature=efa1ff5dd16b6cd86b6186adb3b4c72afed8197bdf08e2bffcd71b9118137831&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream正在解析主机 objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...正在连接 objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... 已连贯。已收回 HTTP 申请,正在期待回应... 200 OK长度:4181 (4.1K) [application/octet-stream]正在保留至: “metrics-server-components.yaml”100%[============================================================================================================================>] 4,181       --.-K/s 用时 0.01s   2022-10-11 00:13:10 (385 KB/s) - 已保留 “metrics-server-components.yaml” [4181/4181])

二、批改镜像地址

将部署文件中镜像地址批改为国内的地址。大略在部署文件的第 140 行。

原配置是:

image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1

批改后的配置是:

image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1

可应用如下命令实现批改:

sed -i 's/k8s.gcr.io/metrics-server/registry.cn-hangzhou.aliyuncs.com/google_containers/g' metrics-server-components.yaml

三、部署 metrics 接口

[[email protected] k8s-install]# kubectl create -f metrics-server-components.yaml serviceaccount/metrics-server createdclusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader createdclusterrole.rbac.authorization.k8s.io/system:metrics-server createdrolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader createdclusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator createdclusterrolebinding.rbac.authorization.k8s.io/system:metrics-server createdservice/metrics-server createddeployment.apps/metrics-server createdapiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

查看该 metric pod 的运行状况:

[[email protected] k8s-install]# kubectl get pods --all-namespaces | grep metricskube-system   metrics-server-6ffc8966f5-84hbb      0/1     Running   0              2m23s

查看该 pod 的状况,发现是探针问题:Readiness probe failed: HTTP probe failed with statuscode: 500

[[email protected] k8s-install]# kubectl describe pod metrics-server-6ffc8966f5-84hbb -n kube-systemName:                 metrics-server-6ffc8966f5-84hbbNamespace:            kube-systemPriority:             2000000000Priority Class Name:  system-cluster-criticalNode:                 k8s-slave2/192.168.100.22Start Time:           Tue, 11 Oct 2022 00:27:33 +0800Labels:               k8s-app=metrics-server                      pod-template-hash=6ffc8966f5Annotations:          <none>Status:               RunningIP:                   10.244.2.9IPs:  IP:           10.244.2.9Controlled By:  ReplicaSet/metrics-server-6ffc8966f5Containers:  metrics-server:    Container ID:  docker://e913a075e0381b98eabfb6e298f308ef69dfbd7c672bdcfb75bb2ff3e4b5a0a4    Image:         registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1    Image ID:      docker-pullable://registry.cn-hangzhou.aliyuncs.com/google_containers/[email protected]:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00    Port:          4443/TCP    Host Port:     0/TCP    Args:      --cert-dir=/tmp      --secure-port=4443      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname      --kubelet-use-node-status-port      --metric-resolution=15s    State:          Running      Started:      Tue, 11 Oct 2022 00:27:45 +0800    Ready:          False    Restart Count:  0    Requests:      cpu:        100m      memory:     200Mi    Liveness:     http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3    Readiness:    http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3    Environment:  <none>    Mounts:      /tmp from tmp-dir (rw)      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x2spb (ro)Conditions:  Type              Status  Initialized       True   Ready             False   ContainersReady   False   PodScheduled      True Volumes:  tmp-dir:    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)    Medium:         SizeLimit:  <unset>  kube-api-access-x2spb:    Type:                    Projected (a volume that contains injected data from multiple sources)    TokenExpirationSeconds:  3607    ConfigMapName:           kube-root-ca.crt    ConfigMapOptional:       <nil>    DownwardAPI:             trueQoS Class:                   BurstableNode-Selectors:              kubernetes.io/os=linuxTolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300sEvents:  Type     Reason     Age                     From               Message  ----     ------     ----                    ----               -------  Normal   Scheduled  7m27s                   default-scheduler  Successfully assigned kube-system/metrics-server-6ffc8966f5-84hbb to k8s-slave2  Normal   Pulling    7m26s                   kubelet            Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1"  Normal   Pulled     7m15s                   kubelet            Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1" in 10.976606194s  Normal   Created    7m15s                   kubelet            Created container metrics-server  Normal   Started    7m15s                   kubelet            Started container metrics-server  Warning  Unhealthy  2m17s (x31 over 6m47s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500进而查看 pod 的日志:[[email protected] k8s-install]# kubectl logs metrics-server-6ffc8966f5-84hbb -n kube-system I1010 16:27:46.228594       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)I1010 16:27:46.633494       1 secure_serving.go:266] Serving securely on [::]:4443I1010 16:27:46.633585       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestControllerI1010 16:27:46.633616       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestControllerI1010 16:27:46.633653       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"I1010 16:27:46.634221       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"W1010 16:27:46.634296       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowedI1010 16:27:46.634365       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"I1010 16:27:46.634370       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-fileI1010 16:27:46.634409       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"I1010 16:27:46.634415       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-fileE1010 16:27:46.641663       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.22:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"E1010 16:27:46.645389       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.20:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.20 because it doesn't contain any IP SANs" node="k8s-master"E1010 16:27:46.652261       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.21:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.21 because it doesn't contain any IP SANs" node="k8s-slave1"I1010 16:27:46.733747       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1010 16:27:46.735167       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1010 16:27:46.735194       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file E1010 16:28:01.643646       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.22:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"E1010 16:28:01.643805       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.21:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.21 because it doesn't contain any IP SANs" node="k8s-slave1"E1010 16:28:01.646721       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.20:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.20 because it doesn't contain any IP SANs" node="k8s-master"I1010 16:28:13.397373       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"

根本能够确定 pod 异样是因为:Readiness Probe 探针检测到 Metris 容器启动后对 http Get 探针存活没反馈,具体起因是:cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"

查看 metrics-server 的文档(https://github.com/kubernetes...),有如下一段阐明:

Kubelet certificate needs to be signed by cluster Certificate Authority (or disable certificate validation by passing --kubelet-insecure-tls to Metrics Server)

意思是:kubelet 证书须要由集群证书颁发机构签名(或者通过向 Metrics Server 传递参数 --kubelet-insecure-tls 来禁用证书验证)。
因为是测试环境,咱们抉择应用参数禁用证书验证,生产环境不举荐这样做!!!

在大略 139 行的地位追加参数:

    spec:      containers:      - args:        - --cert-dir=/tmp        - --secure-port=4443        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname        - --kubelet-use-node-status-port        - --metric-resolution=15s        - --kubelet-insecure-tls

apply 部署文件:

[[email protected] k8s-install]# kubectl apply -f metrics-server-components.yamlWarning: resource serviceaccounts/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.serviceaccount/metrics-server configuredWarning: resource clusterroles/system:aggregated-metrics-reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader configuredWarning: resource clusterroles/system:metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.clusterrole.rbac.authorization.k8s.io/system:metrics-server configuredWarning: resource rolebindings/metrics-server-auth-reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader configuredWarning: resource clusterrolebindings/metrics-server:system:auth-delegator is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator configuredWarning: resource clusterrolebindings/system:metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server configuredWarning: resource services/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.service/metrics-server configuredWarning: resource deployments/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.deployment.apps/metrics-server configuredWarning: resource apiservices/v1beta1.metrics.k8s.io is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured

metrics pod 曾经失常运行:

[[email protected] k8s-install]# kubectl get pod -A | grep metricskube-system   metrics-server-fd9598766-8zphn       1/1     Running   0              89s

再次执行 kubectl top 命令胜利:

[[email protected] k8s-install]# kubectl top podNAME                            CPU(cores)   MEMORY(bytes)   front-end-59bc6df748-699vb      0m           3Mi             front-end-59bc6df748-r7pkr      0m           3Mi             kucc4                           1m           2Mi             legacy-app                      1m           1Mi             my-demo-nginx-998bbf8f5-9t9pw   0m           0Mi             my-demo-nginx-998bbf8f5-lfgvw   0m           0Mi             my-demo-nginx-998bbf8f5-nfn7r   1m           0Mi             nginx-kusc00401                 0m           3Mi[[email protected] k8s-install]# [[email protected] k8s-install]# kubectl top nodeNAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   k8s-master   232m         5%     1708Mi          46%       k8s-slave1   29m          1%     594Mi           34%       k8s-slave2   25m          1%     556Mi           32%