k8s中容器资源的监控
在promethues中如何配置采集容器指标
- 采用promethues的kubernetes_sd_configs中 node级别的role
- job_name: kubernetes-nodes-cadvisor honor_timestamps: false scrape_interval: 30s scrape_timeout: 10s metrics_path: /metrics scheme: https kubernetes_sd_configs: - role: node bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true relabel_configs: - separator: ; regex: __meta_kubernetes_node_label_(.+) replacement: $1 action: labelmap - separator: ; regex: (.*) target_label: __metrics_path__ replacement: /metrics/cadvisor action: replace
cadvisor架构图
cadvisor中 POD
- 在查看cadvisor代码时发现有一种container_name=="POD"的容器,查了下是 k8s中的pause pod
下面追踪下打tag的过程:以pod cpu使用率为例
kubelet最终tag
- 对应cadvisor指标为 container_cpu_usage_seconds_total,可以看到最终查询出来的有如下tag
- 那我们会好奇标识app或service的tag:pod,pod_name,container,container_name是如何打上去的呢
访问集成在kubelet中的cadvisor的tag
- curl localhost:4194/metrics
- 可以发现除了cpu是container_cpu_usage_seconds_total指标特有的tag之外,还有id,name,namespace,pod_name,container_name,image这几个tag
- 上述tag作为cadvisor通用tag会附加在每一个metric上面
- 其实在裸的cadvisor中只有id,image,name三个tag原始cadvisor 打tag
- namespace,pod_name,container_name等属性是k8s才有的,cadvisor肯定无法感知pod信息,说明是k8s注入的
kubelet内置的cadvisor中使用了自定义的PrometheusLabelsFunc
以k8s 1.15版本为例
代码在:E:go_pathsrcgithub.comkuberneteskubernetespkgkubeletserverserver.go
func containerPrometheusLabelsFunc(s stats.Provider) metrics.ContainerLabelsFunc { // containerPrometheusLabels maps cAdvisor labels to prometheus labels. return func(c *cadvisorapi.ContainerInfo) map[string]string { // Prometheus requires that all metrics in the same family have the same labels, // so we arrange to supply blank strings for missing labels var name, image, podName, namespace, containerName string if len(c.Aliases) > 0 { name = c.Aliases[0] } image = c.Spec.Image if v, ok := c.Spec.Labels[kubelettypes.KubernetesPodNameLabel]; ok { podName = v } if v, ok := c.Spec.Labels[kubelettypes.KubernetesPodNamespaceLabel]; ok { namespace = v } if v, ok := c.Spec.Labels[kubelettypes.KubernetesContainerNameLabel]; ok { containerName = v } // Associate pod cgroup with pod so we have an accurate accounting of sandbox if podName == "" && namespace == "" { if pod, found := s.GetPodByCgroupfs(c.Name); found { podName = pod.Name namespace = pod.Namespace } } set := map[string]string{ metrics.LabelID: c.Name, metrics.LabelName: name, metrics.LabelImage: image, "pod_name": podName, "pod": podName, "namespace": namespace, "container_name": containerName, "container": containerName, } return set }}
k8s 1.15 1.16版本对于pod和pod_name的变化
- 经过观察发现1.15的pod pod_name都有,1.16只有pod
- 这是因为 在k8s 1.16版本为了统一cadvisor和kube-stats指标tag做了变更[这个pr] (https://github.com/kubernetes...
kubelet启动时可以使用--node-labels注入node级别tag
--node-labels=os.name=xxxx,os.version=xxxx,os.architecture=amd64,
- 这些tag会转化为promethues metric命名方式xxx_xxx
- 最后追加为os_version=xxx,os_architecture=amd64