共计 6166 个字符,预计需要花费 16 分钟才能阅读完成。
k8s 监控组织架构
指标阐明
- 零碎指标
分为节点 / 容器资源应用和 DaemonSet 运行的资源 - 服务指标
分为 Kubernetes 根底结构组件产生的和利用 pod 产生的
kube-stats-metrics
- job_name: kube-state-metrics
honor_timestamps: false
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- kube-state-metrics.kube-admin:8080
k8s apiserver 是什么
k8s API Server 提供了 k8s 各类资源对象(pod,RC,Service 等)的增删改查及 watch 等 HTTP Rest 接口,是整个零碎的数据总线和数据中心
采集原理
kube-state-metrics 应用 client-go
与 Kubernetes 集群通信, 一直轮询 api-server
- 初始化 metric store family
// E:\go_path\src\k8s.io\kube-state-metrics\internal\store\builder.go
var availableStores = map[string]func(f *Builder) cache.Store{"certificatesigningrequests": func(b *Builder) cache.Store {return b.buildCsrStore() },
"configmaps": func(b *Builder) cache.Store {return b.buildConfigMapStore() },
"cronjobs": func(b *Builder) cache.Store {return b.buildCronJobStore() },
"daemonsets": func(b *Builder) cache.Store {return b.buildDaemonSetStore() },
"deployments": func(b *Builder) cache.Store {return b.buildDeploymentStore() },
"endpoints": func(b *Builder) cache.Store {return b.buildEndpointsStore() },
"horizontalpodautoscalers": func(b *Builder) cache.Store {return b.buildHPAStore() },
"ingresses": func(b *Builder) cache.Store {return b.buildIngressStore() },
"jobs": func(b *Builder) cache.Store {return b.buildJobStore() },
"leases": func(b *Builder) cache.Store {return b.buildLeases() },
"limitranges": func(b *Builder) cache.Store {return b.buildLimitRangeStore() },
"mutatingwebhookconfigurations": func(b *Builder) cache.Store {return b.buildMutatingWebhookConfigurationStore() },
"namespaces": func(b *Builder) cache.Store {return b.buildNamespaceStore() },
"networkpolicies": func(b *Builder) cache.Store {return b.buildNetworkPolicyStore() },
"nodes": func(b *Builder) cache.Store {return b.buildNodeStore() },
"persistentvolumeclaims": func(b *Builder) cache.Store {return b.buildPersistentVolumeClaimStore() },
"persistentvolumes": func(b *Builder) cache.Store {return b.buildPersistentVolumeStore() },
"poddisruptionbudgets": func(b *Builder) cache.Store {return b.buildPodDisruptionBudgetStore() },
"pods": func(b *Builder) cache.Store {return b.buildPodStore() },
"replicasets": func(b *Builder) cache.Store {return b.buildReplicaSetStore() },
"replicationcontrollers": func(b *Builder) cache.Store {return b.buildReplicationControllerStore() },
"resourcequotas": func(b *Builder) cache.Store {return b.buildResourceQuotaStore() },
"secrets": func(b *Builder) cache.Store {return b.buildSecretStore() },
"services": func(b *Builder) cache.Store {return b.buildServiceStore() },
"statefulsets": func(b *Builder) cache.Store {return b.buildStatefulSetStore() },
"storageclasses": func(b *Builder) cache.Store {return b.buildStorageClassStore() },
"validatingwebhookconfigurations": func(b *Builder) cache.Store {return b.buildValidatingWebhookConfigurationStore() },
"volumeattachments": func(b *Builder) cache.Store {return b.buildVolumeAttachmentStore() },
"verticalpodautoscalers": func(b *Builder) cache.Store {return b.buildVPAStore() },
}
- 初始化 watchfunc 接管后果
// E:\go_path\src\k8s.io\kube-state-metrics\internal\store\builder.go
// reflectorPerNamespace creates a Kubernetes client-go reflector with the given
// listWatchFunc for each given namespace and registers it with the given store.
func (b *Builder) reflectorPerNamespace(expectedType interface{},
store cache.Store,
listWatchFunc func(kubeClient clientset.Interface, ns string) cache.ListerWatcher,
) {lwf := func(ns string) cache.ListerWatcher {return listWatchFunc(b.kubeClient, ns) }
lw := listwatch.MultiNamespaceListerWatcher(b.namespaces, nil, lwf)
instrumentedListWatch := watch.NewInstrumentedListerWatcher(lw, b.metrics, reflect.TypeOf(expectedType).String())
reflector := cache.NewReflector(sharding.NewShardedListWatch(b.shard, b.totalShards, instrumentedListWatch), expectedType, store, 0)
go reflector.Run(b.ctx.Done())
}
指标列举
- ConfigMap 指标: ConfigMap 是什么
eg: configmap 信息
kube_configmap_info{configmap="xxx",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="xxx"}
- CronJob 指标 CronJob 是什么
eg: cronjob 下次调度工夫
kube_cronjob_next_schedule_time{cronjob="abc",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="abc"} 1594306800
- DaemonSet 指标 DaemonSet 是什么
eg: ready daemonset
kube_daemonset_status_number_ready{daemonset="npd",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="kube-admin"} 6
- Deployment Metrics Deployment 是什么
eg : 不衰弱的 pod
kube_deployment_status_replicas_unavailable{deployment="coredns",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="kube-system"}
- Endpoints Metrics : service 向其发送流量的对象的 IP 地址
eg: nginx avaiable eps
kube_endpoint_address_available{endpoint="nginx",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="xxx"}
- Horizontal Pod Autoscaler(HPA) Metrics: HPA 是什么
eg: 第三方 hpa 根据 metric_name
kube_horizontalpodautoscaler_spec_target_metric{metric_name="xxxx"}
- Ingress Metrics Ingress 是什么
eg: ingress info
kube_ingress_info{ingress="xxxx",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="xxx"}
- Lease Metrics Lease 是什么
- Namespace Metrics Namespace 是什么
eg:
kube_namespace_status_phase{instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="kube-system",phase="Active"}
- Node Metrics 应用 node-problem-detector 探测 node 的问题
其中节点不衰弱状态有:MemoryPressure DiskPressure PIDPressure KernelDeadlock ReadonlyFilesystem
-
eg: node 节点不衰弱
kube_node_status_condition{condition="Ready",instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",node="xxxx.xxx.xxx.xx",status="unknown"}
- PersistentVolume PersistentVolumeClaim Metrics pv pvc 是什么
- PodDisruptionBudget Metrics PDB 是什么
- Pod Metrics
eg: pod 重启
idelta(kube_pod_container_status_restarts_total{}[1m]) > 0
eg: 代表 pod 在 waiting 状态
kube_pod_container_status_waiting_reason==1
其中状态有
ImagePullBackOff
CrashLoopBackOff
ErrImagePull
CreateContainerConfigError
CreateContainerError
InvalidImageName
eg: pod 调配 cpu
kube_pod_container_resource_requests_cpu_cores
eg: pod 分配内存
kube_pod_container_resource_requests_memory_bytes
eg: pod pending
kube_pod_status_phase{phase=~"Pending|Unknown"}
状态有
Pending
Succeeded
Failed
Running
Unknown
- ReplicaSet metrics
- ResourceQuota Metrics RQ 是什么
资源分 cpu 和 memory , 对象分 pod 和 namespace, 类型分 used 和 hard
- Secret Metrics Secret 是什么
- Service Metrics Service 是什么
- Stateful Set Metrics Statfulset 是什么
eg : ready statfulset pod
kube_statefulset_status_replicas_ready{instance="kube-state-metrics.kube-admin:8080",job="kube-state-metrics",namespace="kube-admin",statefulset="prometheus"}
正文完