关于kubernetes:简单4步利用Prometheus-Operator实现自定义指标监控

本文来自Rancher Labs

在过来的文章中，咱们花了相当大的篇幅来聊对于监控的话题。这是因为当你正在治理Kubernetes集群时，所有都会以极快的速度发生变化。因而有一个工具来监控集群的衰弱状态和资源指标极为重要。

在Rancher 2.5中，咱们引入了基于Prometheus Operator的新版监控，它能够提供Prometheus以及相干监控组件的原生Kubernetes部署和治理。Prometheus Operator能够让你监控集群节点、Kubernetes组件和应用程序工作负载的状态和过程。同时，它还可能通过Prometheus收集的指标来定义告警并且创立自定义仪表盘，通过Grafana能够轻松地可视化收集到的指标。你能够拜访下列链接获取更多对于新版监控组件的细节：

https://rancher.com/docs/ranc…

新版本的监控也采纳prometheus-adapter，开发人员能够利用其基于自定义指标和HPA扩大他们的工作负载。

在本文中，咱们将摸索如何利用Prometheus Operator来抓取自定义指标并利用这些指标进行高级工作负载治理。

装置Prometheus

在Rancher 2.5中装置Prometheus极为简略。仅需拜访Cluster Explorer -> Apps并装置rancher-monitoring即可。

你须要理解以下默认设置：

prometheus-adapter将会作为chart装置的一部分启用
ServiceMonitorNamespaceSelector 留为空，容许 Prometheus 在所有命名空间中收集 ServiceMonitors

装置实现后，咱们能够从Cluster Explorer拜访监控组件。

部署工作负载

当初让咱们部署一个从应用层裸露自定义指标的示例工作负载。该工作负载裸露了一个简略的应用程序，该应用程序曾经应用Prometheus client_golang库进行了检测，并在/metric端点上提供了一些自定义指标。

它有两个指标：

http_requests_total
http_request_duration_seconds

以下manifest部署了工作负载、相干服务以及拜访该工作负载的ingress：

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/name: prometheus-example-app
  name: prometheus-example-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: prometheus-example-app
  template:
    metadata:
      labels:
        app.kubernetes.io/name: prometheus-example-app
    spec:
      containers:
      - name: prometheus-example-app
        image: gmehta3/demo-app:metrics
        ports:
        - name: web
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus-example-app
  labels:
    app.kubernetes.io/name: prometheus-example-app
spec:
  selector:
    app.kubernetes.io/name: prometheus-example-app
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080
      name: web
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
    name: prometheus-example-app
spec:
    rules:
    - host: hpa.demo
      http:
        paths:
        - path: /
          backend:
            serviceName: prometheus-example-app
            servicePort: 8080

部署ServiceMonitor

ServiceMonitor是一个自定义资源定义（CRD），能够让咱们申明性地定义如何监控一组动静服务。

你能够拜访以下链接查看残缺的ServiceMonitor标准：

https://github.com/prometheus…

当初，咱们来部署ServiceMonitor，Prometheus用它来收集组成prometheus-example-app Kubernetes服务的pod。

kind: ServiceMonitor
metadata:
  name: prometheus-example-app
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: prometheus-example-app
  endpoints:
  - port: web

如你所见，当初用户能够在Rancher监控中浏览ServiceMonitor。

不久之后，新的service monitor和服务相关联的pod应该会反映在Prometheus服务发现中。

咱们也可能在Prometheus中看到指标。

部署Grafana仪表盘

在Rancher 2.5中，监控能够让用户将Grafana仪表盘存储为cattle-dashboards命名空间中的ConfigMaps。

用户或集群管理员当初能够在这一命名空间中增加更多的仪表盘以扩大Grafana的自定义仪表盘。

Dashboard ConfigMap Example

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-example-app-dashboard
  namespace: cattle-dashboards
  labels:
    grafana_dashboard: "1"
data:
  prometheus-example-app.json: |
    {
    "annotations": {
        "list": [
        {
            "builtIn": 1,
            "datasource": "-- Grafana --",
            "enable": true,
            "hide": true,
            "iconColor": "rgba(0, 211, 255, 1)",
            "name": "Annotations & Alerts",
            "type": "dashboard"
        }
        ]
    },
    "editable": true,
    "gnetId": null,
    "graphTooltip": 0,
    "links": [],
    "panels": [
        {
        "aliasColors": {},
        "bars": false,
        "dashLength": 10,
        "dashes": false,
        "datasource": null,
        "fieldConfig": {
            "defaults": {
            "custom": {}
            },
            "overrides": []
        },
        "fill": 1,
        "fillGradient": 0,
        "gridPos": {
            "h": 9,
            "w": 12,
            "x": 0,
            "y": 0
        },
        "hiddenSeries": false,
        "id": 2,
        "legend": {
            "avg": false,
            "current": false,
            "max": false,
            "min": false,
            "show": true,
            "total": false,
            "values": false
        },
        "lines": true,
        "linewidth": 1,
        "nullPointMode": "null",
        "percentage": false,
        "pluginVersion": "7.1.5",
        "pointradius": 2,
        "points": false,
        "renderer": "flot",
        "seriesOverrides": [],
        "spaceLength": 10,
        "stack": false,
        "steppedLine": false,
        "targets": [
            {
            "expr": "rate(http_requests_total{code=\"200\",service=\"prometheus-example-app\"}[5m])",
            "instant": false,
            "interval": "",
            "legendFormat": "",
            "refId": "A"
            }
        ],
        "thresholds": [],
        "timeFrom": null,
        "timeRegions": [],
        "timeShift": null,
        "title": "http_requests_total_200",
        "tooltip": {
            "shared": true,
            "sort": 0,
            "value_type": "individual"
        },
        "type": "graph",
        "xaxis": {
            "buckets": null,
            "mode": "time",
            "name": null,
            "show": true,
            "values": []
        },
        "yaxes": [
            {
            "format": "short",
            "label": null,
            "logBase": 1,
            "max": null,
            "min": null,
            "show": true
            },
            {
            "format": "short",
            "label": null,
            "logBase": 1,
            "max": null,
            "min": null,
            "show": true
            }
        ],
        "yaxis": {
            "align": false,
            "alignLevel": null
        }
        },
        {
        "aliasColors": {},
        "bars": false,
        "dashLength": 10,
        "dashes": false,
        "datasource": null,
        "description": "",
        "fieldConfig": {
            "defaults": {
            "custom": {}
            },
            "overrides": []
        },
        "fill": 1,
        "fillGradient": 0,
        "gridPos": {
            "h": 8,
            "w": 12,
            "x": 0,
            "y": 9
        },
        "hiddenSeries": false,
        "id": 4,
        "legend": {
            "avg": false,
            "current": false,
            "max": false,
            "min": false,
            "show": true,
            "total": false,
            "values": false
        },
        "lines": true,
        "linewidth": 1,
        "nullPointMode": "null",
        "percentage": false,
        "pluginVersion": "7.1.5",
        "pointradius": 2,
        "points": false,
        "renderer": "flot",
        "seriesOverrides": [],
        "spaceLength": 10,
        "stack": false,
        "steppedLine": false,
        "targets": [
            {
            "expr": "rate(http_requests_total{code!=\"200\",service=\"prometheus-example-app\"}[5m])",
            "interval": "",
            "legendFormat": "",
            "refId": "A"
            }
        ],
        "thresholds": [],
        "timeFrom": null,
        "timeRegions": [],
        "timeShift": null,
        "title": "http_requests_total_not_200",
        "tooltip": {
            "shared": true,
            "sort": 0,
            "value_type": "individual"
        },
        "type": "graph",
        "xaxis": {
            "buckets": null,
            "mode": "time",
            "name": null,
            "show": true,
            "values": []
        },
        "yaxes": [
            {
            "format": "short",
            "label": null,
            "logBase": 1,
            "max": null,
            "min": null,
            "show": true
            },
            {
            "format": "short",
            "label": null,
            "logBase": 1,
            "max": null,
            "min": null,
            "show": true
            }
        ],
        "yaxis": {
            "align": false,
            "alignLevel": null
        }
        }
    ],
    "schemaVersion": 26,
    "style": "dark",
    "tags": [],
    "templating": {
        "list": []
    },
    "time": {
        "from": "now-15m",
        "to": "now"
    },
    "timepicker": {
        "refresh_intervals": [
        "5s",
        "10s",
        "30s",
        "1m",
        "5m",
        "15m",
        "30m",
        "1h",
        "2h",
        "1d"
        ]
    },
    "timezone": "",
    "title": "prometheus example app",
    "version": 1
    }

当初，用户应该可能在Grafana中拜访prometheus example app的仪表盘。

自定义指标的HPA

这一部分假如你曾经将prometheus-adapter作为监控的一部分装置结束了。实际上，在默认状况下，监控安装程序会装置prometheus-adapter。

用户当初能够创立一个HPA spec，如下所示：

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: prometheus-example-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: prometheus-example-app
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Object
    object:
        describedObject:
            kind: Service
            name: prometheus-example-app
        metric:
            name: http_requests
        target:
            averageValue: "5"
            type: AverageValue

你能够查看以下链接获取对于HPA的更多信息：

https://kubernetes.io/docs/ta…

咱们将应用自定义的http_requests_total指标来执行pod主动伸缩。

当初咱们能够生成一个样本负载来查看HPA的运行状况。我能够应用hey进行同样的操作。

hey -c 10 -n 5000 http://hpa.demo

总结

在本文中，咱们探讨了Rancher 2.5中新监控的灵活性。开发人员和集群管理员能够利用该堆栈来监控它们的工作负载，部署可视化，并利用Kubernetes内可用的高级工作负载治理性能。

关于kubernetes:简单4步利用Prometheus-Operator实现自定义指标监控

装置Prometheus

部署工作负载

部署ServiceMonitor

部署Grafana仪表盘

自定义指标的HPA

总结

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于kubernetes:简单4步利用Prometheus-Operator实现自定义指标监控

装置Prometheus

部署工作负载

部署ServiceMonitor

部署Grafana仪表盘

自定义指标的HPA

总 结

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

总结

发表回复取消回复