关于ebpf:Grafana-开源了一款-eBPF-采集器-Beyla

68次阅读

共计 12836 个字符,预计需要花费 33 分钟才能阅读完成。

eBPF 的倒退热火朝天,在可观测性畛域大放异彩,Grafana 近期也公布了一款 eBPF 采集器,能够采集服务的 RED 指标,本文做一个尝鲜介绍,让读者有个大略理解。

eBPF 根底介绍能够参考我之前的文章《eBPF Hello world》。实践上,eBPF 能够拿到服务收到的申请信息,比方 QPS、提早、成功率等,这些数据对于利用级监控至关重要,Grafana Beyla 就是为此而生的。

要测试应用 Beyla 采集服务的 RED(Rate-Errors-Duration)指标,那首先得有个服务,这里我用的是 answer(https://answer.flashcat.cloud)论坛,你也能够本人搞一个简略的 http 服务,比方:

package main

import (
    "net/http"
    "strconv"
    "time"
)

func handleRequest(rw http.ResponseWriter, req *http.Request) {
    status := 200
    for k, v := range req.URL.Query() {if len(v) == 0 {continue}
        switch k {
        case "status":
            if s, err := strconv.Atoi(v[0]); err == nil {status = s}
        case "delay":
            if d, err := time.ParseDuration(v[0]); err == nil {time.Sleep(d)
            }
        }
    }
    rw.WriteHeader(status)
}

func main() {
    http.ListenAndServe(":8080",
                 http.HandlerFunc(handleRequest))
}

下面这个代码,保留成 server.go,而后用 go run server.go 即可运行,当然,前提是你机器上有 go 开发环境。这个小服务,能够接管两个参数,一个是 status,用来指定返回的 http 状态码,另一个是 delay,用来指定提早多久返回,比方:

curl -v "http://localhost:8080/foo?status=404"

下面的命令,会返回 404 状态码,如果想提早 1 秒返回,能够这样:

curl -v "http://localhost:8080/foo?delay=1s"

接下来,咱们就能够应用 Beyla 采集这个服务的 RED 指标了。

下载 Beyla

我的机器上有 go 开发环境,所以我间接应用 go install 装置了,你也能够去 Beyla 的 release 页面下载二进制包,而后解压缩应用。

go install github.com/grafana/beyla/cmd/beyla@latest

运行 Beyla

应用上面的命令运行 Beyla:

$ BEYLA_PROMETHEUS_PORT=8999 PRINT_TRACES=true OPEN_PORT=8080 sudo -E beyla

或者间接应用 root 账号运行,比方我是这么跑的:

$ BEYLA_PROMETHEUS_PORT=8999 PRINT_TRACES=true OPEN_PORT=8080 beyla

解释一下这几个参数:

  • BEYLA_PROMETHEUS_PORT: Beyla 要监听的端口,通过这个端口裸露 metrics 指标数据
  • PRINT_TRACES: 是否打印 trace 日志
  • OPEN_PORT: Beyla 采集的指标服务监听的端口,这里是 8080,下面给出的那段 go server 的代码就是监听在 8080,我的机器上 answer 论坛程序也是监听在 8080,你要监控的程序如果不是监听在 8080,能够在换成你本人的端口

查看指标

运行之后,能够通过 curl 查看指标:

curl http://localhost:8999/metrics

返回的内容如下:

# HELP http_client_duration_seconds duration of HTTP service calls from the client side, in seconds
# TYPE http_client_duration_seconds histogram
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.005"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.01"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.025"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.05"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.075"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.1"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.25"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.5"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.75"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2.5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="7.5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="10"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 1
http_client_duration_seconds_sum{http_method="GET",http_status_code="200",service_name="answer"} 1.668771575
http_client_duration_seconds_count{http_method="GET",http_status_code="200",service_name="answer"} 1
# HELP http_client_request_size_bytes size, in bytes, of the HTTP request body as sent from the client side
# TYPE http_client_request_size_bytes histogram
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="32"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="64"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="128"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="256"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="512"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1024"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2048"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="4096"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="8192"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 1
http_client_request_size_bytes_sum{http_method="GET",http_status_code="200",service_name="answer"} 0
http_client_request_size_bytes_count{http_method="GET",http_status_code="200",service_name="answer"} 1
# HELP http_server_duration_seconds duration of HTTP service calls from the server side, in seconds
# TYPE http_server_duration_seconds histogram
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.005"} 201
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.01"} 789
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.025"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.05"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.075"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.1"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.25"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.5"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.75"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2.5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="7.5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="10"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 800
http_server_duration_seconds_sum{http_method="GET",http_status_code="200",service_name="answer"} 5.752096697000003
http_server_duration_seconds_count{http_method="GET",http_status_code="200",service_name="answer"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.005"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.01"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.025"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.05"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.075"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.25"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.75"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="2.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="7.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="10"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="+Inf"} 1
http_server_duration_seconds_sum{http_method="GET",http_status_code="302",service_name="answer"} 0.001523002
http_server_duration_seconds_count{http_method="GET",http_status_code="302",service_name="answer"} 1
# HELP http_server_request_size_bytes size, in bytes, of the HTTP request body as received at the server side
# TYPE http_server_request_size_bytes histogram
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="32"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="64"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="128"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="256"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="512"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1024"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2048"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="4096"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="8192"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 800
http_server_request_size_bytes_sum{http_method="GET",http_status_code="200",service_name="answer"} 0
http_server_request_size_bytes_count{http_method="GET",http_status_code="200",service_name="answer"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="32"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="64"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="128"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="256"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="512"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="1024"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="2048"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="4096"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="8192"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="+Inf"} 1
http_server_request_size_bytes_sum{http_method="GET",http_status_code="302",service_name="answer"} 0
http_server_request_size_bytes_count{http_method="GET",http_status_code="302",service_name="answer"} 1
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0

这些指标就能够用采集器来抓了,比方 vmagent、categraf、prometheus 等,完事之后入库,应用 Grafana 展现剖析即可,常常关注本公众号的读者对于这些常识应该比拟相熟了,这里不再赘述。Beyla 默认提供了一个 Grafana Dashboard,能够导入测试:https://github.com/grafana/beyla/tree/main/grafana。

结语

Beyla 目前还不太稳固,还有很多性能没有实现。不过能够尝鲜钻研了。可观测性整套技术栈搞起来还挺吃力的,如果您想建设这套技术栈,欢送来和咱们聊聊,咱们提供这方面的征询和商业产品,详情理解:

https://flashcat.cloud/

正文完
 0