Spring-Cloud-参考文档Spring-Cloud-Sleuth特性

jiezi

6 年前

将 trace 和 span ID 添加到 Slf4J MDC，因此你可以在日志聚合器中从给定的 trace 或 span 提取所有日志，如以下示例日志中所示：
```
2016-02-02 15:30:57.902  INFO [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 
23030 --- [nio-8081-exec-3] ...
2016-02-02 15:30:58.372 ERROR [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 
23030 --- [nio-8081-exec-3] ...
2016-02-02 15:31:01.936  INFO [bar,46ab0d418373cbc9,46ab0d418373cbc9,false] 
23030 --- [nio-8081-exec-4] ...
```
请注意 MDC 中的 [appname,traceId,spanId,exportable] 条目：
- spanId：发生的特定操作的 ID。
- appname：记录 span 的应用程序的名称。
- traceId：包含 span 的延迟图的 ID。
- exportable：是否应将日志导出到 Zipkin，你希望什么时候 span 不能导出？如果要在 Span 中包装某些操作并将其写入日志中。
提供对常见分布式追踪数据模型的抽象：trace、span（形成 DAG）、annotation 和键值 annotation，Spring Cloud Sleuth 基于 HTrace，但与 Zipkin（Dapper）兼容。
Sleuth 记录时间信息以帮助进行延迟分析，通过使用 sleuth，你可以查明应用程序中的延迟原因。
Sleuth 不进行过多的日志记录，并且不会导致生产应用程序崩溃，为此，Sleuth：
- 在带内传播有关你的调用图的结构数据，在带外休息。
- 包括层的自定义插装，比如 HTTP。
- 包括用于管理卷的采样策略。
- 可以向 Zipkin 系统报告用于查询和可视化。
仪器从 Spring 应用程序中常见的入口和出口点（servlet 过滤器、异步端点、rest 模板，调度操作，消息通道，Zuul 过滤器和 Feign 客户端）。
Sleuth 包含默认逻辑，用于跨 HTTP 或消息传递边界连接 trace，例如，HTTP 传播适用于与 Zipkin 兼容的请求 headers。
Sleuth 可以在进程之间传播上下文（也称为 baggage），因此，如果你在 Span 上设置 baggage 元素，则会通过 HTTP 或消息传递向下发送到其他进程。
提供创建或继续 span 以及通过 annotations 添加标记和日志的方法。
如果 spring-cloud-sleuth-zipkin 位于类路径上，则应用程序会生成并收集与 Zipkin 兼容的 trace，默认情况下，它通过 HTTP 将它们发送到 localhost 上的 Zipkin 服务器（端口 9411），你可以通过设置 spring.zipkin.baseUrl 来配置服务的位置。
- 如果你依赖spring-rabbit，你的应用程序会将 trace 发送到 RabbitMQ 代理而不是 HTTP。
- 如果你依赖spring-kafka，并设置spring.zipkin.sender.type:kafka，你的应用程序会将 trace 发送到 Kafka 代理而不是 HTTP。
  
  spring-cloud-sleuth-stream 已弃用，不应再使用。
Spring Cloud Sleuth 兼容 OpenTracing。

如果使用 Zipkin，请通过设置 spring.sleuth.sampler.probability 来配置导出的 span 概率（默认值：0.1，即 10%），否则，你可能会认为 Sleuth 没有工作，因为它忽略了一些 span。
始终设置 SLF4J MDC，并且 logback 用户可以根据前面显示的示例立即在日志中看到 trace 和 span ID，其他日志记录系统必须配置自己的格式化程序才能获得相同的结果，默认值如下：logging.pattern.level设置为%5p [${spring.zipkin.service.name:${spring.application.name:-}},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}](这是 Logback 用户的 Spring Boot 特性)，如果你不使用 SLF4J，则不会自动应用此模式。

从版本 2.0.0 开始，Spring Cloud Sleuth 使用 Brave 作为追踪库，为方便起见，在此处嵌入了 Brave 的部分文档。

在绝大多数情况下，你只需使用 Sleuth 提供的 Brave 中的 Tracer 或SpanCustomizer bean，下面的文档概述了 Brave 是什么以及它是如何工作的。

Brave 是一个用于捕获和报告关于分布式操作的延迟信息到 Zipkin 的库，大多数用户不直接使用 Brave，他们使用库或框架，而不是代表他们使用 Brave。

此模块包含一个追踪器，用于创建和连接 span，对潜在分布式工作的延迟进行建模，它还包括通过网络边界传播 trace 上下文的库（例如，使用 HTTP 头）。

最重要的是，你需要一个brave.Tracer，配置为向 Zipkin 报告。

以下示例设置通过 HTTP（而不是 Kafka）将 trace 数据（spans）发送到 Zipkin：

class MyClass {

    private final Tracer tracer;

    // Tracer will be autowired
    MyClass(Tracer tracer) {this.tracer = tracer;}

    void doSth() {Span span = tracer.newTrace().name("encode").start();
        // ...
    }
}

如果你的 span 包含一个名称长度超过 50 个字符，则该名称将被截断为 50 个字符，你的名称必须明确而具体，大名称会导致延迟问题，有时甚至会引发异常。

追踪器创建并连接 span，对潜在分布式工作的延迟进行建模，它可以采用抽样来减少进程中的开销，减少发送到 Zipkin 的数据量，或两者兼而有之。

追踪器返回的 span 在完成后向 Zipkin 报告数据，如果未采样则不执行任何操作，启动 span 后，你可以批注感兴趣的事件或添加包含详细信息或查找键的标记。

Spans 具有一个上下文，其中包含 trace 标识符，该标识符将 span 放置在表示分布式操作的树中的正确位置。

当追踪代码不离开你的进程，在范围 span 内运行它。

@Autowired Tracer tracer;

// Start a new trace or a span within an existing trace representing an operation
ScopedSpan span = tracer.startScopedSpan("encode");
try {
  // The span is in "scope" meaning downstream code such as loggers can see trace IDs
  return encoder.encode();} catch (RuntimeException | Error e) {span.error(e); // Unless you handle exceptions, you might not know the operation failed!
  throw e;
} finally {span.finish(); // always finish the span
}

当你需要更多功能或更精细的控制时，请使用 Span 类型：

@Autowired Tracer tracer;

// Start a new trace or a span within an existing trace representing an operation
Span span = tracer.nextSpan().name("encode").start();
// Put the span in "scope" so that downstream code such as loggers can see trace IDs
try (SpanInScope ws = tracer.withSpanInScope(span)) {return encoder.encode();
} catch (RuntimeException | Error e) {span.error(e); // Unless you handle exceptions, you might not know the operation failed!
  throw e;
} finally {span.finish(); // note the scope is independent of the span. Always finish a span.
}

上面的两个例子在完成时报告的 span 完全相同！

在上面的示例中，span 将是新的根 span 或现有 trace 中的下一个子 span。

拥有 span 后，你可以为其添加标记，标签可用作查找键或详细信息，例如，你可以使用运行时版本添加标记，如以下示例所示：

span.tag("clnt/finagle.version", "6.36.0");

当暴露自定义 span 到第三方的能力时，使用 brave.SpanCustomizer 而不是brave.Span，前者更易于理解和测试，并且不会使用 span 生命周期钩子诱惑用户。

interface MyTraceCallback {void request(Request request, SpanCustomizer customizer);
}

由于 brave.Span 实现了brave.SpanCustomizer，你可以将其传递给用户，如以下示例所示：

for (MyTraceCallback callback : userCallbacks) {callback.request(request, span);
}

有时，你不知道 trace 是否正在进行，并且您不希望用户执行 null 检查，brave.CurrentSpanCustomizer通过向正在进行或丢弃的任何 span 添加数据来处理此问题，如以下示例所示：

// The user code can then inject this without a chance of it being null.
@Autowired SpanCustomizer span;

void userCode() {span.annotate("tx.started");
  ...
}

在滚动自己的 RPC 仪器之前，请检查此处编写的仪器和 Zipkin 的列表。

RPC 追踪通常由拦截器自动完成，在幕后，他们添加与他们在 RPC 操作中的角色相关的标签和事件。

以下示例显示如何添加客户端 span：

@Autowired Tracing tracing;
@Autowired Tracer tracer;

// before you send a request, add metadata that describes the operation
span = tracer.nextSpan().name(service + "/" + method).kind(CLIENT);
span.tag("myrpc.version", "1.0.0");
span.remoteServiceName("backend");
span.remoteIpAndPort("172.3.4.1", 8108);

// Add the trace context to the request, so it can be propagated in-band
tracing.propagation().injector(Request::addHeader)
                     .inject(span.context(), request);

// when the request is scheduled, start the span
span.start();

// if there is an error, tag the span
span.tag("error", error.getCode());
// or if there is an exception
span.error(exception);

// when the response is complete, finish the span
span.finish();

有时，你需要在有请求但没有响应的情况下建模异步操作，在正常的 RPC 追踪中，你使用 span.finish() 来指示已收到响应，在单向追踪中，你使用 span.flush() 代替，因为不不期望响应。

以下示例显示了客户端如何为单向操作建模：

@Autowired Tracing tracing;
@Autowired Tracer tracer;

// start a new span representing a client request
oneWaySend = tracer.nextSpan().name(service + "/" + method).kind(CLIENT);

// Add the trace context to the request, so it can be propagated in-band
tracing.propagation().injector(Request::addHeader)
                     .inject(oneWaySend.context(), request);

// fire off the request asynchronously, totally dropping any response
request.execute();

// start the client side and flush instead of finish
oneWaySend.start().flush();

以下示例显示了服务器如何处理单向操作：

@Autowired Tracing tracing;
@Autowired Tracer tracer;

// pull the context out of the incoming request
extractor = tracing.propagation().extractor(Request::getHeader);

// convert that context to a span which you can name and add tags to
oneWayReceive = nextSpan(tracer, extractor.extract(request))
    .name("process-request")
    .kind(SERVER)
    ... add tags etc.

// start the server side and flush instead of finish
oneWayReceive.start().flush();

// you should not modify this span anymore as it is complete. However,
// you can create children to represent follow-up work.
next = tracer.newSpan(oneWayReceive.context()).name("step2").start();

Spring-Cloud-参考文档Spring-Cloud-Sleuth特性

Spring Cloud Sleuth 特性

Brave 介绍

追踪

本地追踪

自定义 span

隐式查看当前 span

RPC 追踪

单向追踪