关于spring-cloud:Spring-Cloud-Gateway-雪崩了我-TM-人傻了

91次阅读

共计 17585 个字符,预计需要花费 44 分钟才能阅读完成。

本系列是 我 TM 人傻了 系列第六期 [捂脸],往期精彩回顾:

  • 降级到 Spring 5.3.x 之后,GC 次数急剧减少,我 TM 人傻了
  • 这个大表走索引字段查问的 SQL 怎么就成全扫描了,我 TM 人傻了
  • 获取异样信息里再出异样就找不到日志了,我 TM 人傻了
  • spring-data-redis 连贯透露,我 TM 人傻了
  • Spring Cloud Gateway 没有链路信息,我 TM 人傻了

大家好,我又人傻了。这次的教训通知咱们,进去写代码偷的懒,迟早要还的。

问题景象与背景

昨晚咱们的网关雪崩了一段时间,景象是:

1. 一直有各种微服务报异样:在写 HTTP 响应的时候,连贯曾经敞开

reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response 

2. 同时还有申请还没读取完,连贯曾经敞开的异样

org.springframework.http.converter.HttpMessageNotReadableException: I/O error while reading input message; nested exception is java.io.IOException: UT000128: Remote peer closed connection before all data could be read 

3. 前端一直有申请超时的报警,504 Gateway Time-out

4. 网关过程一直健康检查失败而被重启

5. 重启后的网关过程,立即申请数量激增,每个实例峰值 2000 qps,闲时每个实例 500 qps,忙时因为有扩容也能放弃每个实例在 1000 qps 以内,而后健康检查接口就很长时间没有响应,导致实例一直重启

其中,1 和 2 的问题应该是应为网关一直重启,并且因为某些起因优雅敞开失败导致强制敞开,强制敞开导致连贯被强制断开从而有 1 和 2 相干的异样。

咱们的网关是基于 Spring Cloud Gateway 实现的,并且有主动依据 CPU 负载扩容的机制。奇怪的是在申请数量彪增的时候,CPU 利用率并没有进步很多,放弃在 60% 左右,因为 CPU 负载没有达到扩容的界线,所以始终没有主动扩容。为了疾速解决问题,咱们手动扩容了几个网关实例,将网关单实例负载管制在了 1000 以内,临时解决了问题。

问题剖析

为了彻底解决这个问题,咱们应用 JFR 剖析。首先先依据已知的线索去剖析:

  1. Spring Cloud Gateway 是基于 Spring-WebFlux 实现的异步响应式网关,http 业务线程是无限的(默认是 2 * 能够应用的 CPU 个数,咱们这里是 4)。
  2. 网关过程一直健康检查失败,健康检查调用的是 /actuator/health 接口,这个接口始终超时。

健康检查接口超时个别有两个起因:

  1. 健康检查接口查看某个组件的时候,阻塞住了。例如数据库如果卡住,那么可能数据库健康检查会始终没有返回。
  2. http 线程池没来得及解决健康检查申请,申请就超时了。

咱们能够先去看 JFR 中的定时堆栈,看是否有 http 线程卡在健康检查下面。查看出问题后的线程堆栈,重点关注那 4 个 http 线程,后果发现这 4 个线程的堆栈根本一样,都是在执行 Redis 命令:

"reactor-http-nio-1" #68 daemon prio=5 os_prio=0 cpu=70832.99ms elapsed=199.98s tid=0x0000ffffb2f8a740 nid=0x69 waiting on condition  [0x0000fffe8adfc000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at jdk.internal.misc.Unsafe.park(java.base@11.0.8/Native Method)
    - parking to wait for  <0x00000007d50eddf8> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.8/LockSupport.java:234)
    at java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.8/CompletableFuture.java:1798)
    at java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.8/ForkJoinPool.java:3128)
    at java.util.concurrent.CompletableFuture.timedGet(java.base@11.0.8/CompletableFuture.java:1868)
    at java.util.concurrent.CompletableFuture.get(java.base@11.0.8/CompletableFuture.java:2021)
    at io.lettuce.core.protocol.AsyncCommand.await(AsyncCommand.java:83)
    at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:244)
    at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:75)
    at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80)
    at com.sun.proxy.$Proxy245.get(Unknown Source)
    at org.springframework.data.redis.connection.lettuce.LettuceStringCommands.get(LettuceStringCommands.java:68)
    at org.springframework.data.redis.connection.DefaultedRedisConnection.get(DefaultedRedisConnection.java:267)
    at org.springframework.data.redis.connection.DefaultStringRedisConnection.get(DefaultStringRedisConnection.java:406)
    at org.springframework.data.redis.core.DefaultValueOperations$1.inRedis(DefaultValueOperations.java:57)
    at org.springframework.data.redis.core.AbstractOperations$ValueDeserializingRedisCallback.doInRedis(AbstractOperations.java:60)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:222)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:189)
    at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:96)
    at org.springframework.data.redis.core.DefaultValueOperations.get(DefaultValueOperations.java:53)
    at com.jojotech.apigateway.filter.AccessCheckFilter.traced(AccessCheckFilter.java:196)
    at com.jojotech.apigateway.filter.AbstractTracedFilter.filter(AbstractTracedFilter.java:39)
    at org.springframework.cloud.gateway.handler.FilteringWebHandler$GatewayFilterAdapter.filter(FilteringWebHandler.java:137)
    at org.springframework.cloud.gateway.filter.OrderedGatewayFilter.filter(OrderedGatewayFilter.java:44)
    at org.springframework.cloud.gateway.handler.FilteringWebHandler$DefaultGatewayFilterChain.lambda$filter$0(FilteringWebHandler.java:117)
    at org.springframework.cloud.gateway.handler.FilteringWebHandler$DefaultGatewayFilterChain$$Lambda$1478/0x0000000800b84c40.get(Unknown Source)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:44)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at com.jojotech.apigateway.common.TracedMono.subscribe(TracedMono.java:24)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:255)
    at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
    at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:73)
    at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:82)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:281)
    at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:860)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:120)
    at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:73)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1815)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:120)
    at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:82)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:281)
    at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:860)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1815)
    at reactor.core.publisher.MonoFilterWhen$MonoFilterWhenMain.onNext(MonoFilterWhen.java:149)
    at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2397)
    at reactor.core.publisher.MonoFilterWhen$MonoFilterWhenMain.onSubscribe(MonoFilterWhen.java:112)
    at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:54)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:448)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:250)
    at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onNext(FluxDematerialize.java:98)
    at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onNext(FluxDematerialize.java:44)
    at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:270)
    at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:228)
    at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.request(FluxDematerialize.java:127)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onSubscribe(FluxConcatMap.java:235)
    at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onSubscribe(FluxDematerialize.java:77)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:164)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:86)
    at reactor.core.publisher.InternalFluxOperator.subscribe(InternalFluxOperator.java:62)
    at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:54)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:448)
    at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onSubscribe(FluxConcatMap.java:218)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:164)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:86)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at org.springframework.cloud.sleuth.instrument.web.TraceWebFilter$MonoWebFilterTrace.subscribe(TraceWebFilter.java:184)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4150)
    at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:255)
    at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.netty.http.server.HttpServer$HttpServerHandle.onStateChange(HttpServer.java:915)
    at reactor.netty.ReactorNetty$CompositeConnectionObserver.onStateChange(ReactorNetty.java:654)
    at reactor.netty.transport.ServerTransport$ChildObserver.onStateChange(ServerTransport.java:478)
    at reactor.netty.http.server.HttpServerOperations.onInboundNext(HttpServerOperations.java:526)
    at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:94)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at reactor.netty.http.server.HttpTrafficHandler.channelRead(HttpTrafficHandler.java:209)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at reactor.netty.http.server.logging.AccessLogHandlerH1.channelRead(AccessLogHandlerH1.java:59)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(java.base@11.0.8/Thread.java:834)

发现 http 线程没有卡在健康检查,同时其余线程也没有任何和健康检查相干的堆栈(异步环境下,健康检查也是异步的,其中某些过程可能交给其余线程)。所以,健康检查申请应该是还没被执行就超时勾销了。

那么为啥会这样呢?于此同时,我还发现这里用的是 RedisTemplate,是 spring-data-redis 的同步 Redis API。我猛然想起来之前写这里的代码的时候,因为只是验证一个 key 是否存在和批改 key 的过期工夫,偷懒没有用异步 API。这里是不是因为应用同步 API 阻塞了 http 线程导致的雪崩呢?

咱们来验证下这个猜测:咱们的我的项目中 redis 操作是通过 spring-data-redis + Lettuce 连接池,启用并且减少了对于 Lettuce 命令的 JFR 监控,能够参考我的这篇文章:这个 Redis 连接池的新监控形式针不戳~ 我再加一点佐料,截至目前我的 pull request 曾经合并,这个个性会在 6.2.x 版本公布。咱们看下出问题工夫左近的 Redis 命令采集,如下图所示:

咱们来简略计算下执行 Redis 命令导致的阻塞工夫(咱们的采集是 10s 一次,count 是命令次数,工夫单位是微秒):应用这里的命令个数乘以 50% 的中位数,除以 10(因为是 10s),得出每秒因为执行 Redis 命令导致的阻塞工夫:

32*152=4864
1*860=860
5*163=815
32*176=5632
1*178=178
16959*168=2849112
774*176=136224
3144*166=521904
17343*179=3104397
702*166=116532
总和 6740518
6740518 / 10 = 674051.8 us = 0.67s

这个仅仅是应用中位数计算的阻塞工夫,从图上的散布其实能够看出真正的值应该比这个大,这样很有可能每秒须要在 Redis 同步接口上阻塞的工夫就超过了 1s,一直地申请,申请没有缩小,从而导致了申请越积越多,最初雪崩。

并且因为是阻塞接口,线程很多工夫耗费在期待 io 了, 所以 CPU 上不去,导致没有主动扩容 。业务顶峰时,因为有设定好的事后扩容,导致网关单实例没有达到出问题的压力,所以没问题。

解决问题

咱们来改写原有代码,应用同步 spring-data-redis Api 原有代码是 (其实就是 spring-cloud-gateway 的 Filter 接口的外围办法 public Mono<Void> traced(ServerWebExchange exchange, GatewayFilterChain chain) 的办法体 ):

if (StringUtils.isBlank(token)) {
    // 如果 token 不存在,则依据门路决定持续申请还是返回须要登录的状态码
    return continueOrUnauthorized(path, exchange, chain, headers);
} else {
    try {String accessTokenValue = redisTemplate.opsForValue().get(token);
        if (StringUtils.isNotBlank(accessTokenValue)) {
            // 如果 accessTokenValue 不为空,则续期 4 小时,保障登录用户只有有操作就不会让 token 过期
            Long expire = redisTemplate.getExpire(token);
            log.info("accessTokenValue = {}, expire = {}", accessTokenValue, expire);
            if (expire != null && expire < 4 * 60 * 60) {redisTemplate.expire(token, 4, TimeUnit.HOURS);
            }
            
            // 解析,获取 userId
            JSONObject accessToken = JSON.parseObject(accessTokenValue);
            String userId = accessToken.getString("userId");
            // 如果 userId 不为空才非法
            if (StringUtils.isNotBlank(userId)) {
                // 解析 Token 
                HttpHeaders newHeaders = parse(accessToken);
                // 持续申请
                return FilterUtil.changeRequestHeader(exchange, chain, newHeaders);
            }
        }
    } catch (Exception e) {log.error("read accessToken error: {}", e.getMessage(), e);
    }
    // 如果 token 不非法,则依据门路决定持续申请还是返回须要登录的状态码
    return continueOrUnauthorized(path, exchange, chain, headers);
}

改成应用异步:

if (StringUtils.isBlank(token)) {return continueOrUnauthorized(path, exchange, chain, headers);
} else {
    HttpHeaders finalHeaders = headers;
    // 必须应用 tracedPublisherFactory 包裹,否则链路信息会失落,这里参考我的另一篇文章:Spring Cloud Gateway 没有链路信息,我 TM 人傻了
    return tracedPublisherFactory.getTracedMono(redisTemplate.opsForValue().get(token)
                    // 必须切换线程,否则后续线程应用的还是 Redisson 的线程,如果耗时长则会影响其余应用 Redis 的业务,并且这个耗时也算在 Redis 连贯命令超时中
                    .publishOn(Schedulers.parallel()),
            exchange
    ).doOnSuccess(accessTokenValue -> {if (accessTokenValue != null) {
            //accessToken 续期,4 小时
            tracedPublisherFactory.getTracedMono(redisTemplate.getExpire(token).publishOn(Schedulers.parallel()), exchange).doOnSuccess(expire -> {log.info("accessTokenValue = {}, expire = {}", accessTokenValue, expire);
                if (expire != null && expire.toHours() < 4) {redisTemplate.expire(token, Duration.ofHours(4)).subscribe();}
            }).subscribe();}
    })
    // 必须转换成非 null,否则 flatmap 不会执行;也不能在开端用 switchIfEmpty,因为整体返回的是 Mono<Void> 原本外面承载的就是空的,会导致每个申请发送两遍。.defaultIfEmpty("")
    .flatMap(accessTokenValue -> {
        try {if (StringUtils.isNotBlank(accessTokenValue)) {JSONObject accessToken = JSON.parseObject(accessTokenValue);
                String userId = accessToken.getString("userId");
                if (StringUtils.isNotBlank(userId)) {
                    // 解析 Token 
                    HttpHeaders newHeaders = parse(accessToken);
                    // 持续申请
                    return FilterUtil.changeRequestHeader(exchange, chain, newHeaders);
                }
            }
            return continueOrUnauthorized(path, exchange, chain, finalHeaders);
        } catch (Exception e) {log.error("read accessToken error: {}", e.getMessage(), e);
            return continueOrUnauthorized(path, exchange, chain, finalHeaders);
        }
    });
}

这里有几个留神点:

  1. Spring-Cloud-Sleuth 对于 Spring-WebFlux 中做的链路追踪优先,如果咱们在 Filter 中创立新的 Flux 或者 Mono,这外面是没有链路信息的,须要咱们手动退出。这个能够参考我的另一篇文章:Spring Cloud Gateway 没有链路信息,我 TM 人傻了
  2. spring-data-redis + Lettuce 连接池的组合,对于异步接口,咱们最好在获取响应之后切换成别的线程池执行,否则后续线程应用的还是 Redisson 的线程,如果耗时长则会影响其余应用 Redis 的业务,并且这个耗时也算在 Redis 连贯命令超时中
  3. Project Reactor 如果两头后果有 null 值,则前面的 flatmap、map 等流操作就不执行了。如果在这里终止,前端收到的响应是有问题的。所以两头后果咱们要在每一步思考 null 问题。
  4. spring-cloud-gateway 的外围 GatewayFilter 接口,外围办法返回的是 Mono<Void>。Mono<Void> 原本外面承载的就是空的,导致咱们不能应用开端的 switchIfEmpty 来简化两头步骤的 null,如果用了会导致每个申请发送两遍。

这样批改后,压测了下网关,单实例 2w qps 申请也没有呈现这个问题了。

微信搜寻“我的编程喵”关注公众号,每日一刷,轻松晋升技术,斩获各种 offer

正文完
 0