关于服务器:排查记录负载过高系统变慢

9次阅读

共计 16385 个字符,预计需要花费 41 分钟才能阅读完成。

背景

零碎突然变慢,页面刷新加载工夫长,同一台机器上部署了 n 多个 java 服务,不要问我为什么,问就是历史起因。

一、top 查看零碎负载

Linux:top 命令 – 实时显示零碎运行状态

4 核 16GB 内存的服务器,一分钟负载飙到 24,有点吓人,页面加载时长 20 秒向上了

二、pidstat 命令查看各个过程的 CPU 应用状况

pidstat -u 5 显示各个过程的 cpu 应用统计,5 秒刷新一次

# 显示各个过程的 cpu 应用统计,5 秒刷新一次
[root@srv-100-50-88-6 logs]# pidstat -u 5
Linux 3.10.0-327.el7.x86_64 (srv-100-50-88-6)     08/14/2022     _x86_64_    (4 CPU)

12:29:06 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
12:29:12 AM     0         7    0.00    0.20    0.00    0.20     0  migration/0
12:29:12 AM     0       169    0.00    0.60    0.00    0.60     0  rcu_sched
12:29:12 AM     0       171    0.00    0.20    0.00    0.20     2  rcuos/1
12:29:12 AM     0       172    0.00    0.20    0.00    0.20     3  rcuos/2
12:29:12 AM     0       173    0.00    0.40    0.00    0.40     1  rcuos/3
12:29:12 AM     0       343    0.00    5.75    0.00    5.75     3  ksoftirqd/3
12:29:12 AM     0      2341    0.20    0.00    0.00    0.20     0  ds_agent
12:29:12 AM     0     14799    6.94    0.60    0.00    7.54     0  java
12:29:12 AM     0     59866    0.00    0.20    0.00    0.20     2  qemu-ga
12:29:12 AM     0     69419    0.00    2.38    0.00    2.38     1  java
12:29:12 AM     0     72129    0.00   10.12    0.00   10.12     0  java
12:29:12 AM     0     86118    4.96    5.36    0.00   10.32     2  java
12:29:12 AM     0     95136    0.00    7.34    0.00    7.34     1  java
12:29:12 AM     0     97574    0.20    0.00    0.00    0.20     0  dockerd
12:29:12 AM     0    101854    0.00    3.37    0.00    3.37     3  java
12:29:12 AM     0    147597    0.60    1.19    0.00    1.79     2  node_exporter
12:29:12 AM     0    148215   30.95   12.30    0.00   43.25     0  java
12:29:12 AM     0    150973    0.00    0.20    0.00    0.20     2  kworker/2:1
12:29:12 AM     0    152459    0.20    0.79    0.00    0.99     0  pidstat
12:29:12 AM     0    152902    0.00    5.16    0.00    5.16     2  java
12:29:12 AM     0    153166    0.00    6.35    0.00    6.35     3  java

显著察看到 148215 过程占用零碎 cpu 时长达到了 43.25%。

三、定位 java 线程

# 定位 java 线程
[root@srv-100-50-88-6 logs]# ps H -eo pid,tid,%cpu --sort=%cpu |grep 148215
148215 149006  0.0
148215 149062  0.0
148215 150694  0.0
148215 151011  0.0
148215 151615  0.0
148215 152570  0.0
148215 148217  0.1
148215 148218  0.1
148215 148219  0.1
148215 148220  0.1
148215 148313  0.1
148215 148331  0.1
148215 148531  0.1
148215 148893  0.1
148215 148895  0.1
148215 148243  0.2
148215 148310  0.2
148215 148343  0.2
148215 148539  0.2
148215 148540  0.2
148215 148541  0.2
148215 148890  0.2
148215 148891  0.2
148215 148892  0.2
148215 148894  0.2
148215 148896  0.2
148215 148734  0.4
148215 148221  0.8
148215 148227  0.8
148215 148377  0.9
148215 148378  0.9
148215 150403  0.9
148215 148329  1.2
148215 148328  1.3
148215 148376  1.3
148215 148385  1.3
148215 148567  1.6
148215 148327  1.7
148215 148333  1.7
148215 148216  2.1
148215 148225  3.1
148215 148226  3.1
148215 151191  3.6
# 将线程 TID 转换为 16 进制,为前面查找 jstack 日志做筹备; 这里转换了倒排的七个线程。[root@srv-100-50-88-6 logs]# printf "0x%x\n" 151191
0x24e97
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148226
0x24302
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148225
0x24301
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148216
0x242f8
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148333
0x2436d
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148327
0x24367
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148567
0x24457

四、定位代码块

# 应用 jstack 生成虚拟机以后时刻的线程快照
[root@srv-100-50-88-6 logs]# jstack -l 148215 >> jstacklog.out
#将以后堆栈信息保留为文件,通过 16 进制的 TID 查找问题所在的代码块;搜寻失去对应的线程信息,可直观的看到代码异样信息。[root@srv-100-50-88-6 logs]# sz jstacklog.out 

通过 0x24e97 这样的线程 16 过程,从 jstacklog.out 中找出具体的堆栈。


"Thread-166" #520 prio=10 os_prio=0 tid=0x00007f3290016800 nid=0x24e97 runnable [0x00007f32582db000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
    at sun.security.ssl.InputRecord.read(InputRecord.java:503)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
    - locked <0x00000006d212c5a0> (a java.lang.Object)
    at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
    at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
    - locked <0x00000006d2136e38> (a sun.security.ssl.AppInputStream)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64)
    at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63)
    at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45)
    at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52)
    at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41)
    at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54)
    at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44)
    at com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:533)
    at com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:703)
    at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:642)
    at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:128)
    at com.mysql.cj.NativeSession.sendCommand(NativeSession.java:311)
    at com.mysql.cj.NativeSession.queryServerVariable(NativeSession.java:1008)
    at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1406)
    at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1399)
    at com.mysql.cj.jdbc.ClientPreparedStatement.checkReadOnlySafeStatement(ClientPreparedStatement.java:307)
    - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl)
    at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:317)
    - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl)
    at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
    at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
    at sun.reflect.GeneratedMethodAccessor501.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.logging.jdbc.PreparedStatementLogger.invoke(PreparedStatementLogger.java:59)
    at com.sun.proxy.$Proxy637.execute(Unknown Source)
    at org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:47)
    at org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:74)
    at sun.reflect.GeneratedMethodAccessor900.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63)
    at com.sun.proxy.$Proxy635.update(Unknown Source)
    at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doUpdate(MybatisSimpleExecutor.java:56)
    at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117)
    at sun.reflect.GeneratedMethodAccessor774.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.plugin.Invocation.proceed(Invocation.java:49)
    at com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor.intercept(MybatisPlusInterceptor.java:83)
    at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61)
    at com.sun.proxy.$Proxy634.update(Unknown Source)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:197)
    at sun.reflect.GeneratedMethodAccessor898.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426)
    at com.sun.proxy.$Proxy159.update(Unknown Source)
    at org.mybatis.spring.SqlSessionTemplate.update(SqlSessionTemplate.java:287)
    at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:65)
    at com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148)
    at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89)
    at com.sun.proxy.$Proxy418.updateById(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor905.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
    at com.sun.proxy.$Proxy419.updateById(Unknown Source)
    at com.baomidou.mybatisplus.extension.service.IService.updateById(IService.java:148)
    at com.baomidou.mybatisplus.extension.service.IService$$FastClassBySpringCGLIB$$f8525d18.invoke(<generated>)
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
    at com.segmentfault.business.service.impl.stat.StatQualityAnalysisServiceImpl$$EnhancerBySpringCGLIB$$eb9ef323.updateById(<generated>)
    at com.segmentfault.business.task.service.StatQualityAnalysisService.lambda$saveOrUpdateAnalysis$23(StatQualityAnalysisService.java:1260)
    at com.segmentfault.business.task.service.StatQualityAnalysisService$$Lambda$1908/872329305.accept(Unknown Source)
    at java.util.ArrayList.forEach(ArrayList.java:1257)
    at com.segmentfault.business.task.service.StatQualityAnalysisService.saveOrUpdateAnalysis(StatQualityAnalysisService.java:1246)
    at com.segmentfault.business.task.service.StatQualityAnalysisService.collectTodayQualityInfo(StatQualityAnalysisService.java:138)
    at com.segmentfault.business.task.StatQualityAnalysisJob.collectTodayQualityInfo(StatQualityAnalysisJob.java:33)
    at sun.reflect.GeneratedMethodAccessor2274.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29)
    at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)

   Locked ownable synchronizers:
    - None


"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f3310182800 nid=0x24302 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f331017f800 nid=0x24301 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None


"DestroyJavaVM" #180 prio=5 os_prio=0 tid=0x00007f3310009800 nid=0x242f8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None


"com.alibaba.nacos.naming.beat.sender" #55 daemon prio=5 os_prio=0 tid=0x00007f3311ad9000 nid=0x2436d runnable [0x00007f326faf9000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"com.alibaba.nacos.naming.beat.sender" #49 daemon prio=5 os_prio=0 tid=0x00007f3310f93000 nid=0x24367 waiting on condition [0x00007f32a81d8000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None            


"Thread-74" #185 prio=10 os_prio=0 tid=0x00007f32dc00c800 nid=0x24457 waiting on condition [0x00007f325cd21000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007a4ac0b90> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at org.apache.dubbo.common.threadpool.ThreadlessExecutor.waitAndDrain(ThreadlessExecutor.java:91)
    at org.apache.dubbo.rpc.AsyncRpcResult.get(AsyncRpcResult.java:179)
    at org.apache.dubbo.rpc.protocol.AsyncToSyncInvoker.invoke(AsyncToSyncInvoker.java:61)
    at org.apache.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:78)
    at org.apache.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:91)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.protocol.dubbo.filter.FutureFilter.invoke(FutureFilter.java:52)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at com.segmentfault.common.user.interceptor.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:56)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:69)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.protocol.InvokerWrapper.invoke(InvokerWrapper.java:56)
    at org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:79)
    at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:265)
    at org.apache.dubbo.rpc.cluster.interceptor.ClusterInterceptor.intercept(ClusterInterceptor.java:47)
    at org.apache.dubbo.rpc.cluster.support.wrapper.AbstractCluster$InterceptorInvokerNode.invoke(AbstractCluster.java:92)
    at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:93)
    at org.apache.dubbo.registry.client.migration.MigrationInvoker.invoke(MigrationInvoker.java:170)
    at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:96)
    at org.apache.dubbo.common.bytecode.proxy12.getAreaNameByCode(proxy12.java)
    at com.segmentfault.business.system.api.utils.AreaInfoUtil.getAreaNameByCode(AreaInfoUtil.java:39)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl.remind(WhiteItemRemindConfigServiceImpl.java:164)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$FastClassBySpringCGLIB$$bcecf521.invoke(<generated>)
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$EnhancerBySpringCGLIB$$2071afb8.remind(<generated>)
    at com.segmentfault.business.task.WhiteItemStateJob.lambda$whiteStateHandle$0(WhiteItemStateJob.java:63)
    at com.segmentfault.business.task.WhiteItemStateJob$$Lambda$1783/2031958241.accept(Unknown Source)
    at java.util.ArrayList.forEach(ArrayList.java:1257)
    at com.segmentfault.business.task.WhiteItemStateJob.whiteStateHandle(WhiteItemStateJob.java:51)
    at sun.reflect.GeneratedMethodAccessor1505.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29)
    at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)

   Locked ownable synchronizers:
    - None    

两个 xxljob 工作、两个 C2 CompilerThread、两个 nacos 过程、一个 DestroyJavaVM 守护过程。

五、援用

Java 利用 CPU 占用过高问题排查

pidstat 命令详解

正文完
 0