作者:京东批发 冯晓涛

问题背景

京东生旅平台慧销零碎,作为平台零碎对接了多条业务线,次要进行各个业务线广告,召回等流动相干内容与能力治理。

最近依据告警发现内存继续升高,每隔2-3天会收到内存超过阈值告警,猜想可能存在内存透露的状况,而后进行排查。依据24小时时间段内存监控能够发现,容器的内存在持续上升:

问题排查

初步预计内存透露,查看24小时时间段jvm内存监控,排查jvm内存回收状况:

YoungGC和FullGC状况:

通过jvm内存剖析和YoungGC与FullGC执行状况,能够判断可能起因如下:

1、 存在YoungGC然而没有呈现FullGC,可能是对象进入老年代然而没有达到FullGC阈值,所以没有触发FullGC,对象始终存在老年代无奈回收

2、 存在内存透露,尽管执行了YoungGC,然而这部分内存无奈被回收

通过线程数监控,察看以后线程状况,发现以后线程数7427个,并且还在一直回升,根本判断存在内存透露,并且和线程池的不当应用无关:

通过JStack,获取线程堆栈文件并进行剖析,排查为什么会有这么多线程:

发现通过线程池创立的线程数达7000+:

代码剖析

剖析代码中ThreadPoolExecutor的应用场景,发现在一个worker公共类中定义了一个线程池,worker执行时会应用线程池进行异步执行。

 public class BackgroundWorker {      private static ThreadPoolExecutor threadPoolExecutor;      static {         init(15);     }      public static void init() {         init(15);     }      public static void init(int poolSize) {         threadPoolExecutor =                 new ThreadPoolExecutor(3, poolSize, 1000, TimeUnit.MINUTES, new LinkedBlockingDeque<>(1000), new ThreadPoolExecutor.CallerRunsPolicy());     }      public static void shutdown() {         if (threadPoolExecutor != null && !threadPoolExecutor.isShutdown()) {             threadPoolExecutor.shutdownNow();         }     }      public static void submit(final Runnable task) {         if (task == null) {             return;         }         threadPoolExecutor.execute(() -> {             try {                 task.run();             } catch (Exception e) {                 e.printStackTrace();             }         });     }  }

广告缓存刷新worker应用线程池的代码:

 public class AdActivitySyncJob {    @Scheduled(cron = "0 0/5 * * * ?")    public void execute() {        log.info("AdActivitySyncJob start");        List<DicDTO> locationList = locationService.selectLocation();        if (CollectionUtils.isEmpty(locationList)) {            return;        }        //两头省略局部无关代码        BackgroundWorker.init(40);        locationCodes.forEach(locationCode -> {            showChannelMap.forEach((key,value)->{                BackgroundWorker.submit(new Runnable() {                    @Override                    public void run() {                        log.info("AdActivitySyncJob,locationCode:{},showChannel:{}",locationCode,value);                        Result<AdActivityDTO> result = notLoginAdActivityOuterService.getAdActivityByLocationInner(locationCode, ImmutableMap.of("showChannel", value));                        LocalCache.AD_ACTIVITY_CACHE.put(locationCode.concat("_").concat(value), result);                    }                });            });        });        log.info("AdActivitySyncJob end");    }    @PostConstruct    public void init() {        execute();    }}

起因剖析:猜想是worker每次执行,都会执行init办法,创立新的线程池,然而部分创立的线程池并没有被敞开,导致内存中的线程池越来越多,ThreadPoolExecutor在应用实现后,如果不手动敞开,无奈被GC回收。

剖析验证

验证部分线程池ThreadPoolExecutor创立后,如果不手动敞开,是否会被GC回收:

public class Test {    private static ThreadPoolExecutor threadPoolExecutor;    public static void main(String[] args) {        for (int i=1;i<100;i++){            //每次均初始化线程池            threadPoolExecutor =                    new ThreadPoolExecutor(3, 15, 1000, TimeUnit.MINUTES, new LinkedBlockingDeque<>(1000), new ThreadPoolExecutor.CallerRunsPolicy());            //应用线程池执行工作            for(int j=0;j<10;j++){                submit(new Runnable() {                    @Override                    public void run() {                    }                });            }        }        //获取以后所有线程        ThreadGroup group = Thread.currentThread().getThreadGroup();        ThreadGroup topGroup = group;        // 遍历线程组树,获取根线程组        while (group != null) {            topGroup = group;            group = group.getParent();        }        int slackSize = topGroup.activeCount() * 2;        Thread[] slackThreads = new Thread[slackSize];        // 获取根线程组下的所有线程,返回的actualSize便是最终的线程数        int actualSize = topGroup.enumerate(slackThreads);        Thread[] atualThreads = new Thread[actualSize];        System.arraycopy(slackThreads, 0, atualThreads, 0, actualSize);        System.out.println("Threads size is " + atualThreads.length);        for (Thread thread : atualThreads) {            System.out.println("Thread name : " + thread.getName());        }    }    public static void submit(final Runnable task) {        if (task == null) {            return;        }        threadPoolExecutor.execute(() -> {            try {                task.run();            } catch (Exception e) {                e.printStackTrace();            }        });    }}

输入:

Threads size is 302

Thread name : Reference Handler

Thread name : Finalizer

Thread name : Signal Dispatcher

Thread name : main

Thread name : Monitor Ctrl-Break

Thread name : pool-1-thread-1

Thread name : pool-1-thread-2

Thread name : pool-1-thread-3

Thread name : pool-2-thread-1

Thread name : pool-2-thread-2

Thread name : pool-2-thread-3

Thread name : pool-3-thread-1

Thread name : pool-3-thread-2

Thread name : pool-3-thread-3

Thread name : pool-4-thread-1

Thread name : pool-4-thread-2

Thread name : pool-4-thread-3

Thread name : pool-5-thread-1

Thread name : pool-5-thread-2

Thread name : pool-5-thread-3

Thread name : pool-6-thread-1

Thread name : pool-6-thread-2

Thread name : pool-6-thread-3

…………

执行后果剖析,线程数量302个,部分线程池创立的外围线程没有被回收。

批改初始化线程池局部:

//初始化一次线程池threadPoolExecutor =        new ThreadPoolExecutor(3, 15, 1000, TimeUnit.MINUTES, new LinkedBlockingDeque<>(1000), new ThreadPoolExecutor.CallerRunsPolicy());for (int i=1;i<100;i++){    //应用线程池执行工作    for(int j=0;j<10;j++){        submit(new Runnable() {            @Override            public void run() {            }        });    }}

输入:

Threads size is 8

Thread name : Reference Handler

Thread name : Finalizer

Thread name : Signal Dispatcher

Thread name : main

Thread name : Monitor Ctrl-Break

Thread name : pool-1-thread-1

Thread name : pool-1-thread-2

Thread name : pool-1-thread-3

解决方案

1、只初始化一次,每次执行worker复用线程池

2、每次执行实现后,敞开线程池

BackgroundWorker的定位是后盾执行worker均进行线程池的复用,所以采纳计划1,每次在static动态代码块中初始化,应用时无需从新初始化。

解决后监控:

jvm内存监控,内存不再持续上升:

线程池恢复正常且安稳:

Jstack文件,察看线程池数量恢复正常:

Dump文件剖析线程池对象数量:

拓展

1、 如何敞开线程池

线程池提供了两个敞开办法,shutdownNow 和 shutdown 办法。

shutdownNow办法的解释是:线程池拒接管新提交的工作,同时立马敞开线程池,线程池里的工作不再执行。

shutdown办法的解释是:线程池拒接管新提交的工作,同时期待线程池里的工作执行结束后敞开线程池。

2、 为什么threadPoolExecutor不会被GC回收

threadPoolExecutor =         new ThreadPoolExecutor(3, 15, 1000, TimeUnit.MINUTES, new LinkedBlockingDeque<>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

部分应用后未手动敞开的线程池对象,会被GC回收吗?获取线上jump文件进行剖析:

发现线程池对象没有被回收,为什么不会被回收?查看ThreadPoolExecutor.execute()办法:

如果以后线程数小于外围线程数,就会进入addWorker办法创立线程:

剖析runWorker办法,如果存在工作则执行,否则调用getTask()获取工作:

发现workQueue.take()会始终阻塞,期待队列中的工作,因为Thread线程始终没有完结, 存在援用关系:ThreadPoolExecutor->Worker->Thread,因为存在GC ROOT的援用,所以无奈被回收 。