序
本文主要试用一下 JDK12 新引入的 ShenandoahGC
ShenandoahGC
Shenandoah 是一款 concurrent 及 parallel 的垃圾收集器
跟 ZGC 一样也是面向 low-pause-time 的垃圾收集器,不过 ZGC 是基于 colored pointers 来实现,而 Shenandoah GC 是基于 brooks pointers 来实现
与 G1 GC 相比,G1 的 evacuation 是 parallel 的但不是 concurrent,而 Shenandoah 的 evacuation 是 concurrent,因而能更好地减少 pause time
与 G1 GC 一样,ShenandoahGC 也是基于 region 的 GC,不同的是 ShenandoahGC 在逻辑上没有分代,因而就没有 young/old
GC cycle
ShenandoahGC 主要有如下几个阶段:
Snapshot-at-the-beginning concurrent mark
这里包含 Init Mark(Pause)、Concurrent Mark、Final Mark(Pause);这里使用到了 White(not yet visited)、Gray(visited, but references are not scanned yet)、Black(visited, and fully scanned) Color 算法进行 mark
Concurrent evacuation
这个就是与 G1 不同的 evacuation 阶段,它是 concurrent 的;这里用到了 Brooks Pointers(object version change with additional atomically changed indirection) 算法进行 copy,
Concurrent update references (optional)
这里包含 Init update Refs(Pause)、Concurrent update Refs、Final update Refs(Pause)Final Mark 或者 Final update Refs 之后都可能进行 Concurrent cleanup,进行垃圾回收,reclaims region
相关参数
Shenandoah 开头的参数
bool ShenandoahAcmpBarrier = true {diagnostic} {default}
bool ShenandoahAllocFailureALot = false {diagnostic} {default}
uintx ShenandoahAllocSpikeFactor = 5 {experimental} {default}
intx ShenandoahAllocationStallThreshold = 10000 {diagnostic} {default}
uintx ShenandoahAllocationThreshold = 0 {experimental} {default}
bool ShenandoahAllocationTrace = false {diagnostic} {default}
bool ShenandoahAllowMixedAllocs = true {diagnostic} {default}
bool ShenandoahAlwaysClearSoftRefs = false {experimental} {default}
bool ShenandoahAlwaysPreTouch = false {diagnostic} {default}
bool ShenandoahCASBarrier = true {diagnostic} {default}
bool ShenandoahCloneBarrier = true {diagnostic} {default}
uintx ShenandoahCodeRootsStyle = 2 {experimental} {default}
bool ShenandoahCommonGCStateLoads = false {experimental} {default}
bool ShenandoahConcurrentScanCodeRoots = true {experimental} {default}
uintx ShenandoahControlIntervalAdjustPeriod = 1000 {experimental} {default}
uintx ShenandoahControlIntervalMax = 10 {experimental} {default}
uintx ShenandoahControlIntervalMin = 1 {experimental} {default}
uintx ShenandoahCriticalFreeThreshold = 1 {experimental} {default}
bool ShenandoahDecreaseRegisterPressure = false {diagnostic} {default}
bool ShenandoahDegeneratedGC = true {diagnostic} {default}
bool ShenandoahDontIncreaseWBFreq = true {experimental} {default}
bool ShenandoahElasticTLAB = true {diagnostic} {default}
uintx ShenandoahEvacAssist = 10 {experimental} {default}
uintx ShenandoahEvacReserve = 5 {experimental} {default}
bool ShenandoahEvacReserveOverflow = true {experimental} {default}
double ShenandoahEvacWaste = 1.200000 {experimental} {default}
uintx ShenandoahFreeThreshold = 10 {experimental} {default}
uintx ShenandoahFullGCThreshold = 3 {experimental} {default}
ccstr ShenandoahGCHeuristics = adaptive {experimental} {default}
uintx ShenandoahGarbageThreshold = 60 {experimental} {default}
uintx ShenandoahGuaranteedGCInterval = 300000 {experimental} {default}
size_t ShenandoahHeapRegionSize = 0 {experimental} {default}
bool ShenandoahHumongousMoves = true {experimental} {default}
intx ShenandoahHumongousThreshold = 100 {experimental} {default}
uintx ShenandoahImmediateThreshold = 90 {experimental} {default}
bool ShenandoahImplicitGCInvokesConcurrent = true {experimental} {default}
uintx ShenandoahInitFreeThreshold = 70 {experimental} {default}
bool ShenandoahKeepAliveBarrier = true {diagnostic} {default}
uintx ShenandoahLearningSteps = 5 {experimental} {default}
bool ShenandoahLoopOptsAfterExpansion = true {experimental} {default}
uintx ShenandoahMarkLoopStride = 1000 {experimental} {default}
intx ShenandoahMarkScanPrefetch = 32 {experimental} {default}
size_t ShenandoahMaxRegionSize = 33554432 {experimental} {default}
uintx ShenandoahMergeUpdateRefsMaxGap = 200 {experimental} {default}
uintx ShenandoahMergeUpdateRefsMinGap = 100 {experimental} {default}
uintx ShenandoahMinFreeThreshold = 10 {experimental} {default}
size_t ShenandoahMinRegionSize = 262144 {experimental} {default}
bool ShenandoahOOMDuringEvacALot = false {diagnostic} {default}
bool ShenandoahOptimizeInstanceFinals = false {experimental} {default}
bool ShenandoahOptimizeStableFinals = false {experimental} {default}
bool ShenandoahOptimizeStaticFinals = true {experimental} {default}
bool ShenandoahPacing = true {experimental} {default}
uintx ShenandoahPacingCycleSlack = 10 {experimental} {default}
uintx ShenandoahPacingIdleSlack = 2 {experimental} {default}
uintx ShenandoahPacingMaxDelay = 10 {experimental} {default}
double ShenandoahPacingSurcharge = 1.100000 {experimental} {default}
uintx ShenandoahParallelRegionStride = 1024 {experimental} {default}
uint ShenandoahParallelSafepointThreads = 4 {experimental} {default}
bool ShenandoahPreclean = true {experimental} {default}
bool ShenandoahReadBarrier = true {diagnostic} {default}
uintx ShenandoahRefProcFrequency = 5 {experimental} {default}
bool ShenandoahRegionSampling = true {experimental} {command line}
int ShenandoahRegionSamplingRate = 40 {experimental} {default}
bool ShenandoahSATBBarrier = true {diagnostic} {default}
uintx ShenandoahSATBBufferFlushInterval = 100 {experimental} {default}
size_t ShenandoahSATBBufferSize = 1024 {experimental} {default}
bool ShenandoahStoreCheck = false {diagnostic} {default}
bool ShenandoahStoreValEnqueueBarrier = false {diagnostic} {default}
bool ShenandoahStoreValReadBarrier = true {diagnostic} {default}
bool ShenandoahSuspendibleWorkers = false {experimental} {default}
size_t ShenandoahTargetNumRegions = 2048 {experimental} {default}
bool ShenandoahTerminationTrace = false {diagnostic} {default}
bool ShenandoahUncommit = true {experimental} {default}
uintx ShenandoahUncommitDelay = 300000 {experimental} {default}
uintx ShenandoahUnloadClassesFrequency = 0 {experimental} {default}
ccstr ShenandoahUpdateRefsEarly = adaptive {experimental} {default}
bool ShenandoahVerify = false {diagnostic} {default}
intx ShenandoahVerifyLevel = 4 {diagnostic} {default}
bool ShenandoahWriteBarrier = true {diagnostic} {default}
其中有一些是 diagnostic 用的,比如 ShenandoahAcmpBarrier、ShenandoahAllocFailureALot、ShenandoahAllocationStallThreshold 等
Heuristics 相关参数
ccstr ShenandoahGCHeuristics = adaptive {experimental} {default}
uintx ShenandoahInitFreeThreshold = 70 {experimental} {default}
uintx ShenandoahMinFreeThreshold = 10 {experimental} {default}
uintx ShenandoahAllocSpikeFactor = 5 {experimental} {default}
uintx ShenandoahGarbageThreshold = 60 {experimental} {default}
uintx ShenandoahFreeThreshold = 10 {experimental} {default}
uintx ShenandoahAllocationThreshold = 0 {experimental} {default}
ccstr ShenandoahUpdateRefsEarly = adaptive {experimental} {default}
Heuristics 主要用于告诉 Shenandoah 何时启动一个 GC cycle,其中 ShenandoahGCHeuristics 用于选择不同的策略,其可选值有 adaptive(默认)、static、compact、passive(diagnostic 用)、aggressive(diagnostic 用)
adaptive 方式主要通过 ShenandoahInitFreeThreshold(Initial remaining free heap threshold for learning steps)、ShenandoahMinFreeThreshold(free space threshold at which heuristics triggers the GC unconditionally)、ShenandoahAllocSpikeFactor(How much heap to reserve for absorbing allocation spikes)、XX:ShenandoahGarbageThreshold(Sets the percentage of garbage a region need to contain before it can be marked for collection) 来设置合适启动 GC cycle
static 方式主要是基于 heap occupancy 以及 allocation pressure 来决定是否启动 GC cycle,相关参数有:ShenandoahFreeThreshold(Set the percentage of free heap at which a GC cycle is started)、ShenandoahAllocationThreshold(Set percentage of memory allocated since last GC cycle before a new GC cycle is started)、ShenandoahGarbageThreshold
compact 方式是 continuous 方式的,只要有 allocation 发生,上一个 GC cycle 结束之后就启动新的 GC cycle,相关参数有 ConcGCThreads(Trim down the number of concurrent GC threads to make more room for application to run)、ShenandoahAllocationThreshold
passive 方式是完全 passive,当内存耗尽时触发 STW,通常用于 diagnostic
aggressive 方式是完全 active 的,上一个 GC cycle 结束之后就启动新的 GC cycle(有点类似 compact 方式),不过它会 evacuate 所有的 live objects,通常用于 diagnostic
Failure Modes
当 allocation failure 发生的时候,Shenandoah 有一些优雅的 degradation ladder 用于处理这种情况,如下:
Pacing(<10 ms)
ShenandoahPacing 参数默认开启,Pacer 用于在 gc 不够快的时候去 stall 正在分配对象的线程,当 gc 速度跟上来了就解除对这些线程的 stall;stall 不是无期限的,有个 ShenandoahPacingMaxDelay(单位毫秒) 参数可以设置,一旦超过该值 allocation 就会产生。当 allocation 压力大的时候,Pacer 就无能为力了,这个时候就会进入下一个 step
Degenerated GC(<100 ms)
ShenandoahDegeneratedGC 参数默认开启,在这个 Degenerated cycle,Shenandoah 使用的线程数取之于 ParallelGCThreads 而非 ConcCGThreads
Full GC(>100 ms)
当 Degenerated GC 之后还没有足够的内存,则进入 Full GC cycle,它会尽可能地进行 compact 然后释放内存以确保不发生 OOM
实例
启动参数
-server -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:+UsePerfData -XX:+ShenandoahRegionSampling -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -XX:+UnlockDiagnosticVMOptions -Xlog:age*,ergo*,gc*=info
gc 日志
[2019-03-21T15:12:53.771-0800][8707][gc] Consider -XX:+ClassUnloadingWithConcurrentMark if large pause times are observed on class-unloading sensitive workloads
[2019-03-21T15:12:53.862-0800][8707][gc,init] Regions: 2048 x 1024K
[2019-03-21T15:12:53.862-0800][8707][gc,init] Humongous object threshold: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] Max TLAB size: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] GC threads: 4 parallel, 4 concurrent
[2019-03-21T15:12:53.863-0800][8707][gc,init] Reference processing: parallel
[2019-03-21T15:12:53.864-0800][8707][gc] Heuristics ergonomically sets -XX:+ExplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc] Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc,init] Shenandoah heuristics: adaptive
[2019-03-21T15:12:53.864-0800][8707][gc,heap] Initialize Shenandoah heap with initial size 128M
[2019-03-21T15:12:53.865-0800][8707][gc,ergo] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:12:53.883-0800][8707][gc,init] Safepointing mechanism: global-page poll
[2019-03-21T15:12:53.883-0800][8707][gc] Using Shenandoah
[2019-03-21T15:12:53.884-0800][8707][gc,heap,coops] Heap address: 0x0000000780000000, size: 2048 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
[2019-03-21T15:12:59.530-0800][14083][gc] Trigger: Metadata GC Threshold
[2019-03-21T15:12:59.532-0800][14083][gc,ergo] Free: 1813M (1813 regions), Max regular: 1024K, Max humongous: 1855488K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.532-0800][14083][gc,ergo] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.532-0800][14083][gc,start] GC(0) Concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc,task] GC(0) Using 4 of 4 workers for concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc] GC(0) Concurrent reset 132M->132M(2048M) 0.441ms
[2019-03-21T15:12:59.533-0800][15619][gc,start] GC(0) Pause Init Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.533-0800][15619][gc,task] GC(0) Using 4 of 4 workers for init marking
[2019-03-21T15:12:59.541-0800][15619][gc,ergo] GC(0) Pacer for Mark. Expected Live: 204M, Free: 1813M, Non-Taxable: 181M, Alloc Tax Rate: 0.4x
[2019-03-21T15:12:59.541-0800][15619][gc] GC(0) Pause Init Mark (process weakrefs) (unload classes) 7.568ms
[2019-03-21T15:12:59.541-0800][14083][gc,start] GC(0) Concurrent marking (process weakrefs) (unload classes)
[2019-03-21T15:12:59.541-0800][14083][gc,task] GC(0) Using 4 of 4 workers for concurrent marking
[2019-03-21T15:12:59.619-0800][14083][gc] GC(0) Concurrent marking (process weakrefs) (unload classes) 132M->134M(2048M) 78.373ms
[2019-03-21T15:12:59.619-0800][14083][gc,start] GC(0) Concurrent precleaning
[2019-03-21T15:12:59.619-0800][14083][gc,task] GC(0) Using 1 of 4 workers for concurrent preclean
[2019-03-21T15:12:59.622-0800][14083][gc] GC(0) Concurrent precleaning 134M->134M(2048M) 2.397ms
[2019-03-21T15:12:59.622-0800][15619][gc,start] GC(0) Pause Final Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.622-0800][15619][gc,task] GC(0) Using 4 of 4 workers for final marking
[2019-03-21T15:12:59.625-0800][15619][gc,stringtable] GC(0) Cleaned string table, strings: 13692 processed, 50 removed
[2019-03-21T15:12:59.626-0800][15619][gc,ergo] GC(0) Adaptive CSet Selection. Target Free: 204M, Actual Free: 1914M, Max CSet: 85M, Min Garbage: 0M
[2019-03-21T15:12:59.626-0800][15619][gc,ergo] GC(0) Collectable Garbage: 117M (97% of total), 8M CSet, 126 CSet regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo] GC(0) Immediate Garbage: 0M (0% of total), 0 regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo] GC(0) Pacer for Evacuation. Used CSet: 126M, Free: 1811M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.626-0800][15619][gc] GC(0) Pause Final Mark (process weakrefs) (unload classes) 4.712ms
[2019-03-21T15:12:59.626-0800][14083][gc,start] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.627-0800][14083][gc] GC(0) Concurrent cleanup 134M->135M(2048M) 0.132ms
[2019-03-21T15:12:59.627-0800][14083][gc,ergo] GC(0) Free: 1810M (1810 regions), Max regular: 1024K, Max humongous: 1852416K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.627-0800][14083][gc,ergo] GC(0) Evacuation Reserve: 102M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.627-0800][14083][gc,start] GC(0) Concurrent evacuation
[2019-03-21T15:12:59.627-0800][14083][gc,task] GC(0) Using 4 of 4 workers for concurrent evacuation
[2019-03-21T15:12:59.643-0800][14083][gc] GC(0) Concurrent evacuation 135M->145M(2048M) 15.912ms
[2019-03-21T15:12:59.643-0800][15619][gc,start] GC(0) Pause Init Update Refs
[2019-03-21T15:12:59.643-0800][15619][gc,ergo] GC(0) Pacer for Update Refs. Used: 145M, Free: 1810M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.643-0800][15619][gc] GC(0) Pause Init Update Refs 0.090ms
[2019-03-21T15:12:59.643-0800][14083][gc,start] GC(0) Concurrent update references
[2019-03-21T15:12:59.643-0800][14083][gc,task] GC(0) Using 4 of 4 workers for concurrent reference update
[2019-03-21T15:12:59.652-0800][14083][gc] GC(0) Concurrent update references 145M->147M(2048M) 9.028ms
[2019-03-21T15:12:59.652-0800][15619][gc,start] GC(0) Pause Final Update Refs
[2019-03-21T15:12:59.652-0800][15619][gc,task] GC(0) Using 4 of 4 workers for final reference update
[2019-03-21T15:12:59.653-0800][15619][gc] GC(0) Pause Final Update Refs 0.489ms
[2019-03-21T15:12:59.653-0800][14083][gc,start] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.653-0800][14083][gc] GC(0) Concurrent cleanup 147M->21M(2048M) 0.088ms
[2019-03-21T15:12:59.653-0800][14083][gc,ergo] Free: 1924M (1924 regions), Max regular: 1024K, Max humongous: 1840128K, External frag: 7%, Internal frag: 0%
[2019-03-21T15:12:59.653-0800][14083][gc,ergo] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.653-0800][14083][gc,ergo] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:17:59.666-0800][14083][gc] Trigger: Time since last GC (300009 ms) is larger than guaranteed interval (300000 ms)
gc visualizer
有个 shenandoah-visualizer 工具可以用来可视化 ShenandoahGC,可视化效果如下:
小结
Shenandoah 是一款 concurrent 及 parallel 的垃圾收集器;跟 ZGC 一样也是面向 low-pause-time 的垃圾收集器,不过 ZGC 是基于 colored pointers 来实现,而 Shenandoah GC 是基于 brooks pointers 来实现;与 G1 GC 相比,G1 的 evacuation 是 parallel 的但不是 concurrent,而 Shenandoah 的 evacuation 是 concurrent,因而能更好地减少 pause time;与 G1 GC 一样,ShenandoahGC 也是基于 region 的 GC,不同的是 ShenandoahGC 在逻辑上没有分代,因而就没有 young/old
Shenandoah 的 GC cycle 主要有 Snapshot-at-the-beginning concurrent mark 包括 Init Mark(Pause)、Concurrent Mark、Final Mark(Pause)、Concurrent evacuation、Concurrent update references (optional) 包括 Init update Refs(Pause)、Concurrent update Refs、Final update Refs(Pause);其中 Final Mark 或者 Final update Refs 之后都可能进行 Concurrent cleanup,进行垃圾回收,reclaims region
Heuristics 主要用于告诉 Shenandoah 何时启动一个 GC,其中 ShenandoahGCHeuristics 用于选择不同的策略,其可选值有 adaptive(默认)、static、compact、passive(diagnostic 用)、aggressive(diagnostic 用);另外当 allocation failure 发生的时候,Shenandoah 有一些优雅的 degradation ladder 用于处理这种情况,包括 Pacing(<10 ms)、Degenerated GC(<100 ms)、Full GC(>100 ms)
doc
Shenandoah GC
JEP 189: Shenandoah: A Low-Pause-Time Garbage Collector (Experimental)
Changes to Garbage Collection in Java 12
9 Garbage-First Garbage Collector
G1GC – Java 9 Garbage Collector explained in 5 minutes
devoxx-Nov2017-shenandoah(部分图片来源于此 pdf)