关于java:如何在-Java-中使用-JMH-进行基准测试

0. 引言

在开发工作中经常会遇到对代码运行速度调优的需要。如何迷信评估调优的成绩？则须要精确计量一个办法运行速度的快慢 -- 迷信的做法是进行微基准测试，从而得出量化的后果。JMH 是由 OpenJDK 提供的对 Java 语言程序进行基准测试的工具。本文将介绍的根本用法和一个实用示例。

相干概念

BenchMark：又叫做基准测试，次要用来测试一些办法的性能，能够依据不同的参数以不同的单位进行计算（例如能够应用吞吐量为单位，也能够应用均匀工夫作为单位，在 BenchmarkMode 外面进行调整）。
Micro Benchmark：简略地说就是在 method 层面上的 benchmark，精度能够准确到微秒级。
OPS, Opeartion Per Second: 每秒操作量,是掂量性能的重要指标，数值越大，性能越好。相似的有 TPS, QPS
Throughput 吞吐率：
Warmup 预热：为什么须要预热？因为 JVM 的 JIT 机制的存在，如果某个函数被调用屡次之后，JVM 会尝试将其编译成为机器码从而进步执行速度。程序理论运行中会收到 JVM 的主动优化，为了让 Benchmark 的后果更加靠近真实情况就须要进行预热。

1. 什么是 JMH ?

JMH 全称 Java Microbenchmark Harness，是用于构建、运行和剖析以 Java 和其余基于 JVM 的其余语言编写的 nano/micro/milli/macro 基准测试的 Java 工具。

JMH 官网介绍：“JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM.”

2. JMH 能做什么？

性能介绍

一句话概括 JMH 的作用是度量（measure）某个办法的执行耗时，能够通过执行 JMH 测试得出办法执行耗时的量化后果。

应用场景

JMH 适用范围示例：

a. 度量某个办法执行耗时
b. 度量某个办法执行工夫和输出 n 的相关性
c. 评估一个办法的多种不同实现性能体现
d. 评估利用中调用的第三方库 API 的执行性能
e. b&c 综合利用

JMH 理论利用示例：

评估 ArrayList 遍历性能与输出 n 的相关性
比拟 ArrayList 和 LinkedList 遍历性能与输出 n 的相关性，并比拟差别
评估 redis-client Java 库 put 办法的性能
比拟实现求和的两种办法在 N 次输出下的性能差别，办法 methodSumA 应用了 Stream API，办法 methodSumB 应用了传统遍历累加，须要测试两种办法在不同数据量输出时的体现性能线性变动。通过 JMH 测试，能够对不同量级数据输出时如何抉择合适的求和实现起到指导作用。

3. 如何应用

以比拟 for 循环实现求和与 Stream API 实现求和的办法，在输出数据量级分为别为 10000, 100000, 1000000, 10000000 时求和的性能为例，介绍如何应用 JMH 实现这一测试。

测试程序执行过程形容伪代码：

forLoopMethod() {    loop (size in (10000, 100000, 1000000, 10000000)) {//遍历不同输出数量        doSumFromZeroTo(size);//以 for 循环形式累加 0~size 求和    }}streamMethod() {    loop (size in (10000, 100000, 1000000, 10000000)) {//遍历不同输出数量        doSumFromZeroTo(size);//以 stream sum API 形式累加 0~size 求和    }}

3.1. 创立工程

创立工程

以 Maven 构建的工程为例
应用的 JDK 版本为 1.8
增加以下 dependency 节点向工程中引入依赖

<properties>        <!-- 尽量抉择最新版本 -->    <jmh.version>1.28</jmh.version></properties><properties>    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>    <maven.compiler.source>8</maven.compiler.source>    <maven.compiler.target>8</maven.compiler.target></properties><dependencies>    <dependency>        <groupId>org.openjdk.jmh</groupId>        <artifactId>jmh-core</artifactId>        <version>${jmh.version}</version>    </dependency>    <dependency>        <groupId>org.openjdk.jmh</groupId>        <artifactId>jmh-generator-annprocess</artifactId>        <version>${jmh.version}</version>        <scope>provided</scope>    </dependency></dependencies>

3.2. 参数阐明

运行 JMH 前须进行肯定的设置，JMH 的设置项参数能够通过 new Runner(org.openjdk.jmh.runner.options.Options) 形式注入。
JMH 实现了JSR269标准，即注解处理器，能在编译Java源码的时候，辨认的到须要解决的注解，如@Beanmark，JMH能依据@Beanmark的配置生成一系列测试辅助类。因而也能够通过注解形式注入设置项参数。
以下以注解为例介绍每个设置项参数的作用。

@BenchmarkMode
Mode 示意 JMH 进行 Benchmark 时所应用的模式。通常是测量的维度不同，或是测量的形式不同。目前 JMH 共有四种模式：
1. Throughput: 整体吞吐量，例如“1秒内能够执行多少次调用”，单位是操作数/工夫。
2. AverageTime: 调用的均匀工夫，例如“每次调用均匀耗时xxx毫秒”，单位是工夫/操作数。
3. SampleTime: 随机取样，最初输入取样后果的散布，例如“99%的调用在xxx毫秒以内，99.99%的调用在xxx毫秒以内”。
4. SingleShotTime: 以上模式都是默认一次 iteration 是 1s，唯有 SingleShotTime 是只运行一次。往往同时把 warmup 次数设为0，用于测试冷启动时的性能。
@OutputTimeUnit
输入的工夫单位。
@Iteration
Iteration 是 JMH 进行测试的最小单位。在大部分模式下，一次 iteration 代表的是一秒，JMH 会在这一秒内一直调用须要 Benchmark 的办法，而后依据模式对其采样，计算吞吐量，计算均匀执行工夫等。
@WarmUp
Warmup 是指在理论进行 Benchmark 前先进行预热的行为。
为什么须要预热？因为 JVM 的 JIT 机制的存在，如果某个函数被调用屡次之后，JVM 会尝试将其编译成为机器码从而进步执行速度。为了让 Benchmark 的后果更加靠近真实情况就须要进行预热。
@State
类注解，JMH 测试类必须应用 @State 注解，它定义了一个类实例的生命周期，能够类比 Spring Bean 的 Scope。因为 JMH 容许多线程同时执行测试，不同的选项含意如下：
1. Scope.Thread：默认的 State，每个测试线程调配一个实例；
2. Scope.Benchmark：所有测试线程共享一个实例，用于测试有状态实例在多线程共享下的性能；
3. Scope.Group：每个线程组共享一个实例；
@Fork
进行 fork 的次数。如果 fork 数是2的话，则 JMH 会 fork 出两个过程来进行测试。
@Meansurement
提供真正的测试阶段参数。指定迭代的次数，每次迭代的运行工夫和每次迭代测试调用的数量(通常应用 @BenchmarkMode(Mode.SingleShotTime) 测试一组操作的开销——而不应用循环)
@Setup
办法注解，会在执行 benchmark 之前被执行，正如其名，次要用于初始化。

@TearDown

办法注解，与@Setup 绝对的，会在所有 benchmark 执行完结当前执行，次要用于资源的回收等。

@Setup/@TearDown注解应用Level参数来指定何时调用fixture：

名称	形容
Level.Trial	默认level。全副benchmark运行(一组迭代)之前/之后
Level.Iteration	一次迭代之前/之后(一组调用)
Level.Invocation	每个办法调用之前/之后(不举荐应用，除非你分明这样做的目标)

@Benchmark
办法注解，示意该办法是须要进行 benchmark 的对象。
@Param
成员注解，能够用来指定某项参数的多种状况。特地适宜用来测试一个函数在不同的参数输出的状况下的性能。@Param 注解接管一个String数组，在 @Setup 办法执行前转化为为对应的数据类型。多个 @Param 注解的成员之间是乘积关系，譬如有两个用 @Param 注解的字段，第一个有5个值，第二个字段有2个值，那么每个测试方法会跑5*2=10次。

3.3. 编写测试类

蕴含 main 办法的类和测试内容封装类

蕴含 main 办法的类 TestsMain

package org.example.jmh;import org.example.jmh.tests.IntegerSumTests;import org.openjdk.jmh.annotations.Mode;import org.openjdk.jmh.results.format.ResultFormatType;import org.openjdk.jmh.runner.Runner;import org.openjdk.jmh.runner.RunnerException;import org.openjdk.jmh.runner.options.ChainedOptionsBuilder;import org.openjdk.jmh.runner.options.OptionsBuilder;import java.nio.file.Files;import java.nio.file.Paths;import java.util.concurrent.TimeUnit;import static org.junit.Assert.assertTrue;/** * OptionsTests * created at 2021/4/25 * * @author weny * @since 1.0.0 */public class TestsMain {    //生成的文件门路：{工程根目录}/{reportFileDir}/{XXX.class.getSimpleName()}.json    // e.g. jmh-reports/EmptyMethod.json    private static final String reportFileDir = "jmh-reports/";//    private static final String reportPath = "sample-options-result.json";//生成的文件在工程根目录    /*     * ============================== HOW TO RUN THIS TEST: ====================================     * 1. 批改 Class<IntegerSumTests> targetClazz = IntegerSumTests.class;//须要运行 JMH 测试的类     * 2. 在 IDE 中运行 main 办法     */    public static void main(String[] args) throws RunnerException {        Class<IntegerSumTests> targetClazz = IntegerSumTests.class;//须要运行 JMH 测试的类        String reportFilePath = setupStandardOptions(targetClazz);        assertTrue(Files.exists(Paths.get(reportFilePath)));    }    /**     * 最根底的配置，目标是以最短的耗时测试 JMH 是否能够失常运行     *     * @param targetClazz 要运行 JMH 测试的类     * @throws RunnerException See:{@link RunnerException}     */    @SuppressWarnings({"unused"})    private static String setupBasicOptions(Class<?> targetClazz) throws RunnerException {        // number of iterations is kept to a minimum just to verify that the benchmarks work without spending extra        // time during builds.        String reportFilePath = resolvePath(targetClazz);        ChainedOptionsBuilder optionsBuilder =                new OptionsBuilder()                        .include(targetClazz.getSimpleName())                        .forks(1)                        .warmupIterations(0)                        .measurementBatchSize(1)                        .measurementIterations(1)                        .shouldFailOnError(true)                        .result(reportFilePath)                        .timeUnit(TimeUnit.MICROSECONDS)                        .resultFormat(ResultFormatType.JSON);        new Runner(optionsBuilder.build()).run();        return reportFilePath;    }    /**     * 一份规范的配置，依据理论需要配置预热和迭代等参数     *     * @param targetClazz 要运行 JMH 测试的类     * @throws RunnerException See:{@link RunnerException}     */    private static String setupStandardOptions(Class<?> targetClazz) throws RunnerException {        String reportFilePath = resolvePath(targetClazz);        ChainedOptionsBuilder optionsBuilder =                new OptionsBuilder()                        .include(targetClazz.getSimpleName())                        .mode(Mode.Throughput)//模式-吞吐量 ｜ 注解形式 @BenchmarkMode(Mode.Throughput)                        .forks(1)//Fork进行的数目 ｜ 注解形式 @Fork(2)                        .warmupIterations(1)//预热轮数 ｜ 注解形式 @Warmup(iterations = 1)                        .measurementIterations(3)//度量轮数 ｜ 注解形式 @Measurement(iterations = 3)                        .timeUnit(TimeUnit.MICROSECONDS)//后果所应用的工夫单位 | 注解形式 @OutputTimeUnit(TimeUnit.MILLISECONDS)                        .shouldFailOnError(true)                        .result(reportFilePath)//后果报告文件输入门路                        .resultFormat(ResultFormatType.JSON);//后果报告文件输入格局 JSON        new Runner(optionsBuilder.build()).run();        return reportFilePath;    }    private static String resolvePath(Class<?> targetClazz) {        return reportFileDir + targetClazz.getSimpleName() + ".json";    }}

测试内容封装类 EmptyMethod 蕴含空的办法，用于测试配置是否能够失常运行

package org.example.jmh.tests;import org.openjdk.jmh.annotations.Benchmark;/** * EmptyMethod * created at 2021/4/25 * * @author weny * @since 1.0.0 */public class EmptyMethod {    @Benchmark    public void hello() {        // this method was intentionally left blank.    }}

测试内容封装类 IntegerSumTests 蕴含待度量评估的办法streamSummingInt, forEachPlus

package org.example.jmh.tests;import org.openjdk.jmh.annotations.*;import java.util.Arrays;import java.util.stream.IntStream;/** * IntegerSumTests * created at 2021/4/25 * * @author weny * @since 1.0.0 */public class IntegerSumTests {    // Implementation using stream summingInt    @Benchmark    public int streamSummingInt(Params params) {        return Arrays.stream(params.items).sum();    }    // Implementation using forEach    @Benchmark    public int forEachPlus(Params params) {        int res = 0;        for (int item : params.items) {            res += item;        }        return res;    }    // Define benchmarks parameters with @State    @State(Scope.Benchmark)    public static class Params {        // Run with given size parameters of//        @Param({"1000", "10000", "100000", "1000000"})        @Param({"10000", "100000", "1000000", "10000000"})        public int size;        // Items to run benchmark on        public int[] items;        // Setup test data, will be run once and will not affect our results        @Setup        public void setUp() {            items = IntStream.range(0, size).toArray();        }    }}

3.4. 运行测试

运行测试得出后果数据

运行Main 办法org.example.jmh.TestsMain#main

3.5. 后果剖析

对测试后果数据项进行剖析

运行开始 - 参数打印

# JMH version: 1.28# VM version: JDK 1.8.0_162, Java HotSpot(TM) 64-Bit Server VM, 25.162-b12# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home/jre/bin/java# VM options: -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=58962:/Applications/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8# Blackhole mode: full + dont-inline hint# Warmup: 1 iterations, 10 s each   ------------------------------ 预热5个迭代，每个迭代10s# Measurement: 3 iterations, 10 s each --------------------------- 正式测试5个迭代，每个迭代10s# Timeout: 10 min per iteration ---------------------------------- 每个迭代的超时工夫10min# Threads: 1 thread, will synchronize iterations ----------------- 应用1个线程测试# Benchmark mode: Throughput, ops/time --------------------------- 应用吞吐量作为测试指标# Benchmark: org.example.jmh.tests.IntegerSumTests.forEachPlus --- 本次迭代测试的指标办法名# Parameters: (size = 10000) ------------------------------------- 本次迭代注入的参数值

运行中 - 阶段信息打印

# Run progress: 12.50% complete, ETA 00:04:47 ---------------------- 运行进度 12.50%# Fork: 1 of 1# Warmup Iteration   1: 32206.476 ops/sIteration   1: 32631.226 ops/sIteration   2: 32725.618 ops/sIteration   3: 32681.244 ops/sResult "org.example.jmh.tests.IntegerSumTests.forEachPlus": -------- 阶段后果统计  32679.362 ±(99.9%) 861.539 ops/s [Average]  (min, avg, max) = (32631.226, 32679.362, 32725.618), stdev = 47.224  CI (99.9%): [31817.824, 33540.901] (assumes normal distribution)    # 统计后果给出了屡次测试后的最小值，最大值和均值，以及标准差 (stdev),置信区间(CI,Confidence interval)# 标准差（stdev）反映了数值绝对于均匀值得离散水平，置信区间是指由样本统计量所结构的总体参数的预计区间。在统计学中，一个概率样本的置信区间（Confidence interval）是对这个样本的某个总体参数的区间预计

运行完结 - 后果打印

# Run complete. Total time: 00:05:26REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up onwhy the numbers are the way they are. Use profilers (see -prof, -lprof), design factorialexperiments, perform baseline and negative tests that provide experimental control, make surethe benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.Do not assume the numbers tell you what you want them to tell.Benchmark                           (size)   Mode  Cnt       Score       Error  UnitsIntegerSumTests.forEachPlus          10000  thrpt    3  328440.729 ± 92008.121  ops/sIntegerSumTests.forEachPlus         100000  thrpt    3   32679.362 ±   861.539  ops/sIntegerSumTests.forEachPlus        1000000  thrpt    3    2796.351 ±  3273.468  ops/sIntegerSumTests.forEachPlus       10000000  thrpt    3     218.404 ±    36.500  ops/sIntegerSumTests.streamSummingInt     10000  thrpt    3   50423.644 ± 47232.940  ops/sIntegerSumTests.streamSummingInt    100000  thrpt    3    7521.114 ± 47042.316  ops/sIntegerSumTests.streamSummingInt   1000000  thrpt    3     480.979 ±   112.349  ops/sIntegerSumTests.streamSummingInt  10000000  thrpt    3     132.339 ±  1514.091  ops/sBenchmark result is saved to jmh-reports/IntegerSumTests.jsonProcess finished with exit code 0# Benchmark 列示意这次测试比照的办法。# Mode 列表上后果的统计纬度。# Cnt 列示意采样次数，Cnt=Fork*Iteration. # Score 是对这次评测的打分，对于输出数据量 size=10000 时 forEachPlus 操作数为 328440.729 ops/s，streamSummingInt 时 操作数为 50423.644 ops/s, 意味着 size=10000 时 for 循环求和性能优于 stream.sum() 求和。# Error 这里示意性能统计上的误差，咱们不须要关怀这个数据，次要查看 Score

3.6. 后果报告可视化

将后果生成可视化的报表

可选形式：

一、将后果 json 文件上传至 JMH Visualizer 主动生成报表，参考：JMH Visualizer

二、将后果通过第三方报表组件自定义出现，此处不作开展。

4. 注意事项

4.1. 须要思考到虚拟机的优化

编写 JHM 代码，须要思考到虚拟机的优化，而使得测试失真，如下 measureWrong 代码就是所谓的 Dead-Code 代码

@State(Scope.Thread)@BenchmarkMode(Mode.AverageTime)@OutputTimeUnit(TimeUnit.NANOSECONDS)public class JMHSample_08_DeadCode {  private double x = Math.PI;  @Benchmark  public void baseline() {    //基准  }  @Benchmark  public void measureWrong() {    //虚构机会优化掉这部分，性能同baseline    Math.log(x);  }  @Benchmark  public double measureRight() {    // 真正的性能测试    return Math.log(x);  }}

测试后果如下

Benchmark                                               Mode     Score    Units    c.i.c.c.c.i.c.c.j.JMHSample_08_DeadCode.baseline        avgt     0.358    ns/op    c.i.c.c.c.i.c.c.j.JMHSample_08_DeadCode.measureRight    avgt    24.605    ns/op    c.i.c.c.c.i.c.c.j.JMHSample_08_DeadCode.measureWrong    avgt     0.366    ns/op

在测试 measureWrong 办法，JIT 能揣测出办法体能够被优化调而不影响零碎，measureRight 因为定义了返回值，JIT 不会优化。

4.2. 常量折叠

对于常量折叠，JIT 认为办法计算结果为常量，从而优化间接返回常量给调用者

private double x = Math.PI; private final double wrongX = Math.PI;  @Benchmark  public double baseline() {    // 基准测试    return Math.PI;  }  @Benchmark  public double measureWrong_1() {    // JIT认为是个常量    return Math.log(Math.PI);  }  @Benchmark  public double measureWrong_2() {    // JIT认为办法调用后果是个常量.    return Math.log(wrongX);  }  @Benchmark  public double measureRight() {    // 正确的测试    return Math.log(x);  }

如下是测试后果

Benchmark                                                     Mode    Score   Units           c.i.c.c.c.i.c.c.j.JMHSample_10_ConstantFold.baseline          avgt    1.175   ns/op           c.i.c.c.c.i.c.c.j.JMHSample_10_ConstantFold.measureRight      avgt   25.805   ns/op           c.i.c.c.c.i.c.c.j.JMHSample_10_ConstantFold.measureWrong_1    avgt    1.116   ns/op           c.i.c.c.c.i.c.c.j.JMHSample_10_ConstantFold.measureWrong_2    avgt    1.031   ns/op

思考到 inline 对性能影响很大，JMH 反对 @CompilerControl 来管制是否容许内联

public class Inline {  int x=0,y=0;  @Benchmark  @CompilerControl(CompilerControl.Mode.DONT_INLINE)  public  int   add(){    return dataAdd(x,y);  }  @Benchmark  public  int  addInline(){    return dataAdd(x,y);  }  private int  dataAdd(int x,int y){    return x+y;  }  @Setup  public void init() {    x = 1;    y = 2;  }}

add 和 addInline 办法都会调用 dataAdd 办法，前者应用 CompilerControl 类，能够用在办法或者类上，来提供编译选项

DONT_INLINE，调用办法不内联
INLINE，调用办法内联
BREAK，插入一个调试断点(TODO,如何调试，参考11章)
PRINT，打印办法被 JIT 编译后的机器码信息

开发人员可能感觉下面的测试，add 办法太简略，会习惯性的在 add 办法里方一个循环，以缩小 JMH 调用 add 办法的老本。JMH 不倡议这么做，因为 JIT 会实际上对这种循环会做优化，以打消循环调用老本。如下是个例子能够看到循环测试后果不精确

int x = 1;int y = 2;/** 正确测试*/@Benchmarkpublic int measureRight() {  return (x + y);}private int reps(int reps) {  int s = 0;  for (int i = 0; i < reps; i++) {    s += (x + y);  }  return s;}@Benchmark@OperationsPerInvocation(1)public int measureWrong_1() {  return reps(1);}@Benchmark@OperationsPerInvocation(10)public int measureWrong_10() {  return reps(10);}@Benchmark@OperationsPerInvocation(100)public int measureWrong_100() {  return reps(100);}@Benchmark@OperationsPerInvocation(1000)public int measureWrong_1000() {  return reps(1000);}

注解 OperationsPerInvocation 通知JMH统计性能的时候须要做修改，比方 @OperationsPerInvocation(10) 调用了10次。

性能测试后果如下

编写性能测试的一个好习惯是先编写一个单元测试用例，以确保性能测试准确性，x  Benchmark                                                   Mode   Score   Units    c.i.c.c.c.i.c.c.j.JMHSample_11_Loops.measureRight           avgt   1.114   ns/op    c.i.c.c.c.i.c.c.j.JMHSample_11_oops.measureWrong_1         avgt   1.057   ns/op    c.i.c.c.c.i.c.c.j.JMHSample_11_Loops.measureWrong_10        avgt   0.139   ns/op    c.i.c.c.c.i.c.c.j.JMHSample_11_Loops.measureWrong_100       avgt   0.018   ns/op    c.i.c.c.c.i.c.c.j.JMHSample_11_Loops.measureWrong_1000      avgt   0.035   ns/op    java

5. 结语

通过 JMH 测试，能够和迷信的评估办法的执行耗时，评估后果能够对性能调优、算法性能预测、服务器基础设施容量布局等行为起到指导作用。量化的后果更有说服力，对后果进一步可视化，将更加直观。

6. 扩大浏览

官网的 Code Sample 写得浅显易懂，举荐在须要具体理解 JMH 的用法时能够通读一遍。
如果应用 IDEA，IntelliJ IDEA 有 JMH 的插件，提供 benchmark 办法的主动生成等便当性能。
RPC benchmark实现：java rpc benchmark https://github.com/hank-whu/r...

7. 参考

Reference List：

JMH Github Repo
OpenJDK JMH Samples
JMH Samples 中文版
应用JMH做Benchmark基准测试
Java 并发编程笔记：JMH 性能测试框架
Java性能优化-把握JMH
JMH - Java Microbenchmark Harness
JMH Visualizer
jmh-visual-chart

0. 引言

相干概念

1. 什么是 JMH ?

2. JMH 能做什么？

性能介绍

应用场景

3. 如何应用

3.1. 创立工程

3.2. 参数阐明

@BenchmarkMode

@OutputTimeUnit

@Iteration

@WarmUp

@State

@Fork

@Meansurement

@Setup