关于spark:Uber-jvm-profiler-使用

背景

uber jvm profiler是用于在分布式监控收集jvm 相干指标，如:cpu/memory/io/gc信息等

装置

确保装置了maven和JDK>=8前提下，间接mvn clean package

java application

阐明
间接以java agent的部署就能够应用
应用
java -javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,brokerList='kafka1:9092',topicPrefix=demo_,tag=tag-demo,metricInterval=5000,sampleInterval=0 -cp target/jvm-profiler-1.0.0.jar
选项解释
|参数|阐明|
|------|-----|
|reporter|reporter类别, 此处间接默认为com.uber.profiling.reporters.KafkaOutputReporter就能够|
|brokerList|如reporter为com.uber.profiling.reporters.KafkaOutputReporter,则brokerList为kafka列表,以逗号分隔|
|topicPrefix|如reporter为com.uber.profiling.reporters.KafkaOutputReporter,则topicPrefix为kafka topic的前缀|
|tag|key为tag的metric，会输入到reporter中|
|metricInterval|metric report的频率，依据理论状况设置，单位为ms|
|sampleInterval|jvm堆栈metrics report的频率，依据理论状况设置，单位为ms|

后果展现

  "nonHeapMemoryTotalUsed": 11890584.0,  "bufferPools": [      {          "totalCapacity": 0,          "name": "direct",          "count": 0,          "memoryUsed": 0      },      {          "totalCapacity": 0,          "name": "mapped",          "count": 0,          "memoryUsed": 0      }  ],  "heapMemoryTotalUsed": 24330736.0,  "epochMillis": 1515627003374,  "nonHeapMemoryCommitted": 13565952.0,  "heapMemoryCommitted": 257425408.0,  "memoryPools": [      {          "peakUsageMax": 251658240,          "usageMax": 251658240,          "peakUsageUsed": 1194496,          "name": "Code Cache",          "peakUsageCommitted": 2555904,          "usageUsed": 1173504,          "type": "Non-heap memory",          "usageCommitted": 2555904      },      {          "peakUsageMax": -1,          "usageMax": -1,          "peakUsageUsed": 9622920,          "name": "Metaspace",          "peakUsageCommitted": 9830400,          "usageUsed": 9622920,          "type": "Non-heap memory",          "usageCommitted": 9830400      },      {          "peakUsageMax": 1073741824,          "usageMax": 1073741824,          "peakUsageUsed": 1094160,          "name": "Compressed Class Space",          "peakUsageCommitted": 1179648,          "usageUsed": 1094160,          "type": "Non-heap memory",          "usageCommitted": 1179648      },      {          "peakUsageMax": 1409286144,          "usageMax": 1409286144,          "peakUsageUsed": 24330736,          "name": "PS Eden Space",          "peakUsageCommitted": 67108864,          "usageUsed": 24330736,          "type": "Heap memory",          "usageCommitted": 67108864      },      {          "peakUsageMax": 11010048,          "usageMax": 11010048,          "peakUsageUsed": 0,          "name": "PS Survivor Space",          "peakUsageCommitted": 11010048,          "usageUsed": 0,          "type": "Heap memory",          "usageCommitted": 11010048      },      {          "peakUsageMax": 2863661056,          "usageMax": 2863661056,          "peakUsageUsed": 0,          "name": "PS Old Gen",          "peakUsageCommitted": 179306496,          "usageUsed": 0,          "type": "Heap memory",          "usageCommitted": 179306496      }  ],  "processCpuLoad": 0.0008024004394748531,  "systemCpuLoad": 0.23138430784607697,  "processCpuTime": 496918000,  "appId": null,  "name": "24103@machine01",  "host": "machine01",  "processUuid": "3c2ec835-749d-45ea-a7ec-e4b9fe17c23a",  "tag": "mytag",  "gc": [      {          "collectionTime": 0,          "name": "PS Scavenge",          "collectionCount": 0      },      {          "collectionTime": 0,          "name": "PS MarkSweep",          "collectionCount": 0      }

}

## spark application - 阐明      和java利用不同，须要把jvm-profiler.jar散发到各个节点上 - 应用

  --jars hdfs:///public/libs/jvm-profiler-1.0.0.jar     --conf spark.driver.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,brokerList='kafka1:9092',topicPrefix=demo_,tag=tag-demo,metricInterval=5000,sampleInterval=0   --conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,brokerList='kafka1:9092',topicPrefix=demo_,tag=tag-demo,metricInterval=5000,sampleInterval=0

  - 选项解释      |参数|阐明||---|---||reporter|reporter类别, 此处间接默认为com.uber.profiling.reporters.KafkaOutputReporter就能够||brokerList|如reporter为com.uber.profiling.reporters.KafkaOutputReporter,则brokerList为kafka列表,以逗号分隔||topicPrefix|如reporter为com.uber.profiling.reporters.KafkaOutputReporter,则topicPrefix为kafka topic的前缀||tag|key为tag的metric，会输入到reporter中||metricInterval|metric report的频率，依据理论状况设置，单位为ms||sampleInterval|jvm堆栈metrics report的频率，依据理论状况设置，单位为ms|  - 后果展现

"nonHeapMemoryTotalUsed": 11890584.0,"bufferPools": [    {        "totalCapacity": 0,        "name": "direct",        "count": 0,        "memoryUsed": 0    },    {        "totalCapacity": 0,        "name": "mapped",        "count": 0,        "memoryUsed": 0    }],"heapMemoryTotalUsed": 24330736.0,"epochMillis": 1515627003374,"nonHeapMemoryCommitted": 13565952.0,"heapMemoryCommitted": 257425408.0,"memoryPools": [    {        "peakUsageMax": 251658240,        "usageMax": 251658240,        "peakUsageUsed": 1194496,        "name": "Code Cache",        "peakUsageCommitted": 2555904,        "usageUsed": 1173504,        "type": "Non-heap memory",        "usageCommitted": 2555904    },    {        "peakUsageMax": -1,        "usageMax": -1,        "peakUsageUsed": 9622920,        "name": "Metaspace",        "peakUsageCommitted": 9830400,        "usageUsed": 9622920,        "type": "Non-heap memory",        "usageCommitted": 9830400    },    {        "peakUsageMax": 1073741824,        "usageMax": 1073741824,        "peakUsageUsed": 1094160,        "name": "Compressed Class Space",        "peakUsageCommitted": 1179648,        "usageUsed": 1094160,        "type": "Non-heap memory",        "usageCommitted": 1179648    },    {        "peakUsageMax": 1409286144,        "usageMax": 1409286144,        "peakUsageUsed": 24330736,        "name": "PS Eden Space",        "peakUsageCommitted": 67108864,        "usageUsed": 24330736,        "type": "Heap memory",        "usageCommitted": 67108864    },    {        "peakUsageMax": 11010048,        "usageMax": 11010048,        "peakUsageUsed": 0,        "name": "PS Survivor Space",        "peakUsageCommitted": 11010048,        "usageUsed": 0,        "type": "Heap memory",        "usageCommitted": 11010048    },    {        "peakUsageMax": 2863661056,        "usageMax": 2863661056,        "peakUsageUsed": 0,        "name": "PS Old Gen",        "peakUsageCommitted": 179306496,        "usageUsed": 0,        "type": "Heap memory",        "usageCommitted": 179306496    }],"processCpuLoad": 0.0008024004394748531,"systemCpuLoad": 0.23138430784607697,"processCpuTime": 496918000,"appId": null,"name": "24103@machine01","host": "machine01","processUuid": "3c2ec835-749d-45ea-a7ec-e4b9fe17c23a","tag": "mytag","gc": [    {        "collectionTime": 0,        "name": "PS Scavenge",        "collectionCount": 0    },    {        "collectionTime": 0,        "name": "PS MarkSweep",        "collectionCount": 0    }]

}

## 剖析 - 已有的reporter     |reporter|阐明||---|---||[ConsoleOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/reporters/ConsoleOutputReporter.java#L25)|默认的repoter，个别用于调试||[FileOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/reporters/FileOutputReporter.java#L34)|基于文件的reporter,分布式环境下不实用，得设置outputDir||[KafkaOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/reporters/KafkaOutputReporter.java#L36)|基于kafka的reporter，正式环境用的多，得设置brokerList，topicPrefix||[GraphiteOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/reporters/GraphiteOutputReporter.java#L34)|基于Graphite的reporter,需设置graphite.host等配置||[RedisOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java_redis/com/uber/profiling/RedisOutputReporter.java#L16)|基于redis的reporter，构建命令 `mvn -P redis clean package`||[InfluxDBOutputReporter](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java_influxdb/com/uber/profiling/reporters/InfluxDBOutputReporter.java#L43)|基于InfluxDB的reporter，构建命令`mvn -P influxdb clean package`，需设置influxdb.host等配置|

倡议在生产环境下应用KafkaOutputReporter，操作灵活性高，能够联合clickhouse grafana进行指标展现

- 源码剖析     该jvm-profiler整体是基于[java agent](https://www.developer.com/java/data/what-is-java-agent.html)实现,我的项目[pom文件](https://github.com/uber-common/jvm-profiler/blob/master/pom.xml#L105) 指定了MANIFEST.MF中的Premain-Class项和Agent-Class为[com.uber.profiling.Agent](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/Agent.java#L32)  具体的实现类为[AgentImpl](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/AgentImpl.java#L47)     就具体的AgentImpl类的run办法来进行剖析

public void run(Arguments arguments, Instrumentation instrumentation, Collection<AutoCloseable> objectsToCloseOnShutdown) {

    if (arguments.isNoop()) {        logger.info("Agent noop is true, do not run anything");        return;    }        Reporter reporter = arguments.getReporter();    String processUuid = UUID.randomUUID().toString();    String appId = null;        String appIdVariable = arguments.getAppIdVariable();    if (appIdVariable != null && !appIdVariable.isEmpty()) {        appId = System.getenv(appIdVariable);    }        if (appId == null || appId.isEmpty()) {        appId = SparkUtils.probeAppId(arguments.getAppIdRegex());    }    if (!arguments.getDurationProfiling().isEmpty()            || !arguments.getArgumentProfiling().isEmpty()) {        instrumentation.addTransformer(new JavaAgentFileTransformer(arguments.getDurationProfiling(), arguments.getArgumentProfiling()));    }    List<Profiler> profilers = createProfilers(reporter, arguments, processUuid, appId);        ProfilerGroup profilerGroup = startProfilers(profilers);    Thread shutdownHook = new Thread(new ShutdownHookRunner(profilerGroup.getPeriodicProfilers(), Arrays.asList(reporter), objectsToCloseOnShutdown));    Runtime.getRuntime().addShutdownHook(shutdownHook);}

- [arguments.getReporter()](arguments.getReporter()) 获取reporter，如果没有设置则设置为[reporterConstructor](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/Arguments.java#L264),否则设置为指定的reporter    - [String appId](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/AgentImpl.java#L66) ,设置appId，首先从配置中查找，如果没有设置，再从env中查找，对于spark利用则取[spark.app.id](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/util/SparkUtils.java#L26)的值   - [List<Profiler> profilers = createProfilers(reporter, arguments, processUuid, appId)](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/AgentImpl.java#L133)，创立profilers,默认有CpuAndMemoryProfiler，ThreadInfoProfiler，ProcessInfoProfiler ；   1.其中[CpuAndMemoryProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/CpuAndMemoryProfiler.java#L39)，[ThreadInfoProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/ThreadInfoProfiler.java#L15)，[ProcessInfoProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/ProcessInfoProfiler.java#L33)是从JMX中读取数据，ProcessInfoProfiler还会从 [/pro](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/IOProfiler.java#L55)读取数据；     2.如果设置了durationProfiling，argumentProfiling，sampleInterval，ioProfiling，则会减少对应的MethodDurationProfiler(输入办法调用破费的工夫)，MethodArgumentProfiler(输入办法参数的值)，StacktraceReporterProfiler，IOProfiler；  3.[MethodArgumentProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/MethodDurationProfiler.java#L29)和[MethodDurationProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/MethodArgumentProfiler.java#L29)利用[javassist](https://github.com/jboss-javassist/javassist)第三方字节码编译工具来改写对应的类，具体实现参照[JavaAgentFileTransformer](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/transformers/JavaAgentFileTransformer.java#L35)   4.[StacktraceReporterProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/StacktraceReporterProfiler.java#L35)从JMX中读取数据    5.[IOProfiler](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/IOProfiler.java#L27)则是读取本地机器上的[/pro](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/profilers/IOProfiler.java#L55)文件对应的目录的数据    - [ProfilerGroup profilerGroup = startProfilers(profilers)](https://github.com/uber-common/jvm-profiler/blob/master/src/main/java/com/uber/profiling/AgentImpl.java#L90)   开始进行profiler的定时report    其中还会辨别oneTimeProfilers和periodicProfilers，ProcessInfoProfiler就属于oneTimeProfilers，因为process的信息，在运行期间是不会变的，不须要周期行的reporter   至此，整个流程完结