扩大 Metric 监控信息

官网文档

Source and Scope extension for new metrics

案例:JVM Thread 减少 Metrics 

批改 Thread 的定义

apm-protocol/apm-network/src/main/proto/language-agent/JVMMetric.proto 协定文件中笼罩 message Thread 的定义

message Thread {  int64 liveCount = 1;  int64 daemonCount = 2;  int64 peakCount = 3;  int64 deadlocked = 4;  int64 monitorDeadlocked = 5;  int64 newThreadCount = 7;  int64 runnableThreadCount = 8;  int64 blockedThreadCount = 9;  int64 waitThreadCount = 10;  int64 timeWaitThreadCount = 11;  int64 terminatedThreadCount = 12;}

从新构建 apm-network 我的项目

cd apm-protocol/apm-networkmvn clean package -DskipTests=true

PS:能够装置 Protocol Buffer Editor 插件,反对 Protocol Buffer 语法

批改 agent core 中 Thread Metrics 的提供类

间接应用如下代码笼罩 org.apache.skywalking.apm.agent.core.jvm.thread.ThreadProvider 类

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements.  See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License.  You may obtain a copy of the License at * *     http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * */package org.apache.skywalking.apm.agent.core.jvm.thread;import java.lang.management.ManagementFactory;import java.lang.management.ThreadInfo;import java.lang.management.ThreadMXBean;import java.util.Optional;import org.apache.skywalking.apm.network.language.agent.v3.Thread;public enum ThreadProvider {    INSTANCE;    private final ThreadMXBean threadMXBean;    private static final long [] EMPTY_DEADLOCKED_THREADS = new long[0];    ThreadProvider() {        this.threadMXBean = ManagementFactory.getThreadMXBean();    }    public Thread getThreadMetrics() {        int newThreadCount = 0;        int runnableThreadCount = 0;        int blockedThreadCount = 0;        int waitThreadCount = 0;        int timeWaitThreadCount = 0;        int terminatedThreadCount = 0;        // 基于线程状态信息减少对应状态的线程数        ThreadInfo[] threadInfos = threadMXBean.getThreadInfo(threadMXBean.getAllThreadIds());        if (threadInfos != null) {            for (ThreadInfo threadInfo : threadInfos) {                if (threadInfo != null) {                    switch (threadInfo.getThreadState()) {                        case NEW:                            newThreadCount++;                            break;                        case RUNNABLE:                            runnableThreadCount++;                            break;                        case BLOCKED:                            blockedThreadCount++;                            break;                        case WAITING:                            waitThreadCount++;                            break;                        case TIMED_WAITING:                            timeWaitThreadCount++;                            break;                        case TERMINATED:                            terminatedThreadCount++;                            break;                        default:                            break;                    }                } else {                    /*                     * If a thread of a given ID is not alive or does not exist,                     * the corresponding element in the returned array will,                     * contain null,because is mut exist ,so the thread is terminated                     */                    terminatedThreadCount++;                }            }        }        // 以后存活线程数        int threadCount = threadMXBean.getThreadCount();        // deamon线程数        int daemonThreadCount = threadMXBean.getDaemonThreadCount();        // 峰值线程数        int peakThreadCount = threadMXBean.getPeakThreadCount();        int deadlocked = Optional.ofNullable(threadMXBean.findDeadlockedThreads())                .orElse(EMPTY_DEADLOCKED_THREADS).length;        int monitorDeadlocked = Optional.ofNullable(threadMXBean.findMonitorDeadlockedThreads())                .orElse(EMPTY_DEADLOCKED_THREADS).length;        // 构建一个Thread对象,用于发送Thread Metric信息至OAP        return Thread.newBuilder().setLiveCount(threadCount)                .setDaemonCount(daemonThreadCount)                .setPeakCount(peakThreadCount)                .setDeadlocked(deadlocked)                .setMonitorDeadlocked(monitorDeadlocked)                .setNewThreadCount(newThreadCount)                .setRunnableThreadCount(runnableThreadCount)                .setBlockedThreadCount(blockedThreadCount)                .setWaitThreadCount(waitThreadCount)                .setTimeWaitThreadCount(timeWaitThreadCount)                .setTerminatedThreadCount(terminatedThreadCount)                .build();    }}

批改 ServiceInstanceJVMThread 

间接应用如下代码笼罩 org.apache.skywalking.oap.server.core.source.ServiceInstanceJVMThread 类,
ServiceInstanceJVMThread继承了 Source 抽象类, Source 类是 Skywalking 中 oal 体系,资源及范畴的定义。

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements.  See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License.  You may obtain a copy of the License at * *     http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * */package org.apache.skywalking.oap.server.core.source;import lombok.Getter;import lombok.Setter;import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_CATALOG_NAME;import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_JVM_THREAD;@ScopeDeclaration(id = SERVICE_INSTANCE_JVM_THREAD, name = "ServiceInstanceJVMThread", catalog = SERVICE_INSTANCE_CATALOG_NAME)@ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)public class ServiceInstanceJVMThread extends Source {    @Override    public int scope() {        return SERVICE_INSTANCE_JVM_THREAD;    }    @Override    public String getEntityId() {        return String.valueOf(id);    }    @Getter    @Setter    private String id;    @Getter    @Setter    @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)    private String name;    @Getter    @Setter    @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)    private String serviceName;    @Getter    @Setter    @ScopeDefaultColumn.DefinedByField(columnName = "service_id")    private String serviceId;    @Getter    @Setter    private long liveCount;    @Getter    @Setter    private long daemonCount;    @Getter    @Setter    private long peakCount;    @Getter    @Setter    private long deadlocked;    @Getter    @Setter    private long monitorDeadlocked;    @Getter    @Setter    private long newThreadCount;    @Getter    @Setter    private long runnableThreadCount;    @Getter    @Setter    private long blockedThreadCount;    @Getter    @Setter    private long waitThreadCount;    @Getter    @Setter    private long timeWaitThreadCount;    @Getter    @Setter    private long terminatedThreadCount;}

批改 JVMSourceDispatcher 

org.apache.skywalking.oap.server.analyzer.provider.jvm.JVMSourceDispatcher 是一个 Source 散发类,将从 agent 接管到的 JVM 相干 Metrics 拆分成对应的 Source 。例如: ServiceInstanceJVMMemory 、 ServiceInstanceJVMThread 。
批改办法org.apache.skywalking.oap.server.analyzer.provider.jvm.JVMSourceDispatcher#sendToThreadMetricProcess 为如下代码:

    private void sendToThreadMetricProcess(String service,            String serviceId,            String serviceInstance,            String serviceInstanceId,            long timeBucket,            Thread thread) {        ServiceInstanceJVMThread serviceInstanceJVMThread = new ServiceInstanceJVMThread();        serviceInstanceJVMThread.setId(serviceInstanceId);        serviceInstanceJVMThread.setName(serviceInstance);        serviceInstanceJVMThread.setServiceId(serviceId);        serviceInstanceJVMThread.setServiceName(service);        serviceInstanceJVMThread.setLiveCount(thread.getLiveCount());        serviceInstanceJVMThread.setDaemonCount(thread.getDaemonCount());        serviceInstanceJVMThread.setPeakCount(thread.getPeakCount());        serviceInstanceJVMThread.setTimeBucket(timeBucket);        serviceInstanceJVMThread.setDeadlocked(thread.getDeadlocked());        serviceInstanceJVMThread.setMonitorDeadlocked(thread.getMonitorDeadlocked());        serviceInstanceJVMThread.setNewThreadCount(thread.getNewThreadCount());        serviceInstanceJVMThread.setRunnableThreadCount(thread.getRunnableThreadCount());        serviceInstanceJVMThread.setBlockedThreadCount(thread.getBlockedThreadCount());        serviceInstanceJVMThread.setWaitThreadCount(thread.getWaitThreadCount());        serviceInstanceJVMThread.setTimeWaitThreadCount(thread.getTimeWaitThreadCount());        serviceInstanceJVMThread.setTerminatedThreadCount(thread.getTerminatedThreadCount());        sourceReceiver.receive(serviceInstanceJVMThread);    }

java-agent.oal 减少相干指标

oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal 增加如下语句

// 参考oal语法instance_jvm_thread_deadlocked = from(ServiceInstanceJVMThread.deadlocked).longAvg();instance_jvm_thread_monitor_deadlocked = from(ServiceInstanceJVMThread.monitorDeadlocked).longAvg();instance_jvm_thread_new_thread_count = from(ServiceInstanceJVMThread.newThreadCount).longAvg();instance_jvm_thread_runnable_thread_count = from(ServiceInstanceJVMThread.runnableThreadCount).longAvg();instance_jvm_thread_blocked_thread_count = from(ServiceInstanceJVMThread.blockedThreadCount).longAvg();instance_jvm_thread_wait_thread_count = from(ServiceInstanceJVMThread.waitThreadCount).longAvg();instance_jvm_thread_time_wait_thread_count = from(ServiceInstanceJVMThread.timeWaitThreadCount).longAvg();instance_jvm_thread_terminated_thread_count = from(ServiceInstanceJVMThread.terminatedThreadCount).longAvg();

批改 apm.yml 

oap-server/server-bootstrap/src/main/resources/ui-initialized-templates/apm.yml 文件的 APM 面板下的 Instance 项减少如下配置

{  "width": 3,  "title": "JVM Thread Count (Java Service)",  "height": "250",  "entityType": "ServiceInstance",  "independentSelector": false,  "metricType": "REGULAR_VALUE",  "queryMetricType": "readMetricsValues",  "chartType": "ChartLine",  "metricName": "instance_jvm_thread_live_count, instance_jvm_thread_daemon_count, instance_jvm_thread_peak_count,instance_jvm_thread_deadlocked,instance_jvm_thread_monitor_deadlocked"},{  "width": 3,  "title": "JVM Thread State Count (Java Service)",  "height": "250",  "entityType": "ServiceInstance",  "independentSelector": false,  "metricType": "REGULAR_VALUE",  "metricName": "instance_jvm_thread_new_thread_count,instance_jvm_thread_runnable_thread_count,instance_jvm_thread_blocked_thread_count,instance_jvm_thread_wait_thread_count,instance_jvm_thread_time_wait_thread_count,instance_jvm_thread_terminated_thread_count",  "queryMetricType": "readMetricsValues",  "chartType": "ChartBar"}

如果不分明增加地位,能够间接应用如下配置,笼罩 oap-server/server-bootstrap/src/main/resources/ui-initialized-templates/apm.yml

# Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# UI templates initialized file includes the default template when the SkyWalking OAP starts up at the first time.## Also, SkyWalking would detect the existing templates in the database, once they are missing, all templates in this file# could be added automatically.templates:  - name: "APM"    # The type includes DASHBOARD, TOPOLOGY_INSTANCE, TOPOLOGY_ENDPOINT.    # DASHBOARD type templates could have multiple definitions, by using different names.    # TOPOLOGY_INSTANCE, TOPOLOGY_ENDPOINT type templates should be defined once, as they are used in the topology page only.    type: "DASHBOARD"    # Configuration could be defined through UI, and use `export` to format in the standard JSON.    configuration: |-      [        {          "name": "APM",          "type": "service",          "children": [            {              "name": "Global",              "children": [                {                  "width": 3,                  "title": "Services Load",                  "height": "300",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_cpm",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": false,                  "unit": "CPM - calls per minute"                },                {                  "width": 3,                  "title": "Slow Services",                  "height": "300",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_resp_time",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": false,                  "unit": "ms"                },                {                  "width": 3,                  "title": "Un-Health Services (Apdex)",                  "height": "300",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_apdex",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": false,                  "aggregation": "/",                  "aggregationNum": "10000",                  "sortOrder": "ASC"                },                {                  "width": 3,                  "title": "Slow Endpoints",                  "height": "300",                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_avg",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": false,                  "unit": "ms"                },                {                  "width": "6",                  "title": "Global Response Latency",                  "height": "280",                  "entityType": "All",                  "independentSelector": false,                  "metricType": "LABELED_VALUE",                  "metricName": "all_percentile",                  "queryMetricType": "readLabeledMetricsValues",                  "chartType": "ChartLine",                  "metricLabels": "P50, P75, P90, P95, P99",                  "labelsIndex": "0, 1, 2, 3, 4",                  "unit": "percentile in ms"                },                {                  "width": "6",                  "title": "Global Heatmap",                  "height": "280",                  "entityType": "All",                  "independentSelector": false,                  "metricType": "HEATMAP",                  "unit": "ms",                  "queryMetricType": "readHeatMap",                  "chartType": "ChartHeatmap",                  "metricName": "all_heatmap"                }              ]            },            {              "name": "Service",              "children": [                {                  "width": 3,                  "title": "Service Apdex",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_apdex",                  "queryMetricType": "readMetricsValue",                  "chartType": "ChartNum",                  "aggregation": "/",                  "aggregationNum": "10000"                },                {                  "width": 3,                  "title": "Service Avg Response Time",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_resp_time",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "ms"                },                {                  "width": 3,                  "title": "Successful Rate",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_sla",                  "queryMetricType": "readMetricsValue",                  "chartType": "ChartNum",                  "unit": "%",                  "aggregation": "/",                  "aggregationNum": "100"                },                {                  "width": 3,                  "title": "Service Load",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_cpm",                  "queryMetricType": "readMetricsValue",                  "chartType": "ChartNum",                  "unit": "CPM - calls per minute"                },                {                  "width": 3,                  "title": "Service Apdex",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_apdex",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "aggregation": "/",                  "aggregationNum": "10000"                },                {                  "width": 3,                  "title": "Service Response Time Percentile",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "LABELED_VALUE",                  "metricName": "service_percentile",                  "queryMetricType": "readLabeledMetricsValues",                  "chartType": "ChartLine",                  "unit": "ms",                  "metricLabels": "P50, P75, P90, P95, P99",                  "labelsIndex": "0, 1, 2, 3, 4"                },                {                  "width": 3,                  "title": "Successful Rate",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_sla",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "%",                  "aggregation": "/",                  "aggregationNum": "100"                },                {                  "width": 3,                  "title": "Service Load",                  "height": "200",                  "entityType": "Service",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_cpm",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "CPM - calls per minute"                },                {                  "width": "4",                  "title": "Service Instances Load",                  "height": "280",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_cpm",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": true,                  "unit": "CPM - calls per minute"                },                {                  "width": "4",                  "title": "Slow Service Instance",                  "height": "280",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_resp_time",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": true,                  "unit": "ms"                },                {                  "width": "4",                  "title": "Service Instance Successful Rate",                  "height": "280",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_sla",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": true,                  "unit": "%",                  "aggregation": "/",                  "aggregationNum": "100",                  "sortOrder": "ASC"                }              ]            },            {              "name": "Instance",              "children": [                {                  "width": "3",                  "title": "Service Instance Load",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_cpm",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "CPM - calls per minute"                },                {                  "width": 3,                  "title": "Service Instance Throughput",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_throughput_received,service_instance_throughput_sent",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "Bytes"                },                {                  "width": "3",                  "title": "Service Instance Successful Rate",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_sla",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "%",                  "aggregation": "/",                  "aggregationNum": "100"                },                {                  "width": "3",                  "title": "Service Instance Latency",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "service_instance_resp_time",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "ms"                },                {                  "width": 3,                  "title": "JVM CPU (Java Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_jvm_cpu",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "%",                  "aggregation": "+",                  "aggregationNum": ""                },                {                  "width": 3,                  "title": "JVM Memory (Java Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_jvm_memory_heap, instance_jvm_memory_heap_max,instance_jvm_memory_noheap, instance_jvm_memory_noheap_max",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "MB",                  "aggregation": "/",                  "aggregationNum": "1048576"                },                {                  "width": 3,                  "title": "JVM GC Time",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_jvm_young_gc_time, instance_jvm_old_gc_time",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "ms"                },                {                  "width": 3,                  "title": "JVM GC Count",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartBar",                  "metricName": "instance_jvm_young_gc_count, instance_jvm_old_gc_count"                },                {                  "width": 3,                  "title": "JVM Thread Count (Java Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "metricName": "instance_jvm_thread_live_count, instance_jvm_thread_daemon_count, instance_jvm_thread_peak_count,instance_jvm_thread_deadlocked,instance_jvm_thread_monitor_deadlocked"                },                {                  "width": 3,                  "title": "JVM Thread State Count (Java Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_jvm_thread_new_thread_count,instance_jvm_thread_runnable_thread_count,instance_jvm_thread_blocked_thread_count,instance_jvm_thread_wait_thread_count,instance_jvm_thread_time_wait_thread_count,instance_jvm_thread_terminated_thread_count",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartBar"                },                {                  "width": 3,                  "title": "CLR CPU  (.NET Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_clr_cpu",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "%"                },                {                  "width": 3,                  "title": "CLR GC (.NET Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_clr_gen0_collect_count, instance_clr_gen1_collect_count, instance_clr_gen2_collect_count",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartBar"                },                {                  "width": 3,                  "title": "CLR Heap Memory (.NET Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "instance_clr_heap_memory",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "MB",                  "aggregation": "/",                  "aggregationNum": "1048576"                },                {                  "width": 3,                  "title": "CLR Thread (.NET Service)",                  "height": "250",                  "entityType": "ServiceInstance",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "metricName": "instance_clr_available_completion_port_threads,instance_clr_available_worker_threads,instance_clr_max_completion_port_threads,instance_clr_max_worker_threads"                }              ]            },            {              "name": "Endpoint",              "children": [                {                  "width": "4",                  "title": "Endpoint Load in Current Service",                  "height": "280",                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_cpm",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "parentService": true,                  "unit": "CPM - calls per minute"                },                {                  "width": "4",                  "title": "Slow Endpoints in Current Service",                  "height": "280",                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "metricName": "endpoint_avg",                  "unit": "ms",                  "parentService": true                },                {                  "width": "4",                  "title": "Successful Rate in Current Service",                  "height": "280",                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_sla",                  "queryMetricType": "sortMetrics",                  "chartType": "ChartSlow",                  "aggregation": "/",                  "aggregationNum": "100",                  "parentService": true,                  "unit": "%",                  "sortOrder": "ASC"                },                {                  "width": 3,                  "title": "Endpoint Load",                  "height": 350,                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_cpm",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine"                },                {                  "width": 3,                  "title": "Endpoint Avg Response Time",                  "height": 350,                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_avg",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "ms"                },                {                  "width": 3,                  "title": "Endpoint Response Time Percentile",                  "height": 350,                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "LABELED_VALUE",                  "metricName": "endpoint_percentile",                  "queryMetricType": "readLabeledMetricsValues",                  "chartType": "ChartLine",                  "metricLabels": "P50, P75, P90, P95, P99",                  "labelsIndex": "0, 1, 2, 3, 4",                  "unit": "ms"                },                {                  "width": 3,                  "title": "Endpoint Successful Rate",                  "height": 350,                  "entityType": "Endpoint",                  "independentSelector": false,                  "metricType": "REGULAR_VALUE",                  "metricName": "endpoint_sla",                  "queryMetricType": "readMetricsValues",                  "chartType": "ChartLine",                  "unit": "%",                  "aggregation": "/",                  "aggregationNum": "100"                }              ]            }          ]        }      ]    # Activated as the DASHBOARD type, makes this templates added into the UI page automatically.    # False means providing a basic template, user needs to add it manually.    activated: true    # True means wouldn't show up on the dashboard. Only keeps the definition in the storage.    disabled: false

成果展现

代码奉献

  • Add some new thread metric and class metric to JVMMetric #7230
  • add some new thread metric and class metric to JVMMetric #52
  • Remove Terminated State and New State in JVMMetric (#7230) #53
  • Add some new thread metric and class metric to JVMMetric (#7230) #7243

总结

Metric 如何扩大,网上基本上没案例,都是看官网文档和源码理解。对应这种十分热门的开源我的项目,还是看官网文档和源码更稳。

参考文档

  • Java ManagementFactory解析
  • 编程中应用ThreadMXBean类来检测死锁
  • Source and Scope extension for new metrics
  • Observability Analysis Language
分享并记录所学所见