前言

作为一名gopher,不晓得你是否遇到过以下问题？

CPU忽然飙高(甚至死循环,CPU跑满)？
某个性能接口QPS在压测中始终压不下来？
利用呈现goroutine泄露？
内存居高不下？
在某处加锁的逻辑中，迟迟得不到锁？

你是否能得心应手的找到问题的症结所在，你又是应用什么办法或工具解决的呢？明天我将介绍下我常常应用的工具pprof.它是golang自带的性能剖析大杀器，基本上能疾速解决上述问题。

如何应用pprof

应用pprof有三种姿态，一种是应用runtime/pprof/pprof.go另一种是应用net/http/pprof/pprof.go（底层也是应用runtime/pprof）,还有一种是在单元测试中生成profile 数据。具体的办法在对应的pprof.go文件结尾有阐明。

1. runtime/pprof 形式pprof

package main

import (
    "flag"
    "log"
    "os"
    "runtime"
    "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to `file`")
var memprofile = flag.String("memprofile", "", "write memory profile to `file`")

func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal("could not create CPU profile: ", err)
        }
        defer f.Close() // error handling omitted for example
        if err := pprof.StartCPUProfile(f); err != nil {
            log.Fatal("could not start CPU profile: ", err)
        }
        defer pprof.StopCPUProfile()
    }

    // ... rest of the program ...

    if *memprofile != "" {
        f, err := os.Create(*memprofile)
        if err != nil {
            log.Fatal("could not create memory profile: ", err)
        }
        defer f.Close() // error handling omitted for example
        runtime.GC()    // get up-to-date statistics
        if err := pprof.WriteHeapProfile(f); err != nil {
            log.Fatal("could not write memory profile: ", err)
        }
    }
}

这种形式须要你手动启动须要pprof的类型，比方，开启CPU profile 还是heap profile 等等，pprof 会在利用启动到完结的整个生命周期生成profile 文件。其实生成profile 数据是会损耗性能的，生产环境不倡议始终开启，能够在须要剖析的时候长期采集那个时刻的数据，如通过监听系统信号的形式开启/敞开pprof,示例代码如下：

package main

import (
    "flag"
    "log"
    "os"
    "runtime"
    "runtime/pprof"
)
var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to `file`")
var memprofile = flag.String("memprofile", "", "write memory profile to `file`")

func signalStopPprof(f *os.File) {
    c := make(chan os.Signal, 1)
    signal.Notify(c, syscall.SIGUSR2)
    for _ = range c {
        pprof.StopCPUProfile()
        log.Println("stop  profile")
        _ = f.Close()
    }
}
func signalStartPprof(f *os.File) {
    c := make(chan os.Signal, 1)
    signal.Notify(c, syscall.SIGUSR1)
    for _ = range c {
        _ = f.Truncate(0)
        if err := pprof.StartCPUProfile(f); err != nil {
            log.Println("could not start CPU profile: ", err)
        }
    }
}
func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal("could not create CPU profile: ", err)
        }
        go func() {
            signalStartPprof(f)
        }()
        go func() {
            signalStopPprof(f)
        }()
    }
    // ... rest of the program ...
}

不过自己还是更加喜爱第二种形式net/http/pprof

2. net/http/pprof 形式pprof

package main

import (
    "log"
    "net/http"
    _ "net/http/pprof"
)

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()
  // ... rest of the program ...
}

是不是很简略，这种形式只须要开启http服务，并且import _ "net/http/pprof",他会主动把相应的http的handleFunc 注册下来，若你应用的不是DefaultServeMux,须要本人手动注册下。

func init() {
    http.HandleFunc("/debug/pprof/", Index)
    http.HandleFunc("/debug/pprof/cmdline", Cmdline)
    http.HandleFunc("/debug/pprof/profile", Profile)
    http.HandleFunc("/debug/pprof/symbol", Symbol)
    http.HandleFunc("/debug/pprof/trace", Trace)
}

3. 单元测试中进行profile

go test -cpuprofile cpu.prof -memprofile mem.prof -bench .

剖析pprof

在上文如何应用pprof中介绍的三种开启pprof的形式，他们都会生成profile二进制文件，有三种形式能够剖析这个二进制文件

姿态一：通过交互命令行,办法：go tool pprof {profile文件}

若是通过http形式开启pprof，能够

go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

若是通过形式1和3生成文件,能够

go tool pprof cpu.prof

命令交互行如下
姿态二：通过web形式查看，办法 go tool pprof -http=:6060 {profile文件}

而后就能够在浏览器中拜访 http://localhost:6060/

当然也有一种形式在姿态一中介绍的命令行交互中输出web会生成 .svg文件，不过须要装置graphviz来关上这个文件，不举荐，还是倡议应用命令行或本地启动web服务来查看profile 数据.

实战演练

1.定位CPU问题

本人写了一个模拟程序，模仿CPU问题，

依照上述姿态一关上命令行交互，执行top10,输入采样期间CPU应用最高的10个办法

这里简略介绍下flat flat% sum% cum cum%这五个参数的含意,具体能够查看Pprof and golang – how to interpret a results?

flat：指的是该办法所占用的CPU工夫（不蕴含这个办法中调用其余办法所占用的工夫）

flat%: 指的是该办法flat工夫占全副采样工夫的比例

cum：指的是该办法以及办法中调用其余办法所占用的CPU工夫总和，这里留神区别于flat

cum%:指的是该办法cum工夫占全副采样工夫的比例

sum%: 指的是执行到以后办法累积占用的CPU工夫总和，也即是后面flat%总和

上图能够看出worker()占用CPU工夫较久，咱们能够list main.worker查看具体代码

当然也能够通过上述姿态二，启动web服务查看火焰图go tool pprof -http=:6061 cpu.profile

关上http://localhost:6061/ui/ 火焰图如下，其中色彩越红代表占用的CPU工夫越多

找到了耗时比拟久的中央，就能看看失常的业务代码还是能够优化的逻辑，就能够优化后在pprof

2. 定位内存问题

排查内存再用过高问题

其中inuse_space示意查看常驻内存的应用状况

list 查看相应的函数

原来是统一在append内存，并持有到1GB不开释

当然还能够应用alloc_objects：剖析应用程序的内存长期分配情况

能够看到利用稳固后，除了下面初始调配1GB外，利用长期内存调配次要在worker中

3.定位goroutine问题

例如goroutine泄露等问题，办法如下

能够打出各个goroutine的调用栈

能够看到有900+ 的goroutine 阻塞在runtime/gopark并且是由main.goroutine.fun1办法引起的 list 查看办法内容list main.goroutine.fun1

能够看出次要是阻塞在channel c 上.当然也能够通过traces runtime.gopark 查看那些办法最终阻塞在gopark上，另外也能够在web页面http://localhost:6060/debug/p…间接查看。

4.定位锁问题

锁的问题可能导致程序运行迟缓，pprof mutex 相干的须要设置采样率

runtime.SetMutexProfileFraction(1),若采样率<0 将不进行采样

能够看出，次要在main.mutex.func1上，能够查看调用链

list main.mutex.func1

能够看出次要阻塞在锁m上

5. 定位goroutine阻塞期待同步问题

区别于4 ，这里次要是记录goroutine阻塞期待同步的地位,而4中次要是互斥锁剖析，报告互斥锁的竞争状况。同样须要设置采样率runtime.SetBlockProfileRate(1)

次要主色在chanrecv1上，查看源码

能够发现次要阻塞在channel c 上

写在最初

下面曾经一步一步演示了一遍常见利用性能问题，以及如何剖析定位，前面将写下这些剖析数据背地是如何采集，原理是什么。

最初附上本次测试源码https://github.com/John520/pp…

参考文献

pprof 官网README
Profiling Go Programs
Golang 大杀器之性能分析 PProf
golang pprof 实战
https://golang2.eddycjy.com/p…

关于golang:golang性能分析利器pprof

前言

如何应用pprof

1. runtime/pprof 形式pprof

2. net/http/pprof 形式pprof

3. 单元测试中进行profile

剖析pprof

实战演练

1.定位CPU问题

2. 定位内存问题

3.定位goroutine问题

4.定位锁问题

5. 定位goroutine阻塞期待同步问题

写在最初

参考文献

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于golang:golang性能分析利器pprof

前言

如何应用pprof

1. runtime/pprof 形式pprof

2. net/http/pprof 形式pprof

3. 单元测试中进行profile

剖析pprof

实战演练

1.定位CPU问题

2. 定位内存问题

3.定位goroutine问题

4.定位锁问题

5. 定位goroutine阻塞期待同步问题

写在最初

参考文献

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

发表回复取消回复