pprof 的使用

Robin92 · · 1434 次点击 · · 开始浏览

这是一个创建于的文章，其中的信息可能已经有所发展或是发生改变。

第一次，站长亲自招 Gopher 了>>>

基本介绍

pprof 是在做性能优化前的性能分析工具。

安装： go get -u github.com/google/pprof

两个工具

服务型应用 _ "net/http/pprof" 包，专用于采集 web 服务 运行数据的分析
工具型应用 "runtime/pprof" 包，专用于采集 应用程序 运行数据的分析

另外 go tool pprof 是命令行指令，用于获取和分析 Profiling 数据，源数据可以是 http 地址，也可以是本地 Profile 文件；查看模式可以命令行交互模式，也可以是 http 模式（-http 参数）。

服务型应用 (net/http/pprof)

特点是服务要一直运行。比如 web 应用，一边提供 HTTP 服务一边进行分析。

如果使用了默认的 http.DefaultServeMux (http.ListenAndServe("0.0.0.0:8000", nil))，则只需要导入包则可以使用。

如果自定义的 mux，需要手动注册路由规则，如 r.HandleFunc("/debug/pprof/", pprof.Index)

工具性应用 (runtime/pprof)

特点：应用运行一段时间就结束。

应用退出时，把 profile 文件保存到报告文件中。

封装了接口可以调用，如要进行 CPU Profiling，则调用 pprof.StartCPUProfile(w io.Writer) 写入到 w 中，停止时调用 StopCPUProfile()；要获取内存数据，直接使用 pprof.WriteHeapProfile(w io.Writer) 函数则可。

可以做？

可以做（ $host/debug/pprof/$type, $type 即下方列表的英文单词）：

CPU 分析（profile）: 你可以在 url 上用 seconds 参数指定抽样持续时间（默认 30s），你获取到概览文件后可以用 go tool pprof 命令调查这个概览
内存分配（allocs）: 所有内存分配的抽样
阻塞（block）: 堆栈跟踪导致阻塞的同步原语
命令行调用（cmdline）: 命令行调用的程序
goroutine: 当前 goroutine 的堆栈信息
堆（heap）: 当前活动对象内存分配的抽样，完全也可以指定 gc 参数在对堆取样前执行 GC
互斥锁（mutex）: 堆栈跟踪竞争状态互斥锁的持有者
系统线程的创建（threadcreate）: 堆栈跟踪系统新线程的创建
trace: 追踪当前程序的执行状况. 你可以用 seconds 参数指定抽样持续时间. 你获取到 trace 概览后可以用 go tool pprof 命令调查这个 trace

基本使用

（由于初学资料的限制，以下暂时均以 net/http/pprof 使用为例）

编写 Demo 文件 demo.go，运行这个文件

package main

import (
    "log"
    "net/http"
    _ "net/http/pprof"
)

func main() {
    go func() {
        for {
            log.Println(Add("Hello world"))
        }
    }()

    http.ListenAndServe("0.0.0.0:6060", nil)
}

var datas []string

func Add(str string) string {
    data := []byte(str)
    sData := string(data)
    datas = append(datas, sData)

    return sData
}

通过 Web 页面分析

访问 http://localhost:6060/debug/pprof/ 页面，可以看到总览
点击其中的链接，可以看到各部分的概览

注：网页点击一些链接会下载文件，这些文件应该是需要专用工具才能打开查看分析，如 profile；而且默认采样时长 30s

通过命令行分析

执行命令 go tool pprof http://localhost:6060/debug/pprof/profile?seconds=60 进行 60s 内进行采样（时间自定，默认 30s）。

此时输出一些信息并已经进入到 交互模式：

$ go tool pprof http://localhost:6060/debug/pprof/profile
Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile
Saved profile in C:\Users\c30004139\pprof\pprof.samples.cpu.003.pb.gz
Type: cpu
Time: Jun 17, 2020 at 3:25pm (CST)
Duration: 30.14s, Total samples = 31.64s (104.96%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)

由输出信息可知：

将 profile 文件保存的位置
分析的 type 是 CPU
采样持续时间（Duration）
提示可以输入 help 查看可用命令，输入 o 查看可用选项

输入 help 显示了所有可用命令和选项

(pprof) help
Commands:
    callgrind        Outputs a graph in callgrind format
    comments         Output all profile comments
    disasm           Output assembly listings annotated with samples
    dot              Outputs a graph in DOT format
    eog              Visualize graph through eog
    evince           Visualize graph through evince
    gif              Outputs a graph image in GIF format
    gv               Visualize graph through gv
    kcachegrind      Visualize report in KCachegrind
    list             Output annotated source for functions matching regexp
    pdf              Outputs a graph in PDF format
    peek             Output callers/callees of functions matching regexp
    png              Outputs a graph image in PNG format
    proto            Outputs the profile in compressed protobuf format
    ps               Outputs a graph in PS format
    raw              Outputs a text representation of the raw profile
    svg              Outputs a graph in SVG format
    tags             Outputs all tags in the profile
    text             Outputs top entries in text form
    top              Outputs top entries in text form
    topproto         Outputs top entries in compressed protobuf format
    traces           Outputs all profile samples in text form
    tree             Outputs a text rendering of call graph
    web              Visualize graph through web browser
    weblist          Display annotated source in a web browser
    o/options        List options and their current values
    quit/exit/^D     Exit pprof

Options:
    call_tree        Create a context-sensitive call tree
    compact_labels   Show minimal headers
    divide_by        Ratio to divide all samples before visualization
    drop_negative    Ignore negative differences
    edgefraction     Hide edges below <f>*total
    focus            Restricts to samples going through a node matching regexp
    hide             Skips nodes matching regexp
    ignore           Skips paths going through any nodes matching regexp
    mean             Average sample value over first value (count)
    nodecount        Max number of nodes to show
    nodefraction     Hide nodes below <f>*total
    noinlines        Ignore inlines.
    normalize        Scales profile based on the base profile.
    output           Output filename for file-based outputs
    prune_from       Drops any functions below the matched frame.
    relative_percentages Show percentages relative to focused subgraph
    sample_index     Sample value to report (0-based index or name)
    show             Only show nodes matching regexp
    show_from        Drops functions above the highest matched frame.
    source_path      Search path for source files
    tagfocus         Restricts to samples with tags in range or matched by regexp
    taghide          Skip tags matching this regexp
    tagignore        Discard samples with tags in range or matched by regexp
    tagshow          Only consider tags matching this regexp
    trim             Honor nodefraction/edgefraction/nodecount defaults
    trim_path        Path to trim from source paths before search
    unit             Measurement units to display

Option groups (only set one per group):
    cumulative
    cum              Sort entries based on cumulative weight
    flat             Sort entries based on own weight
    granularity
    addresses        Aggregate at the address level.
    filefunctions    Aggregate at the function level.
    files            Aggregate at the file level.
    functions        Aggregate at the function level.
    lines            Aggregate at the source code line level.
:   Clear focus/ignore/hide/tagfocus/tagignore

type "help <cmd|option>" for more information
(pprof)

几个命令介绍：

list,
dis

CPU profile 的 top 命令

在 CPU profile 交互页面输入 topN 可以查看 top N 的概况（N 可省略，默认为 10）

(pprof) top
Showing nodes accounting for 29.92s, 94.56% of 31.64s total
Dropped 117 nodes (cum <= 0.16s)
Showing top 10 nodes out of 33
      flat  flat%   sum%        cum   cum%
    28.52s 90.14% 90.14%     28.58s 90.33%  runtime.cgocall
     0.81s  2.56% 92.70%      0.82s  2.59%  runtime.stdcall1
     0.24s  0.76% 93.46%      0.25s  0.79%  runtime.stdcall3
     0.16s  0.51% 93.96%     29.10s 91.97%  internal/poll.(*FD).writeConsole
     0.05s  0.16% 94.12%     29.28s 92.54%  internal/poll.(*FD).Write
     0.04s  0.13% 94.25%      0.18s  0.57%  runtime.findrunnable
     0.03s 0.095% 94.34%      0.18s  0.57%  runtime.mallocgc
     0.03s 0.095% 94.44%      0.25s  0.79%  runtime.mcall
     0.02s 0.063% 94.50%     29.49s 93.20%  log.(*Logger).Output
     0.02s 0.063% 94.56%     29.71s 93.90%  log.Println
(pprof)

信息：

显示的节点在总共 31.64s 的抽样中，占 29.92s，比例 94.56%
在 33 个样本中显示了 top 10
列表解释（cum: cumulative 堆积的）
- flat: 在给定函数上运行耗时
- flat%: 在给定函数上运行耗时总比例
- sum%: 给定函数累计使用 CPU 总比例
- cum: 当前函数 以及包含子函数 的调用运行总耗时
- cum%: 同上的 CPU 运行耗时总比例
- 最后一列为函数名称

作者：EDDYCJY
链接：https://www.jianshu.com/p/4e4ff6be6af9
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

heap profile 的 top 命令

go tool pprof http://localhost:6060/debug/pprof/heap

$ go tool pprof http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in C:\Users\c30004139\pprof\pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
Type: inuse_space
Time: Jun 17, 2020 at 3:52pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)
(pprof) top
Showing nodes accounting for 10.04GB, 100% of 10.04GB total
      flat  flat%   sum%        cum   cum%
   10.04GB   100%   100%    10.04GB   100%  main.Add
         0     0%   100%    10.04GB   100%  main.main.func1
(pprof)

inuse_space: 常驻内存占用情况
alloc_objects：内存临时分配情况

可视化界面

编写 test 用例，demo_test.go:

package main

import "testing"

const url = "hello world"

func TestAdd(t *testing.T) {
    s := Add(url)
    if s == "" {
        t.Errorf("Test.Add error")
    }
}

func BenchmarkAdd(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Add(url)
    }
}

执行测试 go test -bench=. -cpuprofile=cpu.prof 及输出：

$ go test -bench=. -cpuprofile=cpu.prof
goos: windows
goarch: amd64
pkg: awesomeProject/pprof
BenchmarkAdd-12         13364784                88.3 ns/op
PASS
ok      awesomeProject/pprof    1.828s

参数解释：

-bench: 通过 go help testflag 查看，按通配运行 benchmarks
-cpuprofile: 指定 CPU profiling 的保存文件
-memprofile: 指定 Memory profiling 的保存文件

结果在本目录生成了 cpu.prof 文件

启用可视化界面

需要安装 graphviz，参考：Graphviz安装及简单使用。

方式一： go tool pprof -http=:8080 cpu.prof
方式二： go tool pprof cpu.prof，然后在交互页面输入 web [funcName] 命令（函数调用图生成 svg 文件并用浏览器查看）

graphviz-pprof

Top, 与前面介绍的几个列一样
Graph, 函数调用图，线框越大越粗表明占用时间越多
Flame Graph, 火焰图
Peek, 在 Top 的基础上显示了调用的上下文文件
Source, 看到了源码，及源码处执行时间
Disassemble

火焰图????

火焰图，调用顺序是从上到下（有些是从下到上，如火焰一般），每一块代表一个函数，越大代表占用 CPU 时间越长。

点击块深入分析。

刨析火焰图，参考如何读懂火焰图？

火焰图是对系统调用栈进行抽样调查
一般采样记录信息有三个参数：采样的进程号、采样的频率（每秒采样多少次，通常 99 次）、采样持续时间
如，对某 一个进程 的 调用栈 进行采样，每秒 99 次（Hz）、持续 30s。如果这个服务器有 16 个 CPU，就能采集到 99 * 30 * 16 = 47520 个调用栈信息
将调用栈信息组装成火焰图
火焰图 Y 轴表示调用栈，每一个块都是一个函数。调用栈越深，火焰越高。火焰顶端是正在执行的函数，底端是父函数（有些火焰是正的，有些是倒的）
火焰图 X 轴表示抽样次数。如果一个函数越宽，就表示被抽到的次数越多，执行时间越长。
注意：X 轴不表示时间先后顺序，而是所有调用栈合并后，按字母顺序排列的。
火焰图的颜色表示 CPU 的繁忙程度。
火焰图就是看顶层那个函数最宽。若有“平顶”（plateaus），就很有可能存在性能问题。
火焰图是 SVG 图（Scalable Vector Graphics 可伸缩的矢量图形参考），可以很方便得进行互动。

火焰图局限：

调用栈过深时，某些系统只返回前面一部分（如前 10 层）
函数名缺失时，只能用内存地址来表示（如匿名函数）

参考

有疑问加站长微信联系（非本文作者）

本文来自：简书

感谢作者：Robin92

查看原文：pprof 的使用

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

1434 次点击

加入收藏微博

收入我的专栏

上一篇：golang oss上传文件

下一篇：protobuf文件编写在goland gomod下正确姿势探究,由浅入深版(版本proto3)

pprof

函数

runtime

web

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

关注我

扫码关注领全套学习资料
加入 QQ 群：
- 192706294（已满）
- 731990104（已满）
- 798786647（已满）
- 729884609（已满）
- 977810755（已满）
- 815126783（已满）
- 812540095（已满）
- 1006366459（已满）
- 692541889
加入微信群：liuxiaoyan-s，备注入群
也欢迎加入知识星球 Go粉丝们（免费）

pprof 的使用

基本介绍

两个工具

服务型应用 (net/http/pprof)

工具性应用 (runtime/pprof)

可以做？

基本使用

通过 Web 页面分析

通过命令行分析

CPU profile 的 top 命令

heap profile 的 top 命令

可视化界面

启用可视化界面

火焰图????

参考

用户登录

今日阅读排行

一周阅读排行

关注我

基本介绍

两个工具

服务型应用 (net/http/pprof)

工具性应用 (runtime/pprof)

可以做？

基本使用

通过 Web 页面分析

通过命令行分析

CPU profile 的 top 命令

heap profile 的 top 命令

可视化界面

启用可视化界面

火焰图????

参考

pprof 的使用

基本介绍

两个工具

服务型应用 (net/http/pprof)

工具性应用 (runtime/pprof)

可以做？

基本使用

通过 Web 页面分析

通过命令行分析

CPU profile 的 top 命令

heap profile 的 top 命令

可视化界面

启用可视化界面

火焰图????

参考

用户登录

今日阅读排行

一周阅读排行

关注我

给该专栏投稿 写篇新文章

收入到我管理的专栏 新建专栏

基本介绍

两个工具

服务型应用 (net/http/pprof)

工具性应用 (runtime/pprof)

可以做？

基本使用

通过 Web 页面分析

通过命令行分析

CPU profile 的 top 命令

heap profile 的 top 命令

可视化界面

启用可视化界面

火焰图????

参考

给该专栏投稿写篇新文章

收入到我管理的专栏新建专栏