简介
profiling工具是语言生态的重要组成部分。这篇blog对erlang的vm层profiling做一个总结。
profiling tools overview
可参考
https://www.erlang.org/doc/ef...
名字 | 性能 | 残余性能) |
---|---|---|
cprof | 统计模块的调用次数,可找出调用次数热点 | 0.67 |
eprof | 统计模块的调用次数和工夫,可找出调用工夫、次数热点 | 0.17 |
fprof | 统计模块调用次数和关系,可生成call-graph | 0.008 |
eflame | 通过erlang trace采集调用栈,过程状态 | 0.003 |
looking_glass | 也是基于erlang trace | 0.004 |
具体数据
压测代码如下:
defp do_sth(:cpu_intensive) do start_time = :erlang.system_time(:microsecond) end_time = start_time + 10_000_000 do_until_time(0, start_time, end_time) end def do_until_time(count, start_time, end_time) do now = :erlang.system_time(:microsecond) if now >= end_time do Logger.warn( "\ntotal_time: #{div(end_time - start_time, 1000_000)} second\ncount: #{count}, avg_s: #{div(end_time - start_time, count)}" ) else for v <- 1..100 do a = %{v => Base.encode64(:crypto.strong_rand_bytes(20))} b = Jason.encode!(a) Jason.decode!(b) end do_until_time(count + 1, start_time, end_time) end end
无压测
性能基准
total_time: 10 secondcount: 24287, avg_s: 411cpu: %100
cprof
示例代码
:cprof.start() do_sth(:cpu_intensive) :cprof.pause() data = :cprof.analyse() Logger.warn("cprof_result:\n#{inspect(data, pretty: true)}")
输入后果
仅能取得模块的调用次数。
{243741614, [ {Jason.Encode, 75685344, [ {{Jason.Encode, :escape_json_chunk, 5}, 49314144}, {{Jason.Encode, :value, 3}, 3296400}, {{Jason.Encode, :escape_json, 4}, 3296400}, {{Jason.Encode, :escape_json, 3}, 3296400}, {{Jason.Encode, :"-fun.escape_json/3-", 3}, 3296400}, {{Jason.Encode, :map_naive_loop, 3}, 1648200}, {{Jason.Encode, :map_naive, 3}, 1648200}, {{Jason.Encode, :key, 2}, 1648200}, {{Jason.Encode, :escape_function, 1}, 1648200}, {{Jason.Encode, :encode_string, 2}, 1648200}, {{Jason.Encode, :encode_map_function, 1}, 1648200}, {{Jason.Encode, :encode, 2}, 1648200}, {{Jason.Encode, :"-fun.map_naive/3-", 3}, 1648200} ]}, {Jason.Decoder, 75685344, [ {{Jason.Decoder, :string, 6}, 52610544}, {{Jason.Decoder, :value, 5}, 3296400}, {{Jason.Decoder, :"-string_decode_function/1-fun-0-", 1}, 3296400}, {{Jason.Decoder, :terminate, 6}, 1648200}, {{Jason.Decoder, :string_decode_function, 1}, 1648200}, {{Jason.Decoder, :parse, 2}, 1648200}, {{Jason.Decoder, :object_decode_function, 1}, 1648200}, {{Jason.Decoder, :object, 6}, 1648200}, {{Jason.Decoder, :key_decode_function, 1}, 1648200}, {{Jason.Decoder, :key, 6}, 1648200}, {{Jason.Decoder, :key, 5}, 1648200}, {{Jason.Decoder, :float_decode_function, 1}, 1648200}, {{Jason.Decoder, :"-key_decode_function/1-fun-0-", 1}, 1648200} ]}
性能
total_time: 10 secondcount: 16482, avg_s: 606cpu: %100
eprof
示例代码
:eprof.start() :eprof.start_profiling(:erlang.processes()) do_sth(:cpu_intensive) :eprof.stop_profiling() data = :eprof.analyze(:total, sort: :time) Logger.warn("eprof_result:\n#{inspect(data, pretty: true)}")
输入后果
统计了调用工夫。按工夫占比排序是个较实用的选项。
FUNCTION CALLS % TIME [uS / CALLS]-------- ----- ------- ---- [----------]gen:call/4 1 0.00 0 [ 0.00]gen:do_for_proc/2 1 0.00 0 [ 0.00]...maps:from_list/1 1249500 2.04 203667 [ 0.16]'Elixir.Enum':into_map/2 833000 2.38 237244 [ 0.28]'Elixir.ProfilingTest':'-do_until_time/3-fun-0-'/2 416500 2.39 238687 [ 0.57]'Elixir.Jason.Decoder':parse/2 416500 2.41 240746 [ 0.58]'Elixir.Base':do_encode64/2 416500 2.65 264394 [ 0.63]'Elixir.Base':'-do_encode64/2-lbc$^0/2-0-'/2 1666000 7.30 728723 [ 0.44]'Elixir.Base':enc64_pair/1 5831000 7.39 737817 [ 0.13]crypto:strong_rand_bytes_nif/1 416500 7.56 754244 [ 1.81]'Elixir.Jason.Encode':escape_json_chunk/5 12461680 12.19 1216586 [ 0.10]'Elixir.Jason.Decoder':string/6 13294680 13.93 1390338 [ 0.10]------------------------------------------------------------ -------- ------- ------- [----------]Total: 61597835 100.00% 9981650 [ 0.16]
性能
total_time: 10 secondcount: 4165, avg_s: 2400cpu: %100
fprof
示例代码
:fprof.trace([:start, {:procs, :erlang.processes()}]) do_sth(:cpu_intensive) # 默认写入fprof.trace文件 :fprof.trace(:stop) # 读取文件并剖析 :fprof.profile() # 输入后果,按函数本身调用工夫排序,不蕴含调用其余函数的工夫 :fprof.analyse(sort: :own) # 按累计工夫排序,包含调用其余函数的工夫 :fprof.analyse(sort: :acc)
输入后果
kcallgrind
call graph生成见:https://segmentfault.com/a/11...
sort own
Processing data...Creating output...%% Analysis results:{ analysis_options, [{callers, true}, {sort, own}, {totals, false}, {details, true}]}.% CNT ACC OWN [{ totals, 5040222,10053.041, 9982.294}]. %%%% CNT ACC OWN [{ "<0.212.0>", 5034485,undefined, 9948.017}]. %%{[{{'Elixir.Jason.Decoder',string,6}, 1011296, 0.000, 1539.664}, {{'Elixir.Jason.Decoder',key,5}, 33800, 2135.166, 47.418}, {{'Elixir.Jason.Decoder',value,5}, 33800, 0.000, 46.437}], { {'Elixir.Jason.Decoder',string,6}, 1078896, 2135.166, 1633.519}, % [{{'Elixir.Jason.Decoder',string,6}, 1011296, 0.000, 1539.664}, {{'Elixir.Jason.Decoder',object,6}, 33800, 303.359, 140.051}, {{'Elixir.Jason.Decoder','-string_decode_function/1-fun-0-',1},67600, 85.381, 85.381}, {{'Elixir.Jason.Decoder',key,6}, 33800, 1899.845, 46.730}, {garbage_collect, 4858, 18.554, 18.554}]}. {[{{'Elixir.Jason.Encode',escape_json_chunk,5},943696, 0.000, 1351.146}, {{'Elixir.Jason.Encode',escape_json,4}, 67600, 1487.645, 94.579}], { {'Elixir.Jason.Encode',escape_json_chunk,5},1011296, 1487.645, 1445.725}, % [{{'Elixir.Jason.Encode',escape_json_chunk,5},943696, 0.000, 1351.146}, {garbage_collect, 8224, 41.920, 41.920}]}.
sort acc
Processing data...Creating output...%% Analysis results:{ analysis_options, [{callers, true}, {sort, acc}, {totals, false}, {details, true}]}.% CNT ACC OWN [{ totals, 5040222,10053.041, 9982.294}]. %%%% CNT ACC OWN [{ "<0.212.0>", 5034485,undefined, 9948.017}]. %%{[{{'Elixir.ProfilingTest','test fprof',1}, 0,10051.624, 0.002}, {{fprof,call,1}, 1, 1.377, 0.004}, {undefined, 0, 0.033, 0.028}], { {fprof,just_call,2}, 1,10053.034, 0.034}, % [{{'Elixir.ProfilingTest',do_sth,1}, 1,10051.622, 0.003}, {suspend, 1, 1.368, 0.000}, {{erlang,monitor,2}, 1, 0.005, 0.005}, {{erlang,demonitor,1}, 1, 0.005, 0.005}]}. {[{undefined, 0,10053.008, 0.001}], { {'Elixir.ProfilingTest','test fprof',1}, 0,10053.008, 0.001}, % [{{fprof,just_call,2}, 0,10051.624, 0.002}, {{fprof,trace,1}, 1, 1.383, 0.002}]}. {[{{fprof,just_call,2}, 1,10051.622, 0.003}], { {'Elixir.ProfilingTest',do_sth,1}, 1,10051.622, 0.003}, % [{{'Elixir.ProfilingTest',do_until_time,3}, 1,10051.617, 0.028}, {{erlang,system_time,1}, 1, 0.002, 0.002}]}.
性能
11:47:51.914 [warn] total_time: 10 secondcount: 341, avg_s: 29325cpu: %160
eflame
示例代码
spawn(fn -> do_sth(:cpu_intensive) end) :eflame2.write_trace(:global_and_local_calls, './eflame.test', :all, 15 * 1000) # 上面的语句须要期待一段时间 :eflame2.format_trace('./eflame.test', './eflame.test.out')
实现输入:
Hello, world, I'm <0.821.0> and I'm running....<0.230.0> <0.228.0> iex(2)> nil<0.50.0> <0.10.0> <0.194.0> <0.64.0> Writing to ./eflame.test.out for 8 processes... finished!
输入
生成火焰图:
~/git/eflame/flamegraph.riak-color.pl eflame.test.out > eflame.svg
性能
total_time: 10 secondcount: 91, avg_s: 109890cpu: %120
looking_grass
代码
# 不限度采集范畴会间接爆内存 :lg.trace( [{:scope, [self()]}], :lg_file_tracer, 'traces.lz4', %{process_dump: true} ) do_sth(:cpu_intensive) :lg.stop() :lg_flame.profile_many('traces.lz4.*', 'output')
输入
显然失落了一些调用栈。暂未查明起因。
~/git/FlameGraph/flamegraph.pl output > looking_grass.svg
性能
total_time: 10 secondcount: 127, avg_s: 78740
总结
- erlang 产品中可用的profiling工具非常少,仅 cprof 耗费较少,eprof也须要审慎开启。通过 trace pattern 缩小对性能的影响.
- erlang 多过程,火焰图不适宜作为剖析 cpu 耗费的工具,可用于剖析单个过程场景。
- 没有堪用的火焰图工具,实现太毛糙(没有采样,没有对两头后果做解决),内存,CPU耗费都过大。
参考
https://www.erlang.org/doc/ef...
https://interscity.org/assets...
https://github.com/slfritchie...
https://github.com/rabbitmq/l...