<p>In the "binary trees benchmark" here <a href="http://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=java" rel="nofollow">http://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=java</a></p>
<p>Go seems to underperform Java by quite a bit. Now I understand Go is much newer, but is there something in code (<a href="http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=go&id=4" rel="nofollow">http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=go&id=4</a>) which slows it down or is it that the GC is still less mature than Java's?</p>
<p>I know it's just a game, but still...</p>
<hr/>**评论:**<br/><br/>kjk: <pre><p>Yes, it is memory allocator and probably the fact that bottomUpTree() is called a lot, recursively.</p>
<p>That being said, making sweeping statements (like "Go's GC is immature") based on a single benchmark is not exactly right. Both Go and Java have state-of-the art garbage collectors but they are tuned for different scenarios and this benchmark has a workload that heavily favors Java's GC.</p>
<p>Why? Because Java's collector is generational. That means that allocation is extremely cheap (bump of the pointer). The downside is that generational GCs use more memory and when objects are moved from nursery to tenured heap, there's additional work of copying the memory.</p>
<p>Go's GC is not generational, so allocation requires (comparatively speaking) much more work. It's also tuned for low latency (smallest pause when GC has to stop the program) at the expense of throughput (i.e. total speed). This is the right trade-off for most programs but doesn't perform optimally on micro-benchmarks that measure throughtput.</p>
<p>I've made a Go version that allocates nodes in bulk <a href="https://gist.github.com/kjk/4620bf60315d3fdd3f210e4590bdf1cb" rel="nofollow">https://gist.github.com/kjk/4620bf60315d3fdd3f210e4590bdf1cb</a> using a pool allocator of nodes.</p>
<p>On my machine it goes from 17s to 9s (if I allocate 1024 nodes at a time) or 7s (if I allocate 32*1024 nodes at a time).</p>
<p>In this particular program the number of nodes could be calculated up-front so it could be sped up even more.</p>
<p>Another thing where Java probably has an advantage is the fact that almost all the time is spent calling bottomUpTree() a lot, recursively.</p>
<p>Why is it worse than in Java?</p>
<p>To make goroutines cheap, Go starts each goroutine with very small stack (~4kb). As a result, Go has to expand the stack when needed. Go does that by checking stack size at the beginning of every function call. When it detects it needs more stack, it creates a new, bigger stack and copies the old stack to new one.</p>
<p>The cost is usually very small (few instructions for each function call) but if there is a function like bottomUpTree() that does little work and is called very often, even small cost adds up.</p>
<p>Additionally, this is deeply recursive call, so stack is being copied several times when it needs to be expanded.</p>
<p>I'm pretty sure that if you rewrote bottomUpTree() to be non-recursive (non-trivial but possible) and used pool allocator for nodes, Go version would be as good (if not better) than Java.</p></pre>kjk: <pre><p>Even faster version: <a href="https://gist.github.com/kjk/1392e38805c1d58bf6cd17c07f027f27" rel="nofollow">https://gist.github.com/kjk/1392e38805c1d58bf6cd17c07f027f27</a></p>
<p>It produces correct results in only 2 secs (originally: 17).</p>
<p>It does that by pre-allocating precise number of nodes and re-using pool allocator within each iteration.</p>
<p>That's not in the spirit of the benchmark but it's a trivial modification that you would do in real life coding if GC were dominating your execution time.</p>
<p>In Go, you would first use pool allocator (a custom allocator or using sync.Pool) if program allocates lots of tiny objects.</p>
<p>In any GC language you would then see if you can reduce the number of allocation. This benchmark happens to over-allocate and leaves the cleanup job to GC. With a small change of re-using pool allocator we can eliminate most of the GC work, reduce peak memory usage and improve cache utilization.</p></pre>comrade-jim: <pre><p>Looks like this one beats the C++ benchmark too. </p>
<p><a href="https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=gpp" rel="nofollow">https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=gpp</a></p>
<p>edit: and C</p>
<p><a href="https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=gcc" rel="nofollow">https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=gcc</a></p></pre>kl0nos: <pre><p>Why not submitting it? Most of the other languages implementations use some kind of arena (memory pool) anyway.</p></pre>neoasterisk: <pre><p>Yeah pretty please submit it.</p></pre>matttproud: <pre><p>The Alioth shootout is Lies, Damn Lies, and Statistics material. The results are — generally speaking — non-deterministic (e.g., suboptimal runtime settings on JVM that lead to non-repeatability) or specious (e.g., measuring the performance of Go, Ruby, or JVM callouts to native C libraries). I wish folks would quit citing them and formally label them harmful; they lack methodological rigor.</p></pre>igouy: <pre><p>I wish folks would quit lazily trash-talking and instead do some work to provide some alternative they believe better.</p>
<p><a href="http://benchmarksgame.alioth.debian.org/for-programming-language-researchers.html" rel="nofollow">http://benchmarksgame.alioth.debian.org/for-programming-language-researchers.html</a></p>
<p><em>We may quote to one another with a chuckle the words of the Wise Statesman, lies, damned lies and statistics, still there are some easy figures which the simplest must understand but the astutest cannot wriggle out of.</em> 1895 Leonard Henry Courtney</p></pre>robertmeta: <pre><p>Interesting question, over half the time on my box is spend in GC related activities (goofing around with GOGC=off more than halves the time). </p>
<p>But, it does seem like a solid naive implementation. I wonder what could be changed to lessen the GC pressure a bit. </p></pre>gohacker: <pre><p><a href="/r/golang/search" rel="nofollow">/r/golang/search</a></p>
<p><a href="https://www.reddit.com/r/golang/comments/4iiksx/why_golang_binary_tree_benchmark_score_is_very_low/?ref=search_posts" rel="nofollow">https://www.reddit.com/r/golang/comments/4iiksx/why_golang_binary_tree_benchmark_score_is_very_low/?ref=search_posts</a></p>
<p><a href="https://www.reddit.com/r/golang/comments/51mhzv/on_the_binarytrees_benchmark/?ref=search_posts" rel="nofollow">https://www.reddit.com/r/golang/comments/51mhzv/on_the_binarytrees_benchmark/?ref=search_posts</a></p></pre>GoTheFuckToBed: <pre><p>profile it and find out</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传