Most efficient way to repeatedly concatenate strings and write them to an io.Writer

xuanbao · · 570 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I have the following code, that I&#39;d like to optimize:</p> <pre><code>var strings []string func main() { strings = []string{&#34;a&#34;, &#34;b&#34;, &#34;c&#34;, &#34;d&#34;} concat1(os.Stdout) fmt.Println() concat2(os.Stdout) } func concat1(w io.Writer) { var buf bytes.Buffer buf.WriteString(&#34;START&#34;) for _, v := range strings { buf.WriteString(v) buf.WriteString(&#34;-&#34;) } buf.WriteString(&#34;END&#34;) // &#34;Flush&#34; w.Write(buf.Bytes()) } func concat2(w io.Writer) { w.Write([]byte(&#34;START&#34;)) for _, v := range strings { w.Write([]byte(v)) w.Write([]byte(&#34;-&#34;)) } w.Write([]byte(&#34;END&#34;)) } </code></pre> <p>What is the most efficient way to concatenate <code>strings</code> in terms of CPU usage, and memory usage, between <code>concat1</code> and <code>concat2</code> (or something else)?</p> <p>My guts tell me it&#39;s probably <code>concat1</code> because the strings don&#39;t have to be &#34;cast&#34; to <code>[]byte</code>... But on the other hand a <code>bytes.Buffer</code> needs to be created.</p> <p><a href="https://play.golang.org/p/BkZTGyDZ2X">Link to playground</a></p> <p>Is there a more idiomatic/efficient way of doing this?</p> <p>Thank you!</p> <p>EDIT: For anyone interested, see <a href="https://www.reddit.com/r/golang/comments/45jhs0/most_efficient_way_to_repeatedly_concatenate/czye11e">this comment</a> for benchmark results.</p> <hr/>**评论:**<br/><br/>jeffrallen: <pre><p>The first rule of optimization is don&#39;t do it. At least not until you have evidence (usually from a CPU or Memory profiler) that proves you need to do it.</p> <p>The simplest way to do this in the standard library is strings.Join. </p> <p>However, using bytes.Buffer and WriteString is also idiomatic. The final write into the io.Writer is fine.</p> <p>It is not idiomatic that you are passing in the strings slice via a global, but I&#39;m sure you know that...</p> <p>Keep in mind that for very high performance, you need to be careful of avoiding making garbage. Every []byte(string) cast makes a copy, thus more garbage...</p> <p>You might be interested to look into the special behavior of the append() builtin when appending a string onto a []byte (citation needed). And you might also like to experiment with append, watching how the resulting buffer&#39;s capacity goes up. Go is careful to allocate extra space so that a series of append() will do less copying that you might worry about. So your concat could be as simple as for _,str := range strings { buf = append(buf, str, &#34;-&#34;) }.</p> <p>-jeff</p></pre>sroeuouay: <pre><p>First, thank you for your thorough answer.</p> <p>I&#39;m slowly getting back to Go, this code is from a toy project of mine, and I really like trying to get the best performance in terms of CPU or RAM usage. Something that I find particularly hard and not enjoyable with languages like Ruby for example.</p> <p>So optimizing this <code>concat</code> is really just for fun.</p> <blockquote> <p>Keep in mind that for very high performance, you need to be careful of avoiding making garbage. Every []byte(string) cast makes a copy, thus more garbage...</p> </blockquote> <p>Is there any good/recommended articles on this subject? aka finding the right balance between readability, idiomatic-ness(?) and speed?</p></pre>TheMerovius: <pre><p>You have 90% of a benchmark right there. Now, just execute it.</p> <p>Benchmarking go code is <em>really</em> easy. Read the Benchmark section of the <a href="http://godoc.org/testing">testing package</a>. Benchmarking both will give you a definitive answer, for your specific usecase and data. Instead of just guessing :)</p></pre>TheMerovius: <pre><p>That being said: My bet is on <a href="https://play.golang.org/p/rKyNNqP3jh" rel="nofollow">something like this</a>. Because it&#39;s the minimum number of allocations and copies possible and ultimately those will dominate.</p></pre>sroeuouay: <pre><p>Your solution is officialy the most efficient for this specific example:</p> <pre><code>BenchmarkConcatBytesBuffer-4 3000000 552 ns/op 224 B/op 2 allocs/op BenchmarkConcatBufIO-4 1000000 1362 ns/op 4208 B/op 2 allocs/op BenchmarkConcatStringsJoin-4 3000000 519 ns/op 320 B/op 4 allocs/op BenchmarkConcatWriter-4 1000000 1655 ns/op 544 B/op 28 allocs/op BenchmarkConcatCopy-4 5000000 294 ns/op 176 B/op 2 allocs/op </code></pre> <p><a href="https://play.golang.org/p/Y2tFczM5Nj" rel="nofollow">Link to playground</a></p></pre>TheMerovius: <pre><p>To reduce noise, I would suggest benchmarking against <a href="http://godoc.org/io/ioutil#Discard" rel="nofollow">ioutil.Discard</a>. It doesn&#39;t change a lot, though. Before:</p> <pre><code>BenchmarkConcatBytesBuffer-4 3000000 557 ns/op 224 B/op 2 allocs/op BenchmarkConcatBufIO-4 1000000 1495 ns/op 4208 B/op 2 allocs/op BenchmarkConcatStringsJoin-4 3000000 511 ns/op 320 B/op 4 allocs/op BenchmarkConcatWriter-4 1000000 1659 ns/op 544 B/op 28 allocs/op BenchmarkConcatCopy-4 5000000 291 ns/op 176 B/op 2 allocs/op </code></pre> <p>After:</p> <pre><code>BenchmarkConcatBytesBuffer-4 3000000 476 ns/op 112 B/op 1 allocs/op BenchmarkConcatBufIO-4 1000000 1366 ns/op 4096 B/op 1 allocs/op BenchmarkConcatStringsJoin-4 3000000 425 ns/op 208 B/op 3 allocs/op BenchmarkConcatWriter-4 1000000 1342 ns/op 432 B/op 27 allocs/op BenchmarkConcatCopy-4 10000000 195 ns/op 64 B/op 1 allocs/op </code></pre></pre>sroeuouay: <pre><p>Good point, thank you,</p> <p>I will update this post in a few minutes with the proper benchmarks.</p> <p>While am at it, is there an easy way to &#34;benchmark&#34; memory usage/allocations/gc? AFAIK the testing package only gives the <code>ns/op</code>.</p></pre>TheMerovius: <pre><p>go test -bench=. -benchmem</p></pre>sroeuouay: <pre><p>This is absolutely amazing, thank you!</p></pre>earthboundkid: <pre><p>Just to add to your indecision: you could also wrap Stdout in a <a href="https://godoc.org/bufio#Writer" rel="nofollow">bufio.Writer</a> and flush that. Ah, the joys of perfectionism! </p></pre>sroeuouay: <pre><p>Oh pretty cool!</p> <p>I&#39;m currently wrapping my head around everything that has been said in this thread, and then I&#39;ll post some benchmarks.</p></pre>: <pre><p>[deleted]</p></pre>earthboundkid: <pre><p>This isn&#39;t preferred because strings are immutable, so you end up allocating a lot of space for the different substring combinations. Just use strings.Join. </p></pre>mc_hammerd: <pre><p>thanks</p></pre>earthboundkid: <pre><p>Always benchmark if it really matters, but <code>strings.Join</code> is fairly efficient because it allocates exactly as much space as needed for the concatenation. Buffers are good if you don&#39;t know how much space you&#39;ll need, but you may end up creating a lot of garbage that way. </p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

570 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传