A better subslicing algorithm?

agolangf · · 607 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I&#39;m working on a subslicing function; here&#39;s the best I came up with, but I was wondering if there&#39;s a better solution?:</p> <p><a href="http://play.golang.org/p/2VuoWQKEn4">http://play.golang.org/p/2VuoWQKEn4</a></p> <pre><code>// Divides `slice` into `n` subslices such that the elements are distributed // as evenly as possible. In other words, if there are 10 elements in `slice`, // and `n` is 3, there will be one subslice with 4 elements and the others will // have only 3. func subslice(slice []int, n int) [][]int { blockSize := len(slice) / n remainder := len(slice) % n subslices := make([]int, n) // It&#39;s not enough just to make n subslices of len(slice)/n elements, // because integer division means that n*len(slice)/n != len(slice). For // example, if slice contains 10 elements and n is 4, then we&#39;d be creating // 4 subslices with 10/4=2 elements each, and 2*4 != 10. To work around // this, we&#39;ll initially compute the remainder (10%4 = 2) which tells us // how many subslices will need to have len(slice)/n + 1 elements // (10/4+1=3) elements, while the rest will have only len(slice)/n // elements. // // The first for loop creates those slightly-larger subslices, while the // second creates the slightly-smaller subslices. In this way, the // difference between the largest and smallest subslices will be at most 1 // element. for i := 0; i &lt; remainder; i++ { start := i * (blockSize + 1) end := (i + 1) * (blockSize + 1) subslices[i] = slice[start:end] } for i := remainder; i &lt; n; i++ { start := i*blockSize + remainder end := (i+1)*blockSize + remainder subslices[i] = slice[start:end] } return subslices } </code></pre> <hr/>**评论:**<br/><br/>TheMerovius: <pre><p>This is what I could come up off the top of my head :) <a href="http://play.golang.org/p/Hk33xxaFLP">http://play.golang.org/p/Hk33xxaFLP</a></p> <p>[edit] or, if you want it <em>really</em> concise: <a href="http://play.golang.org/p/MjAFRBH-Jj">http://play.golang.org/p/MjAFRBH-Jj</a></p> <p>[edit2] As <a href="/u/RalphCorderoy">/u/RalphCorderoy</a> pointed out, the concise version takes a hit on readability and speed. I spent a couple of minutes making it better and I think this is as good as it gets for this algorithm: <a href="http://play.golang.org/p/JbiJLB4_YH">http://play.golang.org/p/JbiJLB4_YH</a>. In particular, preallocating the ret slice gives a very significant speedup.</p></pre>RalphCorderoy: <pre><p>Nice, but the concise one has the reader checking the <code>len(s)/n</code> is the same both times, and the compiler ideally needs to spot that too.</p></pre>TheMerovius: <pre><p>oh well.</p> <p>Interestingly pulling that out <em>does</em> actually make the code noticably faster (With the range of values I tested I got values between 5% up to an astonishingly 20% of speedup). It&#39;s still a micro-optimization, though. In practice it probably won&#39;t matter.</p> <p>But yeah, if you want to optimize readability you definitely want to make that function a bit more verbose and add some comments to explain how it works. Though I believe spotting the <code>len(s)/n</code> part is negligible compared to understanding how that code actually works and why (I mean… I don&#39;t even <em>really</em> know why it works. It&#39;s mainly based on intuition that just happened to create a correct solution ^^ ).</p></pre>RalphCorderoy: <pre><p>Why it works is understood by working backwards, as if doing a proof by induction.</p> <p>Splitting n into 1 is simple.</p> <p>Splitting n into 2: If n is even then it&#39;s simple. If n is odd then n/2 gives the smaller &#34;half&#34; with the remainder of 1 being in the second &#34;half&#34;, and splitting that second half into 1 has already been shown to be simple.</p> <p>n into 3: p=n/3 gives a remainder of 0, 1, or 2, just as n/2 above gave 0, or 1. If 0, it&#39;s simple. If 1 then that will be the remainder when the last two &#34;thirds&#34; are tackled when splitting into 2 as 2p is even. If 2 then that will be split and 1 will be added to each of the last two &#34;thirds&#34;, thus no remainder when they&#39;re divided by 2 next time as 2p+2 divides exactly by 2.</p></pre>weberc2: <pre><p>Great work! Thanks!</p></pre>dgryski: <pre><p>Nicer than my solution! I might have to my ddmin implementation..</p></pre>RalphCorderoy: <pre><p>A variation, showing the pattern of the spread. <a href="https://play.golang.org/p/sBoYW47d1D" rel="nofollow">https://play.golang.org/p/sBoYW47d1D</a></p></pre>dgryski: <pre><p>In the past I&#39;ve used a modification of <a href="https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm" rel="nofollow">https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm</a> to create evenly split subslices. A Go implementation is in <a href="https://github.com/dgryski/go-ddmin/blob/master/ddmin.go#L86" rel="nofollow">https://github.com/dgryski/go-ddmin/blob/master/ddmin.go#L86</a></p></pre>dchapes: <pre><p>One thing I&#39;ll note about this is that I&#39;d strongly recommend setting the capacity of the returned slices (see <a href="https://golang.org/ref/spec#Slice_expressions" rel="nofollow">the Full slice expressions section of the specification</a>) so that if anything should ever choose to append to them the &#34;correct&#34; thing will happen. I don&#39;t think there is any downside to doing this as it adds no measurable expense/effort and I can&#39;t imagine any circumstance where someone appending to the non-last returned slice should overwrite data in the other returned slices.</p> <pre><code>// e.g. everywhere with either of these: x := s[:i] y := s[i:j] // should instead be: x := s[:i:i] y := [i:j:j] </code></pre> <p>Example: <a href="https://play.golang.org/p/P2vmdUcNMX" rel="nofollow">https://play.golang.org/p/P2vmdUcNMX</a><br/> Example with appends: <a href="https://play.golang.org/p/oRbtTs1974" rel="nofollow">https://play.golang.org/p/oRbtTs1974</a></p> <p>You could of course leave any extra capacity of the input slice in the last of the returned slices.</p></pre>weberc2: <pre><p>These slices are read only. Just breaking up the slice so it can be processed in parallel.</p></pre>dchapes: <pre><blockquote> <p>These slices are read only</p> </blockquote> <p>As you know, there is no such thing in Go as a read only slice.</p> <p>Setting the capacity changes it from &#34;unlike every other slice in Go, if you append to one of these it&#39;s a bug inducing mistake&#34; to just &#34;if you append to one of these slices it will require reallocation&#34;. The first is massively error prone and since the solution is trivial I&#39;d call it a bug to not do so.</p></pre>weberc2: <pre><blockquote> <p>As you know, there is no such thing in Go as a read only slice.</p> </blockquote> <p>Sure there is. If you only read from it, it&#39;s read only. ;) In my last comment, I meant that I was the only one using this method, and I was using it for throw-away benchmarking code. But I don&#39;t doubt that everything you said is correct, and thanks again for tipping me off. :)</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

607 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传