A better subslicing algorithm?

agolangf · · 617 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

I'm working on a subslicing function; here's the best I came up with, but I was wondering if there's a better solution?: <a href="http://play.golang.org/p/2VuoWQKEn4">http://play.golang.org/p/2VuoWQKEn4</a> <pre><code>// Divides `slice` into `n` subslices such that the elements are distributed // as evenly as possible. In other words, if there are 10 elements in `slice`, // and `n` is 3, there will be one subslice with 4 elements and the others will // have only 3. func subslice(slice []int, n int) [][]int { blockSize := len(slice) / n remainder := len(slice) % n subslices := make([]int, n) // It's not enough just to make n subslices of len(slice)/n elements, // because integer division means that n*len(slice)/n != len(slice). For // example, if slice contains 10 elements and n is 4, then we'd be creating // 4 subslices with 10/4=2 elements each, and 2*4 != 10. To work around // this, we'll initially compute the remainder (10%4 = 2) which tells us // how many subslices will need to have len(slice)/n + 1 elements // (10/4+1=3) elements, while the rest will have only len(slice)/n // elements. // // The first for loop creates those slightly-larger subslices, while the // second creates the slightly-smaller subslices. In this way, the // difference between the largest and smallest subslices will be at most 1 // element. for i := 0; i < remainder; i++ { start := i * (blockSize + 1) end := (i + 1) * (blockSize + 1) subslices[i] = slice[start:end] } for i := remainder; i < n; i++ { start := i*blockSize + remainder end := (i+1)*blockSize + remainder subslices[i] = slice[start:end] } return subslices } </code></pre> <hr/>**评论：** TheMerovius: <pre>This is what I could come up off the top of my head :) <a href="http://play.golang.org/p/Hk33xxaFLP">http://play.golang.org/p/Hk33xxaFLP</a> [edit] or, if you want it really concise: <a href="http://play.golang.org/p/MjAFRBH-Jj">http://play.golang.org/p/MjAFRBH-Jj</a> [edit2] As <a href="/u/RalphCorderoy">/u/RalphCorderoy</a> pointed out, the concise version takes a hit on readability and speed. I spent a couple of minutes making it better and I think this is as good as it gets for this algorithm: <a href="http://play.golang.org/p/JbiJLB4_YH">http://play.golang.org/p/JbiJLB4_YH</a>. In particular, preallocating the ret slice gives a very significant speedup.</pre>RalphCorderoy: <pre>Nice, but the concise one has the reader checking the <code>len(s)/n</code> is the same both times, and the compiler ideally needs to spot that too.</pre>TheMerovius: <pre>oh well. Interestingly pulling that out does actually make the code noticably faster (With the range of values I tested I got values between 5% up to an astonishingly 20% of speedup). It's still a micro-optimization, though. In practice it probably won't matter. But yeah, if you want to optimize readability you definitely want to make that function a bit more verbose and add some comments to explain how it works. Though I believe spotting the <code>len(s)/n</code> part is negligible compared to understanding how that code actually works and why (I mean… I don't even really know why it works. It's mainly based on intuition that just happened to create a correct solution ^^ ).</pre>RalphCorderoy: <pre>Why it works is understood by working backwards, as if doing a proof by induction. Splitting n into 1 is simple. Splitting n into 2: If n is even then it's simple. If n is odd then n/2 gives the smaller "half" with the remainder of 1 being in the second "half", and splitting that second half into 1 has already been shown to be simple. n into 3: p=n/3 gives a remainder of 0, 1, or 2, just as n/2 above gave 0, or 1. If 0, it's simple. If 1 then that will be the remainder when the last two "thirds" are tackled when splitting into 2 as 2p is even. If 2 then that will be split and 1 will be added to each of the last two "thirds", thus no remainder when they're divided by 2 next time as 2p+2 divides exactly by 2.</pre>weberc2: <pre>Great work! Thanks!</pre>dgryski: <pre>Nicer than my solution! I might have to my ddmin implementation..</pre>RalphCorderoy: <pre>A variation, showing the pattern of the spread. <a href="https://play.golang.org/p/sBoYW47d1D" rel="nofollow">https://play.golang.org/p/sBoYW47d1D</a></pre>dgryski: <pre>In the past I've used a modification of <a href="https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm" rel="nofollow">https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm</a> to create evenly split subslices. A Go implementation is in <a href="https://github.com/dgryski/go-ddmin/blob/master/ddmin.go#L86" rel="nofollow">https://github.com/dgryski/go-ddmin/blob/master/ddmin.go#L86</a></pre>dchapes: <pre>One thing I'll note about this is that I'd strongly recommend setting the capacity of the returned slices (see <a href="https://golang.org/ref/spec#Slice_expressions" rel="nofollow">the Full slice expressions section of the specification</a>) so that if anything should ever choose to append to them the "correct" thing will happen. I don't think there is any downside to doing this as it adds no measurable expense/effort and I can't imagine any circumstance where someone appending to the non-last returned slice should overwrite data in the other returned slices. <pre><code>// e.g. everywhere with either of these: x := s[:i] y := s[i:j] // should instead be: x := s[:i:i] y := [i:j:j] </code></pre> Example: <a href="https://play.golang.org/p/P2vmdUcNMX" rel="nofollow">https://play.golang.org/p/P2vmdUcNMX</a> Example with appends: <a href="https://play.golang.org/p/oRbtTs1974" rel="nofollow">https://play.golang.org/p/oRbtTs1974</a> You could of course leave any extra capacity of the input slice in the last of the returned slices.</pre>weberc2: <pre>These slices are read only. Just breaking up the slice so it can be processed in parallel.</pre>dchapes: <pre><blockquote> These slices are read only </blockquote> As you know, there is no such thing in Go as a read only slice. Setting the capacity changes it from "unlike every other slice in Go, if you append to one of these it's a bug inducing mistake" to just "if you append to one of these slices it will require reallocation". The first is massively error prone and since the solution is trivial I'd call it a bug to not do so.</pre>weberc2: <pre><blockquote> As you know, there is no such thing in Go as a read only slice. </blockquote> Sure there is. If you only read from it, it's read only. ;) In my last comment, I meant that I was the only one using this method, and I was using it for throw-away benchmarking code. But I don't doubt that everything you said is correct, and thanks again for tipping me off. :)</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

617 次点击

加入收藏微博

slice

github

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

A better subslicing algorithm?

用户登录

今日阅读排行

一周阅读排行

最新主题