<p>When should we return []*T and when []T?</p>
<p>Edit: Removed the article for less confusion.</p>
<hr/>**评论:**<br/><br/>lobster_johnson: <pre><p>There are different use cases based on the pros/cons, and they're nearly exactly the same as non-slice versions (i.e. <code>T</code> vs. <code>*T</code>).</p>
<p><code>[]T</code>:</p>
<ul>
<li>Pro: Contiguous in memory with respect to T (the runtime will allocate <code>sizeof(T) * cap</code>), which increases cache locality. For example, a loop over <code>[]int</code> can be very efficient and potentially even be vectorized.</li>
<li>Con: To access a slice element, it has to be copied, which is more expensive than passing a pointer around. Similarly, modifying an element requires copying the element, modifying it and then copying it back.</li>
</ul>
<p><code>[]*T</code>:</p>
<ul>
<li>Pro: No copying needed in order to read/write elements.</li>
<li>Con: Requires an indirection to dereference the stored pointer, which can point anywhere in RAM and will be unlikely to take advantage of cache locality.</li>
</ul>
<p>Cache locality also includes <a href="http://www.futurechips.org/chip-design-for-all/prefetching.html">RAM prefetching</a>; modern CPU architectures are complicated, but sequential access is generally faster than random access.</p>
<hr/>
<p>I would recommend using <code>[]T</code> unless you have a specific reason to want to minimize copying. For example, let's say we have this:</p>
<pre><code>type Document struct {
// Lots of fields here, making Document large
}
func ClassifyDocuments(docs []Document) map[string][]Document
</code></pre>
<p>Imagine we want to "classify" the documents based on some heuristic, like divide them into topics like "business", "sports", and so on. There's one input slice, and a map of topics to output slices. Some documents may be in multiple topics, though; and by using <code>[]Document</code>, we're potentially duplicating each document multiple times, which is wasteful. So we should probably do this instead:</p>
<pre><code>func ClassifyDocuments(docs []*Document) map[string][]*Document
</code></pre>
<p>This allows the result to simply point to the same documents as the input. Except for the allocating the <code>map</code> and slices in the result, it's possible that this function doesn't need to allocate anything at all on the heap.</p></pre>connor4312: <pre><p>Also note that you can take a pointer to a slice element which avoids the copying and lets you call pointer methods on the type, while still maintaining that lovely chunk of continuous memory. Example: <a href="https://play.golang.org/p/pz0JVHj2dQ">https://play.golang.org/p/pz0JVHj2dQ</a></p>
<p>The small downside is that (at least the last time I checked) in cases where the slice could otherwise be allocated on the stack, taking pointers to elements will cause Go to allocate it on the heap.</p></pre>lobster_johnson: <pre><p>Good point, and also in a loop to avoid copying via a <code>range</code>:</p>
<pre><code>for i := range things {
thing := &things[i]
}
</code></pre></pre>zemo: <pre><blockquote>
<p>modifying an element requires copying the element, modifying it and then copying it back.</p>
</blockquote>
<p>would this perform a copy?</p>
<p><code>items[8].x = 10</code></p></pre>xiegeo: <pre><p>Use []*T when you need to use []*T. Otherwise just use []T. </p>
<p>The rules are the same as using *T vs T.</p></pre>nhooyr: <pre><p>I've always been using <code>*T</code> as the default for my methods, I thought it was the opposite. Use <code>*T</code> unless you need to use <code>T</code>.</p></pre>xiegeo: <pre><p>*T is a must for methods that modify T, or if you what *T to implement an interface, so methods tends to get called on pointers.</p></pre>nhooyr: <pre><p>But if I'm not modifying it, I should use <code>T</code> by default unless I profile and find the receiver is large enough that it is causing issues?</p></pre>xiegeo: <pre><p>For method receivers, this sums it up nicely:
<a href="https://golang.org/doc/faq#methods_on_values_or_pointers" rel="nofollow">https://golang.org/doc/faq#methods_on_values_or_pointers</a></p>
<p>For me, if T is a rename of a simple type such as int, they I use T. but if T is a struct then I use *T in case I want to add a modifying method and the rest none modifying methods should be consistent.</p></pre>sh41: <pre><p>Relevant discussion in <code>go-github</code> library, started by Russ Cox:</p>
<p><a href="https://github.com/google/go-github/issues/180">https://github.com/google/go-github/issues/180</a></p></pre>nesigma: <pre><p>Nice find! That settles it.</p>
<p>So apparently the correct answer is that it depends on the size of T. When dealing with large structs it is better to use []*T like Russ Cox recommends especially because of the code that will iterate on that slice.</p>
<p>It's finally crystal clear in my head. Thank you.</p></pre>uncle_bad_touches: <pre><p>Less memory fragmentation and fewer pointers to GC?</p></pre>kl0nos: <pre><p>When you copy slice out of the function you are not copying any data from the slice, you are copying pointer that is pointing to that data. Just use []T for slices.</p></pre>nesigma: <pre><blockquote>
<p>Just use []T for slices.</p>
</blockquote>
<p>When is it more appropriate to return []*T?</p></pre>Deltigre: <pre><p>I used it temporarily for an object pool (to avoid excessive GC) but quickly changed to a linked list + free object stack implementation.</p>
<p>Edit: another related use I can think of is maintaining small struct size with an array of pointers. Or you might want a list of objects you wish to mutate. Obviously these are all specialized use cases.</p></pre>materialdesigner: <pre><p>Isn't sync.Pool for this purpose?</p></pre>Deltigre: <pre><p>From the docs: </p>
<blockquote>
<p>On the other hand, a free list maintained as part of a short-lived object is not a suitable use for a Pool, since the overhead does not amortize well in that scenario. It is more efficient to have such objects implement their own free list.</p>
</blockquote>
<p>I use it to manage the per-routine pools because it includes synchronization, but the in-routine pools are singly-linked structs, placed in a stack when free.</p></pre>kl0nos: <pre><p>Don't use slice of pointers unless you really have to. It's another level of indirection, it will hurt your cache and prefetcher. Whenever you can just use []T instead of []*T.</p></pre>: <pre><p>[deleted]</p></pre>DocMerlin: <pre><p>No, an array is NOT a collection of pointers. It is a bunch of objects in memory.
A slice is a pointer to an array, a length, and a capacity.</p></pre>Remi1115: <pre><p>I can't find any source that supports your statement.</p>
<p>Could it be that you're confusing slices, which contain a pointer to the underlying array?</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传