What is considered a "large" struct

xuanbao · · 9 次点击    
<p>I was reading a post on pointer vs Non-pointer method receivers and one of the reasons why you would want to pass by reference as opposed to by value is :</p> <p>&#34;The struct is very large and a deep copy is expensive&#34;</p> <p>Now this may be more of a general programming question but at what point would a data structure be considered &#34;large&#34; ? Would it be in the hundreds or thousands of objects depending on type? At what point would the performance be noticeably affected? Or how could you check the performance?</p> <p>I suppose this would also be the same as with passing by value or reference to functions.</p> <hr/>**评论:**<br/><br/>dgryski: <pre><p>I wouldn&#39;t worry about it until you see memcpy turning up in your profiles.</p></pre>nsd433: <pre><p>When you profile both and see the copy due to the pass-by-value is more expensive than the alloc and gc which is sometimes due to the pass-by-reference.</p> <p>I&#39;d guess it depends on the CPU and ABI too, since that determines how large a struct many be passed in registers.</p></pre>dgryski: <pre><p>The Go ABI requires all parameters to be pushed on the stack. A register-based calling convention is an open bug: <a href="https://github.com/golang/go/issues/18597" rel="nofollow">https://github.com/golang/go/issues/18597</a></p></pre>nsd433: <pre><p>Thanks, I didn&#39;t know the compiler didn&#39;t use registers to pass arguments.</p></pre>jackmott2: <pre><p>The cuttoff for passing by reference vs value, when performance is the only concern, is roughly 16-24 bytes in a 32bit context, likely more in a 64 bit context. Bigger than that is when it tends to be better to pass by reference.</p> <p>But that is just a rule of thumb and may depend greatly on how you are using them and what the rest of your code is doing, and the hardware, and so on.</p> <p>EDIT: I updated some things after considering 32bit vs 64bit and realizing the rule of thumb I was quoting was from quite a while ago when 32bit was the norm.</p></pre>drvd: <pre><p>That for sure is wrong in most cases.</p></pre>jackmott2: <pre><p>Can you expound on how you know/why you think that?</p></pre>drvd: <pre><p>Well, formally your statement might be okay: The less you pass, the faster the code will be. But in almost all case you do not pass arguments and do not <em>use</em> them. Try benchmarking real code where you actually <em>use</em> the values passed and you&#39;ll be surprised. (Modern architectures offer a wide range of unintuitive performance characteristics and are pretty fast in handling continuous memory and suffer from cache misses much more than gained by not copying 20 bytes.) </p></pre>jackmott2: <pre><p>I&#39;m very familiar the importance of memory locality. We should do some benchmarks!</p></pre>jackmott2: <pre><p>Actually you are right, the 16byte rule of thumb is an old one. I&#39;ve updated my reply.</p></pre>Obsessionman: <pre><p>That is a lot smaller than I would&#39;ve expected. </p> <pre><code>type Coord3d struct { X, Y, Z int64 } </code></pre> <p>So something like this at 24 bytes would generally have better performance passing by reference? In practice would it matter?</p> <p>Also in the situation that the struct contained a slice or map with several elements. I know that slices and maps pass a header rather their elements so should they be considered?</p></pre>robe_and_wizard_hat: <pre><p>Keep in mind that passing by value does a copy on the stack, which is incredibly cheap. If you get to the point where your bottleneck has to do with copying values versus copying pointers, you&#39;re in a very good place. </p></pre>JackOhBlades: <pre><p>You&#39;d be copying a pointer value which means indirection is required to get at the values.</p> <p>The copy itself may be faster using a pointer value, but you&#39;re making the accessing of the values &#39;slower&#39; and less cache friendly because of the indirection. </p> <p>I think. </p></pre>metamatic: <pre><p>Though depending on when the values were written to the struct, they may still be in L1 cache when you reach the destination code, since a recent CPU will likely have 32KiB of L1 cache with 4-cycle latency. So it&#39;s complicated...</p></pre>jackmott2: <pre><p>Right when you pass the slice you are passing just the slice info, not the contents, and having something like a Coord3d may be best to use values because they will be contiguous in memory. Rather than iterating over pointers to random locations in the heap.</p> <p>Though if you really want top performance you may want separate arrays of Xs, Ys, and Zs, depending on how you access them.</p></pre>1Gijs: <pre><p>Please beware: </p> <ul> <li>premature optimization is the root of all evil (or at least most of it) in programming</li> </ul> <p>Donald Knuth, Computer Programming as an Art, 1974</p></pre>drvd: <pre><p>7.</p> <p>No really. Measure. &#34;Large&#34; is context dependent. </p></pre>go-li: <pre><p>of course it&#39;s context dependent. why wouldn&#39;t it be?</p></pre>mdaffin: <pre><p>For anything performance related <strong>benchmark</strong> both solutions and see which is faster. There is no hard cut off for when one is better then the other and the only real way to tell is to actually test the performance of your application.</p></pre>alasijia: <pre><p>just try to use non-pointer method receivers.</p> <p>Direct answer for your questions. It depends. &#34;&gt;= five native words&#34; is my standard.</p></pre>
9 次点击  
加入收藏 微博
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传