[help] Tagged unions with unsafe memory chunks and GC

polaris · · 697 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Hi,</p> <p>I&#39;m considering Go as the target language for a compiler. It provides memory layout control, GC and scheduler.</p> <p>For some cases I have to store things in untyped blobs of memory. For example, my language provides sum types, implemented as tagged unions; the variant data is stored in an array of bytes, and then depending on the tag it is hard-casted to the correspondent type via unsafe.Pointer. See:</p> <p><a href="http://play.golang.org/p/r6TNAan58M" rel="nofollow">http://play.golang.org/p/r6TNAan58M</a></p> <p>As you can check, that example is actually broken. The area printed in line 88 isn&#39;t right. That&#39;s because the garbage collection fired in line 86 frees the two pointers that are stored in S as plain bytes; being precise, it stops tracking the pointers when they go into the array of bytes. Here&#39;s a more concrete example of the issue:</p> <p><a href="http://play.golang.org/p/PMhfQpRHz2" rel="nofollow">http://play.golang.org/p/PMhfQpRHz2</a></p> <p>So, the only workaround I can think of is holding a reference to the &#34;payload&#34; in an empty interface. ( <a href="http://play.golang.org/p/aHMfxqipNS" rel="nofollow">http://play.golang.org/p/aHMfxqipNS</a> ) This defeats the whole purpose of implementing sum types as tagged unions, since every payload will go into the heap anyway, so you might as well use the &#34;boxed&#34; interface{} version to discriminate with a type switch.</p> <p>Do you have any better idea to tell the GC that there is a live reference in there?</p> <p>Thanks in advance!</p> <p>PD. I was trying to post this on golang-nuts. I&#39;m not allowed to do that for some reason. I&#39;m posting this here while they figure it out.</p> <hr/>**评论:**<br/><br/>jerf: <pre><blockquote> <p>since every payload will go into the heap anyway, so you might as well use the &#34;boxed&#34; interface{} version to discriminate with a type switch.</p> </blockquote> <p>Sometimes I like to call Go a modern scripting language. For a scripting language, Go gives you a significant amount of control over whether something ends up on the stack or heap, given that up until pretty recently all scripting languages put everything on the heap, unconditionally.</p> <p>However, for a systems language, Go gives you essentially no control where something goes, or at least not with any reasonable degree of effort. And you&#39;ll always be at the mercy of any changes in future releases. Go isn&#39;t really suitable for tasks where you care about that very deeply.</p> <p>The way I see it, you&#39;ve got two choices: Care less (no sarcasm!), in which case, yes, the <code>interface{}</code> solution is basically as good as you&#39;re going to get. Since you&#39;ve budgeted a full <code>int</code> for the tag you might as well let that be a runtime-managed type pointer anyhow.</p> <p>Or, alternatively, don&#39;t back to Go.</p> <p>Honestly, I lean with the second one. I question the long-term viability of any project built against Go in its current state, precisely because you care about a lot of details that are in <em>massive</em> flux right now. If things stabilized in another 3 years you might be OK, but you&#39;re in for a world of pain trying to keep up with Go as a moving target right now. This is only the first of many places where Go is going to fight you. Go&#39;s runtime is pretty deeply Go, and lacks the abstraction that something like the JVM has where there&#39;s this bytecode layer between the language and the runtime. There&#39;s a reason people like to back to those.</p> <p>The crazy wild-card alternative is: Fork Go. Be unafraid to reach in to the internals and <em>make</em> it do what you want. If you&#39;re interested in implementing your own language, you should have the requisite skills to do that. Go&#39;s internals are as clean as any language implementation I&#39;ve ever seen. Basically, the idea being if you are going to find yourself unable to track Go&#39;s evolution going forward, then you should at least <em>reap the benefits</em> of that as well and make changes that suit you. Trying to stick to &#34;pure official Go&#34; kinda ends up being the worst of all worlds.</p></pre>xlab_is: <pre><p>It&#39;s easier to fork and change gccgo in this case.</p></pre>tcardv: <pre><p>Thanks for the response.</p> <p>I should probably have mentioned that using Go as a target is only meant for prototyping. It&#39;s not a big deal if I have to <em>care less</em> at this point. I&#39;d like it to resemble an actual &#34;C&#34; tagged union, so that I have a more meaningful prototype, but it would do anyway.</p> <p>Later on, if the language were to get more serious, I could consider forking and tweaking Go&#39;s runtime, but I think it would make more sense to write a runtime from scratch for this language idiosyncrasies.</p></pre>xlab_is: <pre><blockquote> <p>Do you have any better idea to tell the GC that there is a live reference in there?</p> </blockquote> <p>Could be handy indeed, but currently there&#39;s no way except storing the pointer in a global map somewhere, AFAIK. The authors of latest bindings into Java use this technique, see the «Reference tracking between Garbage Collectors» section in <a href="https://golang.org/s/gobind" rel="nofollow">https://golang.org/s/gobind</a></p> <blockquote> <p>To manage this, when the generated stub code attempts to pass a pointer from one language to another, it registers it with a global map maintained in the object’s host language. This map provides a unique reference number which is passed to the target language.</p> </blockquote> <p>I hope this issue would be addressed as soon as things with GC will calm down over a few more releases of Go. By the way, what&#39;s the performance penalty of using the map approach? Without any evaluations it&#39;s hard to say that telling GC about any &#34;live reference&#34; will perform better than plain maps.</p></pre>tcardv: <pre><p>I did think about storing a reference in a map, but didn&#39;t know that <code>runtime.SetFinalizer</code> existed to remove the reference afterwards. Thanks!</p></pre>funny_falcon: <pre><p>unsafe.Pointer is also considered by GC.</p></pre>tcardv: <pre><p>The problem remains that the payload would always have to be allocated in the heap, in distinct objects. It wouldn&#39;t be a union in the C sense.</p></pre>funny_falcon: <pre><p>Use Interface: <a href="http://play.golang.org/p/bdjNjGSKLr" rel="nofollow">http://play.golang.org/p/bdjNjGSKLr</a></p> <p>All (known to me) ML languages allocates their Algebraic Types on a Heap. So that, Golang will be on par with them.</p> <p>Edit: fixed url Edit: fixed url again</p></pre>bradfitz: <pre><blockquote> <p>PD. I was trying to post this on golang-nuts. I&#39;m not allowed to do that for some reason. I&#39;m posting this here while they figure it out.</p> </blockquote> <p>For spam control reasons, the mailing list is moderated for all first-time posters. After your first post is approved, subsequent posts go through automatically.</p></pre>tcardv: <pre><p>I&#39;ve posted before, that&#39;s why I&#39;m confused about this...</p></pre>bradfitz: <pre><p>Email me (this username) @golang.org and I&#39;ll look into it.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

697 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传