<p>Hi,</p>
<p>I'm considering Go as the target language for a compiler. It provides memory layout control, GC and scheduler.</p>
<p>For some cases I have to store things in untyped blobs of memory. For example, my language provides sum types, implemented as tagged unions; the variant data is stored in an array of bytes, and then depending on the tag it is hard-casted to the correspondent type via unsafe.Pointer. See:</p>
<p><a href="http://play.golang.org/p/r6TNAan58M" rel="nofollow">http://play.golang.org/p/r6TNAan58M</a></p>
<p>As you can check, that example is actually broken. The area printed in line 88 isn't right. That's because the garbage collection fired in line 86 frees the two pointers that are stored in S as plain bytes; being precise, it stops tracking the pointers when they go into the array of bytes. Here's a more concrete example of the issue:</p>
<p><a href="http://play.golang.org/p/PMhfQpRHz2" rel="nofollow">http://play.golang.org/p/PMhfQpRHz2</a></p>
<p>So, the only workaround I can think of is holding a reference to the "payload" in an empty interface. ( <a href="http://play.golang.org/p/aHMfxqipNS" rel="nofollow">http://play.golang.org/p/aHMfxqipNS</a> ) This defeats the whole purpose of implementing sum types as tagged unions, since every payload will go into the heap anyway, so you might as well use the "boxed" interface{} version to discriminate with a type switch.</p>
<p>Do you have any better idea to tell the GC that there is a live reference in there?</p>
<p>Thanks in advance!</p>
<p>PD. I was trying to post this on golang-nuts. I'm not allowed to do that for some reason. I'm posting this here while they figure it out.</p>
<hr/>**评论:**<br/><br/>jerf: <pre><blockquote>
<p>since every payload will go into the heap anyway, so you might as well use the "boxed" interface{} version to discriminate with a type switch.</p>
</blockquote>
<p>Sometimes I like to call Go a modern scripting language. For a scripting language, Go gives you a significant amount of control over whether something ends up on the stack or heap, given that up until pretty recently all scripting languages put everything on the heap, unconditionally.</p>
<p>However, for a systems language, Go gives you essentially no control where something goes, or at least not with any reasonable degree of effort. And you'll always be at the mercy of any changes in future releases. Go isn't really suitable for tasks where you care about that very deeply.</p>
<p>The way I see it, you've got two choices: Care less (no sarcasm!), in which case, yes, the <code>interface{}</code> solution is basically as good as you're going to get. Since you've budgeted a full <code>int</code> for the tag you might as well let that be a runtime-managed type pointer anyhow.</p>
<p>Or, alternatively, don't back to Go.</p>
<p>Honestly, I lean with the second one. I question the long-term viability of any project built against Go in its current state, precisely because you care about a lot of details that are in <em>massive</em> flux right now. If things stabilized in another 3 years you might be OK, but you're in for a world of pain trying to keep up with Go as a moving target right now. This is only the first of many places where Go is going to fight you. Go's runtime is pretty deeply Go, and lacks the abstraction that something like the JVM has where there's this bytecode layer between the language and the runtime. There's a reason people like to back to those.</p>
<p>The crazy wild-card alternative is: Fork Go. Be unafraid to reach in to the internals and <em>make</em> it do what you want. If you're interested in implementing your own language, you should have the requisite skills to do that. Go's internals are as clean as any language implementation I've ever seen. Basically, the idea being if you are going to find yourself unable to track Go's evolution going forward, then you should at least <em>reap the benefits</em> of that as well and make changes that suit you. Trying to stick to "pure official Go" kinda ends up being the worst of all worlds.</p></pre>xlab_is: <pre><p>It's easier to fork and change gccgo in this case.</p></pre>tcardv: <pre><p>Thanks for the response.</p>
<p>I should probably have mentioned that using Go as a target is only meant for prototyping. It's not a big deal if I have to <em>care less</em> at this point. I'd like it to resemble an actual "C" tagged union, so that I have a more meaningful prototype, but it would do anyway.</p>
<p>Later on, if the language were to get more serious, I could consider forking and tweaking Go's runtime, but I think it would make more sense to write a runtime from scratch for this language idiosyncrasies.</p></pre>xlab_is: <pre><blockquote>
<p>Do you have any better idea to tell the GC that there is a live reference in there?</p>
</blockquote>
<p>Could be handy indeed, but currently there's no way except storing the pointer in a global map somewhere, AFAIK. The authors of latest bindings into Java use this technique, see the «Reference tracking between Garbage Collectors» section in <a href="https://golang.org/s/gobind" rel="nofollow">https://golang.org/s/gobind</a></p>
<blockquote>
<p>To manage this, when the generated stub code attempts to pass a pointer from one language to another, it registers it with a global map maintained in the object’s host language. This map provides a unique reference number which is passed to the target language.</p>
</blockquote>
<p>I hope this issue would be addressed as soon as things with GC will calm down over a few more releases of Go. By the way, what's the performance penalty of using the map approach? Without any evaluations it's hard to say that telling GC about any "live reference" will perform better than plain maps.</p></pre>tcardv: <pre><p>I did think about storing a reference in a map, but didn't know that <code>runtime.SetFinalizer</code> existed to remove the reference afterwards. Thanks!</p></pre>funny_falcon: <pre><p>unsafe.Pointer is also considered by GC.</p></pre>tcardv: <pre><p>The problem remains that the payload would always have to be allocated in the heap, in distinct objects. It wouldn't be a union in the C sense.</p></pre>funny_falcon: <pre><p>Use Interface: <a href="http://play.golang.org/p/bdjNjGSKLr" rel="nofollow">http://play.golang.org/p/bdjNjGSKLr</a></p>
<p>All (known to me) ML languages allocates their Algebraic Types on a Heap. So that, Golang will be on par with them.</p>
<p>Edit: fixed url
Edit: fixed url again</p></pre>bradfitz: <pre><blockquote>
<p>PD. I was trying to post this on golang-nuts. I'm not allowed to do that for some reason. I'm posting this here while they figure it out.</p>
</blockquote>
<p>For spam control reasons, the mailing list is moderated for all first-time posters. After your first post is approved, subsequent posts go through automatically.</p></pre>tcardv: <pre><p>I've posted before, that's why I'm confused about this...</p></pre>bradfitz: <pre><p>Email me (this username) @golang.org and I'll look into it.</p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传