simplistic question re GC: if running on a machine with many cores, why cant the GC run in its own thread, continuously cleaning up ?

blov · · 746 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

pardon my ignorance, and perhaps it alrwady does this ... and when we have to wait for the GC we simply have a case of it being too slow for all the accumulated garbage to collect. i would expect there to be minimal blocking coming from the GC. I guess this is a generic question, but i would have expected Go to implement this coroutine better than any other lang <hr/>**评论：** calebdoxsey: <pre>Memory is on a shared heap that all goroutines can access. How can the GC safely collect memory while its possibly being used by other threads? To answer your question: in 1.5 it does. There is a small window where everything is stopped, but the bulk of the work is done in parallel with other goroutines. See: <a href="https://talks.golang.org/2015/go-gc.pdf" rel="nofollow">https://talks.golang.org/2015/go-gc.pdf</a>.</pre>matttproud: <pre>Outside of the question of the need for locking shared memory for runtime state and bookkeeping[0], … even with incremental, continuously-running garbage collection, you could still face issues where the collector just can't keep up to allocations: dC/dT < dA/dT (C: bytes collected; A: bytes allocated). Process resident set size (RSS) continues to grow, because the heap is growing at dA/dT - dC/dT. You could say that this is the result of degenerate server design, but this often occurs with improperly scaled systems. You'd see this manifest itself in different ways with different runtime releases. In pre-1.5, you'd see long pause times; in 1.5 and 1.6, you'd hopefully see the kernel out-of-memory (OOM) killer taking action. Neither of these are desirable. No matter which runtime you are using (including 1.5 and planned 1.6), you still have problems with heap fragmentation in spite of the allocator using caches for objects of a given size class. Heap fragmentation results in unnecessarily large heaps, which, in turn, grows process RSS. Deploy your servers in memory-constrained environments (e.g., containers), and the OOM killer will be constantly cycling your jobs; or you'll swap if that hasn't been disabled. This is why heap compaction is important, something which Go does not perform nor is it planned on the runtime's roadmap. Unfortunately heap compaction is difficult to achieve without some measure of runtime locking. In specific terms of the allocator (it is modeled off of TCMalloc, which has known issues with fragmentation), sparsely occupied spans of a given size class are never consolidated into fewer ones. This is especially problematic for long-lived objects. In spite of how much folks in the Go community have historically derided Java, the HotSpot JVM has solved most of these problems for years now: compaction and low collection time. The dirty secret is this: you need a highly-parallel collector (-XX:+UseParallelGC -XX:+UseParallelOldGC), NOT a concurrent one, and pin the memory pools to fixed boundaries reflecting the containerization environment -Xms == -Xmx and -XX:MaxNewSize == -XX:NewSize. That is all! If you keep new generation reasonably sized (e.g., < 800MiB) and deploy instances decently accordingly to load test regime (you should do this for anything you care about), you'll easily have end-user pause times < 10ms. The trick that is Java-specific is not violate the <a href="http://www.cubrid.org/blog/dev-platform/understanding-java-garbage-collection/" rel="nofollow">weak generational hypothesis</a> in your server's design. You get excellent performance, because you only have minor garbage collection events (from not violating weak generational hypothesis) and small memory footprint thanks to heap compaction. Go does not use a generational memory pools. I think the work on Go's 1.5 collector is admirable and awesome, especially with getting the pause time to be low, but the point of why I am mentioning this is that similar performance can be had elsewhere—and I love Go, mind you. AFAIR, too, the continuously or bounded-duration pause time mechanism that is being used in 1.5 was also used in Java's concurrent mode collector (CMS) in its so-called incremental mode, which is—what—four-some years old now? The problem with HotSpot CMS is that it is non-compacting. It has a compacting failure mode, but it is SLOW (ca. order of seconds). [0] — There are probably strategies that could be used to shard the bookkeeping by mutator thread (a thread that the serving application is running in, not the runtime), but that increases complexity drastically. TCMalloc, which I mention in the writeup already does something like this with thread-local allocator pool caches.</pre>jerf: <pre>You and others who read this may find this interesting: <a href="https://youtu.be/aiv1JOfMjm0?t=8m44s" rel="nofollow">How Go differs from Java from a GC perspective (video)</a>. (This isn't "disagreement", just, elaboration.)</pre>bradfitz: <pre>Java's GC is ahead of Go's in many ways, but I think you might be giving Java's GC a little too much credit. Watching engineers and SREs try to tune JVM GC settings is often comical. There are so many knobs, it's almost a full-time job just trying to make services behave. It's far from just magically working.</pre>XANi_: <pre>To be fair a lot of problems with tuning is due to programmers not understanding how their language works. The thing with programming is that, no matter what language you use, if you write apps complex enough eventually you will have to get up close and personal with its internals. Which sometimes cause PTSD in developer who then proceeds to cry in the corner yelling "ADMINS DID IT, IT WORKED ON MY MACHINE FINE" from time to time</pre>gngl: <pre>It is my understanding that the "C4" GC for Azul's very advanced Java systems works like this, to the extent that it should be an almost-zero-pause system, but Java does have some benefits in terms of a more restricted data model, so everything is a sea of tiny objects, except for arrays (and furthermore, they do need a very, very good GC for precisely the same reason - because everything is a sea of tiny objects). I'm not sure how C4 would be applicable to Go, though. Besides C4's being very complex (and actually requiring a custom OS kernel), Go's internal representations might pose some problems for the elaborate C4 that I'm currently not aware of.</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

746 次点击

加入收藏微博

java

runtime

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

simplistic question re GC: if running on a machine with many cores, why cant the GC run in its own thread, continuously cleaning up ?

用户登录

今日阅读排行

一周阅读排行

最新主题