Go GC:Go 1.5 将会解决延迟问题 【未翻译】

· · 3842 次点击 · · 开始浏览    
这是一个创建于 的文章,其中的信息可能已经有所发展或是发生改变。

Richard L. Hudson (Rick) is best known for his work in memory management including the invention of the Train, Sapphire, and Mississippi Delta algorithms as well as GC stack maps which enabled garbage collection in statically typed languages like Java, C#, and Go. He has published papers on language runtimes, memory management, concurrency, synchronization, memory models and transactional memory. Rick is a member of Google’s Go team where he is working on Go’s GC and runtime issues.


In economics, there is this concept of a virtuous cycle – a positive feedback loop between different processes that feed into one another. Traditionally in tech, there has been a virtuous cycle between software and hardware development. CPU hardware improves, which enables faster software to be written, which in turn drives further improvements in CPU speed and compute power. This cycle was healthy until about 2004, which is about when Moore’s Law started to end.




These days, 2X transistors != 2x faster programs. More transistors == more cores, but software has not evolved to be able to fully utilize more cores. Because software today is not able to adequately put multiple cores to work, the hardware guys are not going to keep putting more cores in. The cycle is sputtering.

A long term goal of Go is to reboot this virtuous cycle by enabling more concurrent, parallel programs. In the shorter term, we need to increase Go adoption. One of the biggest complaints with the Go runtime right now is that GC pauses are too long.

When their team initially took on this problem, he jokingly says that as engineers, their initial reaction was to not actually solve the problem, and to look for workarounds like:

  • Adding an eye tracker to the computer and GC when no one’s looking

  • Pop up a network wait icon during GC and blame the pause on network latency or something else

But Russ Cox shot these ideas down for some reason, so they decided to roll up their sleeves and actually try to improve the Go GC. The algorithm they developed trades program execution throughput for reduced GC latency. Go programs will get a little bit slower in exchange for ensuring lower GC latencies.



How can we make latency tangible?

  • Nanosecond: Grace Hopper analogized time to distance. A nanosecond is 11.8 inches

  • Microsecond: 5.4 microseconds is the time it takes light to travel 1 mile in vacuum.

  • Milliseconds

  • 1: Read 1 MB sequentially from SSD

  • 20: Read 1 MB from spinny disk

  • 50: Perceptual causality (eye/cursor response threshold).

  • 50+: Various network delays

  • 300: eye blink

So how much GC can we do in a millisecond?

Java GC vs. Go GC



  • thousands of goroutines

  • synchronization via channels

  • runtime written in Go, leverages Go same as users

  • control of spatial locality (structs can be embedded, interior pointers (&foo.field))


  • tens of Java threads

  • synchronization via objects/locks

  • runtime written in C

  • objects linked with pointers

The biggest difference is the issue of spatial locality. In Java, everything is a pointer, whereas Go enables you to embed structs within one another. Following pointers many layers deep causes a lot of issues for a garbage collector.



GC basics

Here’s a quick primer on garbage collectors. They typically involve 2 phases:


  1. Scan phase: Determine which things in the heap are reachable. This involves starting from the poitners in stacks, registers, and global variables, and following pointers into the heap.

  2. Mark phase: Walk the pointer graph. Mark objects as reachable from the program as you go. From the GC’s point of view, it is simplest to stop the world so that pointers are not changing while the mark phase is happening. Truly concurrent GC is difficult, because pointers are continually changing. The program uses something called a write barrier to communicate to the GC that it should not collect an object. In practice, write barriers can be more expensive than stop-the-world pauses.




The Go GC Algorithm uses a combination of write barriers and short stop-the-world pauses. Here are its phases:


Here’s what the GC algorithm looked like in Go 1.4:


Here it is in Go 1.5:


Note the shorter stop-the-world pauses. During concurrent GC, the GC uses 25% CPU.

Here are the benchmarks:


In previous versions of Go, GC pauses are in general much longer, and they grow as the heap size grows. In Go 1.5, GC pauses are more than order of magnitude shorter.

Zooming in, there is still a slight positive correlation between heap size and GC pauses. But they know what the issue is and it will be fixed in Go 1.6.


There is a slight throughput penalty with the new GC algorithm, and that penalty shrinks as the heap size grows:




Moving forward

Tell people that GC is no longer an issue with Go’s low latency GC. Moving forward, they are planning to tune for even lower latency, higher throughput, and more predictability. They want to find the sweet spot between these tradeoffs. Development work for Go 1.6 will be use case and feedback driven, so let them know.


The new low latency GC will make Go an even more viable replacement for manual-memory-management languages like C.

Q & A

Q: Any plans for heap compaction? A: Our approach has been to adopt the techniques that have served the C language community well, which is to avoid fragmentation to begin with by storing objects of the same size in the same memory span.






查看原文:Go GC:Go 1.5 将会解决延迟问题 【未翻译】

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

3842 次点击  
加入收藏 微博
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传