Go 开发进程说明

叶秀兰 · · 3615 次点击 · · 开始浏览    
这是一个创建于 的文章,其中的信息可能已经有所发展或是发生改变。

Go in Go

随着 Go 1.5 版本的开发,现在整个系统都是使用 Go 编写的(有一小部分汇编)

C 已经成为过去时了。

注: gccgo 仍然很强大
这篇文章主要探讨原始编译器 gcc

为什么之前用 C 编写?

启动

(而且 Go 的主要目的不是作为一个编译器实现语言)

为什么编译器使用 Go 重写?

不单单是验证,我们还有更多实际的原因:

  • Go 比 C 容易编写(实际上)

  • Go 比 C 容易调试(即使没有调试器的情况下)

  • Go 将成为你唯一需要会的语言,鼓励贡献

  • Go 有更好的模块化,工具链,测试工具,配置工具等等

  • Go 很简单就能进行并行操作

虽然看起来很多优势,但是还是夸的太早了:)

设计文档: golang.org/s/go13compiler  

为什么运行时也是用 Go 重写?

我们有自己的 C 编译器来编译运行时
   我们需要一个带有跟 Go ABI 一样编译器,比如 segmented stacks。

使用 Go 编写可以摆脱 C 编译器的劣势。
  这是比使用 Go 重写编译器还重要。

(所有使用 Go 重写编译器的理由都可以作为使用 Go 重写运行时的理由)

限制运行时只用一种语言编写,更容易进行集成,管理 stack 等等操作。

跟往常一样,简化是首要的考虑因素。

历史

为什么我们要有完全属于自己的工具链?
   我们的 ABI?
   我们自身的文件格式?

History, familiarity, and ease of moving forward. And speed.

Go 的大部分重大改进要比 GCC 或者 LLVM 要更困难。

news.ycombinator.com/item?id=8817990

重大改进

由于使用自身工具简化的功能和使用 Go 重写后的一些改进:

  • linker 重构

  • 新垃圾收集器

  • 堆叠图

  • 连续栈

  • 写屏障

最后三个都不可能用 C 实现:

  • C 是非类型安全的

  • 因为最佳化而造成堆栈槽混淆

(Gccgo 很快会有 segmented stacks 和 imprecise (stack) collection )

Goroutine 栈

  • Until 1.2: Stacks were segmented.

  • 1.3: Stacks were contiguous unless executing C code (runtime).

  • 1.4: Stacks made contiguous by restricting C to system stack.

  • 1.5: Stacks made contiguous by eliminating C.

These were each huge steps, made quickly (led by khr@).

Converting the runtime

Mostly done by hand with machine assistance.

Challenge to implement the runtime in a safe language.
   Some use of unsafe to deal with pointers as raw bits in the GC, for instance.
   But less than you might think.

The translator (next sections) helped for some of the translation.

Converting the compiler

Why translate it, not write it from scratch? Correctness, testing.

Steps:

  • Write a custom translator from C to Go.

  • Run the translator, iterate until success.

  • Measure success by bit-identical output.

  • Clean up the code by hand and by machine.

  • Turn it from C-in-Go to idiomatic Go (still happening).

Translator

First output was C line-by-line translated to (bad!) Go.
   Tool to do this written by rsc@ (talked about at GopherCon 2014).
   Custom written for this job, not a general C-to-Go translator.

Steps:

  • Parse C code using new simple C parser (yacc)

  • Remove or rewrite C-isms such as *p++ as an expression

  • Walk the C parse tree, print the C code in Go syntax

  • Compile the output

  • Run, compare generated code

  • Repeat

The Yacc grammar was translated by sam-powered hands.

Translator configuration

Aided by hand-written rewrite rules, such as:

  • this field is a bool

  • this function returns a bool

Also diff-like rewrites for things such as using the standard library:

diff {
-    g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow)
-    idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32)
-    if g.Rpo == nil || idom == nil {
-        Fatal("out of memory")
-    }
+    g.Rpo = make([]*Flow, g.Num)
+    idom = make([]int32, g.Num)
}

Another example

This one due to semantic difference between the languages.

diff {
-    if nreg == 64 {
-        mask = ^0 // can't rely on C to shift by 64
-    } else {
-        mask = (1 << uint(nreg)) - 1
-    }
+    mask = (1 << uint(nreg)) - 1
}

Grind

Once in Go, new tool grind deployed (by rsc@):

  • parses Go, type checks

  • records a list of edits to perform: "insert this text at this position"

  • at end, applies edits to source (hard to edit AST).

Changes guided by profiling and other analysis:

  • removes dead code

  • removes gotos

  • removes unused labels, needless indirections, etc.

  • moves var declarations nearer to first use

rsc.io/grind

Performance problems

Output from translator was poor Go, and ran about 10X slower.
   Most of that slowdown has been recovered.

Problems with C to Go:

  • C patterns can be poor Go; e.g.: complex for loops

  • C stack variables never escape; Go compiler isn't as sure

  • interfaces such as fmt.Stringer vs. C's varargs

  • no unions in Go, so use structs instead: bloat

  • variable declarations in wrong place

C compiler didn't free much memory, but Go has a GC.
   Adds CPU and memory overhead.

Performance fixes

Profile! (Never done before!)

  • move vars closer to first use

  • split vars into multiple

  • replace code in the compiler with code in the library: e.g. math/big

  • use interface or other tricks to combine struct fields

  • better escape analysis (drchase@).

  • hand tuning code and data layout

Use tools like grind, gofmt -r and eg for much of this.

Removing interface argument from a debugging print library got 15% overall!

More remains to be done.

Technical benefits

Other benefits of the conversion:

Garbage collection means no more worry about introducing a dangling pointer.

Chance to clean up the back ends.

Unified 386 and amd64 architectures throughout the tool chain.

New architectures are easier to add.

Unified the tools: now one compiler, one assembler, one linker.

Compiler

GOOS=YYY GOARCH=XXX go tool compile  

One compiler; no more 6g, 8g etc.

About 50K lines of portable code.
   Even the registerizer is portable now; architectures well characterized.
   Non-portable: Peepholing, details like registers bound to instructions.
   Typically around 10% of the portable LOC.

Assembler

GOOS=YYY GOARCH=XXX go tool asm  

New assembler, all in Go, written from scratch by r@.
   Clean, idiomatic Go code.

Less than 4000 lines, <10% machine-dependent.

Almost completely compatible with previous yacc and C assemblers.

How is this possible?

  • shared syntax originating in the Plan 9 assemblers

  • unified back-end logic (old liblink, now internal/obj)

Linker

GOOS=YYY GOARCH=XXX go tool link  

Mostly hand- and machine- translated from C code.

New library, internal/obj, part of original linker, captures details about machines, writes object files.

27000 lines summed across 4 architectures, mostly tables (plus some ugliness).

  • arm: 4000

  • arm64: 6000

  • ppc64: 5000

  • x86: 7500 (386 and amd64)

Example benefit: one print routine to print any instruction for any architecture.

启动

不需要 C 编译器,只需要一个 Go 编译器

因此需要从 1.5 的源代码去下载安装构建 Go

我们使用 Go 1.4+ 作为基础库来构建 1.5+ 的工具链

详情: golang.org/s/go15bootstrap  

未来

未来仍然有很多任务要完成,但是 1.5 已经完成的差不多了。

未来的计划:

更好的转义分析
   新编译器后端使用 SSA(使用 Go 会比 C 简单很多)。
   更多优化

从 PDFs (或者是 XML)生成机器描述
   将会有一个纯机器生成指令定义
   “从 PDF 读入,写出一个汇编配置”
   已经部署反汇编程序

总结

摆脱 C 是 Go 项目的一个巨大改进,代码更整洁,提升可测试性,可部署性,也更容易运行。

新的统一工具链减少了代码数量,提升可维护性。

灵活的工具链对可移植性也很重要。

Thank you

Rob Pike

Google

r@golang.org

http://golang.org/

via talks.golang.org


有疑问加站长微信联系(非本文作者)

本文来自:开源中国博客

感谢作者:叶秀兰

查看原文:Go 开发进程说明

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

3615 次点击  
加入收藏 微博
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传