Golang反射机制的实现分析——reflect.Type类型名称

breaksoftware · 2019-01-09 19:11:31 · 2054 次点击 · 预计阅读时间 11 分钟 · 大约8小时之前开始浏览

这是一个创建于 2019-01-09 19:11:31 的文章，其中的信息可能已经有所发展或是发生改变。

现在越来越多的java、php或者python程序员转向了Golang。其中一个比较重要的原因是，它和C/C++一样，可以编译成机器码运行，这保证了执行的效率。在上述解释型语言中，它们都支持了“反射”机制，让程序员可以很方便的构建一些动态逻辑。这是C/C++相对薄弱的环节，而Golang却有良好的支持。本系列，我们将通过反汇编Golang的编译结果，探究其反射实现的机制。（转载请指明出于breaksoftware的csdn博客）

为了防止编译器做优化，例子中的源码都通过下面的指令编译

go build -gcflags "-N -l" [xxxxxx].go

类型名称

基本类型

package main

import (
	"fmt"
	"reflect"
)

func main() {
	t := reflect.TypeOf(1)
	s := t.Name()
	fmt.Println(s)
}

这段代码最终将打印出1的类型——int。

main函数的入口地址是main.main。我们使用gdb在这个位置下断点，然后反汇编。略去一部分函数准备工作，我们看到

   0x0000000000487c6f <+31>:    mov    %rbp,0xa0(%rsp)
   0x0000000000487c77 <+39>:    lea    0xa0(%rsp),%rbp
   0x0000000000487c7f <+47>:    lea    0xfb5a(%rip),%rax        # 0x4977e0
   0x0000000000487c86 <+54>:    mov    %rax,(%rsp)
   0x0000000000487c8a <+58>:    lea    0x40097(%rip),%rax        # 0x4c7d28 <main.statictmp_0>
   0x0000000000487c91 <+65>:    mov    %rax,0x8(%rsp)
   0x0000000000487c96 <+70>:    callq  0x46f210 <reflect.TypeOf>

第3~4行，这段代码将地址0x4977e0压栈。之后在5~6行，又将0x4c7d28压栈。64位系统下，程序的压栈不像32位系统使用push指令，而是使用mov指令间接操作rsp寄存器指向的栈空间。

第7行，调用了reflect.TypeOf方法，在Golang的源码中，该方法的相关定义位于\src\reflect\type.go中

// TypeOf returns the reflection Type that represents the dynamic type of i.
// If i is a nil interface value, TypeOf returns nil.
func TypeOf(i interface{}) Type {
	eface := *(*emptyInterface)(unsafe.Pointer(&i))
	return toType(eface.typ)
}

// toType converts from a *rtype to a Type that can be returned
// to the client of package reflect. In gc, the only concern is that
// a nil *rtype must be replaced by a nil Type, but in gccgo this
// function takes care of ensuring that multiple *rtype for the same
// type are coalesced into a single Type.
func toType(t *rtype) Type {
	if t == nil {
		return nil
	}
	return t
}

reflect.emptyInterface是一个保存数据类型信息和裸指针的结构体，它位于\src\reflect\value.go

// emptyInterface is the header for an interface{} value.
type emptyInterface struct {
	typ  *rtype
	word unsafe.Pointer
}

之前压栈的两个地址0x4977e0和0x4c7d28分别对应于type和word。

(gdb) x/16xb $rsp
0xc42003fed0:   0xe0    0x77    0x49    0x00    0x00    0x00    0x00    0x00
0xc42003fed8:   0x28    0x7d    0x4c    0x00    0x00    0x00    0x00    0x00

这样在内存上便构成了一个emptyInterface结构。下面我们查看它们的内存，0x4c7d28保存的值0x01即是我们传入reflect.TypeOf的值。

0x4977e0:       0x08    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x4c7d28 <main.statictmp_0>:    0x01    0x00    0x00    0x00    0x00    0x00    0x00    0x00

reflect.rtype定义位于src\reflect\type.go

// rtype is the common implementation of most values.
// It is embedded in other struct types.
//
// rtype must be kept in sync with ../runtime/type.go:/^type._type.
type rtype struct {
	size       uintptr
	……
	str        nameOff  // string form
	ptrToThis  typeOff  // type for pointer to this type, may be zero
}

在reflect.TypeOf方法中，我们看到reflect.toType隐式的将reflect.rtype转换成了reflect.Type类型，而reflect.Type类型和它完全不一样

type Type interface {
	Align() int
	FieldAlign() int
	Method(int) Method
	……
}

从Golang的源码的角度去解析似乎进入了死胡同，我们继续转向汇编层面，查看reflect.TypeOf的实现

   0x000000000046f210 <+0>:     mov    0x8(%rsp),%rax
   0x000000000046f215 <+5>:     test   %rax,%rax
   0x000000000046f218 <+8>:     je     0x46f22c <reflect.TypeOf+28>
   0x000000000046f21a <+10>:    lea    0xaddbf(%rip),%rcx        # 0x51cfe0 <go.itab.*reflect.rtype,reflect.Type>
   0x000000000046f221 <+17>:    mov    %rcx,0x18(%rsp)
   0x000000000046f226 <+22>:    mov    %rax,0x20(%rsp)
   0x000000000046f22b <+27>:    retq   
   0x000000000046f22c <+28>:    xor    %eax,%eax
   0x000000000046f22e <+30>:    mov    %rax,%rcx
   0x000000000046f231 <+33>:    jmp    0x46f221 <reflect.TypeOf+17>

之前介绍过，在调用reflect.TypeOf前，已经在栈上构建了一个emptyInterface结构体。由于此函数只关注类型，而不关注值，所以此时只是使用了typ字段——rsp+0x08地址的值。

比较有意思的是这个过程获取了一个内存地址0x51cfe0，目前我们尚不知它是干什么的。之后我们会再次关注它。

   0x0000000000487c9b <+75>:    mov    0x10(%rsp),%rax
   0x0000000000487ca0 <+80>:    mov    0x18(%rsp),%rcx
   0x0000000000487ca5 <+85>:    mov    %rax,0x38(%rsp)
   0x0000000000487caa <+90>:    mov    %rcx,0x40(%rsp)
   0x0000000000487caf <+95>:    mov    0xc0(%rax),%rax
   0x0000000000487cb6 <+102>:   mov    %rcx,(%rsp)
   0x0000000000487cba <+106>:   callq  *%rax

从reflect.TypeOf调用中返回后，rax寄存器保存的是0x51cfe0，然后在第5行计算了该地址偏移0xC0的地址中保存的值。最后在第7行调用了该地址所指向的函数。

(gdb) x/64bx 0x51cfe0+0xc0 
0x51d0a0 <go.itab.*reflect.rtype,reflect.Type+192>:     0x80    0xcc    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0a8 <go.itab.*reflect.rtype,reflect.Type+200>:     0xf0    0xd6    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0b0 <go.itab.*reflect.rtype,reflect.Type+208>:     0x60    0xd7    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0b8 <go.itab.*reflect.rtype,reflect.Type+216>:     0xe0    0xbe    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0c0 <go.itab.*reflect.rtype,reflect.Type+224>:     0xd0    0xd7    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0c8 <go.itab.*reflect.rtype,reflect.Type+232>:     0x80    0xd8    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0d0 <go.itab.*reflect.rtype,reflect.Type+240>:     0x90    0xcb    0x46    0x00    0x00    0x00    0x00    0x00
0x51d0d8 <go.itab.*reflect.rtype,reflect.Type+248>:     0x60    0xb9    0x46    0x00    0x00    0x00    0x00    0x00

使用反汇编指令看下0x46cc80处的函数，可以看到它是reflect.(*rtype).Name()

(gdb) disassemble 0x46cc80
Dump of assembler code for function reflect.(*rtype).Name:

我们再看0x51d0a0附近的内存中的值，发现其很有规律。其实它们都是reflect.(*rtype)下的函数地址。

(gdb) disassemble 0x46b960
Dump of assembler code for function reflect.(*rtype).Size:

(gdb) disassemble 0x46cb90
Dump of assembler code for function reflect.(*rtype).PkgPath:

这些方法也是reflect.Type接口暴露的方法。当我们调用Type暴露的方法的时候，实际底层调用的rtype对应的同名方法。

type Type interface {
	Align() int
	FieldAlign() int
	……
	Name() string
	PkgPath() string
	Size() uintptr
	……
}

从reflect.TypeOf调用返回后，就调用reflect.(*rtype).Name()。它的相关实现是

func (t *rtype) Name() string {
	if t.tflag&tflagNamed == 0 {
		return ""
	}
	s := t.String()
	……
	return s[i+1:]
}

func (t *rtype) String() string {
	s := t.nameOff(t.str).name()
	if t.tflag&tflagExtraStar != 0 {
		return s[1:]
	}
	return s
}

type name struct {
	bytes *byte
}

func (t *rtype) nameOff(off nameOff) name {
	return name{(*byte)(resolveNameOff(unsafe.Pointer(t), int32(off)))}
}

这段代码表示，变量的类型值和rtype的地址和rtype.str字段有关。而这个rtype就是reflect.TypeOf调用前构建的emptyInterface的rtype。我们使用gdb查看该结构体

$4 = {
  size = 0x8, 
  ptrdata = 0x0, 
  hash = 0xf75371fa, 
  tflag = 0x7, 
  align = 0x8, 
  fieldAlign = 0x8, 
  kind = 0x82, 
  alg = 0x529a70, 
  gcdata = 0x4c6cd8, 
  str = 0x3a3, 
  ptrToThis = 0xac60
}

最后我们就要看相对复杂的resolveNameOff实现。

func resolveNameOff(ptrInModule unsafe.Pointer, off nameOff) name {
	if off == 0 {
		return name{}
	}
	base := uintptr(ptrInModule)
	for md := &firstmoduledata; md != nil; md = md.next {
		if base >= md.types && base < md.etypes {
			res := md.types + uintptr(off)
			if res > md.etypes {
				println("runtime: nameOff", hex(off), "out of range", hex(md.types), "-", hex(md.etypes))
				throw("runtime: name offset out of range")
			}
			return name{(*byte)(unsafe.Pointer(res))}
		}
	}

	// No module found. see if it is a run time name.
	reflectOffsLock()
	res, found := reflectOffs.m[int32(off)]
	reflectOffsUnlock()
	if !found {
		println("runtime: nameOff", hex(off), "base", hex(base), "not in ranges:")
		for next := &firstmoduledata; next != nil; next = next.next {
			println("\ttypes", hex(next.types), "etypes", hex(next.etypes))
		}
		throw("runtime: name offset base pointer out of range")
	}
	return name{(*byte)(res)}
}

我们先忽略17行之后的代码。从6~15行，程序会遍历模块信息，并检测rtype地址是否在该区间之内（base >= md.types && base < md.etypes）。如果在此区间，则返回相对于该区间起始地址的off偏移地址。

所以，rtype.str字段的偏移不是相对于rtype的起始地址。而是相对于rtype起始地址所在的区间的保存type信息区块（[md.types, md.etypes)）起始地址。

和rtype信息一样，firstmoduledata的信息也是全局初始化的。我们使用IDA协助查看它位置。

可以看到这些数据都存储在elf的.noptrdata节中，该节中数据是Golang构建程序时保存全局数据的地方。所以这种“反射”是编译器在编译的过程中，暗中帮我们构建了和变量等有关的信息。

我们再看下模块起始地址0x488000偏移rtype.str=0x3a3的地址空间。

这样我们就看到int字段的来源了。

自定义结构类型

package main

import (
	"fmt"
	"reflect"
)

type t20190107 struct {
	v string
}

func main() {
	i2 := t20190107{"s20190107"}
	t2 := reflect.TypeOf(i2)
	s2 := t2.Name()

	fmt.Println(s2)
}

这段代码故意构建一个名字很特殊的结构体，我们看下反汇编的结果。

   0x0000000000487c6f <+31>:    mov    %rbp,0xc0(%rsp)
   0x0000000000487c77 <+39>:    lea    0xc0(%rsp),%rbp
   0x0000000000487c7f <+47>:    movq   $0x0,0x58(%rsp)
   0x0000000000487c88 <+56>:    movq   $0x0,0x60(%rsp)
   0x0000000000487c91 <+65>:    lea    0x2f868(%rip),%rax        # 0x4b7500
   0x0000000000487c98 <+72>:    mov    %rax,0x58(%rsp)
   0x0000000000487c9d <+77>:    movq   $0x9,0x60(%rsp)
   0x0000000000487ca6 <+86>:    mov    %rax,0x98(%rsp)
   0x0000000000487cae <+94>:    movq   $0x9,0xa0(%rsp)
   0x0000000000487cba <+106>:   lea    0x196ff(%rip),%rax        # 0x4a13c0
   0x0000000000487cc1 <+113>:   mov    %rax,(%rsp)
   0x0000000000487cc5 <+117>:   lea    0x98(%rsp),%rax
   0x0000000000487ccd <+125>:   mov    %rax,0x8(%rsp)
   0x0000000000487cd2 <+130>:   callq  0x40c7e0 <runtime.convT2E>

第5行，我们获取了0x4b7500空间地址，我们看下其值，就是我们初始化结构体的字面量“s20190107"

第10行，我们又获取了0x4a13c0地址。依据之前的经验，该地址保存的是reflect.rtype类型数据。但是由于之后调用的runtime.convT2E，所以其类型是runtime._type。

func convT2E(t *_type, elem unsafe.Pointer) (e eface) {
	if raceenabled {
		raceReadObjectPC(t, elem, getcallerpc(unsafe.Pointer(&t)), funcPC(convT2E))
	}
	if msanenabled {
		msanread(elem, t.size)
	}
	x := mallocgc(t.size, t, true)
	// TODO: We allocate a zeroed object only to overwrite it with actual data.
	// Figure out how to avoid zeroing. Also below in convT2Eslice, convT2I, convT2Islice.
	typedmemmove(t, x, elem)
	e._type = t
	e.data = x
	return
}

其实runtime._type和reflect.rtype的定义是一样的

type _type struct {
	size       uintptr
	……
	str       nameOff
	ptrToThis typeOff
}

type rtype struct {
	size       uintptr
	……	
	str        nameOff  // string form
	ptrToThis  typeOff  // type for pointer to this type, may be zero
}

而reflect.emptyInterface和runtime.eface也一样

type eface struct {
	_type *_type
	data  unsafe.Pointer
}

type emptyInterface struct {
	typ  *rtype
	word unsafe.Pointer
}

这让我们对基本类型的分析结果和经验在此处依然适用。

使用gdb把_type信息打印出来，可以发现这次类型名称的偏移量0x6184比较大。

$3 = {
  size = 0x10, 
  ptrdata = 0x8, 
  hash = 0xe1c71878, 
  tflag = 0x7, 
  align = 0x8, 
  fieldalign = 0x8, 
  kind = 0x19, 
  alg = 0x529a90, 
  gcdata = 0x4c6dc4, 
  str = 0x6184, 
  ptrToThis = 0xae80
}

runtime.convT2E第8行在垃圾回收器上构建了一段内存，并将裸指针指向的数据保存到该地址空间中。然后在第12~13行重新构建了eface结构体。

之后进入reflect.TypeOf逻辑，这和之前分析的流程一致。我们最后看下保存的类型数据的全局区域

总结

编译器在编译过程中，将变量对应的类型信息（runtime._type或reflect.rtype）保存在.rodata节中。
字面量直接使用reflect.TypeOf方法获取rtype类型函数地址列表
变量使用runtime.convT2*类型转换函数，使用垃圾回收器上分配的空间存储变量值，然后调用reflect.TypeOf方法
遍历保存在.noptrdata节中的模块信息，确认类型信息的存储地址位于的模块区域。然后以该区块中保存type信息的区块起始地址为基准，使用rtype.str字段表示的偏移量计算出名称在内存中的位置。

有疑问加站长微信联系（非本文作者）

本文来自：CSDN博客

感谢作者：breaksoftware

查看原文：Golang反射机制的实现分析——reflect.Type类型名称

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

2054 次点击

加入收藏微博

收入我的专栏

上一篇：Java架构师面试题，试了才知道行不行

下一篇：Golang反射机制的实现分析——reflect.Type方法查找和调用

runtime

信息

函数

反汇编

1 回复 | 直到 2019-10-15 16:16:17

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

关注我

扫码关注领全套学习资料
加入 QQ 群：
- 192706294（已满）
- 731990104（已满）
- 798786647（已满）
- 729884609（已满）
- 977810755（已满）
- 815126783（已满）
- 812540095（已满）
- 1006366459（已满）
- 692541889
加入微信群：liuxiaoyan-s，备注入群
也欢迎加入知识星球 Go粉丝们（免费）

Golang反射机制的实现分析——reflect.Type类型名称

类型名称

基本类型

自定义结构类型

总结

用户登录

今日阅读排行

一周阅读排行

关注我

类型名称

基本类型

自定义结构类型

总结

Golang反射机制的实现分析——reflect.Type类型名称

类型名称

基本类型

自定义结构类型

总结

用户登录

今日阅读排行

一周阅读排行

关注我

给该专栏投稿 写篇新文章

收入到我管理的专栏 新建专栏

类型名称

基本类型

自定义结构类型

总结

给该专栏投稿写篇新文章

收入到我管理的专栏新建专栏