以太坊源码解析-虚拟机&智能合约篇

justheone · · 794 次点击 · · 开始浏览    
这是一个创建于 的文章,其中的信息可能已经有所发展或是发生改变。

   本文将从代码层级深入分析以太坊的虚拟机的设计原理和运行机制,以及智能合约运行的相关机制。
   1.虚拟机堆栈和内存数据结构
   虚拟机的底层数据机构是一个堆栈,包括一个stack和一个memory。

1)我们先来看一下stack的数据结构:
// Stack is an object for basic stack operations. Items popped to the stack are
// expected to be changed and modified. stack does not take care of adding newly
// initialised objects.
type Stack struct {
data []big.Int //big.int是一个结构体,32个字节的切片
}
func newstack() Stack {
return &Stack{data: make([]
big.Int, 0, 1024)} //指定深度1024
}
以及push/pop/dup(复制栈顶元素)/peek(查看栈顶元素)/Back/swap(交换栈顶和指定元素)/require(保证栈顶元素的数量大于等于n)

2)intpool
可以重复利用的big int pool,大小为256。
type intPool struct {
pool *Stack
}
以及get/put函数,取出或设置默认值,

3)intPoolPool
intPool的管理池,默认的容量是25
type intPoolPool struct {
pools []*intPool
lock sync.Mutex
}
get/put,取出或者加入intPool,使用同步锁来控制。

4)memory
一个简单的内存模型,包含最近gas花费记录,why?
type Memory struct {
store []byte
lastGasCost uint64
}

func NewMemory() *Memory {
return &Memory{}
}
首先使用Resize分配空间
// Resize resizes the memory to size
func (m *Memory) Resize(size uint64) {
if uint64(m.Len()) < size {
m.store = append(m.store, make([]byte, size-uint64(m.Len()))...)
}
}
再使用set来设置值
// Set sets offset + size to value
func (m *Memory) Set(offset, size uint64, value []byte) {
// length of store may never be less than offset + size.
// The store should be resized PRIOR to setting the memory
if size > uint64(len(m.store)) {
panic("INVALID memory: store empty")
}
// It's possible the offset is greater than 0 and size equals 0. This is because
// the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
if size > 0 {
copy(m.store[offset:offset+size], value)
}
}
以及包含Get/Getpro/Len/Data/Print等函数,其中Getpro函数中可能存在切片访问越界的问题。

5)一些工具类函数,比如判定某stack是否可以执行dup或者swap操作:
func makeDupStackFunc/makeSwapStackFunc(n int) stackValudationFunc

2.虚拟机指令,跳转表和解释器
operation标识一条操作指令所需要的函数和变量,jumptable是一个[256]operation的数据结构。
type operation struct {
// execute is the operation function
execute executionFunc //执行函数
// gasCost is the gas function and returns the gas required for execution
gasCost gasFunc //消耗函数
// validateStack validates the stack (size) for the operation
validateStack stackValidationFunc //验证stack的大小
// memorySize returns the memory size required for the operation
memorySize memorySizeFunc //内存大小

halts   bool // indicates whether the operation shoult halt further execution 表示操作是否停止进一步执行
jumps   bool // indicates whether the program counter should not increment 指示程序计数器是否不增加
writes  bool // determines whether this a state modifying operation 确定这是否是一个状态修改操作
valid   bool // indication whether the retrieved operation is valid and known 指示检索到的操作是否有效并且已知
reverts bool // determines whether the operation reverts state (implicitly halts)确定操作是否恢复状态(隐式停止)
returns bool // determines whether the opertions sets the return data content 确定操作是否设置了返回数据内容

}
然后分别设置三个指令集:
newHomesteadInstructionSet
newByzantiumInstructionSet
newConstantinopleInstructionSet
后者在前者的基础上生成。

instruction.go中列举了很多具体的指令,such as:
func opPc(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetUint64(
pc))
return nil, nil
}
func opMsize(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetInt64(int64(memory.Len())))
return nil, nil
}

gas_table.go 返回了各种指令消耗的gas的函数,基本上只有errGasUintOverflow的整数溢出错误。
比如说
func memoryGasCost(mem *Memory, newMemSize uint64) (uint64, error) {
if newMemSize == 0 {
return 0, nil
}
// The maximum that will fit in a uint64 is max_word_count - 1
// anything above that will result in an overflow.
// Additionally, a newMemSize which results in a
// newMemSizeWords larger than 0x7ffffffff will cause the square operation
// to overflow.
// The constant 0xffffffffe0 is the highest number that can be used without
// overflowing the gas calculation
if newMemSize > 0xffffffffe0 {
return 0, errGasUintOverflow
}
newMemSizeWords := toWordSize(newMemSize)
newMemSize = newMemSizeWords * 32
if newMemSize > uint64(mem.Len()) {
square := newMemSizeWords * newMemSizeWords
linCoef := newMemSizeWords * params.MemoryGas
quadCoef := square / params.QuadCoeffDiv
newTotalFee := linCoef + quadCoef

    fee := newTotalFee - mem.lastGasCost
    mem.lastGasCost = newTotalFee

    return fee, nil
}
return 0, nil

}
这个函数计算内存扩张的费用2,只针对扩展内存。nMS2 + nMS*3 - 内存的最近一次花费。
其中有很多各种指令的gas定义函数。

interpreter.go 解释器
// Config are the configuration options for the Interpreter
type Config struct {
// Debug enabled debugging Interpreter options
Debug bool
// Tracer is the op code logger
Tracer Tracer
// NoRecursion disabled Interpreter call, callcode,
// delegate call and create.
NoRecursion bool
// Enable recording of SHA3/keccak preimages
EnablePreimageRecording bool
// JumpTable contains the EVM instruction table. This
// may be left uninitialised and will be set to the default
// table.
JumpTable [256]operation
}

// Interpreter is used to run Ethereum based contracts and will utilise the
// passed environment to query external sources for state information.
// The Interpreter will run the byte code VM based on the passed
// configuration.
type Interpreter interface {
// Run loops and evaluates the contract's code with the given input data and returns
// the return byte-slice and an error if one occurred.
Run(contract *Contract, input []byte) ([]byte, error)
// CanRun tells if the contract, passed as an argument, can be
// run by the current interpreter. This is meant so that the
// caller can do something like:
//
// golang // for _, interpreter := range interpreters { // if interpreter.CanRun(contract.code) { // interpreter.Run(contract.code, input) // } // } //
CanRun([]byte) bool
// IsReadOnly reports if the interpreter is in read only mode.
IsReadOnly() bool
// SetReadOnly sets (or unsets) read only mode in the interpreter.
SetReadOnly(bool)
}

//EVMInterpreter represents an EVM interpreter
type EVMInterpreter struct {
evm *EVM
cfg Config
gasTable params.GasTable // 标识了很多操作的Gas价格
intPool *intPool
readOnly bool // Whether to throw on stateful modifications
returnData []byte // Last CALL's return data for subsequent reuse 最后一个函数的返回值
}

// NewInterpreter returns a new instance of the Interpreter.
func NewEVMInterpreter(evm *EVM, cfg Config) *Interpreter {
// We use the STOP instruction whether to see
// the jump table was initialised. If it was not
// we'll set the default jump table.
// 用一个STOP指令测试JumpTable是否已经被初始化了, 如果没有被初始化,那么设置为默认值
if !cfg.JumpTable[STOP].valid {
switch {
case evm.ChainConfig().IsConstantinople(evm.BlockNumber):
cfg.JumpTable = constantinopleInstructionSet
case evm.ChainConfig().IsByzantium(evm.BlockNumber):
cfg.JumpTable = byzantiumInstructionSet
case evm.ChainConfig().IsHomestead(evm.BlockNumber):
cfg.JumpTable = homesteadInstructionSet
default:
cfg.JumpTable = frontierInstructionSet
}
}
return &Interpreter{
evm: evm,
cfg: cfg,
gasTable: evm.ChainConfig().GasTable(evm.BlockNumber),
intPool: newIntPool(),
}
}

func (in *EVMInterpreter) enforceRestrictions(op OpCode, operation operation, stack *Stack) error {
if in.evm.chainRules.IsByzantium {
if in.readOnly {
// If the interpreter is operating in readonly mode, make sure no
// state-modifying operation is performed. The 3rd stack item
// for a call operation is the value. Transferring value from one
// account to the others means the state is modified and should also
// return with an error.
if operation.writes || (op == CALL && stack.Back(2).BitLen() > 0) {
return errWriteProtection
}
}
}
return nil
}

另外一个重要的函数就是run,用给定的入参循环执行合约的代码,并返回return的字节片段,如果发生错误则返回错误。解释器返回任何错误除了errExecutionReverted之外都视为消耗完所有gas。
func (in *EVMInterpreter) Run(contract *Contract, input []byte) (ret []byte, err error) {
if in.intPool == nil {
in.intPool = poolOfIntPools.get()
defer func() {
poolOfIntPools.put(in.intPool)
in.intPool = nil
}()
}

// Increment the call depth which is restricted to 1024
in.evm.depth++
defer func() { in.evm.depth-- }()

// Reset the previous call's return data. It's unimportant to preserve the old buffer
// as every returning call will return new data anyway.
in.returnData = nil

// Don't bother with the execution if there's no code.
if len(contract.Code) == 0 {
    return nil, nil
}

var (
    op    OpCode        // current opcode
    mem   = NewMemory() // bound memory
    stack = newstack()  // local stack
    // For optimisation reason we're using uint64 as the program counter.
    // It's theoretically possible to go above 2^64. The YP defines the PC
    // to be uint256. Practically much less so feasible.
    pc   = uint64(0) // program counter
    cost uint64
    // copies used by tracer
    pcCopy  uint64 // needed for the deferred Tracer
    gasCopy uint64 // for Tracer to log gas remaining before execution
    logged  bool   // deferred Tracer should ignore already logged steps
)
contract.Input = input

// Reclaim the stack as an int pool when the execution stops
defer func() { in.intPool.put(stack.data...) }()
    //查看是否是debug状态
if in.cfg.Debug {
    defer func() {
        if err != nil {
            if !logged {
                in.cfg.Tracer.CaptureState(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
            } else {
                in.cfg.Tracer.CaptureFault(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
            }
        }
    }()
}
    for atomic.LoadInt32(&in.evm.abort) == 0 {
    if in.cfg.Debug {
        // Capture pre-execution values for tracing.
        logged, pcCopy, gasCopy = false, pc, contract.Gas
    }
    //得到下一个需要执行的指令
    op = contract.GetOp(pc)
    operation := in.cfg.JumpTable[op]
    if !operation.valid {
    return nil, fmt.Errorf("invalid opcode 0x%x", int(op))
}
    //检查是否有足够的堆栈空间
if err := operation.validateStack(stack); err != nil {
    return nil, err
}
    // If the operation is valid, enforce and write restrictions
    if err := in.enforceRestrictions(op, operation, stack); err != nil {
        return nil, err
    }

    var memorySize uint64
    // calculate the new memory size and expand the memory to fit
    // the operation
    if operation.memorySize != nil {
        memSize, overflow := bigUint64(operation.memorySize(stack))
        if overflow {
            return nil, errGasUintOverflow
        }
        // memory is expanded in words of 32 bytes. Gas
        // is also calculated in words.
        if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow {
            return nil, errGasUintOverflow
        }
    }
      //计算gas的cost并使用,如果不够则out of gas。
      cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize)
    if err != nil || !contract.UseGas(cost) {
        return nil, ErrOutOfGas
    }
    if memorySize > 0 {
        mem.Resize(memorySize)
    }

    if in.cfg.Debug {
        in.cfg.Tracer.CaptureState(in.evm, pc, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
        logged = true
    }

    // execute the operation
    res, err := operation.execute(&pc, in, contract, mem, stack)
    // verifyPool is a build flag. Pool verification makes sure the integrity
    // of the integer pool by comparing values to a default value.
    if verifyPool {
        verifyIntegerPool(in.intPool)
    }
    // if the operation clears the return data (e.g. it has returning data)
    // set the last return to the result of the operation.
    if operation.returns {//如果有返回值则设置返回值,只有最后一个有效。
        in.returnData = res
    }

    switch {
    case err != nil:
        return nil, err
    case operation.reverts:
        return res, errExecutionReverted
    case operation.halts:
        return res, nil
    case !operation.jumps:
        pc++
    }
}
return nil, nil

}

虚拟机
contract.go
type ContractRef interface{Address() common.Address} 这是一个合约背后支持对象的引用。
AccountRef 实现了上述接口。
type Contract struct {
// CallerAddress是初始化合约的用户account,如果是合约调用初始化则设置为合约的调用者
CallerAddress common.Address
caller ContractRef
self ContractRef
jumpdests destinations // JUMPDEST 指令分析.
Code []byte //代码
CodeHash common.Hash //代码hash
CodeAddr *common.Address //代码地址
Input []byte //入参
Gas uint64 //合约剩余gas
value *big.Int //合约剩余的eth
Args []byte //参数
DelegateCall bool
}

构造函数
func NewContract(caller ContractRef, object ContractRef, value *big.Int, gas uint64) Contract {
c := &Contract{CallerAddress: caller.Address(), caller: caller, self: object, Args: nil}
如果caller是一个合约,则jumpodests设置为caller的jumpdests.
if parent, ok := caller.(
Contract); ok {
// Reuse JUMPDEST analysis from parent context if available.
c.jumpdests = parent.jumpdests
} else {
c.jumpdests = make(destinations)
}
// Gas should be a pointer so it can safely be reduced through the run
// This pointer will be off the state transition
c.Gas = gas
// ensures a value is set
c.value = value
return c
}
//为了链式调用,当调用者是合约时,设置本合约的CallerAddress、value为caller相应的值
func (c Contract) AsDelegate() Contract {
c.DelegateCall = true
// NOTE: caller must, at all times be a contract. It should never happen
// that caller is something other than a Contract.
parent := c.caller.(
Contract)
c.CallerAddress = parent.CallerAddress
c.value = parent.value
return c
}
//GetOP用来获取下一跳指令,
推测合约code是不是已经拆成了指令合集,然后在input或者Args中获取。
func (c *Contract) GetOp(n uint64) OpCode

//接下来的两个


有疑问加站长微信联系(非本文作者)

本文来自:简书

感谢作者:justheone

查看原文:以太坊源码解析-虚拟机&智能合约篇

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

794 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传