本文将从代码层级深入分析以太坊的虚拟机的设计原理和运行机制,以及智能合约运行的相关机制。
1.虚拟机堆栈和内存数据结构
虚拟机的底层数据机构是一个堆栈,包括一个stack和一个memory。
1)我们先来看一下stack的数据结构:
// Stack is an object for basic stack operations. Items popped to the stack are
// expected to be changed and modified. stack does not take care of adding newly
// initialised objects.
type Stack struct {
data []big.Int //big.int是一个结构体,32个字节的切片
}
func newstack() Stack {
return &Stack{data: make([]big.Int, 0, 1024)} //指定深度1024
}
以及push/pop/dup(复制栈顶元素)/peek(查看栈顶元素)/Back/swap(交换栈顶和指定元素)/require(保证栈顶元素的数量大于等于n)
2)intpool
可以重复利用的big int pool,大小为256。
type intPool struct {
pool *Stack
}
以及get/put函数,取出或设置默认值,
3)intPoolPool
intPool的管理池,默认的容量是25
type intPoolPool struct {
pools []*intPool
lock sync.Mutex
}
get/put,取出或者加入intPool,使用同步锁来控制。
4)memory
一个简单的内存模型,包含最近gas花费记录,why?
type Memory struct {
store []byte
lastGasCost uint64
}
func NewMemory() *Memory {
return &Memory{}
}
首先使用Resize分配空间
// Resize resizes the memory to size
func (m *Memory) Resize(size uint64) {
if uint64(m.Len()) < size {
m.store = append(m.store, make([]byte, size-uint64(m.Len()))...)
}
}
再使用set来设置值
// Set sets offset + size to value
func (m *Memory) Set(offset, size uint64, value []byte) {
// length of store may never be less than offset + size.
// The store should be resized PRIOR to setting the memory
if size > uint64(len(m.store)) {
panic("INVALID memory: store empty")
}
// It's possible the offset is greater than 0 and size equals 0. This is because
// the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
if size > 0 {
copy(m.store[offset:offset+size], value)
}
}
以及包含Get/Getpro/Len/Data/Print等函数,其中Getpro函数中可能存在切片访问越界的问题。
5)一些工具类函数,比如判定某stack是否可以执行dup或者swap操作:
func makeDupStackFunc/makeSwapStackFunc(n int) stackValudationFunc
2.虚拟机指令,跳转表和解释器
operation标识一条操作指令所需要的函数和变量,jumptable是一个[256]operation的数据结构。
type operation struct {
// execute is the operation function
execute executionFunc //执行函数
// gasCost is the gas function and returns the gas required for execution
gasCost gasFunc //消耗函数
// validateStack validates the stack (size) for the operation
validateStack stackValidationFunc //验证stack的大小
// memorySize returns the memory size required for the operation
memorySize memorySizeFunc //内存大小
halts bool // indicates whether the operation shoult halt further execution 表示操作是否停止进一步执行
jumps bool // indicates whether the program counter should not increment 指示程序计数器是否不增加
writes bool // determines whether this a state modifying operation 确定这是否是一个状态修改操作
valid bool // indication whether the retrieved operation is valid and known 指示检索到的操作是否有效并且已知
reverts bool // determines whether the operation reverts state (implicitly halts)确定操作是否恢复状态(隐式停止)
returns bool // determines whether the opertions sets the return data content 确定操作是否设置了返回数据内容
}
然后分别设置三个指令集:
newHomesteadInstructionSet
newByzantiumInstructionSet
newConstantinopleInstructionSet
后者在前者的基础上生成。
instruction.go中列举了很多具体的指令,such as:
func opPc(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetUint64(pc))
return nil, nil
}
func opMsize(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetInt64(int64(memory.Len())))
return nil, nil
}
gas_table.go 返回了各种指令消耗的gas的函数,基本上只有errGasUintOverflow的整数溢出错误。
比如说
func memoryGasCost(mem *Memory, newMemSize uint64) (uint64, error) {
if newMemSize == 0 {
return 0, nil
}
// The maximum that will fit in a uint64 is max_word_count - 1
// anything above that will result in an overflow.
// Additionally, a newMemSize which results in a
// newMemSizeWords larger than 0x7ffffffff will cause the square operation
// to overflow.
// The constant 0xffffffffe0 is the highest number that can be used without
// overflowing the gas calculation
if newMemSize > 0xffffffffe0 {
return 0, errGasUintOverflow
}
newMemSizeWords := toWordSize(newMemSize)
newMemSize = newMemSizeWords * 32
if newMemSize > uint64(mem.Len()) {
square := newMemSizeWords * newMemSizeWords
linCoef := newMemSizeWords * params.MemoryGas
quadCoef := square / params.QuadCoeffDiv
newTotalFee := linCoef + quadCoef
fee := newTotalFee - mem.lastGasCost
mem.lastGasCost = newTotalFee
return fee, nil
}
return 0, nil
}
这个函数计算内存扩张的费用2,只针对扩展内存。nMS2 + nMS*3 - 内存的最近一次花费。
其中有很多各种指令的gas定义函数。
interpreter.go 解释器
// Config are the configuration options for the Interpreter
type Config struct {
// Debug enabled debugging Interpreter options
Debug bool
// Tracer is the op code logger
Tracer Tracer
// NoRecursion disabled Interpreter call, callcode,
// delegate call and create.
NoRecursion bool
// Enable recording of SHA3/keccak preimages
EnablePreimageRecording bool
// JumpTable contains the EVM instruction table. This
// may be left uninitialised and will be set to the default
// table.
JumpTable [256]operation
}
// Interpreter is used to run Ethereum based contracts and will utilise the
// passed environment to query external sources for state information.
// The Interpreter will run the byte code VM based on the passed
// configuration.
type Interpreter interface {
// Run loops and evaluates the contract's code with the given input data and returns
// the return byte-slice and an error if one occurred.
Run(contract *Contract, input []byte) ([]byte, error)
// CanRun tells if the contract, passed as an argument, can be
// run by the current interpreter. This is meant so that the
// caller can do something like:
//
// golang // for _, interpreter := range interpreters { // if interpreter.CanRun(contract.code) { // interpreter.Run(contract.code, input) // } // } //
CanRun([]byte) bool
// IsReadOnly reports if the interpreter is in read only mode.
IsReadOnly() bool
// SetReadOnly sets (or unsets) read only mode in the interpreter.
SetReadOnly(bool)
}
//EVMInterpreter represents an EVM interpreter
type EVMInterpreter struct {
evm *EVM
cfg Config
gasTable params.GasTable // 标识了很多操作的Gas价格
intPool *intPool
readOnly bool // Whether to throw on stateful modifications
returnData []byte // Last CALL's return data for subsequent reuse 最后一个函数的返回值
}
// NewInterpreter returns a new instance of the Interpreter.
func NewEVMInterpreter(evm *EVM, cfg Config) *Interpreter {
// We use the STOP instruction whether to see
// the jump table was initialised. If it was not
// we'll set the default jump table.
// 用一个STOP指令测试JumpTable是否已经被初始化了, 如果没有被初始化,那么设置为默认值
if !cfg.JumpTable[STOP].valid {
switch {
case evm.ChainConfig().IsConstantinople(evm.BlockNumber):
cfg.JumpTable = constantinopleInstructionSet
case evm.ChainConfig().IsByzantium(evm.BlockNumber):
cfg.JumpTable = byzantiumInstructionSet
case evm.ChainConfig().IsHomestead(evm.BlockNumber):
cfg.JumpTable = homesteadInstructionSet
default:
cfg.JumpTable = frontierInstructionSet
}
}
return &Interpreter{
evm: evm,
cfg: cfg,
gasTable: evm.ChainConfig().GasTable(evm.BlockNumber),
intPool: newIntPool(),
}
}
func (in *EVMInterpreter) enforceRestrictions(op OpCode, operation operation, stack *Stack) error {
if in.evm.chainRules.IsByzantium {
if in.readOnly {
// If the interpreter is operating in readonly mode, make sure no
// state-modifying operation is performed. The 3rd stack item
// for a call operation is the value. Transferring value from one
// account to the others means the state is modified and should also
// return with an error.
if operation.writes || (op == CALL && stack.Back(2).BitLen() > 0) {
return errWriteProtection
}
}
}
return nil
}
另外一个重要的函数就是run,用给定的入参循环执行合约的代码,并返回return的字节片段,如果发生错误则返回错误。解释器返回任何错误除了errExecutionReverted之外都视为消耗完所有gas。
func (in *EVMInterpreter) Run(contract *Contract, input []byte) (ret []byte, err error) {
if in.intPool == nil {
in.intPool = poolOfIntPools.get()
defer func() {
poolOfIntPools.put(in.intPool)
in.intPool = nil
}()
}
// Increment the call depth which is restricted to 1024
in.evm.depth++
defer func() { in.evm.depth-- }()
// Reset the previous call's return data. It's unimportant to preserve the old buffer
// as every returning call will return new data anyway.
in.returnData = nil
// Don't bother with the execution if there's no code.
if len(contract.Code) == 0 {
return nil, nil
}
var (
op OpCode // current opcode
mem = NewMemory() // bound memory
stack = newstack() // local stack
// For optimisation reason we're using uint64 as the program counter.
// It's theoretically possible to go above 2^64. The YP defines the PC
// to be uint256. Practically much less so feasible.
pc = uint64(0) // program counter
cost uint64
// copies used by tracer
pcCopy uint64 // needed for the deferred Tracer
gasCopy uint64 // for Tracer to log gas remaining before execution
logged bool // deferred Tracer should ignore already logged steps
)
contract.Input = input
// Reclaim the stack as an int pool when the execution stops
defer func() { in.intPool.put(stack.data...) }()
//查看是否是debug状态
if in.cfg.Debug {
defer func() {
if err != nil {
if !logged {
in.cfg.Tracer.CaptureState(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
} else {
in.cfg.Tracer.CaptureFault(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
}
}
}()
}
for atomic.LoadInt32(&in.evm.abort) == 0 {
if in.cfg.Debug {
// Capture pre-execution values for tracing.
logged, pcCopy, gasCopy = false, pc, contract.Gas
}
//得到下一个需要执行的指令
op = contract.GetOp(pc)
operation := in.cfg.JumpTable[op]
if !operation.valid {
return nil, fmt.Errorf("invalid opcode 0x%x", int(op))
}
//检查是否有足够的堆栈空间
if err := operation.validateStack(stack); err != nil {
return nil, err
}
// If the operation is valid, enforce and write restrictions
if err := in.enforceRestrictions(op, operation, stack); err != nil {
return nil, err
}
var memorySize uint64
// calculate the new memory size and expand the memory to fit
// the operation
if operation.memorySize != nil {
memSize, overflow := bigUint64(operation.memorySize(stack))
if overflow {
return nil, errGasUintOverflow
}
// memory is expanded in words of 32 bytes. Gas
// is also calculated in words.
if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow {
return nil, errGasUintOverflow
}
}
//计算gas的cost并使用,如果不够则out of gas。
cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize)
if err != nil || !contract.UseGas(cost) {
return nil, ErrOutOfGas
}
if memorySize > 0 {
mem.Resize(memorySize)
}
if in.cfg.Debug {
in.cfg.Tracer.CaptureState(in.evm, pc, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err)
logged = true
}
// execute the operation
res, err := operation.execute(&pc, in, contract, mem, stack)
// verifyPool is a build flag. Pool verification makes sure the integrity
// of the integer pool by comparing values to a default value.
if verifyPool {
verifyIntegerPool(in.intPool)
}
// if the operation clears the return data (e.g. it has returning data)
// set the last return to the result of the operation.
if operation.returns {//如果有返回值则设置返回值,只有最后一个有效。
in.returnData = res
}
switch {
case err != nil:
return nil, err
case operation.reverts:
return res, errExecutionReverted
case operation.halts:
return res, nil
case !operation.jumps:
pc++
}
}
return nil, nil
}
虚拟机
contract.go
type ContractRef interface{Address() common.Address} 这是一个合约背后支持对象的引用。
AccountRef 实现了上述接口。
type Contract struct {
// CallerAddress是初始化合约的用户account,如果是合约调用初始化则设置为合约的调用者
CallerAddress common.Address
caller ContractRef
self ContractRef
jumpdests destinations // JUMPDEST 指令分析.
Code []byte //代码
CodeHash common.Hash //代码hash
CodeAddr *common.Address //代码地址
Input []byte //入参
Gas uint64 //合约剩余gas
value *big.Int //合约剩余的eth
Args []byte //参数
DelegateCall bool
}
构造函数
func NewContract(caller ContractRef, object ContractRef, value *big.Int, gas uint64) Contract {
c := &Contract{CallerAddress: caller.Address(), caller: caller, self: object, Args: nil}
如果caller是一个合约,则jumpodests设置为caller的jumpdests.
if parent, ok := caller.(Contract); ok {
// Reuse JUMPDEST analysis from parent context if available.
c.jumpdests = parent.jumpdests
} else {
c.jumpdests = make(destinations)
}
// Gas should be a pointer so it can safely be reduced through the run
// This pointer will be off the state transition
c.Gas = gas
// ensures a value is set
c.value = value
return c
}
//为了链式调用,当调用者是合约时,设置本合约的CallerAddress、value为caller相应的值
func (c Contract) AsDelegate() Contract {
c.DelegateCall = true
// NOTE: caller must, at all times be a contract. It should never happen
// that caller is something other than a Contract.
parent := c.caller.(Contract)
c.CallerAddress = parent.CallerAddress
c.value = parent.value
return c
}
//GetOP用来获取下一跳指令,推测合约code是不是已经拆成了指令合集,然后在input或者Args中获取。
func (c *Contract) GetOp(n uint64) OpCode
//接下来的两个
有疑问加站长微信联系(非本文作者)