Ethereum Yellow Paper Deciphered | Part 7: Execution Model

April 1, 2023

Note that this post is updated till the Shanghai version of the ethereum yellow paper.

Execution Model

The execution model specified how to executed a bytecode in given message call or contract creation transaction. The EVM (Ethereum Virtual Machine) is designed with this model. You can read more about EVM here.

As we have seen in previous parts, this execution model is defined by the function. computes the resultant state after code execution, along with other information.

Execution Overview

Now we will formally defined the function finally. Roughly, executes instructions denoted by 1-byte values called opcodes. It does so iteratively, progressing a pair of states - first the world state that we have already talked a lot about. And second is the machine state .

Machine State

The machine state, is a tuple comprising of 6 elements:

  1. Gas available,
  2. Program counter, . A pointer to position in bytecode pointing the opcode to execute next.
  3. Memory contents, . By default it is assumed to be a series of zero bytes of size bits.
  4. Active number of words in memory,
  5. Stack contents,
  6. The returndata buffer

We denote as current operation (opcode) to be executed, then it is equal to .

Note that instruction mnemonic written above as represents actual opcode number and should be interpreted as such. So, should be interpreted as hex value since it represents operation in the bytecode. Similarly opcodes like , , , etc. are mnemonic for values - , , , respectively.

Also we use to denote the number inputs taken by an opcode. This is also stack items removed by the opcode from the current stack. Similarly, represents the number output items by an opcode or the number of items added to the stack. We subscript and to show which opcode we are talking about.

So, for example, for the opcode, = 2. Since takes two stack items to do addition on them. And = 1, the number of output items which only one - the sum.

Execution

The progression of the states and is defined in terms of the function . The function recursively executes the opcodes/instructions until the machine halts either by an exception or normally. A single iteration of the recursion is defined by the function that outputs result of a single cycle of the machine.

We go through some important functions one by one:

Exceptional Halting function

is used to denote a function which outputs a boolean representing if the present state is an exceptional halting state of machine. The output of is true i.e. it is exceptional halt if any of the following conditions is true:

  • If there is not enough gas left to execute next opcode:

  • If the opcode at current pointer is invalid:

  • If there is insufficient number of input stack items than required by the opcode:

  • If the instruction is JUMP but input (topmost element in stack) does not represent a valid destination to jump to i.e. destination is not JUMPDEST opcode:

The function determines validity of the jumped destination. We'll expand on it next.

  • If the instruction is JUMPI (conditional jump), and condition input in stack is true (not 0) but destination input does not a valid destination to jump to:

  • If the instruction is RETURNDATACOPY but

  • If executing the current instruction/opcode would result in stack size exceeding 1024 (stack overflow). Note that executing an opcode pops items from stack and pushes items into it:

  • State modification is attempted during a STATICCALL which does not have permission to make state modification:

The function above defines condition for all state modification scenarios that includes: - is any of , , or opcodes:

  • is any of the event logging opcodes:
  • is opcode but the input at 2nd index is non-zero:

Combining all above:

  • Gas left before executing opcode is less than (see EIP-2200):

Combining all of the conditions describes above we can now mathematically defined as:

Normal Halting Function

Jump Destination Validity

We previously use a function to define a set of all valid jump destinations where the execution can be jumped to via the JUMP or JUMPI opcodes. The set of valid jump destination simply is all of the positions in the bytecode of the contract where there is JUMPDEST () opcode. To write roughly in set notation it would be:

The paper defines it a bit differently by using another function . Given bytecode and index for opcode in that code, outputs the set of all valid jump destinations in the code starting from index . So, by definition we can say:

since is set of all valid jump destinations in the code.

The function is defined recursively covering three cases as:

  • If the index of opcodes in code becomes greater than length of code, it is empty set:

  • If the opcode at is itself, the set includes and all other valid jump destination positions after :

    where function defines the next valid opcode position (it is not always simply , see below).

  • Otherwise it is only set of all other valid jump destination positions after :

Combining all above we get:

We used function above for the next valid opcode index in the code.

Execution Cycle Function

The execution cycle is defined by function . executes the current opcode, resulting in modification of machine state, like addition or removal of items in stack.

Function takes current state and outputs the next state as:

After this execution cycle (of operating on the opcode), the stack size is changed by which is number of items added subtracted with number of items added to it:

The new stack size, is then:

The stack items with their new indices is updated:

The available gas is reduced by whatever instruction's gas cost was:

And the program counter is incremented accordingly:

Recursive Execution Function

The execution function is defined recursively, using function . uses the function as subroutine for opcode/instruction execution.

We have already seen before:

outputs similarly with some additional elements as well, which we'll see later:

The default initial values of the state are set as:

  • Gas in machine state is same as available gas:

  • The program counter is set to 0:

  • The memory is all zero bytes:

  • The active number of words in memory is zero:

  • The stack is empty:

  • Any output data is empty:

Now we formally define which executes recursively:

  • If it is an exceptional halting according to the function, the output state and returndata is empty - all else remaining unchanged:

  • If a opcode is encountered, the output state is empty, only machine state and returndata remains updated till revert:

  • If there is no revert or exception and returndata is not empty, result of appended with the returndata:

  • Otherwise, is executed again with output of function as the the input for next recursion cycle:

Combining all cases above we get:

As mentioned before, the returndata output is determined by normal halting function :

The operator concatenates the item to tuple.

goes through a recursive cycle until either is true representing an exceptional halt or the output of becomes a series of bytes rather than empty set () indicating a controlled halt.

And this is it!