Ethereum Yellow Paper Mathematics Deciphered | Part 1: World State

July 26, 2022

This part of the series will cover section 2 of the Yellow Paper, which explains world state.

I assume you've already gone through conventions defined in Part 0. If you haven't yet, I highly recommend you to go check it out first.

World State

Ethereum can be considered as a transaction based state machine. It begins with a genesis state and by executing transactions, it goes to a new state - a valid one enforced by a set of rules. The state of this state machine is termed - "world state" and denoted by . The is, simply put, a mapping between account addresses and corresponding account states as you'll see more later.

    | WORLD STATE |

Address     AccountState
----------------------------------------------------------------
address1 -> accountState1(nonce, balance, storageRoot, codeHash)
address2 -> accountState2(nonce, balance, storageRoot, codeHash)
address3 -> accountState3(nonce, balance, storageRoot, codeHash)
        .
        .

That brings us to the first equivalence from paper -

where,

  • is the world state at some time
  • is a single transaction tuple with multiple constituents
  • is the state transition function

When given the current state and a transaction , the state transition function is defined to be a function that will generate a valid next state .

Keep in mind that provides a new state given a transaction i.e. next state that'll result from executing a single transaction .

Now, comes the block. Isn't it that Ethereum jumps to a new state, when a block is mined? Where does the block fit in the equation?

Well, yes a new state is indeed generated when a block in finalized by mining. In fact, the individual state transitions corresponding to each transaction in the block () happen through function.

and so on...

Above three can be combined to be written as:

Now, we introduce another state-transition function from paper,

where,

  • is the world state at time
  • is a single block, represented as a tuple
  • is the state transition function at block level

may look similar to , but notice that transitions state at transaction level. While transitions state at block level. It transitions the world state, , to a new state by including and finalizing all the transactions in the block .

The block is a tuple with multiple components, one of which is a list of transactions (precisely, a list of transaction tuples),

which is same as-

(Note: ellipsis before and after the list of transactions, abbreviate other components of the block tuple. Other components of will be introduced later in series.)

In other words is same as applying multiple times to subsequent transactions in block :

This might seem a bit confusing, but remember that all of the above are equivalences () not equalities ().

Account State

As said before, the world state, , is a mapping between addresses and their account states. The account state is a RLP-serialized data-structure. For an address , the account state can be denoted by - similar to accessing a value in a map by a key (here, ) in programming.

An account state comprises of the following components -

  • (): Nonce is number of transactions sent by this address, if it is an EOA, or it is number of contract-creations made by this address if it is a contract.

  • (): Number of Wei owned by this address.

  • (): A 256-bit hash of root node of a Merkle Patricia Trie data-structure. This MPT structure holds contents of this account. The contents in the MPT are encoded mapping of 256-bit hash of 256-bit keys (aka storage slot) to RLP-encoded 256-bit integers as values (representing the account content). In simple terms, you might know this "content" as contract storage. As expected, only contract accounts can have non-empty storage. EOAs always have empty storage.

  • (): The hash of the EVM code (aka bytecode) of this account. Remember that only contracts have bytecode, EOAs have empty bytecode. Thus, if is the bytecode then, . And , always for an EOA account - meaning for an EOA is always hash of an empty string.

Paper defines a serialization function for a key and a value pair that outputs suitable item for insertion into the account storage Merkle Patricia Trie. In this regard, must be keccak-256 hashed and must be RLP-encoded, as discussed earlier above in component. Hence,

where must be 32-byte (256-bit) long and must be a natural number. Formally this is same as,

𝟛𝟚

According to convention then, is a similar function to except takes as input a series of key-value pairs and performs same operation but element-wise. That is,

Finally, using all the above the paper defines a convenient equivalence:

Note that in the equivalence above, the subscript (bolder, on LHS) is different from subscript (on RHS)

We already know that on RHS is the defined earlier. If you look carefully, is actually defines the same Merkle Patricia Trie whose root's hash is . How?

(bolder ) represents a list of contract's storage key-value pairs (raw integer key and raw storage content) corresponding to an account, . The then hashes each of the keys and RLP-encodes each of the values. Just as defined at . And with these encoded inputs the Merkle Patricia Trie is constructed with function.

The equivalence is made just for convenience since we typically refer to the trie's underlying set of key-value pairs and NOT just the hash of storage trie root . It is assumed from now on that the latter is equivalent to former i.e. maybe used to refer to the MPT and not just the simple hash value.

A non-existent account is defined as:

Whereas an empty account i.e. it has no code (bytecode), zero nonce and zero balance, given as:

An account is dead if it is either non-existent or empty. Mathematically it is same as,

Now, paper defines the world-state collapse function for existent accounts, as,

where,

Remember, that the world-state is a mapping from account address to account states. function is basically squishing everything belonging to an account state and yields a set of these values for all accounts (i.e. ) except non-existent accounts (i.e. ).

This set of items yielded by function when put into a Merkle Patricia Trie (using function), provides a short identity for the world-state. This identity is nothing but the hash of root of this trie. Keep in mind that this MPT is not the same one as defined for for a single account. This MPT encompasses all the accounts with it's contents. You might know this as "State Trie".

It is assumed that account state can either be non-existent or valid if existent. Mathematically,

where, is a validity function which just sets some restrictions on what values an account's components can take (e.g. nonce and balance must be natural number less than ):

And that concludes this part of the series! Next part will be all about transactions. Stay tuned!

Resources