Once you write your first few solidity contracts, you may wonder how are contract actually executed? Sure, when you compile your solidity code it may look like a traditional compilation similar to a C program. But understanding the intricacies of how your smart contract is compiled and executed will definitely make you a better developer and help you debug your code faster. In this post we’ll try seek answers to the following questions:
- How a solidity program is compiled?
- What is the Ethereum Virtual Machine (EVM)?
- How the EVM interprets and executes your code?
Prerequisites
- This is advance discusion. You need to have some experience with blockchain development and have a basic understanding of what a smart contract is. If you do not have this knowledge yet. Please feel free to watch checkout courses like 6-Figure Blockchain Developer or Web Development For Blockchain.
- Please install solidity compiler. Refer this link for details
A simple smart contract
//hello.sol
pragma solidity >=0.7.0 <0.9.0;
contract HelloWolrd{
uint256 a;
constructor() {
a = 1;
}
}
Lets compile the program using the following command:
$ solc --bin --asm hello.sol
You should see a similar output on your terminal:
======= hello.sol:HelloWolrd =======
EVM assembly:
/* "hello.sol":33:109 contract HelloWolrd{ ... */
mstore(0x40, 0x80)
/* "hello.sol":73:107 constructor() {... */
callvalue
dup1
iszero
tag_1
jumpi
0x00
dup1
revert
tag_1:
pop
/* "hello.sol":99:100 1 */
0x01
/* "hello.sol":95:96 a */
0x00
/* "hello.sol":95:100 a = 1 */
dup2
swap1
sstore
pop
/* "hello.sol":33:109 contract HelloWolrd{ ... */
dataSize(sub_0)
dup1
dataOffset(sub_0)
0x00
codecopy
0x00
return
stop
sub_0: assembly {
/* "hello.sol":33:109 contract HelloWolrd{ ... */
mstore(0x40, 0x80)
0x00
dup1
revert
auxdata: 0xa26469706673582212205fe60a80f73e86d9f3ad4b2fce33b99a5903282fbd1e3b5c7df084ce790bd8d964736f6c634300080a0033
}
Binary:
6080604052348015600f57600080fd5b506001600081905550603f8060256000396000f3fe6080604052600080fdfea26469706673582212205fe60a80f73e86d9f3ad4b2fce33b99a5903282fbd1e3b5c7df084ce790bd8d964736f6c634300080a0033
I admit that output is very difficult to understand. Let first focus on the output under the heading
Binary:
6080604052348015600f57600080fd5b506001600081905550603f8060256000396000f3fe6080604052600080fdfea26469706673582212205fe60a80f73e86d9f3ad4b2fce33b99a5903282fbd1e3b5c7df084ce790bd8d964736f6c634300080a0033
This is the compiled bytecode of our contract. The bytecode is a set of instruction that describes what we have written in solidity. Bytecode is deployed to the ethereum network which the EVM then executes.
The Ethereum Virtual Machine
The part of the protocol that actually handles processing the transactions is Ethereum’s own virtual machine, known as the Ethereum Virtual Machine (EVM).
The EVM is a Turing complete virtual machine, as defined earlier. The only limitation the EVM has that a typical Turing complete machine does not is that the EVM is intrinsically bound by gas. Thus, the total amount of computation that can be done is intrinsically limited by the amount of gas provided.

Moreover, the EVM has a stack-based architecture. A stack machine is a computer that uses a last-in, first-out stack to hold temporary values.
The size of each stack item in the EVM is 256-bit, and the stack has a maximum size of 1024.
The EVM has memory, where items are stored as word-addressed byte arrays. Memory is volatile, meaning it is not permanent.
The EVM also has storage. Unlike memory, storage is non-volatile and is maintained as part of the system state. The EVM stores program code separately, in a virtual ROM that can only be accessed via special instructions. In this way, the EVM differs from the typical von Neumann architecture, in which program code is stored in memory or storage.
If you scan the raw bytecode by bytes (two characters at a time), the EVM identifies specific opcodes that it associates to particular actions. For example: The first 4 characters of the bytecode 6080
translates to (in hexadecimal):
0x60 0x80
The disassembled code is still very low-level and difficult to read, but as you will see, we can start making sense out of it. Each instruction is made up of an opcode and an optional arguments.
Opcodes
Opcodes are low level stack operators that tells the EVM what to do in an instruction. You can think of opcodes as basic arithmetic operations used in algebra. Before we get started on our ambitious endeavour of completely deconstructing the bytecode, you’re going to need a basic tool set for understanding individual opcodes such as PUSH, ADD, SWAP, DUP, etc. An opcode, in the end, can only push or consume items from the EVM’s stack, memory, or storage belonging to the contract. That’s it.
The following table contains the EVM’s opcode instruction set:
Opcode | Name | Description | Extra Info | Gas |
---|---|---|---|---|
0x00 |
STOP | Halts execution | – | 0 |
0x01 |
ADD | Addition operation | – | 3 |
0x02 |
MUL | Multiplication operation | – | 5 |
0x03 |
SUB | Subtraction operation | – | 3 |
0x04 |
DIV | Integer division operation | – | 5 |
0x05 |
SDIV | Signed integer division operation (truncated) | – | 5 |
0x06 |
MOD | Modulo remainder operation | – | 5 |
0x07 |
SMOD | Signed modulo remainder operation | – | 5 |
0x08 |
ADDMOD | Modulo addition operation | – | 8 |
0x09 |
MULMOD | Modulo multiplication operation | – | 8 |
0x0a |
EXP | Exponential operation | – | 10* |
0x0b |
SIGNEXTEND | Extend length of two’s complement signed integer | – | 5 |
0x0c – 0x0f |
Unused | Unused | – | |
0x10 |
LT | Less-than comparison | – | 3 |
0x11 |
GT | Greater-than comparison | – | 3 |
0x12 |
SLT | Signed less-than comparison | – | 3 |
0x13 |
SGT | Signed greater-than comparison | – | 3 |
0x14 |
EQ | Equality comparison | – | 3 |
0x15 |
ISZERO | Simple not operator | – | 3 |
0x16 |
AND | Bitwise AND operation | – | 3 |
0x17 |
OR | Bitwise OR operation | – | 3 |
0x18 |
XOR | Bitwise XOR operation | – | 3 |
0x19 |
NOT | Bitwise NOT operation | – | 3 |
0x1a |
BYTE | Retrieve single byte from word | – | 3 |
0x1b |
SHL | Shift Left | EIP145 | 3 |
0x1c |
SHR | Logical Shift Right | EIP145 | 3 |
0x1d |
SAR | Arithmetic Shift Right | EIP145 | 3 |
0x20 |
KECCAK256 | Compute Keccak-256 hash | – | 30* |
0x21 – 0x2f |
Unused | Unused | ||
0x30 |
ADDRESS | Get address of currently executing account | – | 2 |
0x31 |
BALANCE | Get balance of the given account | – | 700 |
0x32 |
ORIGIN | Get execution origination address | – | 2 |
0x33 |
CALLER | Get caller address | – | 2 |
0x34 |
CALLVALUE | Get deposited value by the instruction/transaction responsible for this execution | – | 2 |
0x35 |
CALLDATALOAD | Get input data of current environment | – | 3 |
0x36 |
CALLDATASIZE | Get size of input data in current environment | – | 2* |
0x37 |
CALLDATACOPY | Copy input data in current environment to memory | – | 3 |
0x38 |
CODESIZE | Get size of code running in current environment | – | 2 |
0x39 |
CODECOPY | Copy code running in current environment to memory | – | 3* |
0x3a |
GASPRICE | Get price of gas in current environment | – | 2 |
0x3b |
EXTCODESIZE | Get size of an account’s code | – | 700 |
0x3c |
EXTCODECOPY | Copy an account’s code to memory | – | 700* |
0x3d |
RETURNDATASIZE | Pushes the size of the return data buffer onto the stack | EIP 211 | 2 |
0x3e |
RETURNDATACOPY | Copies data from the return data buffer to memory | EIP 211 | 3 |
0x3f |
EXTCODEHASH | Returns the keccak256 hash of a contract’s code | EIP 1052 | 700 |
0x40 |
BLOCKHASH | Get the hash of one of the 256 most recent complete blocks | – | 20 |
0x41 |
COINBASE | Get the block’s beneficiary address | – | 2 |
0x42 |
TIMESTAMP | Get the block’s timestamp | – | 2 |
0x43 |
NUMBER | Get the block’s number | – | 2 |
0x44 |
DIFFICULTY | Get the block’s difficulty | – | 2 |
0x45 |
GASLIMIT | Get the block’s gas limit | – | 2 |
0x46 |
CHAINID | Returns the current chain’s EIP-155 unique identifier | EIP 1344 | 2 |
0x47 – 0x4f |
Unused | – | ||
0x48 |
BASEFEE | Returns the value of the base fee of the current block it is executing in. | EIP 3198 | 2 |
0x50 |
POP | Remove word from stack | – | 2 |
0x51 |
MLOAD | Load word from memory | – | 3* |
0x52 |
MSTORE | Save word to memory | – | 3* |
0x53 |
MSTORE8 | Save byte to memory | – | 3 |
0x54 |
SLOAD | Load word from storage | – | 800 |
0x55 |
SSTORE | Save word to storage | – | 20000** |
0x56 |
JUMP | Alter the program counter | – | 8 |
0x57 |
JUMPI | Conditionally alter the program counter | – | 10 |
0x58 |
GETPC | Get the value of the program counter prior to the increment | – | 2 |
0x59 |
MSIZE | Get the size of active memory in bytes | – | 2 |
0x5a |
GAS | Get the amount of available gas, including the corresponding reduction for the cost of this instruction | – | 2 |
0x5b |
JUMPDEST | Mark a valid destination for jumps | – | 1 |
0x5c – 0x5f |
Unused | – | ||
0x60 |
PUSH1 | Place 1 byte item on stack | – | 3 |
0x61 |
PUSH2 | Place 2-byte item on stack | – | 3 |
0x62 |
PUSH3 | Place 3-byte item on stack | – | 3 |
0x63 |
PUSH4 | Place 4-byte item on stack | – | 3 |
0x64 |
PUSH5 | Place 5-byte item on stack | – | 3 |
0x65 |
PUSH6 | Place 6-byte item on stack | – | 3 |
0x66 |
PUSH7 | Place 7-byte item on stack | – | 3 |
0x67 |
PUSH8 | Place 8-byte item on stack | – | 3 |
0x68 |
PUSH9 | Place 9-byte item on stack | – | 3 |
0x69 |
PUSH10 | Place 10-byte item on stack | – | 3 |
0x6a |
PUSH11 | Place 11-byte item on stack | – | 3 |
0x6b |
PUSH12 | Place 12-byte item on stack | – | 3 |
0x6c |
PUSH13 | Place 13-byte item on stack | – | 3 |
0x6d |
PUSH14 | Place 14-byte item on stack | – | 3 |
0x6e |
PUSH15 | Place 15-byte item on stack | – | 3 |
0x6f |
PUSH16 | Place 16-byte item on stack | – | 3 |
0x70 |
PUSH17 | Place 17-byte item on stack | – | 3 |
0x71 |
PUSH18 | Place 18-byte item on stack | – | 3 |
0x72 |
PUSH19 | Place 19-byte item on stack | – | 3 |
0x73 |
PUSH20 | Place 20-byte item on stack | – | 3 |
0x74 |
PUSH21 | Place 21-byte item on stack | – | 3 |
0x75 |
PUSH22 | Place 22-byte item on stack | – | 3 |
0x76 |
PUSH23 | Place 23-byte item on stack | – | 3 |
0x77 |
PUSH24 | Place 24-byte item on stack | – | 3 |
0x78 |
PUSH25 | Place 25-byte item on stack | – | 3 |
0x79 |
PUSH26 | Place 26-byte item on stack | – | 3 |
0x7a |
PUSH27 | Place 27-byte item on stack | – | 3 |
0x7b |
PUSH28 | Place 28-byte item on stack | – | 3 |
0x7c |
PUSH29 | Place 29-byte item on stack | – | 3 |
0x7d |
PUSH30 | Place 30-byte item on stack | – | 3 |
0x7e |
PUSH31 | Place 31-byte item on stack | – | 3 |
0x7f |
PUSH32 | Place 32-byte (full word) item on stack | – | 3 |
0x80 |
DUP1 | Duplicate 1st stack item | – | 3 |
0x81 |
DUP2 | Duplicate 2nd stack item | – | 3 |
0x82 |
DUP3 | Duplicate 3rd stack item | – | 3 |
0x83 |
DUP4 | Duplicate 4th stack item | – | 3 |
0x84 |
DUP5 | Duplicate 5th stack item | – | 3 |
0x85 |
DUP6 | Duplicate 6th stack item | – | 3 |
0x86 |
DUP7 | Duplicate 7th stack item | – | 3 |
0x87 |
DUP8 | Duplicate 8th stack item | – | 3 |
0x88 |
DUP9 | Duplicate 9th stack item | – | 3 |
0x89 |
DUP10 | Duplicate 10th stack item | – | 3 |
0x8a |
DUP11 | Duplicate 11th stack item | – | 3 |
0x8b |
DUP12 | Duplicate 12th stack item | – | 3 |
0x8c |
DUP13 | Duplicate 13th stack item | – | 3 |
0x8d |
DUP14 | Duplicate 14th stack item | – | 3 |
0x8e |
DUP15 | Duplicate 15th stack item | – | 3 |
0x8f |
DUP16 | Duplicate 16th stack item | – | 3 |
0x90 |
SWAP1 | Exchange 1st and 2nd stack items | – | 3 |
0x91 |
SWAP2 | Exchange 1st and 3rd stack items | – | 3 |
0x92 |
SWAP3 | Exchange 1st and 4th stack items | – | 3 |
0x93 |
SWAP4 | Exchange 1st and 5th stack items | – | 3 |
0x94 |
SWAP5 | Exchange 1st and 6th stack items | – | 3 |
0x95 |
SWAP6 | Exchange 1st and 7th stack items | – | 3 |
0x96 |
SWAP7 | Exchange 1st and 8th stack items | – | 3 |
0x97 |
SWAP8 | Exchange 1st and 9th stack items | – | 3 |
0x98 |
SWAP9 | Exchange 1st and 10th stack items | – | 3 |
0x99 |
SWAP10 | Exchange 1st and 11th stack items | – | 3 |
0x9a |
SWAP11 | Exchange 1st and 12th stack items | – | 3 |
0x9b |
SWAP12 | Exchange 1st and 13th stack items | – | 3 |
0x9c |
SWAP13 | Exchange 1st and 14th stack items | – | 3 |
0x9d |
SWAP14 | Exchange 1st and 15th stack items | – | 3 |
0x9e |
SWAP15 | Exchange 1st and 16th stack items | – | 3 |
0x9f |
SWAP16 | Exchange 1st and 17th stack items | – | 3 |
0xa0 |
LOG0 | Append log record with no topics | – | 375 |
0xa1 |
LOG1 | Append log record with one topic | – | 750 |
0xa2 |
LOG2 | Append log record with two topics | – | 1125 |
0xa3 |
LOG3 | Append log record with three topics | – | 1500 |
0xa4 |
LOG4 | Append log record with four topics | – | 1875 |
0xa5 – 0xaf |
Unused | – | ||
0xb0 |
JUMPTO | Tentative libevmasm has different numbers | EIP 615 | |
0xb1 |
JUMPIF | Tentative | EIP 615 | |
0xb2 |
JUMPSUB | Tentative | EIP 615 | |
0xb4 |
JUMPSUBV | Tentative | EIP 615 | |
0xb5 |
BEGINSUB | Tentative | EIP 615 | |
0xb6 |
BEGINDATA | Tentative | EIP 615 | |
0xb8 |
RETURNSUB | Tentative | EIP 615 | |
0xb9 |
PUTLOCAL | Tentative | EIP 615 | |
0xba |
GETLOCAL | Tentative | EIP 615 | |
0xbb – 0xe0 |
Unused | – | ||
0xe1 |
SLOADBYTES | Only referenced in pyethereum | – | – |
0xe2 |
SSTOREBYTES | Only referenced in pyethereum | – | – |
0xe3 |
SSIZE | Only referenced in pyethereum | – | – |
0xe4 – 0xef |
Unused | – | ||
0xf0 |
CREATE | Create a new account with associated code | – | 32000 |
0xf1 |
CALL | Message-call into an account | – | Complicated |
0xf2 |
CALLCODE | Message-call into this account with alternative account’s code | – | Complicated |
0xf3 |
RETURN | Halt execution returning output data | – | 0 |
0xf4 |
DELEGATECALL | Message-call into this account with an alternative account’s code, but persisting into this account with an alternative account’s code | – | Complicated |
0xf5 |
CREATE2 | Create a new account and set creation address to sha3(sender + sha3(init code)) % 2**160 |
– | |
0xf6 – 0xf9 |
Unused | – | – | |
0xfa |
STATICCALL | Similar to CALL, but does not modify state | – | 40 |
0xfb |
Unused | – | – | |
0xfc |
TXEXECGAS | Not in yellow paper FIXME | – | – |
0xfd |
REVERT | Stop execution and revert state changes, without consuming all provided gas and providing a reason | – | 0 |
0xfe |
INVALID | Designated invalid instruction | – | 0 |
0xff |
SELFDESTRUCT | Halt execution and register account for later deletion | – | 5000* |
Instructions
Each line in the disassembled code above is an instruction for the EVM to execute. Each instruction contains an opcode. For example 0x60 0x80 translates to:
PUSH1 0x80
| |
| Hex value for push.
Opcode.
Destructing our contract
The compiled code has a lot of boilerplate, the code we have written essentially is compiled under “tag_1”:
tag_1:
pop
/* "hello.sol":98:99 1 */
0x01
/* "hello.sol":94:95 a */
0x00
/* "hello.sol":94:99 a = 1 */
dup2
swap1
sstore
pop
/* "hello.sol":33:107 contract HelloWolrd{ ... */
dataSize(sub_0)
dup1
dataOffset(sub_0)
0x00
codecopy
0x00
return
stop
This assignment is represented by the bytecode “6001600081905550”. Let’s break it up into one instruction per line:
60 01
60 00
81
90
55
50
The EVM is basically a loop that execute each instruction from top to bottom. Let’s annotate the assembly code (indented under the label tag_1) with the corresponding bytecode to better see how they are associated:
tag_1:
// 60 01
0x1
// 60 00
0x0
// 81
dup2
// 90
swap1
// 55
sstore
// 50
pop
Note that 0x1 in the assembly code is actually a shorthand for push(0x1). This instruction pushes the number 1 onto the stack.
EVM: A Stack Machine
The EVM is a stack machine. Instructions might use values on the stack as arguments, and push values onto the stack as results. Let’s consider the operation add. Assume that there are two values on the stack:
[1 2]
When the EVM sees add, it adds the top 2 items together, and pushes the answer back onto the stack, resulting in:
[3]
And notate the contract storage with {}:
// Nothing in storage.
store: {}
// The value 0x1 is stored at the position 0x0.
store: { 0x0 => 0x1 }
Let’s now look at some real bytecode. We’ll simulate the bytecode sequence “6001600081905550” as EVM would, and print out the machine state after each instruction:
// 60 01: pushes 1 onto stack
0x1
stack: [0x1]
// 60 00: pushes 0 onto stack
0x0
stack: [0x0 0x1]
// 81: duplicate the second item on the stack
dup2
stack: [0x1 0x0 0x1]
// 90: swap the top two items
swap1
stack: [0x0 0x1 0x1]
// 55: store the value 0x1 at position 0x0
// This instruction consumes the top 2 items
sstore
stack: [0x1]
store: { 0x0 => 0x1 }
// 50: pop (throw away the top item)
pop
stack: []
store: { 0x0 => 0x1 }
The end. The stack is empty, and there’s one item in storage. Evm has now executed our contract successfully.
Conclusion
It’s definitely a good investment to learn how a high-level language like Solidity runs on the Ethereum Virtual Machine (EVM). Knowing the EVM well would help you make awesome tools for yourself and others, also to debug your code better. For further reading checkout the following articles:
Leave a Reply