Ethereum EVM Source Code Analysis: Inside the Virtual Machine

The Ethereum Virtual Machine (EVM) is the runtime environment at the heart of Ethereum’s smart contract functionality. It executes code in a secure, sandboxed environment, ensuring deterministic behavior across all nodes in the network. This article dives into the inner workings of the EVM by analyzing its implementation in the official Go Ethereum client (geth). We'll explore how contracts are created and executed, how instructions are interpreted, and how resources like gas and memory are managed.

Whether you're a blockchain developer, a researcher, or just curious about how smart contracts run under the hood, this deep dive into EVM internals will provide valuable insights.

👉 Discover how developers use blockchain tools to test and deploy smart contracts efficiently.

EVM Module Architecture Overview

The evm module in Go Ethereum is designed around a few core components that work together to execute smart contracts. Understanding this architecture is key to grasping how Ethereum processes transactions.

At the center is the EVM struct — it represents a single instance of the virtual machine, instantiated for each transaction. The EVM relies on three primary collaborators:

Interpreter (EVMInterpreter): Executes bytecode instructions one by one.
Configuration (vm.Config): Holds settings like the active instruction set (JumpTable) and gas pricing rules.
State Database (StateDB): Persists account states, including balances, contract code, and storage.

When a transaction arrives, ApplyTransaction creates a new EVM instance. This function converts the transaction into a message format used within the EVM context and triggers either contract creation or execution via StateTransition.TransitionDb.

func ApplyTransaction(...) {
    context := NewEVMContext(msg, header, bc, author)
    vmenv := vm.NewEVM(context, statedb, config, cfg)
    _, gas, failed, err := ApplyMessage(vmenv, msg, gp)
}

Inside TransitionDb, the system checks if the transaction is creating a new contract (determined by whether the "to" address is empty). If so, it calls EVM.Create; otherwise, it invokes EVM.Call. After execution, unused gas is refunded to the sender, and consumed gas is credited to the miner.

This modular design keeps execution logic isolated while allowing flexibility across different Ethereum upgrades.

Contract Lifecycle: Creation and Invocation

Smart contracts on Ethereum go through two main phases: creation and calling. Both are handled by the EVM but follow distinct paths.

Creating a Smart Contract

Contract creation begins when a transaction has no recipient address. The EVM.Create method computes a unique contract address using the sender's address and nonce:

contractAddr = crypto.CreateAddress(caller.Address(), evm.StateDB.GetNonce(caller.Address()))

This ensures that even deploying identical code multiple times results in different addresses due to increasing nonces.

The actual creation happens in EVM.create, which performs several critical steps:

Depth Check: Prevents excessive recursive contract creations (capped at 1024 levels).
Balance Verification: Ensures the creator has enough funds to cover value transfers.
Nonce Increment: Increases the creator’s nonce before deployment.
Account Initialization: Creates a new account in StateDB and optionally sets an initial nonce (EIP-158).
Value Transfer: Sends ether specified in the transaction to the new contract.

After setup, a Contract object is initialized with caller info, gas limit, and deployment code. The real magic happens when run(evm, contract, nil, false) executes the initialization bytecode — typically compiler-injected setup logic such as constructor functions.

If successful, the return data from this execution becomes the final runtime bytecode stored in StateDB.SetCode(address, ret).

👉 Learn how blockchain platforms streamline contract deployment and monitoring.

Invoking Existing Contracts

Once deployed, contracts can be called using four methods:

Call
CallCode
DelegateCall
StaticCall

Understanding Call Variants

Method	Purpose
Call	Standard call with separate context
CallCode	Runs external code in caller's state (deprecated)
DelegateCall	Delegates logic execution while preserving caller’s context
StaticCall	Read-only calls that prevent state modifications

DelegateCall is especially important for library reuse. For example, if contract A uses DelegateCall to invoke library B, B runs using A’s storage and address — making it behave like part of A itself.

This pattern enables upgradable contracts and shared logic without duplication.

Static Calls and Read-Only Enforcement

StaticCall enforces immutability by rejecting any operation that modifies state. During interpretation, if a write instruction (like SSTORE) is encountered while readOnly=true, the interpreter returns errWriteProtection.

This check occurs in enforceRestrictions:

if in.readOnly {
    if operation.writes || (op == CALL && stack.Back(2).BitLen() > 0) {
        return errWriteProtection
    }
}

Thus, attempting to modify state in a view function marked view will fail during execution.

Interpreting Bytecode: The Role of EVMInterpreter

The EVMInterpreter is responsible for stepping through bytecode instructions. However, it doesn't directly implement opcodes — instead, it delegates execution to functions defined in the JumpTable.

Execution Flow

The main loop inside Run follows these steps:

Fetch next opcode via contract.GetOp(pc)
Retrieve corresponding operation from JumpTable[op]
Validate stack requirements
Enforce read-only restrictions
Calculate and deduct gas cost
Resize memory if needed
Execute the operation via operation.execute(...)

Each step ensures safety and consistency before proceeding.

Memory and Stack Management

Two transient storage areas are used during execution:

Stack: A LIFO structure holding up to 1024 *big.Int values. Used for arithmetic and control flow.
Memory: A byte array simulating RAM. Grows dynamically and incurs gas costs when expanded.

These are recreated for every call frame, ensuring isolation between executions.

Permanent data resides in StateDB, which maps addresses to accounts containing balance, nonce, code hash, and storage trie root.

Jump Tables and Opcode Handling

The vm.Config.JumpTable is an array of 256 operation structs — one per possible opcode. Each entry defines:

execute: Function pointer for actual logic
gasCost: Dynamic or static gas calculation
validateStack: Ensures correct operand count
writes: Marks state-modifying operations

Different protocol versions (Homestead, Byzantium, Constantinople) use different jump tables to support evolving features.

For example:

ADD: {
    execute: opAdd,
    gasCost: constGasFunc(GasFastestStep), // Always 3 gas
}

Newer instructions like SHL, SHR, and CREATE2 were added in later forks.

Secure Jump Destinations

Jump instructions (JUMP, JUMPI) require targets to be valid jump destinations (JUMPDEST). But simply checking opcode value isn’t enough — what if that byte was data?

To solve this, EVM uses a bit vector (bitvec) generated during contract analysis. As the interpreter scans bytecode:

Instructions mark their position as executable (0)
PUSH data regions mark bytes as non-code (1)

Then validJumpdest(pos) confirms both:

Opcode at target is JUMPDEST
Bit vector indicates it's not embedded data

This prevents malicious jumps into middle of PUSH arguments.

Gas Mechanics and Resource Accounting

Gas is Ethereum’s metering mechanism — every operation consumes some amount based on computational weight.

Gas Lifecycle

Intrinsic Gas: Deducted before execution based on transaction type and data size.
Per-Instruction Cost: Paid before each opcode runs.
Memory Expansion: Additional cost when memory grows.
Refund Mechanism: Partial refunds for clearing storage (SSTORE[x] = 0).

Unused gas is returned after execution; failed transactions due to out-of-gas lose everything.

Gas Calculation Example

For simple ops like ADD, gas is constant:

constGasFunc(GasFastestStep) // Returns 3

For complex ops like SSTORE, cost varies:

First time setting key: ~20,000 gas
Resetting to zero: ~5,000 gas + refund later
Reusing existing key: ~5,000 gas

These values evolve with EIPs like EIP-1706 to adjust economic incentives.

Core Keywords

ethereum evm, smart contract execution, evm interpreter, jump table, opcode handling, gas consumption, contract creation, delegatecall

Frequently Asked Questions

Q: What is the purpose of the EVM?
A: The EVM executes smart contracts in a deterministic, isolated environment across all Ethereum nodes, ensuring consensus on state changes.

Q: How does DelegateCall differ from Call?
A: DelegateCall runs code from another contract but uses the caller’s storage and address — enabling reusable libraries and proxy patterns.

Q: Why does jump destination validation matter?
A: Without bit vector analysis, attackers could trick the EVM into jumping into PUSH data sections — potentially executing unintended logic.

Q: Is memory persistent across calls?
A: No. Memory is temporary and reset after each external call or contract execution completes.

Q: How is gas priced during execution?
A: Gas price is set by the transaction sender. The total fee equals gas used × gas price, paid in ETH.

Q: Can I run my own EVM locally?
A: Yes — tools like Hardhat or Foundry include local EVMs for testing smart contracts off-chain.

👉 Explore modern tools that integrate with EVM-compatible networks for development and trading.