What Is an Ethereum Archive Node? A Beginner's Guide

Ethereum archive node is a specialized type of full node that stores the complete historical state of the Ethereum blockchain, including every account balance, contract code, and storage value at every block since genesis. Unlike a pruned full node, an archive node never deletes old data, allowing anyone to query the exact state of the network at any point in the past. This makes it the most data-intensive but also the most powerful tool for developers, researchers, and blockchain explorers.

Ethereum Archive Node vs Full Node vs Light Node

To understand an Ethereum archive node, it helps to compare it with the other two main node types: full nodes and light nodes. Each trades off between storage, bandwidth, and functionality.

Node Type	Data Stored	Use Case	Storage Requirement
Light Node	Block headers only, no state	Mobile wallets, quick verification	~100 MB
Full Node (Pruned)	All blocks, only recent state (last 128 blocks)	Transaction sending, consensus participation	~600 GB – 1 TB
Ethereum Archive Node	All blocks + full state history for every block	Historical queries, analytics, research	12+ TB (and growing)

A pruned full node keeps all blocks but discards older account states. If you query that node for a balance from a year ago, it must re-execute all transactions from that block to the present to reconstruct the answer. An Ethereum archive node can answer the same query instantly because it stored that exact state at the moment of the block.

Why the Difference Matters

For everyday use like sending transactions or checking your current balance, a pruned full node works perfectly. However, if you are building a block explorer that needs to display the balance of an address as it was 10,000 blocks ago, only an archive node can provide that answer without an expensive replay.

How an Ethereum Archive Node Stores Historical Data

Ethereum’s state trie is a data structure that maps accounts to their balances, nonces, storage roots, and code hashes. Each block produces a new version of this trie. A normal full node keeps only the most recent 128 versions (by default). An Ethereum archive node preserves every single version of the trie, from block 0 onward.

Pruning vs. Archive

Pruning: The node deletes older trie nodes that are no longer reachable from the latest state. This saves about 90% of storage compared to an archive node.
Archive mode: The node retains all trie nodes forever. The Ethereum Foundation’s Go implementation (Geth) uses the flag --gcmode=archive to enable this.

The result is that an archive node’s database grows at roughly the same rate as the blockchain itself, currently adding several gigabytes per week.

Why You Might Need an Ethereum Archive Node

Most users do not need an archive node, but several specific use cases make them indispensable:

Block explorers (e.g., Etherscan) must display account balances at any historical block. They rely on archive nodes to serve thousands of queries per second.
Smart contract debugging – To understand why a contract behaved unexpectedly months ago, you need to reproduce the exact state at that time. An archive node lets you replay transactions in that historical context.
Data analytics & research – Analysts studying DeFi adoption, NFT trading patterns, or whale movements often query balances across many blocks. An archive node makes these queries fast.
dApp development – Developers testing new features may need to simulate transactions against historical market conditions. Archive nodes provide the raw data for such forked testing environments (like using Hardhat or Ganache against an archive endpoint).

A Practical Example

Imagine you are building a tool that shows the total value locked (TVL) in a lending protocol on a specific date, say January 1, 2023. With a pruned node, you would have to sync from that date forward, replaying millions of transactions. With an Ethereum archive node, you issue a single read call against block number 16,000,000, and the answer comes back in milliseconds.

Running an Ethereum Archive Node: Practical Considerations

Setting up your own Ethereum archive node is not for the faint of heart. The hardware requirements are steep, and the syncing process can take weeks.

Hardware Requirements

Storage: As of early 2025, an archive node requires over 12 TB of solid-state drive (SSD) space. This grows at roughly 3–5 TB per year.
RAM: 16–32 GB minimum; 64 GB recommended for smooth operation.
CPU: A modern multi-core processor (e.g., Intel i7 or AMD Ryzen 7) with high single-thread performance.
Network: Fast, stable internet with unlimited bandwidth. Archive nodes download several terabytes during initial sync and continue streaming new data at every block.

Syncing Time

The initial sync of an archive node from genesis can take 3–8 weeks depending on hardware and network speed. Many operators use snap sync or state sync methods to reduce the initial load, but these still require downloading and verifying all historical state data.

Alternatives to Self-Hosting

Given the cost and complexity, most users access archive data through third-party providers. Services like Infura, Alchemy, and QuickNode offer archive node endpoints for a monthly fee. This is far cheaper than buying 12+ TB of SSD and running a dedicated machine 24/7.

Common Misconceptions About Ethereum Archive Nodes

Several myths surround archive nodes. Let’s clear them up with bold facts.

Misconception: Archive nodes are needed to validate transactions.
Truth: A pruned full node can validate every transaction and block just as securely. Archive mode is about querying history, not security.
Misconception: Running an archive node creates more decentralization.
Truth: Archive nodes do not participate in consensus or serve new blocks. They are read-only historical databases. Decentralization comes from full nodes that propagate and validate new blocks.
Misconception: Archive nodes store every transaction’s raw data.
Truth: They store the resulting state after each block, not necessarily the transaction inputs. To retrieve raw transaction data (e.g., the exact calldata of a swap), you still need access to the blockchain’s transaction database, which all full nodes already retain.

Conclusion

An Ethereum archive node is the most complete replica of the Ethereum blockchain, preserving every historical account state and storage slot since the network’s inception. While unnecessary for ordinary users, it is a critical tool for block explorers, analytics platforms, and developers who need to query past states quickly. The trade-off is enormous storage and syncing costs, making third-party archive endpoints the practical choice for most. Understanding the difference between archive, full, and light nodes helps you choose the right tool for your project without wasting resources.

What Is an Ethereum Archive Node? A Beginner's Guide

What Is an Ethereum Archive Node? A Beginner's Guide

Ethereum Archive Node vs Full Node vs Light Node

Why the Difference Matters

How an Ethereum Archive Node Stores Historical Data

Pruning vs. Archive

Why You Might Need an Ethereum Archive Node

A Practical Example

Running an Ethereum Archive Node: Practical Considerations

Hardware Requirements

Syncing Time

Alternatives to Self-Hosting

Common Misconceptions About Ethereum Archive Nodes

Conclusion

RELATED ARTICLES

Algorand and Pure Proof of Stake: A Beginner's Guide

Are Crypto Gifts Taxable? A Beginner's Guide