What Is an Ethereum Archive Node? A Beginner's Guide
An Ethereum archive node stores the complete state history of the blockchain. Learn how it differs from full and light nodes, when you need one, and practical tips for running or using one.

What Is an Ethereum Archive Node? A Beginner's Guide
Ethereum archive node is a specialized type of full node that stores the complete historical state of the Ethereum blockchain, including every account balance, contract code, and storage value at every block since genesis. Unlike a pruned full node, an archive node never deletes old data, allowing anyone to query the exact state of the network at any point in the past. This makes it the most data-intensive but also the most powerful tool for developers, researchers, and blockchain explorers.
Ethereum Archive Node vs Full Node vs Light Node
To understand an Ethereum archive node, it helps to compare it with the other two main node types: full nodes and light nodes. Each trades off between storage, bandwidth, and functionality.
| Node Type | Data Stored | Use Case | Storage Requirement |
|---|---|---|---|
| Light Node | Block headers only, no state | Mobile wallets, quick verification | ~100 MB |
| Full Node (Pruned) | All blocks, only recent state (last 128 blocks) | Transaction sending, consensus participation | ~600 GB – 1 TB |
| Ethereum Archive Node | All blocks + full state history for every block | Historical queries, analytics, research | 12+ TB (and growing) |
A pruned full node keeps all blocks but discards older account states. If you query that node for a balance from a year ago, it must re-execute all transactions from that block to the present to reconstruct the answer. An Ethereum archive node can answer the same query instantly because it stored that exact state at the moment of the block.
Why the Difference Matters
For everyday use like sending transactions or checking your current balance, a pruned full node works perfectly. However, if you are building a block explorer that needs to display the balance of an address as it was 10,000 blocks ago, only an archive node can provide that answer without an expensive replay.
How an Ethereum Archive Node Stores Historical Data
Ethereum’s state trie is a data structure that maps accounts to their balances, nonces, storage roots, and code hashes. Each block produces a new version of this trie. A normal full node keeps only the most recent 128 versions (by default). An Ethereum archive node preserves every single version of the trie, from block 0 onward.
Pruning vs. Archive
- Pruning: The node deletes older trie nodes that are no longer reachable from the latest state. This saves about 90% of storage compared to an archive node.
- Archive mode: The node retains all trie nodes forever. The Ethereum Foundation’s Go implementation (Geth) uses the flag
--gcmode=archiveto enable this.
The result is that an archive node’s database grows at roughly the same rate as the blockchain itself, currently adding several gigabytes per week.
Why You Might Need an Ethereum Archive Node
Most users do not need an archive node, but several specific use cases make them indispensable:
- Block explorers (e.g., Etherscan) must display account balances at any historical block. They rely on archive nodes to serve thousands of queries per second.
- Smart contract debugging – To understand why a contract behaved unexpectedly months ago, you need to reproduce the exact state at that time. An archive node lets you replay transactions in that historical context.
- Data analytics & research – Analysts studying DeFi adoption, NFT trading patterns, or whale movements often query balances across many blocks. An archive node makes these queries fast.
- dApp development – Developers testing new features may need to simulate transactions against historical market conditions. Archive nodes provide the raw data for such forked testing environments (like using Hardhat or Ganache against an archive endpoint).
A Practical Example
Imagine you are building a tool that shows the total value locked (TVL) in a lending protocol on a specific date, say January 1, 2023. With a pruned node, you would have to sync from that date forward, replaying millions of transactions. With an Ethereum archive node, you issue a single read call against block number 16,000,000, and the answer comes back in milliseconds.
Running an Ethereum Archive Node: Practical Considerations
Setting up your own Ethereum archive node is not for the faint of heart. The hardware requirements are steep, and the syncing process can take weeks.
Hardware Requirements
- Storage: As of early 2025, an archive node requires over 12 TB of solid-state drive (SSD) space. This grows at roughly 3–5 TB per year.
- RAM: 16–32 GB minimum; 64 GB recommended for smooth operation.
- CPU: A modern multi-core processor (e.g., Intel i7 or AMD Ryzen 7) with high single-thread performance.
- Network: Fast, stable internet with unlimited bandwidth. Archive nodes download several terabytes during initial sync and continue streaming new data at every block.
Syncing Time
The initial sync of an archive node from genesis can take 3–8 weeks depending on hardware and network speed. Many operators use snap sync or state sync methods to reduce the initial load, but these still require downloading and verifying all historical state data.
Alternatives to Self-Hosting
Given the cost and complexity, most users access archive data through third-party providers. Services like Infura, Alchemy, and QuickNode offer archive node endpoints for a monthly fee. This is far cheaper than buying 12+ TB of SSD and running a dedicated machine 24/7.
Common Misconceptions About Ethereum Archive Nodes
Several myths surround archive nodes. Let’s clear them up with bold facts.
-
Misconception: Archive nodes are needed to validate transactions.
Truth: A pruned full node can validate every transaction and block just as securely. Archive mode is about querying history, not security. -
Misconception: Running an archive node creates more decentralization.
Truth: Archive nodes do not participate in consensus or serve new blocks. They are read-only historical databases. Decentralization comes from full nodes that propagate and validate new blocks. -
Misconception: Archive nodes store every transaction’s raw data.
Truth: They store the resulting state after each block, not necessarily the transaction inputs. To retrieve raw transaction data (e.g., the exact calldata of a swap), you still need access to the blockchain’s transaction database, which all full nodes already retain.
Conclusion
An Ethereum archive node is the most complete replica of the Ethereum blockchain, preserving every historical account state and storage slot since the network’s inception. While unnecessary for ordinary users, it is a critical tool for block explorers, analytics platforms, and developers who need to query past states quickly. The trade-off is enormous storage and syncing costs, making third-party archive endpoints the practical choice for most. Understanding the difference between archive, full, and light nodes helps you choose the right tool for your project without wasting resources.
RELATED ARTICLES

A rug pull is a crypto scam where developers abandon a project after taking investors' money. These schemes exploit trust and hype to create a false sense of legitimacy before vanishing. Understanding how rug pulls work is essential for protecting your funds in decentralized finance (DeFi) and token markets.

Algorand and Pure Proof of Stake: A Beginner's Guide
