[Ethereum] Do we have a full archive node after a geth fast sync

fast-syncgo-ethereum

I have read the comments in PR for the fast sync and I am a little bit confused that Péter Szilágyi is stating that after a fast sync the user will have a full archive node with all historical states.

This allows a fast synced node to act as a full archive node from all intents and purposes.

My understanding of the fast sync is that geth is downloading some historical states on the way to the pivot block (state download running in parallel to the header/receipt download), but the complete state download will only happen at the pivot point. From there on it continues with the full sync.

Hence, in the best case, we have some historical states before the pivot point and a full history after that, right? Can someone please confirm/correct my mental model?

Best Answer

I have the same understanding as you - the fast synced node will have all blocks, but the state only for last 64 blocks and onwards. Maybe what the author meant is "This allows a fast synced node to act as a full archive node from all intents and purposes, except there will be no state data for blocks earlier than the pivot block" :)

In geth 1.8 pruning is enabled by default https://github.com/ethereum/go-ethereum/releases/tag/v1.8.0. This means that even after fast sync is complete the node will not maintain the historical state. To disable pruning --gcmode=archive should be used. This, as I understand, can become a problem, if e.g. you started a fast sync at some point, in the meantime the network is moving ahead and most of the nodes with pruning enabled will prune the state data of your pivot block. This means you can continue syncing by pulling data only from nodes that disable pruning.

Tracing and pruning: By default, state for the last 128 blocks kept in memory. Most states are garbage collected. If you are running a block explorer or other service relying on transaction tracing without an archive node (--gcmode=archive), you need to trace within this window! Alternatively, specify the "reexec" tracer option to allow regenerating historical state; and ideally switch to chain tracing which amortizes overhead across all traced blocks.

Related Topic