[Ethereum] Syncing keeps behind the blockchain

blockchainsynchronization

I am trying to run a geth node just to learn what I can do with it. I started geth with fast syncing. The problem is that it never syncs and always lags behind ~100 blocks. My datadir is on my HDD and I know that that's the issue but I don't have space in my SSD. Is there any way I can continue with an HDD.

Windows 10 64bit

32GB RAM

geth --fast --rpc --cache=8192 --datadir ./Ethereum

I also port forwarded port 30303 for UDP with advice on some other forum.

Best Answer

Syncing the Ethereum blockchain with Geth in --fast mode has two phases running in parallel: block sync and state trie download. Both phases need to be done in order to have a full node and switch to full mode where every transaction is executed and verified.

The block sync downloads all the block information (header, transactions). This phase uses a lot of CPU and space to store all the data. You can observe this process in the logs with the mention of "Importing block headers and block receipts".

INFO [09-26|09:25:19.045] Imported new block headers               count=1    elapsed=80.177ms     number=8623429 hash=c064e8…4daa8b age=1m1s
INFO [09-26|09:19:52.655] Imported new block receipts              count=65   elapsed=396.964ms    number=8623342 hash=2ef982…20344e age=17m32s    size=2.35MiB

However, in fast mode no transaction are executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). Geth needs to download and cross-check with the latest block the state trie. This phase is called state trie download and usually takes longer than the block sync. This phase is describes in the logs by the following statements:

INFO [09-26|09:29:27.542] Imported new state entries               count=1152 elapsed=16.372ms     processed=338933905 pending=2630   retry=0   duplicate=16797 unexpected=352359
INFO [09-26|09:29:30.307] Imported new state entries               count=768  elapsed=10.657ms     processed=338934673 pending=3075   retry=0   duplicate=16797 unexpected=352359

The charts below shows some metrics during the syncing process. We an observe that once the block sync has finished, we are storing less data and consuming less CPU and memory. However, Geth is still downlading and writing the state entries at a high rate.

When you are between 64 and 128 blocks behind, it usually mean you finished the block sync phase and during the state trie download phase, the block number count will always oscillate between 64 and 128 block behind the latest block mined on Ethereum. This is normal until the state trie download phase ends and your node is fully synced.

To know how closed you are from the end of the state trie download, compare the value of processed=x (latest state downloaded) with the size of the trie. It's hard to get the exact size as it grows all the time. In this recent comment, it was mentioned the trie has around 475,000,000 state entries.

However, using a HDD, you might not be able to keep up and have a high enough disk write rate to catch the head (latest state entry).


This answer is inspired from my article Running an Ethereum Full Node on a RaspberryPi 4 (model B)

Related Topic