Once i've downloaded the blockchain with geth –fast , is there a dynamic pruning on the up-comings blocks, or will I archive all the blocks from where I stand?
[Ethereum] dynamic pruning in geth –fast
fast-syncgo-ethereumstate-trie-pruningsynchronization
Related Solutions
geth and parity have differents methods to save the ethereum blockchain in their internal format. I made many benchs because i find it so long just to use a Wallet.
The pruning mode is how the block data are saved. With the archive mode, all states are saved. So, you know the state at each moment without to reload all the blockchain. With fast and light, we assume that we don't need all this information but only the current state and few before, so we remove many intermediates states.
On geth, the --fast method saves a state of the blockchain at block B[-1500] and all the states after this block (B[-1] is the last block, B[-2] is the before last block ...). So it is possible to rewind at the state of 1500 last blocks. With a full blockchain, you can do it for all blocks.
In parity, there are 4 pruning modes or journalDB algorithms:
- Archive (Archive): as geth Archive we keep all states
- Fast (OverlayRecent): as geth Fast we keep all the full states of the B[-i] blocks
- Light (EarlyMerge): we keep all the states of the B[-i] blocks but in differential (so it is smaller than fast but slower access)
- Basic (RefCounted): we keep all the states of the B[-i] blocks like with OverlayRecent but we remove states after x changes... so we have only the xth last changes
Benchmarks done on i7 3720QM 16GB ram with Geth 1.4.4 (Homestead with 1.6Mi blocks)
_____________________________________________
| Option | Disk Used | Time | Disk Written |
|--------|-----------|------|---------------|
| none | 19.GB | 5h00 | 1TB |
| fast | 3.7GB | 1h00 | 100GB |
---------------------------------------------
Benchmarks done on i7 3720QM 16GB ram with Geth 1.5.0 unstable (Homestead with 1.6Mi blocks found at https://gitter.im/ethereum/go-ethereum?at=574d26c010f0fed86f49b32f)
__________________________________________________
| command | Disk Used | Time | Disk Written |
|-------------|-----------|------|---------------|
| geth | 21GB | 5h00 | 150GB |
| geth --fast | 4.2GB | 21m | 35GB |
| geth export | 1.5GB | 10m | |
| geth import | 21GB | 3h30 | |
--------------------------------------------------
Benchmarks done on i7 3720QM 16GB ram with Parity 1.2 (Homestead with 1.6Mi blocks)
_____________________________________________
| Option | Disk Used | Time | Disk Written |
|--------|-----------|------|---------------|
| archive| 19.GB | 2h00 | 300GB |
| fast | 3.7GB | 1h30 | 20GB |
| light | 2.5GB | 2h00 | 130GB |
---------------------------------------------
Note: When you have a node with a blockchain, you can dump the chaindata of geth directory to use it with your other computers. I check it with Linux, Windows and OS X.
Note: if you use --cache with 1024, it could be faster. But it is not significant on my system. The same goes for the --jitvm
Note: the ethereum blockchain saved the final state after transactions but it is safer to replay the transactions to check them.
Summary
I downloaded the geth
source, modified the source code to specify the fast sync pivot block, compiled the code, removed the old chaindata and started the fast syncing. Once this is complete, I'll be back to running the regular geth
binaries.
UPDATE This experiment failed. There were a few different errors with my hack that prevented the blockchain to fast sync to the specified block and then normal sync after the specified block. Back to full archive node sync.
Anyone has any suggestions?
Details
I downloaded the source code for geth
and modified the source code for section that calculates the fast sync pivot point eth/downloader/downloader.go, lines 419-441:
case FastSync:
// Calculate the new fast/slow sync pivot point
if d.fsPivotLock == nil {
pivotOffset, err := rand.Int(rand.Reader, big.NewInt(int64(fsPivotInterval)))
if err != nil {
panic(fmt.Sprintf("Failed to access crypto random source: %v", err))
}
if height > uint64(fsMinFullBlocks)+pivotOffset.Uint64() {
pivot = height - uint64(fsMinFullBlocks) - pivotOffset.Uint64()
}
} else {
// Pivot point locked in, use this and do not pick a new one!
pivot = d.fsPivotLock.Number.Uint64()
}
// If the point is below the origin, move origin back to ensure state download
if pivot < origin {
if pivot > 0 {
origin = pivot - 1
} else {
origin = 0
}
}
glog.V(logger.Debug).Infof("Fast syncing until pivot block #%d", pivot)
I modified the last line above to change the Debug
into Info
and added the following two lines below the code above:
glog.V(logger.Info).Infof("Fast syncing until pivot block #%d", pivot)
if (pivot >= 2394190) {
pivot = 2394190;
}
glog.V(logger.Info).Infof("Fast syncing until modified pivot block #%d", pivot)
I recompiled and started off the fast sync process using the modified binaries:
Iota:go-ethereum user$ make geth
...
Done building.
Run "build/bin/geth" to launch geth.
I checked the version of the modified geth
:
Iota:go-ethereum user$ build/bin/geth version
Geth
Version: 1.5.3-unstable
I removed the old damaged chaindata:
Iota:go-ethereum user$ build/bin/geth removedb
/Users/bok/Library/Ethereum/chaindata
Remove this database? [y/N] y
Removing...
Removed in 35.242291ms
I started the fast sync:
Iota:go-ethereum user$ build/bin/geth --fast --cache=1024 console
I1120 23:44:44.870142 ethdb/database.go:83] Allotted 1024MB cache and 1024 file handles to /Users/user/Library/Ethereum/geth/chaindata
I1120 23:44:44.878926 ethdb/database.go:176] closed db:/Users/user/Library/Ethereum/geth/chaindata
...
I1121 08:33:51.340811 eth/downloader/downloader.go:441] Fast syncing until pivot block #2664150
I1121 08:33:51.340847 eth/downloader/downloader.go:445] Fast syncing until modified pivot block #2394190
After the fast syncing is complete, I'll go back to using the regular geth binaries.
Best Answer
Simple answer no. --fast download a pruned version of the state tries, but it behaves like archive after. So if you want spare disk, remove your blockchain with geth removedb and do a --fast again.