[Ethereum] Parity’s “warp” sync, and why is it faster than Geth “fast”

fast-syncparitysynchronizationwarp-sync

A follow-up of one of the classic questions on this site:

One of the answers to this question suggested using Geth's --fast flag to help quickly synchronise the block data.

Now, parity comes with a --warp flag to enable synchronization in 10 minutes.

--warp        Enable syncing from the snapshot over the network. (default: false)

How does the --warp flag work, and how does using it speed up the synchronisation? Are we syncing less data, or are we in some way performing fewer checks on its integrity or source?

Best Answer

It's difficult to give an answer without just re-hashing the explanation on the Parity wiki...

The pertinent part is as follows:

These snapshots can be used to quickly get a full copy of the state at a given block. Every 30,000 blocks, nodes will take a consensus-critical snapshot of that block's state. Any node can fetch these snapshots over the network, enabling a fast sync.

The snapshot itself is comprised of 3 parts:

A manifest, which is basically metadata about the snapshot;
Block chunks, which contain raw block data about blocks and their transaction receipts;
State chunks, which contain data about the state at a given block.

Chunks are currently set to 4MB in size.

So how does this actually speed up the sync? What that wiki page doesn't say is that we only sync the snapshots initially. So for each block at intervals of 30,000, we obtain a set of 4MB chunks. Then in the background we continue to sync the remaining block data.

This is equivalent to Geth's --fast sync, which first syncs the block headers, and then in the background syncs the rest of the data. It's just that --warp is syncing even less data on the first pass, and filling in the bigger gaps later on.

Edit:

See also the relevant official Ethcore blog post, specifically the section entitled Core Strength.

Why Does Fast Sync Restart

As stated in Péter Szilágyi's comment above, you will have to wait for --fast syncing to complete, otherwise you will have to restart the process again. The message you should see on your console when --fast syncing has completed is "fast sync complete, auto disabling" as shown below:

I0416 17:16:12.631667   30629 blockchain.go:1251] imported 195 block(s) 
  (0 queued 0 ignored) including 0 txs in 1.235990428s. #384 [d707e667 / d3d5d5c1]
 I0416 17:16:12.631825   30629 sync.go:180] fast sync complete, auto disabling
I0416 17:16:48.831757   30629 blockchain.go:1251] imported 4 block(s) (0 
  queued 0 ignored) including 0 txs in 12.933585ms. #388 [bbb506ab / 0ace7268]

And to restart, you will have to clear your chaindata folder - see "How do I reset my blockchain and run geth --fast" below.

Ethereum Mining after Fast Sync

Sync the blockchain, and when you have the latest blocks being sycned, you can type the following command in your console:

miner.start(n)

where n is the number of threads you want your CPU to mine with.

I'm assuming here that you want to mine the blockchain with a regular CPU, and not a graphics processing unit (GPU). You may want to refer to Is CPU mining even worth the Ether? .

If you do have a GPU on your computer, you may first want to search this site for "mining" Q&As, or ask a separate question if you cannot find an answer. Here is one Q&A - How to mine Ether on GNU + Linux? .

Note that you will have to firstly create an account into which your mining rewards get paid into. See "But I do have a GPU and want to mine with it" below.

How should I proceed?

See details below.

What should I do to stay synced?

You should only need to run geth --fast console for the first time. The --fast option will not sync the blocks any faster after the first time. You can omit the --fast parameters in subsequent runs of geth.

When you run geth for the first time without the --fast parameter, geth may take a few days to download the blockchain from other computers over the Internet - this time depends on the speed of your network connection and your computer CPU.

If you do use geth --fast for the first time in your fresh installation, geth --fast will take several hours to download the blockchain - again this would depend on your network connection and your CPU.

After your initial download of the blockchain using geth --fast, you only run geth without the --fast parameter. The synching now will be fast as only the new blocks need to be download from other Ethereum nodes over the Internet and they are being produced at an average rate of about one block every 14 to 15 seconds.

If you want to reset your blockchain and re-download the blockchain using --fast, see the section below "How do I reset my blockchain and run geth --fast".

Did I do some wrong or `geth --fast` is not for mining?

geth --fast is used to INITIALLY download a copy the current blockchain from other Ethereum nodes over the Internet. As answered in What is Geth's "fast" sync, and why is it faster? :

Instead of processing the entire block-chain one link at a time, and replay all transactions that ever happened in history, fast syncing downloads the transaction receipts along the blocks, and pulls an entire recent state database.

geth --fast is NOT for mining. It is just the first step of downloading a copy of the blockchain. You will subsequently need a continuously syncing copy of the blockchain if you want to mine.

Was any other step that I missed?

Not that I can tell. It is unusual that your chain will start syncing from the beginning, unless it did not complete correctly or there are some configuration problems. Try clearing your chaindata directory and re-sync your blockchain. You should not need to re-sync from scratch after this.

The Details

I'm assuming that you want to run the syncing command in one window (#1) and attach another geth console in another window (#2). And when you want to exit from your console, use the Control-D (^D) keystroke. If you you Control-C multiple times, or kill the process in other ways (kill in Linux or Mac, or Task Manager in Windows), your blockchain data can get corrupted (only very rarely - happened to me once).

Syncing for the first time

In window #1, run the command:

geth --fast console

In window #2, run the following command to attach to the geth --fast console instance above:

geth attach

You don't need the --rpc flags for this as communication between these two geth instances will be done over the IPC protocol. The IPC protocol only runs within the local computer through a file descriptor. The RPC protocol can be used for communication across different computers.

Syncing after the first time

In window #1, run the command:

geth console

In window #2, run the following command to attach to the geth console instance above:

geth attach

Console message difference between `--fast` and normal syncing

The following messages are displayed on the geth --fast console screen - note the header(s) and receipt(s):

I0416 13:35:53.497422   30629 blockchain.go:889] imported 192 header(s) 
  (0 ignored) in 71.941018ms. #6336 [2edbbc3f… / b80c9ac3…]
I0416 13:35:54.263134   30629 blockchain.go:1044] imported 192 receipt(s) 
  (0 ignored) in 55.447062ms. #6336 [2edbbc3f… / b80c9ac3…]
I0416 13:35:54.683682   30629 blockchain.go:889] imported 192 header(s) 
  (0 ignored) in 73.050377ms. #6528 [8ab9a7af… / f2ffecac…]

And the following messages are displayed on the geth console screen - note the block(s):

I0416 13:32:23.331906   30581 blockchain.go:1251] imported 256 block(s) (0 
  queued 0 ignored) including 0 txs in 979.938402ms. #6366 [66dcf4c1 / c5d009a1]
I0416 13:32:24.169955   30581 blockchain.go:1251] imported 256 block(s) (0 
  queued 0 ignored) including 0 txs in 836.388044ms. #6622 [e11a3fa9 / d211c2e1]
I0416 13:32:24.974790   30581 blockchain.go:1251] imported 256 block(s) (0 
  queued 0 ignored) including 0 txs in 803.457715ms. #6878 [c9f9ae12 / 238493b8]

Here is the transition when the --fast syncing has completed and normal syncing starts:

I0416 17:16:12.631667   30629 blockchain.go:1251] imported 195 block(s) 
  (0 queued 0 ignored) including 0 txs in 1.235990428s. #384 [d707e667 / d3d5d5c1]
I0416 17:16:12.631825   30629 sync.go:180] fast sync complete, auto disabling
I0416 17:16:48.831757   30629 blockchain.go:1251] imported 4 block(s) (0 
  queued 0 ignored) including 0 txs in 12.933585ms. #388 [bbb506ab / 0ace7268]

How do I reset my blockchain and run `geth --fast`

The blockchain data, by default, is stored in the following locations (reference Backup & Restore):

Mac: ~/Library/Ethereum/chaindata
Linux: ~/.ethereum/chaindata
Windows: %APPDATA%\Ethereum\chaindata

Delete the contents of the directory above, or move it to another location and when you have successfully synced you data then delete the old copy.

Once the data in the chaindata directory is removed, you should be able to --fast sync again.

But I do have a GPU and want to mine with it

In this case you will need another application like ethminer that will perform the mining operations on the GPU. Communications between geth and ethminer does not work via IPC. You will need to enable the RPC communications using:

geth --rpc console

geth uses the default --rpcaddr 127.0.0.1 and --rpcport 8545. Start your ethminer with the following command

ethminer -F http://127.0.0.1:8545 -G

The -G parameter is the instruction for ethminer to perform the mining computations using your GPU.

Before you can run the commands above to mine, you will have to create an account into which any mining rewards will be paid into. Run the following command

geth account new

You will be prompted for a password twice, then you are good to run the commands above.

Best Answer

Related Solutions

[Ethereum] Geth’s “fast” sync, and why is it faster

[Ethereum] Geth node starts to sync from the beginning after fast sync