[Ethereum] Private PoA Network doesn’t mine

cliquedigitaloceanpoaprivate-blockchainproof-of-authority

I have 5 (8GB/4vCPU) droplets (private networking enabled; IP's below are of this private network) running on DigitalOcean. I am startled because the workflow I outline below worked like charm locally.

  • OS: 4.4.0-116-generic #140-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
  • geth: 1.7.3-stable-4bb3c89d

The droplets have ufw enabled, however I opened all tcp/udp ports that are relevant for geth/ssh.

  1. First, I dynamically generate a genesis block authorizing the 5 nodes as sealers and pre-allocating them with Ether (Also adding 100 other accounts for later use). (I have extradata and other attributes from puppeth)
  2. After that, I make sure that geth isn't running and kill all bootnode or geth processes via SSH.
  3. Then, I SCP the genesis.json and identities.js to all nodes. identities.js contains the 5 coinbase accounts.
  4. I sucessfully init the genesis block on all nodes and start the bootnode on the first droplet.
  5. Next up, it's time to start gethwith the following command on each node:
    • $1 is passed in using bash (1-5) from identities.js
    • $2 is passed in using bash and is the internal IP of en1 on the respective droplet
geth --datadir=/home/node/ --identity=$1 --port=30300 --rpc --rpccorsdomain=* --rpcapi admin,net,clique,eth,web3,miner,personal --rpcport=8500 --networkid=187 --preload=/home/identities.js --bootnodes enode://a349b003327c3075b779715a733d5076b986aae0225c48a760cd2b0e768b9654c0179014353ebb34eb0b9e7654caf69ec56d80b1b32705277bad953fb7777585@10.135.29.232:30301 --nodiscover --verbosity=6 --unlock 0 --password /home/password.sec --nat "extip:$2">> /home/node/console.log 2>&1 &
  1. Because we have the --nodiscover flag, I add all nodes using a bash script to first collect admin.nodeInfo from every node, and then recursively use admin.addPeer via ssh and geth --exec which suceeeds returning true for every entry.

  2. After that, we start mining on 3 out of 5 sealer nodes to reach consensus.

However, they never mine any blocks and just stay stuck in the following endless loop of Recalculated downloader QoS values even though they still have each other as peers when I check admin.peers and net.peerCount.
Also, checking clique.getSigners() sucessfully returns an array containing the pre-allocated and authorized nodes that are running and unlocked on the droplets.

The log-level output from geth with on all nodes looks like this:

TRACE[03-15|13:02:37] Registering sync peer                    peer=6c8d5ad9f798ede5
DEBUG[03-15|13:02:37] Relaxed downloader QoS values            rtt=20s confidence=0.333 ttl=1m0s
DEBUG[03-15|13:02:38] Adding static node                       node="enode://f26f9b43bae5ff8018a723f714f045f55d74a438abf7f043a71c7bf758ebd7fbfa2d31f9bd8f74e06e1d6b0744476013f0a7bf324e06235881a3c1f29fbdad91@10.135.75.26:30300?discport=0"
TRACE[03-15|13:02:38] New dial task                            task="staticdial f26f9b43bae5ff80 10.135.75.26:30300"
DEBUG[03-15|13:02:38] Adding p2p peer                          id=f26f9b43bae5ff80 name=Geth/sealer05/v1.7.3...                                  addr=10.135.75.26:30300  peers=4
TRACE[03-15|13:02:38] Dial task done                           task="staticdial f26f9b43bae5ff80 10.135.75.26:30300"
TRACE[03-15|13:02:38] Starting protocol eth/63                 id=f26f9b43bae5ff80 conn=staticdial
DEBUG[03-15|13:02:38] Ethereum peer connected                  id=f26f9b43bae5ff80 conn=staticdial name=Geth/sealer05/v1.7.3-stable-4bb3c89d/linux-amd64/go1.9.2
TRACE[03-15|13:02:38] Registering sync peer                    peer=f26f9b43bae5ff80
DEBUG[03-15|13:02:38] Relaxed downloader QoS values            rtt=20s confidence=0.250 ttl=1m0s
DEBUG[03-15|13:02:48] Recalculated downloader QoS values       rtt=20s confidence=0.625 ttl=1m0s
INFO [03-15|13:02:52] Transaction pool price threshold updated price=18000000000
INFO [03-15|13:02:52] Starting mining operation
INFO [03-15|13:02:52] Commit new mining work                   number=1 txs=0 uncles=0 elapsed=205.437µs
TRACE[03-15|13:02:52] Waiting for slot to sign and propagate   delay=-99.476ms
INFO [03-15|13:02:52] Successfully sealed new block            number=1 hash=4c706b…3edb3c
DEBUG[03-15|13:02:52] Trie cache stats after commit            misses=14 unloads=0
INFO [03-15|13:02:52] 🔨 mined potential block                  number=1 hash=4c706b…3edb3c
DEBUG[03-15|13:02:52] Reinjecting stale transactions           count=0
TRACE[03-15|13:02:52] Propagated block                         hash=4c706b…3edb3c recipients=2 duration=2562047h47m16.854s
INFO [03-15|13:02:52] Commit new mining work                   number=2 txs=0 uncles=0 elapsed=601.597µs
INFO [03-15|13:02:52] Signed recently, must wait for others
TRACE[03-15|13:02:52] Announced block                          hash=4c706b…3edb3c recipients=2 duration=2562047h47m16.854s
DEBUG[03-15|13:02:52] Queued propagated block                  peer=6c8d5ad9f798ede5 number=1 hash=b88988…21011c queued=1
DEBUG[03-15|13:02:52] Importing propagated block               peer=6c8d5ad9f798ede5 number=1 hash=b88988…21011c
WARN [03-15|13:02:52] Discarded bad propagated block           number=1 hash=b88988…21011c
TRACE[03-15|13:02:52] Propagated block                         hash=b88988…21011c recipients=1 duration=848.472µs
DEBUG[03-15|13:02:52] Queued propagated block                  peer=3869118cfc562f9e number=1 hash=b88988…21011c queued=1
DEBUG[03-15|13:02:52] Importing propagated block               peer=3869118cfc562f9e number=1 hash=b88988…21011c
WARN [03-15|13:02:52] Discarded bad propagated block           number=1 hash=b88988…21011c
TRACE[03-15|13:02:52] Propagated block                         hash=b88988…21011c recipients=0 duration=370.18µs
DEBUG[03-15|13:02:53] Queued propagated block                  peer=3869118cfc562f9e number=1 hash=dffae0…4fe644 queued=1
DEBUG[03-15|13:02:53] Importing propagated block               peer=3869118cfc562f9e number=1 hash=dffae0…4fe644
WARN [03-15|13:02:54] Discarded bad propagated block           number=1 hash=dffae0…4fe644
TRACE[03-15|13:02:54] Propagated block                         hash=dffae0…4fe644 recipients=1 duration=1.629ms
DEBUG[03-15|13:02:54] Queued propagated block                  peer=05eb0ebc7dec7aa9 number=1 hash=dffae0…4fe644 queued=1
DEBUG[03-15|13:02:54] Importing propagated block               peer=05eb0ebc7dec7aa9 number=1 hash=dffae0…4fe644
WARN [03-15|13:02:54] Discarded bad propagated block           number=1 hash=dffae0…4fe644
TRACE[03-15|13:02:54] Propagated block                         hash=dffae0…4fe644 recipients=1 duration=739.317µs
TRACE[03-15|13:03:05] Dial task done                           task="wait for dial hist expire (29.999994993s)"
TRACE[03-15|13:03:05] New dial task                            task="wait for dial hist expire (660.45357ms)"
TRACE[03-15|13:03:06] Dial task done                           task="wait for dial hist expire (660.45357ms)"
TRACE[03-15|13:03:06] New dial task                            task="wait for dial hist expire (702.908968ms)"
TRACE[03-15|13:03:07] Dial task done                           task="wait for dial hist expire (702.908968ms)"
TRACE[03-15|13:03:07] New dial task                            task="wait for dial hist expire (700.379857ms)"
TRACE[03-15|13:03:08] Dial task done                           task="wait for dial hist expire (700.379857ms)"
DEBUG[03-15|13:03:08] Recalculated downloader QoS values       rtt=20s confidence=0.812 ttl=1m0s
DEBUG[03-15|13:03:28] Recalculated downloader QoS values       rtt=20s confidence=0.906 ttl=1m0s
DEBUG[03-15|13:03:48] Recalculated downloader QoS values       rtt=20s confidence=0.953 ttl=1m0s
DEBUG[03-15|13:04:08] Recalculated downloader QoS values       rtt=20s confidence=0.977 ttl=1m0s
DEBUG[03-15|13:04:28] Recalculated downloader QoS values       rtt=20s confidence=0.988 ttl=1m0s
DEBUG[03-15|13:04:48] Recalculated downloader QoS values       rtt=20s confidence=0.994 ttl=1m0s
DEBUG[03-15|13:05:08] Recalculated downloader QoS values       rtt=20s confidence=0.997 ttl=1m0s
DEBUG[03-15|13:05:28] Recalculated downloader QoS values       rtt=20s confidence=0.999 ttl=1m0s
DEBUG[03-15|13:05:48] Recalculated downloader QoS values       rtt=20s confidence=0.999 ttl=1m0s
DEBUG[03-15|13:06:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:06:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:06:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:07:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:07:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:07:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:08:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:08:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:08:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:09:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:09:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:09:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:10:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:10:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:10:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:11:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:11:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:11:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:12:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:12:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:12:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:13:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:13:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:13:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:14:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:14:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:14:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:15:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:15:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:15:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:16:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:16:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:16:48] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
DEBUG[03-15|13:17:08] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s
TRACE[03-15|13:17:28] Refreshing port mapping                  proto=tcp extport=30300 intport=30300 interface=ExtIP(10.135.29.232)
DEBUG[03-15|13:17:28] Recalculated downloader QoS values       rtt=20s confidence=1.000 ttl=1m0s

Is something inherently wrong with this approach?
I really can't get any further. Any hint is highly appreciated since I really spent days trying to figure out a way to get this up and running.

Best Answer

The issue could be resolved by only defining two out of 5 authorized sealers in the genesis.json (using puppeth) and only starting them with --mine and the other 3 nodes without it. We also added some sleep timers between the starting of the different geth instances, maybe that also helped.

If you ever experience a similar issue that doesn't conclude any networking issues you best refer to EIP 225 on github.

Related Topic