[Ethereum] For pool mining, what exactly is a share

ethashminingmining-pools

Can someone please explain exactly what a share is in the context of pool mining? I have a superficial understanding of testing random nonces to find a hash under the current difficulty. I also understand that mining pools set a custom easier difficulty to target a relatively easy to attain share difficulty (~10 minutes).

What I don't understand is how those shares translate to finding real blocks. Say one out of a thousand shares is a valid real new block. Why wouldn't miners just submit that hash themselves, and send the rest of the easier shares to the pool?

I'm sure I have a fundamental misunderstanding, but can't figure out what it is. I guess somehow the pool has some secret that it can combine w/ the submitted shares to see if that share matches a block, but that is a guess.

Any help would be greatly appreciated.

Best Answer

Lots of answers here, but none of them has actually answered the question "what is a share?"

In almost all mining pools, a share is a block "solution" not quite good enough to be published as an actual block, but still good enough that it's really hard to find them. This means that shares can be used to measure how much work you're doing, but just with much finer and more consistent granularity than actual block solutions, which are far too rare for small miners.

So, just as an example, suppose that the current difficulty was 10,000. To become valid blocks, an attempted block with a specific nonce has to be "better" than 10,000. In this situation, the pool might set their "share difficulty" at 100. So with each nonce you try, your software will check to see how "good" the difficulty of the resulting block is. Most of these blocks will be below 100 in "difficulty level", but a small amount of them will be over 100 in difficulty level (and still less than 10,000). These "better than 100 but still less than 10,000" blocks are the ones we're calling the "shares". They can be sent into the mining pool, even though they aren't good enough to be published on the open network as actual blocks. Inside these shares the mining pool will be clearly marked as the recipient of any potential block reward, which means that the mining pool can use the number of shares you submit as unfakeable evidence of how much work your machine is doing to try and find blocks for the pool, even if you've never found one yet. Which is good, because it takes forever to find actual blocks.

As you mine along, happily submitting shares, then every once in blue moon you will come across a solution that is not only good enough to be a share, it's actually good enough to be a real block! That is, it has difficulty "over 10,000" and so it meets not just the share criteria but the full network standard of difficulty, which is much harder. This one you would still submit to the pool, but when they get it they will go ahead and publish it out over the actual network, receiving a nice fat reward that gets distributed amongst everyone according to the shares they've been submitting. Of course, these aren't real numbers, and most software works by just telling your machine to look for blocks over 100 and not worry about what the network difficulty is. But still, we can see how someone who isn't doing actual mining would never be able to find any shares in the first place, which means this is actually reasonably secure from the pool's perspective as a way of measuring how much work everyone is doing.

I've glossed over a lot of details here, because there are some subtle tricks the pool needs to be careful of (block withholding attacks anyone?), but that in a nutshell is what shares actually are: totally normal blocks that don't meet the full requirements to be published on the network, but still meet some smaller requirement set by the pool to count as proof you're mining with the pool set as the recipient.

Now, why can't the miner just submit any actual blocks themselves to take the whole reward? Two reasons: first, in order for their shares to be valid they have to have the pool set as the recipient, so the mined block already gives the reward to the pool no matter who broadcasts it, and second (as indicated by zanzu) the pool doesn't actually bother giving the whole block out to miners, just a template for the header that contains the hashes of the actual block contents).

What the miner could do is secretly throw away the valid block instead of sending it back to the pool. That would hurt the rest of the pool more than the miner because only a small portion of the reward from it would have actually come back to them, and for a bunch of complicated game theory reasons this could maybe result in an advantage if the same miner also had a lot of other mining power not on the pool at all. (This is the "block withholding attack" I mentioned above). But it starts to be noticeable in the statistics if you do it a lot, and also it doesn't provide any benefit to the typical small-time miner. So these attacks are presumed to be fairly rare. There are certain types of reward schemes that are more or less resistant to the strategy, but most people don't seem to be very concerned about these attacks in general. So all in all the basic "share" strategy is pretty much good enough.

TL;DR: shares are "failed blocks" that a pool uses as evidence of a small miner's participation.

Related Solutions

[Ethereum] Solo Vs Pool Mining With A GPU

short answer too, that will take the exact opposite stance as @nicolas-massart ;)

in the long run you'll be always better off mining solo, ever because you get uncles and pay no fees

pool mining reduces your variance, period.

this reddit post is quite interesting, it's basically @vitalik-buterin asking as to why people mine in pools.

It's not true for all pools but most of them don't pay you uncles : that substracts to your gains. It's almost true for all pools, there is a fee that substracts to your gains too

[Ethereum] Pool mining: How is DAG generation possible without knowing the block number

To recap, your mining pool will respond to the eth_getWork call with the following information:

DATA, 32 Bytes - current block header pow-hash
DATA, 32 Bytes - the seed hash used for the DAG.
DATA, 32 Bytes - the boundary condition ("target"), 2^256 / difficulty.

This information will be used to generate the DAG.

From here on I've referenced the Yellow Paper, rather than the Ethash wiki page. From the Yellow Paper, we can confirm the basis for your query:

J.3. Dataset generation. In order the generate the dataset we need the cache c, which is an array of bytes. It depends on the cache size csize and the seed hash s ∈ B32.

And:

J.2. Size of dataset and cache. The size for Ethash’s cache c ∈ B and dataset d ∈ B depend on the epoch, which in turn depends on the block number.

From section J.2. of the Yellow Paper you'll see that the size of the cache is constant for all mining performed during a given epoch, where an epoch is defined as 30,000 blocks (~100 hours). You therefore don't need to know the block number to calculate the cache size, only the current epoch number. The same holds true for the size of the DAG itself, which is also dependent on the current epoch number.

So... How do we locally find the epoch number?

We're told the seed hash by eth_getWork, but we can't reverse the hash to get the block number from which it was created. So we use trial and error. Starting with a block we know to be in epoch 0 - because we know epoch 0 is from block 0 to block 30,000 - we locally create a seed hash and check whether it matches the seed hash we've been sent.

An example in the code is as follows.

A call to getWork is made in MinerAux.h, and the 3 variables, including the seed hash, is returned
EthashAux::full() is called, and the seed hash from getWork is passed in
The action jumps to EthashAux.cpp, where the main functions are defined
In EthashAux::full() we call EthashAux::computeFull(), again passing through the seed hash
Here we calculate the (approximate) block number using blockNumber = EthashAux::number(_seedHash)

The pertinent part of EthashAux::number() is the following line:

for (h256 h; h != _seedHash && epoch < 2048; ++epoch, h = sha3(h), get()->m_epochs[h] = epoch) {}

... which loops through epoch numbers, hashes to create a seed hash, and checks it against the hash sent by getWork.

The function then returns an approximate block number: return epoch * ETHASH_EPOCH_LENGTH;.

We then later use this value to calculate the cache size, which in turn can be used in generating the cache.

Edit - Addendum:

Parity - the client written by Ethcore - does include the block number in its implementation of eth_getWork.

Best Answer

Related Solutions

[Ethereum] Solo Vs Pool Mining With A GPU

[Ethereum] Pool mining: How is DAG generation possible without knowing the block number

Related Topic