[Ethereum] What aspect affects the GPU’s mining speed

gpumining

I recently got my Asus Radeon RX 580 8GB and decided to test it out on mining. It has default core clock speed 1380 MHz, and memory clock 2000 MHz.

Now, my first test with stock settings (no OC, no modded bios), I get around 23 Mh/s.
Second try with modded bios, without OC, I get merely 26 Mh/s, mostly stayed at 25.9 Mh/s or so.

After some researches, I found some other people's settings in Afterburner that claimed to have high hash rate like around 30 Mh/s, so I tried theirs and it only lowered my hash rate a bit, and not increasing it at all. For example this user claimed 29-30 Mh/s, with his settings applied, I get nowhere near 30. Similarly with this user, but his settings doesn't get me the same hash rate neither.

So I decided to tweak around Afterburner myself, and it seems that my core clock at 1250 Mhz, and memory clock at 2100 MHz can get me up to just 27 Mh/s. If I go higher with those 2 settings, it doesn't affect too much and sometimes even drop down.

So here I am, wondering what settings affect the mining performance? Core clock and memory clock? And how do they affect the performance, because it's clear that it's not the higher the better!

Another thing I don't understand is with the same settings in Afterburner applied, some users claim they get up to 30 Mh/s, and I reach around 25 Mh/s only. What's the catch here?

Best Answer

Not sure if this question is off-topic, but broadly speaking, there are a few things that affect a GPU's performance given a particular algorithm being executed (for mining Ethereum and otherwise), assuming adequate power being available:

  • compute performance,
  • instruction pipeline performance,
  • VRAM performance,
  • chance?
  • temperature
  • voltage

The faster the GPU can perform computations, the faster you can get an answer to a calculation. For example, a pocket calculator is much faster than you are at doing arithmetic. Its arithmetic compute performance exceeds that of your brain. The limiting factor of a pocket calculator is your ability to enter instructions -- i.e., how quickly you can give it new work. That's the instruction pipeline performance. Fetching instructions from your CPU/RAM is slow, so a GPU has its own cache to store instructions. But there is only so much that can fit in there, so it depends on the algorithm as to how much time it spends waiting for new instructions.

Also while performing computations, the GPU will likely need to store/access data. The speed at which the video card's RAM (VRAM) is able to provide that data will affect computations, especially those that require frequent memory access. Changing the VRAM's memory access speed may also change its latency in preprogrammed steps. So speeding up the throughput of your RAM may actually result in decreased performance because latency increases. Different GPUs will have different memory speeds, latencies, and bus widths.

Depending on the algorithm and the hardware architecture, the instruction pipeline may have to be flushed frequently, causing the GPU to be briefly idle on the main computation. The effect of this depends on the data that's being operated on (all else equal). Sometimes, that's just pure chance.

If the GPU gets too hot, it should slow itself down to prevent overheating. Intermittent spikes in power use while doing heavy computation may also cause involuntary clocking down of one or more components. It will take some non-zero time to return to the default speed.

If the voltage to the GPU's components is too low, you may end up with memory corruption and/or incorrectly executed instructions. This may result in seemingly random errors. In the context of mining, small numbers of errors may manifest as wasted shares (or worse).

There are interaction effects between several of these and it's difficult to say what any particular bottleneck you are hitting are. E.g., changing the ratio of GPU processor speed to VRAM speed (even excluding the effects of memory straps -- the latency changes) may result in the GPU and VRAM operating "out of sync", so to speak, causing inefficient use of resources. This can be affected by the computations being executed. This is why, for example, a program tuned for the use of an FPU on one CPU model might behave relatively poorly on a machine with a faster, but different model CPU.

What values are attainable for the hardware aspects also vary from unit to unit.