If you want to do analytics, the trick is scrapping the whole blockchain into an indexed database. My advice is that you get a SQL database, and write a program that queries a node for its blocks, one by one, then you get the transactions and the transactions receipts, which you can query again for more data. Is up to you which fields interest you the most, to which you want to apply the right indexing.
Here is the structure of a block
From https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_getblockbyhash
// Request
curl -X POST --data '{"jsonrpc":"2.0","method":"eth_getBlockByHash","params":["0xe670ec64341771606e55d6b4ca35a1a6b75ee3d5145a99d05921026d1527331", true],"id":1}'
// Result
{
"id":1,
"jsonrpc":"2.0",
"result": {
"number": "0x1b4", // 436
"hash": "0xe670ec64341771606e55d6b4ca35a1a6b75ee3d5145a99d05921026d1527331",
"parentHash": "0x9646252be9520f6e71339a8df9c55e4d7619deeb018d2a3f2d21fc165dde5eb5",
"nonce": "0xe04d296d2460cfb8472af2c5fd05b5a214109c25688d3704aed5484f9a7792f2",
"sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
"logsBloom": "0xe670ec64341771606e55d6b4ca35a1a6b75ee3d5145a99d05921026d1527331",
"transactionsRoot": "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
"stateRoot": "0xd5855eb08b3387c0af375e9cdb6acfc05eb8f519e419b874b6ff2ffda7ed1dff",
"miner": "0x4e65fda2159562a496f9f3522f89122a3088497a",
"difficulty": "0x027f07", // 163591
"totalDifficulty": "0x027f07", // 163591
"extraData": "0x0000000000000000000000000000000000000000000000000000000000000000",
"size": "0x027f07", // 163591
"gasLimit": "0x9f759", // 653145
"gasUsed": "0x9f759", // 653145
"timestamp": "0x54e34e8e" // 1424182926
"transactions": [{...},{ ... }]
"uncles": ["0x1606e5...", "0xd5145a9..."]
}
}
Then you get hashes for transactions, which you can perform for each, two queries
https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_gettransactionbyhash
// Request
curl -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionByHash","params":["0xb903239f8543d04b5dc1ba6579132b143087c68db1b2168786408fcbce568238"],"id":1}'
// Result
{
"id":1,
"jsonrpc":"2.0",
"result": {
"hash":"0xc6ef2fc5426d6ad6fd9e2a26abeab0aa2411b7ab17f30a99d3cb96aed1d1055b",
"nonce":"0x",
"blockHash": "0xbeab0aa2411b7ab17f30a99d3cb9c6ef2fc5426d6ad6fd9e2a26a6aed1d1055b",
"blockNumber": "0x15df", // 5599
"transactionIndex": "0x1", // 1
"from":"0x407d73d8a49eeb85d32cf465507dd71d507100c1",
"to":"0x85h43d8a49eeb85d32cf465507dd71d507100c1",
"value":"0x7f110" // 520464
"gas": "0x7f110" // 520464
"gasPrice":"0x09184e72a000",
"input":"0x603880600c6000396000f300603880600c6000396000f3603880600c6000396000f360",
}
}
https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_gettransactionreceipt
// Request
curl -X POST --data '{"jsonrpc":"2.0","method":"eth_getTransactionReceipt","params":["0xb903239f8543d04b5dc1ba6579132b143087c68db1b2168786408fcbce568238"],"id":1}'
// Result
{
"id":1,
"jsonrpc":"2.0",
"result": {
transactionHash: '0xb903239f8543d04b5dc1ba6579132b143087c68db1b2168786408fcbce568238',
transactionIndex: '0x1', // 1
blockNumber: '0xb', // 11
blockHash: '0xc6ef2fc5426d6ad6fd9e2a26abeab0aa2411b7ab17f30a99d3cb96aed1d1055b',
cumulativeGasUsed: '0x33bc', // 13244
gasUsed: '0x4dc', // 1244
contractAddress: '0xb60e8dd61c5d32be8058bb8eb970870f07233155' // or null, if none was created
logs: [{
// logs as returned by getFilterLogs, etc.
}, ...]
}
}
Possibly someone familiar with it or able to study it more will chime in with a more detailed explanation.
They are mathematically adjusting account balances on the fly, as needed. For example, line 56 kicks off a computation instead of just returning a number lookup:
return tokenFromReflection(_rOwned[account]);
As one follows the chain of internal function calls, supply is also adjusted. This is following a style Nick Johnson called "Amortizing Work". Given the rules of the system, it's not important to do batch-like work if the result of such work can be computed when it's needed.
This is from 2017, so the syntax is outdated but the description of the pattern will help you understand this contract: https://weka.medium.com/dividend-bearing-tokens-on-ethereum-42d01c710657
As a PSA, it would be interesting to investigate the for
loop near 239. "Unbounded" for
loops are an anti-pattern, so I'm curious about the maximum length of the array, hopefully short.
Hope it helps.
Best Answer
A good overview of frontrunning, and related miner extractable value (MEV), problems are well established and well known at this point. The difference between these two modes of frontrunning that in MEV it is the miners who frontrun you, whereas normal frontrunning bot use the same mempool for transactions are other Ethereum clients.
Here are some articles
Coindesk on frontrunning
An actual tutorial implementation of a frontrunning bot
TheBlock research on miner extractable value
Escaping the Dark Forest as kindly pointed out by Richard Horrocks
Flash Boys 2.0 as kindly pointed out by Nulik