I will try to anwser your question. I worked on bloom filters in cpp-ethereum and Parity.
will retrieving event logs become prohibitively slow as the blockchain becomes larger?
Not necessarily. Everything depends on the implementation, logs density (average number of logs / block) and number of cache levels.
More specifically, what is the time complexity of eth_getLogs
?
In worst case, where every block contains log matching your query it is 0(n). But it's rarely a case. Bloom filters utilize probability of false positives, so the more sophisticated your filter is (more topics it has), the faster you will get your results.
Is eth_getLogs even the correct RPC request for what I'm trying to do?
Yes
To summarise, I believe, that 10s response time is caused by sub-optimal imlementation of bloom filters in go-ethereum. Here are the results of benchmarks with parity:
Find all logs from block 0 to 986082 with address: 0x33990122638b9132ca29c723bdf037f1a891a70c (should return 1602 logs).
time curl -X POST --data '{"id":8,"jsonrpc":"2.0","method":"eth_getLogs","params":[{"fromBlock":"0x0","toBlock":"0xf0be2", "address": "0x33990122638b9132ca29c723bdf037f1a891a70c"}]}' -H "Content-Type: application/json" http://127.0.0.1:3030 >> /dev/null
geth first request:
real 0m17.003s
geth second request (I assumed, that results should be cached after the first one).
real 0m18.023s
parity first request (~24x faster then geth)
real 0m0.770s
parity second request (~30x faster then geth)
real 0m0.668s
The gap between Parity and geth closes dramatically when there are no logs to be found:
Find all logs from block 0 to 986082 with address: 0x33990122638b9132ca29c723bdf037f1a891a70d (address does not exist, 0 logs returned).
time curl -X POST --data '{"id":8,"jsonrpc":"2.0","method":"eth_getLogs","params":[{"fromBlock":"0x0","toBlock":"0xf0be2", "address": "0x33990122638b9132ca29c723bdf037f1a891a70d"}]}' -H "Content-Type: application/json" http://127.0.0.1:3030 >> /dev/null
geth first request:
real 0m0.022s
geth second request
real 0m0.021s
parity first request (4x slower than geth)
real 0m0.080s
parity second request (1.5x slower than geth)
real 0m0.030s
Best Answer
emitting Events makes use of a
log
storage, which as you've noted is a 4th form of contract information that is much cheaper than the other three kinds of accessible from solidity (memory, storage, stack). EVM nodes are not required to keep logs forever and can garbage collect old logs to save space. Dapps listening for these logs cannot rely on them being persisted forever (e.g. really old events), but can probably listen to new events as a means of updating on changes.My favorite article about EVM events / log storage is here https://blog.qtum.org/how-solidity-events-are-implemented-diving-into-the-ethereum-vm-part-6-30e07b3037b9
Although I don't know the exact rationale of the EVM designers, I would guess they wanted to provide a cheap, but not free, way of storing information from contracts and publishing notification information for outside listeners (not on the blockchain). Free storage would be open to abuse / denial-of-service attacks.
In particular from the article, we can compare the cost of storing in logs versus storing in
storage
(the choice of names is mildly confusing and regrettable, but what can you do). That would address your question ofLOG
versusSSTORE
opcodes.