The best way is simply not to clear them. In many (most?) situations, the array varies in size over time, and emptied slots will eventually be filled again. Instead of shortening the array, keep a separate count of live elements:
uint numElements = 0;
uint[] array;
function insert(uint value) {
if(numElements == array.length) {
array.length += 1;
}
array[numElements++] = value;
}
function clear() {
numElements = 0;
}
This means that deleted elements remain in the array, but are ignored, since you can use numElements
in place of array.length
everywhere.
This also conserves gas on the whole; deleting and reinserting an item to an array costs 5k (deletion) - 10k (refund) + 20k (reinsertion) = 10k net gas, plus another 10k to update the array length twice. Overwriting, in contrast, costs only 5k gas, plus the same 10k to update the numElements
variable twice.
In cases where the array is erased only once, and won't grow again, a more sensible alternative may be to spawn a new contract for the life of the event, and have it kill itself at the end, which will refund gas for all live storage elements to the caller.
A simple example demonstrating this effect looks like this:
pragma solidity^0.4.12;
contract Test {
function test(uint[20] a) public returns (uint){
return a[10]*2;
}
function test2(uint[20] a) external returns (uint){
return a[10]*2;
}
}
Calling each function, we can see that the public
function uses 496 gas, while the external
function uses only 261.
The difference is because in public functions, Solidity immediately copies array arguments to memory, while external functions can read directly from calldata. Memory allocation is expensive, whereas reading from calldata is cheap.
The reason that public
functions need to write all of the arguments to memory is that public functions may be called internally, which is actually an entirely different process than external calls. Internal calls are executed via jumps in the code, and array arguments are passed internally by pointers to memory. Thus, when the compiler generates the code for an internal function, that function expects its arguments to be located in memory.
For external functions, the compiler doesn't need to allow internal calls, and so it allows arguments to be read directly from calldata, saving the copying step.
As for best practices, you should use external
if you expect that the function will only ever be called externally, and use public
if you need to call the function internally. It almost never makes sense to use the this.f()
pattern, as this requires a real CALL
to be executed, which is expensive. Also, passing arrays via this method would be far more expensive than passing them internally.
You will essentially see performance benefits with external
any time you are only calling a function externally, and passing in a lot of calldata (eg, large arrays).
Examples to differentiate:
public - all can access
external - Cannot be accessed internally, only externally
internal - only this contract and contracts deriving from it can access
private - can be accessed only from this contract
Best Answer
For Q1: Assuming your contract is too large because you have complex data structures in it and therefore complex business logic to manipulate them, my approach has been to review the data model and consider some of these data structures to be contract themselves. For instance, your main contract holds a mapping of structure that contains array. If it is functionally relevant consider converting the structure into a contract and defining some business logic into this contract. In other words, try defining your object model by creating different contract for elements that have independent or different life cycles.
Q2: Rob already answered to this. I will just add that mapping and array automatic getter do not provide a length function. And mapping cannot be iterated over if you do not have already the list of keys.
Q3: here it depends if you can break your business logic to manipulate your data structure independently. From a gas perspective the more straightforward the VM process is the lower the gas will be. Also, with complex data structure one can assume searching in arrays, mapping, sorting and remapping that have a tendency of being loop intensive. So structure your internal data so you have a way to speed up the algorithms. For instance, i needed a sorted list of dates, i built it in a way that the inserting function keep it sorted. Another example is to delegate to the external caller more work to do, ie more separate transactions when possible. Update attributes of a contract in different calls versus one single call will all values.
Good luck and share your findings