Solidity – Identifying the Most Gas-Efficient Way to Compose Byte Arrays

gassolidity

I'm trying to test the gas costs for storing and retrieving strings on-chain, and I've found the most gas efficient way to put them there is with the following function (Its gas cost is a linear function of the value being stored)

mapping(string => bytes[]) private text_parts;

// ...

function append_to_text_part(string memory key, string memory value) public {
    text_parts[key].push(bytes(value));
}

To retrieve one of these stored strings I use

function retrieve_stored_byte_arrays_as_string(string memory text_part_label) private view returns(string memory) {
    string memory joined_string = "";
    bytes[] memory selected_text_parts = text_parts[text_part_label];
    uint selected_text_part_length = text_parts[text_part_label].length;
    for (uint i = 0; i < selected_text_part_length; i++) {
        joined_string = string(abi.encodePacked(joined_string, selected_text_parts[i]));
    }
    return joined_string;
}

My function seems to have linear memory cost relative to the total size of stored string. However, it's continually creating new strings and holding two in memory at once while using abi.encodePacked, which seems wasteful.

An efficient way to do this in another language would be the check the size of the stored bytes, create a byte[] with the total length, and memcpy the stored bytes into the appropriate positions in the returned array.

Is this possible to do in Solidity?

Best Answer

EVM design patterns take some time to get used to owing to the limitations of the platform. In other settings, time-complexity is important to the extent that efficiency is desired but the EVM imposes hard limits on the viable time-complexity of any transaction and costs to contract users based on OPCODES that are similar to CPU operations on a metal computer.

Generally, O(1) should be the goal. Where that isn't feasible, then carefully considered upper bounds, e.g. O(n) where n < limit holds in all cases ... otherwise you might get something that is either too expensive for the users, or completely unusable due to the block gasLimit.

Your first reaction to a concat "requirement" should usually be to find a design alternative to doing it at all. A clever layout of data at rest might be able to accomplish a similar effect. Also, since client processing power is plentiful and free, you can look at externalizing the concern, e.g. have clients prepare data in a way that's acceptable to the contract and not the other way around.

If you don't find an alternative and you're sure you can reasonably limit gas cost, you can consider trying for the most efficient process that gets it done. Something like this assembly code might help.

https://github.com/Arachnid/solidity-stringutils/blob/master/src/strings.sol

A caveat would be that assembly has a certain code smell that affects the time/cost of QA and audit, not mention debugging. That example is old and probably isn't best of breed in 2022. Avoid string manipulation, if possible, or keep looking for a modern implementation.

Hope it helps.

Related Topic