EVM – Why Does mload(0x40) Result in 0x20?

assemblyevmopcodeyul

I ran the code below and had a question.

function mloadTest1() public pure returns(bytes memory) {
    bytes memory a;

    assembly {
        a :=mload(0x40)
    }

    return a;
}

mload(x) is a Yul code that reads 32 bytes from offset x in memory. And Solidity defaults to free memory points from 0x00 to 0x60, with 0x80 stored at 0x40.

So I think the value of the last return a in the above code should be 0x80, but the result was 0x20, which I don't understand.

Can anyone explain these results?

Best Answer

Unlike value types like bytesN or uintN reference types (denoted by their memory, calldata or storage annotation) are merely pointers to their underlying data.

When you use inline-assembly to set a what you are setting is actually the pointer of that variable not the value of the byte-string itself. As you've stated correctly the free memory pointer at 0x40 initially holds 0x80, meaning when you do a := mload(0x40) you're essentially saying "the bytes variable a now points to the memory offset 0x80.

Considering that the memory is empty and the offset that a bytes object points to holds its length, Solidity interprets this as a bytes of length 0 because the word (32-byte segment) at 0x80 is 0.

When you then return the bytes Solidity proceeds to ABI encode it. Return values are encoded as a tuple of values, in this case a tuple with a single bytes: (bytes,). For variable length values such as bytes ABI encoding first encodes the offset in the return data, before encoding the length and actual data at that offset.

So the data you're getting should be interpreted as follows:

0x

offset of the first `bytes` value (32 bytes in the return data):
0000000000000000000000000000000000000000000000000000000000000020
at the encoded offset (32) the actual length of the `bytes` (0):
0000000000000000000000000000000000000000000000000000000000000000

Related Solutions

EVM – Understanding Confusion Around mload Opcode in EVM and Yul

There are 2 points that you should clearly understand in assembly context :

1 . All variables are value type in assembly

There is no such thing as a "reference type" in assembly, for instance _input in assembly context is the address of the byte array, it's not the byte array itself like in pure solidity. Considering that value as an address (i.e., a pointer) or anything else is just a matter of interpretation, it is by no means enforced by the language.

2 . Memory arrays have the following layout

The length of the array is stored at it's address (in that case, the value of _input is just the address where you will find the length of the array encoded on 32 bytes), the following memory addresses will contain the data on as much 32 bytes words as necessary.

You can read more about it in the documentation.

For a visual explanation, this would be the actual memory layout of the _input array if it were composed of 33 bytes each with value 0x01.

So :

mload(_input) loads the 32 byte value contained at the address _input (the value of _input is an address, the value of memory[_input] is the length of the _input solidity array.)

add(_input, 0x20) takes the address _input and adds 0x20 (32) to skip the 32 byte length field, the result is the address at which the data is actually starting in memory. Think of it as the address of _input[0] if you want.

keccak256 (SHA3) requires both the offset of the data to hash and its length. The offset is just the memory address where the data is starting (add(_input, 0x20)) and it's length is mload(_input) as we have seen just before.

I hope that answers your question. Don't hesitate to ask for precisions if anything is unclear.

Solidity – What Happens If Free Memory Pointer Is Not Updated in Assembly?

What will happen? Will it overwrite my data?

Yes, if the code following adheres to solidity's memory model AND uses memory, it will start writing at the address returned by mload(0x40) and overwrite your data.

  function example() public pure returns (bytes32, string memory) {
        bytes32 myValue;

        assembly {
            // mload(0x40) is 0x80 initially
            mstore(mload(0x40), 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF)
        }

        string memory myString = "hello";

        assembly {
            myValue := mload(0x80)
        }

        // myValue : 0x0000000000000000000000000000000000000000000000000000000000000005
        // myString : 'hello'
        return (myValue, myString);
    }

Here, myValue becomes 0x05 because the 32 bytes at 0x80 are now used by myString to store the length field. hello is 5 ASCII characters long, so its length is 5 bytes.

If you want to ensure that the memory space is secure (i.e., no side effects) then you must allocate the memory by incrementing the free memory pointer. Using memory above the free memory pointer but not allocating it is perfectly fine as long as you are treating that memory space as a scratch space. You can read more about it in the documentation.

I hope this answers your question.

Best Answer

Related Solutions

EVM – Understanding Confusion Around mload Opcode in EVM and Yul

Solidity – What Happens If Free Memory Pointer Is Not Updated in Assembly?

Related Topic