[Ethereum] How does solidity “tightly packed arguments” work in sha256()

hashsolidity

I'm trying to recreate the following Solidity code in python: sha256(x1, y1, x2, y2) where the x's are addresses and the y's are uint256's.

I've installed the correct sha256 library and I can successfully recreate a hash of one argument. But it doesn't work when I try it with multiple arguments, so I think I'm not "tightly packing" the arguments in the same way that Solidity is in sha256().

Here's what I'm trying right now in Python:

x1 = 0x0123456789012345678901234567890123456789
x2 = 0x0123456789012345678901234567890123456789
y1 = 0x0000000000000000000000000000000000000000000000000000000000000001
y2 = 0x0000000000000000000000000000000000000000000000000000000000000001

data = x1[2:] + x2[2:] + y1[2:] + y2[2:]
answer = sha256(data.decode("hex"))

the result is: 0x4e16b1812b95ddb26d9449dff36d48c85c2d0969f6725b117afc83ab2a354968

but an identical function in Solidity gives me: 0x0549a83cb0b851f2025d3600172681973b9944ab1b157e9fae00984061bce198

Any idea what I'm doing wrong?

Best Answer

First, I have to clarify that you stated that the sha3 hash is the result of sha256(x1, y1, x2, y2), but then you tried to calculate sha256(x1, x2, y1, y2). I don't know which of those gave you the result, so I am going to show the solution to both tasks.

Precognition

  1. the usage of sha3(...) is equal to the usage of keccak256(...) in Solidity. In reality, a keccak256 hash of the data is calculated.
  2. The usage of multiple arguments inside keccak256(...) is discouraged and will be deprecated in the future.

Cite (https://Solidity.readthedocs.io/en/v0.4.24/units-and-global-variables.html?highlight=abi.encodePacked#abi-encoding-functions):

Furthermore, keccak256(abi.encodePacked(a, b)) is a more explicit way to compute keccak256(a, b), which will be deprecated in future versions.

This note also leads to another realization. Since keccak256(a,b) is equal to keccak256(abi.encodePacked(a, b)), we have to find out how the abi.encodePacked(...) function operates to finally figure out which data was used for your sha256(x1,y1,x2,y2) call.

  1. abi.encodePacked(...) packs the arguments into one number.

Cite (https://Solidity.readthedocs.io/en/v0.4.24/abi-spec.html#non-standard-packed-mode):

Solidity supports a non-standard packed mode where:

no function selector is encoded, types shorter than 32 bytes are neither zero padded nor sign extended and dynamic types are encoded in-place and without the length.

As an example encoding int1, bytes1, uint16, string with values -1, 0x42, 0x2424, "Hello, world!" results in

0xff42242448656c6c6f2c20776f726c6421
  ^^                                 int1(-1)
    ^^                               bytes1(0x42)
      ^^^^                           uint16(0x2424)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^ string("Hello, world!") without a length field

More specifically, each statically-sized type takes as many bytes as its range has and dynamically-sized types like string, bytes or uint[] are encoded without their length field. This means that the encoding is ambiguous as soon as there are two dynamically-sized elements.

Solution using Solidity

After implementing the solution in Python, my results were not equal to your given hash. I was looking quite a long time for my error, but I could not find one. So I had to assume, that the hash you provided could actually not belong to the result of sha256(x1,x2,y1,y2) or sha256(x1,y1,x2,y2), which turned out to be the case.

You can execute the following code at https://remix.ethereum.org

pragma Solidity ^0.4.7;
import "remix_tests.sol"; // this import is automatically injected by Remix.

contract Test {

    function testPackingAndHashing() public pure returns (bytes, bytes, bytes32, bytes32) {
        address x1 = 0x0123456789012345678901234567890123456789;
        address x2 = 0x0123456789012345678901234567890123456789;
        uint256 y1 = 0x0000000000000000000000000000000000000000000000000000000000000001;
        uint256 y2 = 0x0000000000000000000000000000000000000000000000000000000000000001;
        return (abi.encodePacked(x1, x2, y1, y2), abi.encodePacked(x1, y1, x2, y2), 
            keccak256(abi.encodePacked(x1, x2, y1, y2)), keccak256(abi.encodePacked(x1, y1, x2, y2)));
    }
}

Executing the code gives me the following results:

0: bytes: 0x0123456789012345678901234567890123456789012345678901234567890123456789012345678900000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000001

1: bytes: 0x0123456789012345678901234567890123456789000000000000000000000000000000000000000000000000000000000000000101234567890123456789012345678901234567890000000000000000000000000000000000000000000000000000000000000001

2: bytes32: 0x8507cc9ba3e98ea475462d270a7c3d1f516850eef73042db8942d7d171164e79

3: bytes32: 0xb11d6ebcb58892c42346a73b14a0d1422310158ce075b733ad1e23b86309cc8c

Solution using Python

The existing Python libraries for Ethereum can be found in the package pyethereum (https://github.com/ethereum/pyethereum). They use the package "pycryptodome" (https://pycryptodome.readthedocs.io/en/latest/src/introduction.html) for their keccak256 hash calculations, which I will use in the following example as well.

Python code showing how the get the correct hash for solidities sha256(x1,y1,x2,y2) and sha256(x1,x2,y1,y2) calls:

from Crypto.Hash import keccak
from math import ceil

kec1 = keccak.new(digest_bits=256)
kec2 = keccak.new(digest_bits=256)

x1 = "0123456789012345678901234567890123456789"
x2 = "0123456789012345678901234567890123456789"
y1 = "0000000000000000000000000000000000000000000000000000000000000001"
y2 = "0000000000000000000000000000000000000000000000000000000000000001"

c1 = x1 + x2 + y1 + y2
c2 = x1 + y1 + x2 + y2

c1_byte_count = ceil(len(c1)/2)
c2_byte_count = ceil(len(c2)/2) # same as c1_byte_count of course

# note that Ethereum uses big endian representation for lists of bytes
c1_bytes = int(c1, 16).to_bytes(c1_byte_count, "big")
c2_bytes = int(c2, 16).to_bytes(c2_byte_count, "big")

c1_result = kec1.update(c1_bytes).hexdigest()
c2_result = kec2.update(c2_bytes).hexdigest()

print("Result of old solidity sha3(x1,x2,y1,y2): ", c1_result)
print("Result of old solidity sha3(x1,y1,x2,y2): ", c2_result)

Output:

Result of old solidity sha3(x1,x2,y1,y2): 8507cc9ba3e98ea475462d270a7c3d1f516850eef73042db8942d7d171164e79

Result of old solidity sha3(x1,y1,x2,y2): b11d6ebcb58892c42346a73b14a0d1422310158ce075b733ad1e23b86309cc8c

Related Topic