[Ethereum] How does the Keccak256 hash function work

hash-algorithmkeccak

As the Ethereum platform relies on the Keccak256 hash algorithm, I'd like to get a better understanding of it.

My rough understanding is something like this:

a function accepting a finite set of bits into a giant imaginary rubik's cube which is then shunted about in a specific way. A subset of 256 bits are then returned. The function has the property that a change to a single input bit causes the output to change in an unpredictable way.

Is the above approximately true? You might see where I got the rubik's cube idea from if you look at Figure 1 here (I think this is the right spec).

There's also this, which I've read through, but it has not really soaked in.

How does the Keccak256 hash function work?

Best Answer

Keccak is nice that it has arbitrary inputs and an infinite input space. This enables one to "make a hash" of a super large file where each input causes the internal state to scramble up some more. The hash should entirely change if a single bit of data in the source is different - unlike say a CRC32, or a checksum. It means your password could be a million chars long maybe. It's stored on disk as a hash, much smaller in size.

Regarding Keccak, it uses a "Sponge Construction" lord knows what that is read up on it here: https://keccak.team/keccak_specs_summary.html If I understand it's a permutation chosen from a set of seven Keccak permutations, denoted I assume by reference to their bit depths as b∈{25,50,100,200,400,800,1600}.

The state is organized as an array of 5×5 lanes, each of length w∈{1,2,4,8,16,32,64} and 25 cells deep. When implemented on a 64-bit processor, a lane of Keccak can be represented as a tidy 64-bit CPU word.

Finally, to even entertain the thought of similar input causing collisions, you have to imagine this data traversing from base 25, through base 50, up to 1600 and back. Smart money is on this being quite very resistant to collisions (it's design goal?).

Related Solutions

Cryptography – Which Cryptographic Hash Function Does Ethereum Use?

Ethereum uses KECCAK-256. It should be noted that it does not follow the FIPS-202 based standard (a.k.a SHA-3), which was finalized in August 2015.

According to this, NIST changed the padding to SHA3-256(M) = KECCAK [512] (M || 01, 256). This was different from the padding proposed by the Keccak team in The Keccak SHA-3 submission version 3 (final, winning version). The difference is the additional '01' bits appended to the message. People are now calling the "submitted version 3" SHA-3 Keccak hashing "Keccak" and the finalized NIST SHA-3 standard "SHA-3".

Using this online generator and the Solidity Online Compiler, I tested the difference between Keccak-256 and SHA3-256. I hashed the word testing using Ethereum and the 2 SHA3 hashing algorithms:

Ethereum SHA3 function in Solidity = 5f16f4c7f149ac4f9510d9cf8cf384038ad348b3bcdc01915f95de12df9d1b02

Keccak-256 = 5f16f4c7f149ac4f9510d9cf8cf384038ad348b3bcdc01915f95de12df9d1b02

SHA3-256 (NIST Standard) = 7f5979fb78f082e8b1c676635db8795c4ac6faba03525fb708cb5fd68fd40c5e

Solidity SHA3 – How Does keccak256 Hash Uints?

Jehan's answer is great, but we need to explain one more thing: Why does sha3(1) in solidity produce b10e2d...fa0cf6?

This is because solidity's sha3 function hashes its inputs based on the argument types. Thus the value 1 will generate a different hash if it is stored as bytes8, bytes16, bytes32, etc. Since sha3(1) is being passed 1 as a number literal, it is converted into the smallest necessary type, uint8¹.

8 bits fit into 2 hex characters, so if you pad your input to 2 characters you will get the same result in web3:

Javascript:

web3.sha3(leftPad((1).toString(16), 2, 0), { encoding: 'hex' })
// 5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2

Likewise, you can cast the number on the solidity side:

Solidity:

// uint is equivalent to uint256
sha3(uint(1))
// b10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6

Javascript:

// note that the value is padded by 64 characters to fit 256 bits
web3.sha3(leftPad((1).toString(16), 64, 0), { encoding: 'hex' })
// b10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6

A note about `BigNumber` types:

They don't work automatically with web3.sha3. You have to convert them to hex first.

Solidity:

sha3(uint(100 ether))
// c7cc234d21c9cfbd4632749fd77669e7ae72f5241ce5895e410c45185a469273

Javascript:

// the .slice is to remove the leading '0x'
web3.sha3(leftPad(web3.toHex(web3.toWei(100)).slice(2).toString(16), 64, 0), { encoding: 'hex' })
// c7cc234d21c9cfbd4632749fd77669e7ae72f5241ce5895e410c45185a469273

EDIT:

I wrote a small lib that provides a version of web3.sha3 that exactly matches the behavior of sha3 in Solidity. Hopefully this clears up all your hashing woes :). https://github.com/raineorshine/solidity-sha3

Best Answer

Related Solutions

Cryptography – Which Cryptographic Hash Function Does Ethereum Use?

Solidity SHA3 – How Does keccak256 Hash Uints?

A note about BigNumber types:

EDIT:

Related Topic

A note about `BigNumber` types: