[Ethereum] IPFS hash algorithm

blockchaingo-ethereumipfs

How actually is an IPFS hash (ipfs add) is generated?

I'm looking for step by step operations done on input(bytes/string) to get output(in this case IPFSHash)

Best Answer

IPFS uses multihash where the format is the following:

base58(<varint hash function code><varint digest size in bytes><hash function output>)

The list of hash function codes can be found in this table.

Here's some pseudocode of the process using SHA2-256 as the hashing function.

sha2-256   size  sha2-256("hello world")
0x12       0x20  0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

Concatenating those three items will produce

1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

Which then you encode it to base58

QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

Here's an example of how to implement multihash in JavaScript:

const crypto = require('crypto')
const bs58 = require('bs58')

const data = 'hello world'
const hashFunction = Buffer.from('12', 'hex') // 0x20

const digest = crypto.createHash('sha256').update(data).digest()
console.log(digest.toString('hex')) // b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

const digestSize = Buffer.from(digest.byteLength.toString(16), 'hex')
console.log(digestSize.toString('hex')) // 20

const combined = Buffer.concat([hashFunction, digestSize, digest])
console.log(combined.toString('hex')) // 1220b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

const multihash = bs58.encode(combined)
console.log(multihash.toString()) // QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

There's a CLI you can use to generate multihashes:

$ go get github.com/multiformats/go-multihash/multihash
$ echo -n "hello world" | multihash -a sha2-256
QmaozNR7DZHQK1ZcU9p7QdrshMvXqWK6gpu5rmrkPdT3L4

A file in IPFS is "transformed" into a Unixfs "file", which is a representation of files in a DAG. So when you use add to upload a file to IPFS, the data has metadata wrapper which will give you a different result when you multihash it.

For example:

$ echo -n "hello world" | ipfs add -Q
Qmf412jQZiuVUtdgnB36FXFX7xg5V6KEbSJ4dpQuhkLyfD

Here's an example in Node.js of how to generate the exact same multihash as ipfs add:

const Unixfs = require('ipfs-unixfs')
const {DAGNode} = require('ipld-dag-pb')

const data = Buffer.from('hello world', 'ascii')
const unixFs = new Unixfs('file', data)

DAGNode.create(unixFs.marshal(), (err, dagNode) => {
  if (err) return console.error(err)

  console.log(dagNode.toJSON().multihash) // Qmf412jQZiuVUtdgnB36FXFX7xg5V6KEbSJ4dpQuhkLyfD
})

Hope this helps

Related Topic