go-ethereum – Understanding Ethereum Wallet v3 Format

encryptiongo-ethereumwallets

Can someone explain what exactly Ethereum wallet file (v3) contains, and how it works. How exactly public-private key pair is encrypted. Why two encryption algorithms are mentioned in the file (aes-128-ctr and KDF-scrypt) ? How each of them is used ? What is "mac" value on the last line ? Considering following file:

{  
   "address":"2600a448db443dc49f3c0b6bf46e6f9110914568",
   "id":"712b2934-7ccd-4ef7-87f3-6384627d5b7d",
   "version":3,
   "crypto":{  
      "cipher":"aes-128-ctr",
      "ciphertext":"225c3c42c2d7834c844a26070b13da6d5ac0e812022e4a4be434833aef430ae6",
      "cipherparams":{  
         "iv":"75304b13fcf01c67536eb985f88dfc43"
      },
      "kdf":"scrypt",
      "kdfparams":{  
         "dklen":32,
         "n":262144,
         "p":1,
         "r":8,
         "salt":"cd623230c41b3c8a8a88547e150da6ca1653bff04951cedd13791243d910cb21"
      },
      "mac":"cbe1a233297b97518efdbebe4a250bf5b29461537384f0d83d9f016c747eff5f"
   }
}

Best Answer

It's a block cipher that uses a cryptographic hash function that encrypts or decrypts your private key of your Ethereum account. This is done after you generate a private key to encrypt the private key (aka make it so you must provide the password AND this file, not just the private key).

Generating a Private Key

I am going to skip over private key / public keys as much as I can, but we should at least cover it a bit.

  1. Create a random private key (64 (hex) characters / 256 bits / 32 bytes)

  2. Derive the public key from this private key (128 (hex) characters / 512 bits / 64 bytes). Math. ECDSA. Stuff.

  3. Derive the address from this public key. (40 (hex) characters / 160 bits / 20 bytes). Take the last 40 characters of the hash of the public key. Prefix with 0x.

  4. We are here: Encrypt the private key with a password.

Cryptographic Hash Functions & Block Ciphers

A block cipher is a deterministic algorithm operating on fixed-length groups of bits, called a block, with an unvarying transformation that is specified by a symmetric key. Block ciphers operate as important elementary components in the design of many cryptographic protocols, and are widely used to implement encryption of bulk data.

A cryptographic hash function is a special class of hash function that has certain properties which make it suitable for use in cryptography. It is a mathematical algorithm that maps data of arbitrary size to a bit string of a fixed size (a hash) and is designed to be a one-way function, that is, a function which is infeasible to invert. The only way to recreate the input data from an ideal cryptographic hash function's output is to attempt a brute-force search of possible inputs to see if they produce a match, or use a rainbow table of matched hashes. Bruce Schneier has called one-way hash functions "the workhorses of modern cryptography".1 The input data is often called the message, and the output (the hash value or hash) is often called the message digest or simply the digest.

Why?

Because you want 2 pieces of information instead of 2 in order to access your funds.

As you make your way through the below, you will see the n value. This value is the number of iterations, or rounds, that the function undergoes. The more rounds, the harder it is to brute-force.

This is the reason that if you try to unlock the above (hereon referred to as the keystore file on MyEtherWallet, it will take 20-30 seconds, depending on your computer and browser. Firefox will typically timeout during that process (you can press continue to have it keep going through the rounds).

Why is this more secure? Because given the above without a password, it will take 20 seconds for each password you guess. This makes it less efficient for a computer to brute force large numbers of these files. This is why MyEtherWallet are less concerned with the n value being a bit lower, as we don't store these on servers. If we did, enforcing complex passwords (possibly those not chosen by the user, as Blockchain wallet does) as well as a high n value would be more important. If our server was compromised, preventing bulk brute-forcing of these keystore files would be of utmost importance.

However, this does not increase the security if the attacker has both the keystore file AND the password. It may take 20 seconds, but it will only take 20 seconds. Basically, it only secures against brute-force style attacks.

The Stuff

"address":"2600a448db443dc49f3c0b6bf46e6f9110914568"

Your address, sans leading 0x.


"id":"712b2934-7ccd-4ef7-87f3-6384627d5b7d",

a random ID


"version":3,

So you can tell the difference between this and version 2 or 4. should act as a "hard" identifier of version, implementations may also use minor version to track smaller, non-breaking changes to the format.


"crypto"

"cipher":"aes-128-ctr"

The cipher to use. Names must match those of supported by OpenSSL, e.g. aes-128-ctr or aes-128-cbc.

AES is a standard. This ensures that a program can encrypt or decrypt it using standard libraries. This is important as it ensures that (1) a program knows which algorithm or library to use and (2) because you shouldn't roll your own crypto. 😉

The 128 represents the bits. Aka its a "128-bit key" or a "256-bit key". The higher the number, the more secure. More specifically...

With a key of length n bits, there are 2^n possible keys. This number grows very rapidly as n increases. The large number of operations (2^128) required to try all possible 128-bit keys is widely considered out of reach for conventional digital computing techniques for the foreseeable future.

Keep in mind that a raw private key is 256-bits. This is 128-bits. And remember above how we took the hash of the private key to derive the address? That hash function is Keccak-256. See. 128. 256. You get it now, right? No?

Okay, so each character of a private key can be A or B or CDEF1234567890. Put more simply, each time you choose a character it is 1 of 16 possible values. There are 64 characters. Therefore, there are 16^64 possibilities. 16^64, when reduced, is 2^256. Which makes it a 256-bit key. If you don't believe me, use https://www.wolframalpha.com/input/?i=16%5E64 vs https://www.wolframalpha.com/input/?i=2%5E256.

More Reading


"ciphertext":"225c3c42c2d7834c844a26070b13da6d5ac0e812022e4a4be434833aef430ae6",


"cipherparams"

"iv":"75304b13fcf01c67536eb985f88dfc43"

Initialization vector for the cipher. Size must match the requirements of the cipher. Random number generated via crypto.getRandomBytes if nothing is supplied.


"kdf":"scrypt"

The key derivation function. Could be scrypt or pbkdf2. this is how many times you do stuff to it to make it harder to brute force. This added computational work makes password cracking much more difficult, and is known as key stretching.

Instead of just going

password -> hashed password

It goes

password -> (MATH * 262144) -> hashed password.

Thus, when going backwards, you must include the (MATH * 10000000) which adds valuable time to each test your run. Instead of a single test taking .00000001 second, it may take .0001 second or 1 second.

This actually works too. Bitches are complaining that their GPUs are being slow as fuck when playing with hashcats new Ethereum wallet decrypter.

Since writing about cracking various Ethereum wallets using the JSON file, a few people have mentioned that their systems hang/blue screen when they start the crack, so I thought I’d talk about why this is.

tl;dr – If hashcat crashes/hangs your system, your wallet scrypt settings more than likely want more RAM than your GPU has. You’ll only be able to crack with a CPU ~~(adding -D 1 # where # is the number hashcat assigns your CPU will select all available CPU devices, or -D 1 -d for an individual CPU)~~ and the hash rate will still be slow 😦

So the reason some of your systems hang when starting hashcat is because the N results in hashcat trying to use more RAM than your GPUs have.

https://stealthsploit.com/2018/01/04/ethereum-wallet-cracking-pt-2-gpu-vs-cpu/


"kdfparams"

The parameters for stetching this, so you can unstretch it in the same way later.


"dklen":32

Derived key length (in bytes). For certain cipher settings, this must match the block sizes of those.


"n":262144

Iteration count. Defaults to 262144 for geth. Lower for MEW as browsers hate doing something 262,144 times. Basically, this is the number in the (MATH * 262144) from above.


"p":1

Parallelization factor. Defaults to 1. Basically, you have to go 0...1...2...262143...262144 or 262144...262143...2...1...0, you can't run them in parallel. I don't know if it changes anything if you change it to 2 with AES. I don't think it does as the hash of Round #2 is the hash of Round #1 which is the hash of Round #0.

https://crypto.stackexchange.com/questions/26021/can-aes-in-cbc-mode-be-parallelized


"r":8

Block size for the underlying hash. Defaults to 8.


Okay so this is going to cover scrypt and r and n and p. Ready?

n r p values and their affects on CPU vs GPU cracking

https://stealthsploit.com/2018/01/04/ethereum-wallet-cracking-pt-2-gpu-vs-cpu/


"salt":"cd623230c41b3c8a8a88547e150da6ca1653bff04951cedd13791243d910cb21"

Random salt for the kdf. Size must match the requirements of the KDF (key derivation function). Random number generated via crypto.getRandomBytes if nothing is supplied.


"mac":"cbe1a233297b97518efdbebe4a250bf5b29461537384f0d83d9f016c747eff5f"

SHA3 (keccak-256) of the concatenation of the last 16 bytes of the derived key together with the full ciphertext.


I'm out of time...feel free to edit away.

Related Topic