[Ethereum] Why don’t Ethereum addresses have checksums


A checksum, similar to its use in Bitcoin addresses, can primarily prevent mistyped or invalid addresses from being used, before a transaction with an invalid address is constructed.

Why don't Ethereum addresses have checksums? Was it an oversight that was overlooked by the designers, auditors, and the community until after Frontier was launched? (This question is more about the history, than ongoing and future efforts to rectify.)

Best Answer

Edited to add: As predicted, with the launch of the Ethereum Name Service (ENS), users and wallets have gradually begun switching over to using strings like "mywallet.eth" instead of the raw hex addresses. Because that name was not known at the time this answer was written, it refers to the same concept as a "namereg".

I can elaborate on this a little bit, because it's not just the fact that end users are eventually expected to use human-readable strings for normal day to day transactions. It's that the raw hexadecimal string that you're calling an "Ethereum address" wasn't even intended to be the standard way of representing that information.

You may or may not know that when you send a bitcoin transaction to a "bitcoin address" such as 1Q2TWHE3GMdB6BZKafqwxXtWAWgFt5Jvm3, the actual transaction itself doesn't contain the string "1Q2TWHE3GMdB6BZKafqwxXtWAWgFt5Jvm3". Instead, it decodes that representation into the real address 0xfc916f213a3d7f1369313d5fa30f6168f9446a2d, a pure hexadecimal representation that doesn't waste space on checksums and version bits. Look familiar?

It's true that the pure hexadecimal address itself doesn't contain any checksums. But there's nothing stopping you from writing software which uses the exact same method that Bitcoin does to create an encoding of that string in base 58 with a built-in version number and checksum. It would interoperate perfectly with the network by silently decoding the new "Ethereum address" into raw hexadecimal form. It could even accept both types of formats as long as you were careful to always include the "0x" on the front of the raw ones (which you should be doing anyways). Then you could send and receive with the exact same experience you have in Bitcoin. Perhaps with a different version number so that you don't accidentally mix up the addresses, though.

Vitalik has already pointed out one reason nobody bothered to do this for most Frontier apps. But there's another one, much more relevant. Ethereum apps don't take the Bitcoin approach because there is an even more featureful way of representing raw Ethereum addresses, called the ICAP, which looks like this: "XE7338O073KYGTWWZN0F2WZ0R8PX5ZPPZS". Like the standard Bitcoin address representation, it uses a wider range of alphanumeric characters to save space and includes a checksum. But that's not all, folks!

For one thing, the ICAP is a fully valid International Bank Account Number (or IBAN). That means that existing bank software can understand it and interact with it.

For another, the ICAP doesn't have to use hexadecimal addresses. Instead, once we all do switch over to using namereg contracts it can just use your actual human readable string to end up with something like "XE81ETHXREGJEFFCOLEMAN", which still matches bank formats but might be possible to actually remember!

Support for the ICAP is gradually growing, including within the official Ethereum clients. Perhaps one day soon, it will no longer be the case that the most common representation of an Ethereum address lacks a checksum!

Edit: As of February 2016, Vitalik has also implemented a transitional checksumming method where capitalisation of the otherwise case-insensitive hex address is used to provide some additional protection against accidental errors while remaining backwards compatible with software that doesn't support the checksum (and will ignore the case differences). Anyone developing software that supports inputting or displaying a raw hexadecimal encoding is strongly advised to implement this "capitals-based checksum" method.

With Vitalik's method, the address:


is compared against the raw binary keccack-256 hash of the address bytes, and where there are letters in the same corresponding place as a "1" bit the letter is capitalised (letters which correspond to the place of a "0" bit are left in lowercase form, and numbers are unchanged). This results in:


Almost all non-checksum aware code will simply ignore the case differences above and interpret this representation identically to the first one, so there is very little disadvantage to implementation of the capitals-based checksum.