I understand that Merkle tree are Hashes of Hashes, they have the advantage that you can verify only a subtree. But what about Patricia? What does a trie mean? And how is it used in Ethereum?
[Ethereum] ELI5 How does a Merkle-Patricia-trie tree work
blockchainmerkle-patricia-tries
Related Solutions
I think, I have an idea. An important property of the Modified Merkle Patricia Tree is that it’s versioned. You can take a tree version 1 with root hash H1
, insert new key-value pairs into the tree, and you will get a new tree version 2 with root hash H2
. However, H1
will still be in the database, will still be accessible, and will still denote the complete tree version 1. This stricture is also very space-efficient, because most of the nodes will stay the same and will be shared between tree versions. Only the nodes along the path of added key-value pairs will be new.
Now, branch nodes are much bigger (17 elements) than extension nodes (2 elements). (This is, as I understand, the only way to distinguish one from another.) Because of that, we would like to try to keep them unmodified as much as we can when updating the tree to a new version.
Imagine now that you have a combined extension-branch node N1 that has a prefix 12345
. We are inserting a new value with the key 123ffaaee
. We would have to:
- Replace N1 with a new branch node N2 with prefix
123
- Insert a new leaf node N3 for N2 branch
f
(with prefixfaaee
) - Copy N1 to N4, modify only its prefix from
12345
to5
and attach it to N2 branch4
.
Now we have two big extension-branch nodes, N1 and N4, with the same branching part and only different prefixes. If we would have used separate extension and branch nodes, we wouldn’t have to duplicate the branch node, only its extension parent node would have been split into two.
An important architectural decision here is that we much-much-much less care about the performance of the tree than about its footprint (which only grows in size with every transaction, has to be kept on every full Ethereum node, and therefore pose a threat to the scalability of Ethereum).
However, I also see a flaw in my logic: if there is a branch node with a prefix, big chances are that it does not have a lot of branches. In which case it will not have a large footprint, because the RLP
encoding will encode every null
branch with a single byte.
Best Answer
Trie (also called digital tree, prefix trie or radix trie)
An ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. A node's position in the tree defines the key with which it is associated. https://en.wikipedia.org/wiki/Trie
A trie for keys "A","to", "tea", "ted", "ten", "i", "in", and "inn".
Patricia - Practical Algorithm To Retrieve Information Coded In Alphanumeric (source)(orginial paper by Donald R. Morrison). A Patricia trie is a binary radix trie - binary choice at each node when traversing the trie; this is modified in Ethereum.
Source: https://en.wikipedia.org/wiki/File:An_example_of_how_to_find_a_string_in_a_Patricia_trie.png
In ethereum, hexadecimal is used - X characters from an 16 character "alphabet". Hence nodes in the trie have 16 child nodes (the 16 character hex "alphabet") and a maximum depth of X. Note a hex character is referred to as a "nibble".
Merkle Patricia Trie
As described here, the term Merkle implies that
Ethereum Modified Merkle Patricia Trie
The yellow paper describes a modified merkle patricia trie. This defines three different node types; extension, branch and leaf. These are descibed, using a simplified world state, in the diagram below: