Transaction Tries and Transaction Receipt Tries are indeed independent data structures with distinct roots stored on the blockchain header and differ in both purpose and content.
Purpose:
Content:
Parameters used in composing a Transaction Trie [details in section 4.3 of the yellow paper]:
- nonce,
- gas price,
- gas limit,
- recipient,
- transfer value,
- transaction signature values, and
- account initialization (if transaction is of contract creation type), or transaction data (if transaction is a message call)
Parameters used in composing a Transaction Receipt Trie [details in section 4.4.1 of the yellow paper]:
- post-transaction state,
- the cumulative gas used,
- the set of logs created through execution of the transaction, and
- the Bloom filter composed from information in those logs
Firstly, you'll want to take a look at this picture from a previous question for reference.
Q1 - Where does geth stores these tries - I reckon within the block
themselves, i.e. chaindata folder, within the ldb files? Or am I
entirely off-base?
The chain data isn't actually part of the block proper - it's stored separately in a leveldb database. On your machine this is what's inside the chaindata
folder. The block itself stores the hashes of the roots of the various tries, the state data (i.e. chain data) being one of these.
See:
Q2 - (I am assuming the state trie is indeed kept in each block) Does
the state trie of block N references the state trie of block N-1 (if
no changes to an account's state is found, i.e. only log accounts with
differences), or is the state trie duplicated across blocks?
Your first assumption isn't correct, as per Q1, but yes, the state trie references backwards to prevent duplication. This picture from this previous answer helps visualise this:
Q3 - While state tree pruning, I reckon that the states are pruned off
since if we have the stateRoot, this effectively verifies that the
state trie nodes are OK, hence safe to discard the state trie
themselves - Would this understanding be correct from high level
perspective?
I'm not entirely sure, but this previous official blog post might help:
https://blog.ethereum.org/2015/06/26/state-tree-pruning/
Q4 - What would the .\ethereum\nodes folder be for?
It's a database of nodes your node knows about. It's blobified in RLP format, so it's not readily readable. For further details, see Format of LevelDB files in nodes directory? Trouble pulling contents with python leveldb API
Best Answer
Low level geth database format is:
To get data you have to recursivily build tree`s from this data. Knowing hash of state root you can find state root, and then you know hashes of children of state root, so you know children so you can get up to leafs.
Depending on geth option
--gcmode archive|fast|light
(you can also specifie how many blocks you want to remember), geth stores or doesn`t some tries.Diffrent tries are world state tree (links to accounts), storage tries (account data), and receipt tries (for transaction receipts).
To get value "sample value" from tree (for example contract adress). You need to go 32 length way down the tree depending on 32 chars length sha3("sample value").
To understand better which data are stored in db and how tries are made look at these to pictures: