Code is run on the blockchain through the use of smart contracts. Each smart contract has an address, storage, and code. When a transaction is sent to a contract's address, it's code is run on every node, inside the Ethereum Virtual Machine (EVM), and the contract can send Ether, call other contracts, and write to it's own storage.
Scaling is handled through the use of gas. Every computational step, even sending a regular transaction or creating a contract, requires some amount of gas (which can be negative for operations like deleting stored data). This gas is sent along with the transaction, and is paid to the miner, and if insufficient gas is provided, all state changes are rolled back.
When a transaction is sent, the user specifies the maximum gas that they are willing to spend, and the price (in Ether per gas) that they are willing to pay for it. Miners are then able to choose to process transactions that provide the highest gas price.
Miners also get to choose the block gas limit
The gas limit determines how much computation and storage can be used per block. On each block, the miner gets to set the new gas limit, but may only increase or decrease the limit by 1/1024th of the previous limit.
This means that miners are able to vote to keep the block sizes small enough to process, and the price of computation is flexible, depending on supply and demand.
Other scaling techniques
Off-chain computation:
- By using computation markets, code execution can be done off-chain by default, and only verified on-chain if there is a dispute.
State Channels:
- Similar to payment channels, parties lock some part of the state into a contract that requires signatures from all parties to edit. Parties transact amongst themselves and only submit the updated state if there is a dispute.
More theoreticall stuff:
Short answer is don't store chunks of files in the blockchain. It's not well-suited for this purpose.
Better answer. It helps to consider separate concerns:
- Smart Contracts. Minimalist data that must be true for all users at all times. Focus is on fidelity, as it's extremely difficult to post false data into a contract that's careful about updates.
- Distributed Storage of large objects. Swarm, Storj and IPFS are examples of distributed object storage/distributed file system. They use various strategies for smearing out copies and shards across participating nodes, but they are not putting all things in all places at all times.
At a very high level, we can consider a combination of these approaches. A distributed storage system can provide resilient storage of a blob, media, or JSON object with a lot of data, known by it's file name or path; a unique identifer.
A Smart Contract can track the identifier of the object that is valid, now. It can hold a validation hash useful for confirming that the correct data has been loaded from the "other" source.
For example, one might have a contract that says "There is a movie called HomeMovie" located at "url ..." and its hash is "0x456" and this is the "valid" object as of this time. If an authorized user changes the file (version), they would update the Smart Contract; "Now, the movie we're using is HomeMovie2 located at "new url ..." and its hash is "0x567".
This keeps the Ethereum storage to a minimum, while providing a way to confirm no corruption of the data in the overall system. There is an extensive assortment of projects working toward a unified view of things; that is, one interface that does it all.
Hope it helps.
Update: See Blog: Simple Storage Patterns in Solidity
Best Answer
The state gets stored in the blockchain, yes. You are also correct in saying that every full-node has a copy of the entire blockchain, including all the states.
The miners job is to simply solve the POW puzzle. When they do that, they have the ability to mine the block. They now choose which transactions they want included in that specific block. These transactions define all the state changes that were made, so the miner technically doesn't have to look into that.
In terms of the ASIC question, Ethereum POW was designed to be memory-hard, meaning that it is difficult for ASICs to mine.
See my answer above as the answer to this question as well. To summarize, the miner simply chooses a transaction. The reason it doesn't get run multiple times is because there is only one miner that finds the block first and gets rewarded for it. This is the only miner that broadcasts the transaction, as well as the contract calls, to the Ethereum network.
See answer one. You are correct—blockchains (as they stand) are quite inefficient and slow, and each node does hold a copy of this state.
It does not alter the blockchain, as that is impossible. What it does do (without getting too technical) is disallows anybody from calling that contract again. The best example to look at is the Parity wallet exploit, as it is a critical example of the usage of
selfdestruct
.