Etherscan – Discrepancy Between Opcode View and Opcode Disassembler for USDT Contract

bytecodecontract-debuggingetherscanevmopcode

We have USDT smart contract, which is located at 0xdac17f958d2ee523a2206206994597c13d831ec7. I am trying to decompile the smart contract (at least view it as at opcodes). What I see is that it begins with
60606040526000 (ABI on etherscan)

[0] PUSH1 0x60 -> 6060
[2] PUSH1 0x40 -> 6040
[4] MSTORE -> 52
[5] PUSH1 0x00 -> 6000

and so says opcode tool, but if you Switch to opcode view on contract page
you will see the following opcodes:

PUSH1 0x60
PUSH1 0x40
MSTORE
PUSH1 0x04
CALLDATASIZE
LT
PUSH2 0x0196
JUMPI
PUSH1 0x00
CALLDATALOAD

which are not

# from opcode tool

[1] PUSH1 0x60
[3] PUSH1 0x40
[4] MSTORE
[6] PUSH1 0x00
[7] DUP1
[9] PUSH1 0x14
[12] PUSH2 0x0100
[13] EXP
[14] DUP2
[15] SLOAD
[16] DUP2

the question is why do those differ? And what is the right one? I guess the second variant is right due to the docs, but can etherscan mistake? By the way, if you know any docs on compiled smart contract structure it'd be nice to share, so I and everyone could understand it better.

Best Answer

Your "opcode tool" link leads to a completely different address (0x9e1b57fc92eba6434251a8458811c32690f32c45). If you check opcodes for your original address, you'll see they're the same:

0xdac17f958d2ee523a2206206994597c13d831ec7 (code)

PUSH1 0x60
PUSH1 0x40
MSTORE
PUSH1 0x04
CALLDATASIZE
LT
PUSH2 0x0196
JUMPI
PUSH1 0x00
CALLDATALOAD
PUSH29 0x0100000000000000000000000000000000000000000000000000000000
SWAP1
DIV
...

0xdac17f958d2ee523a2206206994597c13d831ec7 (disassembler)

[1] PUSH1 0x60
[3] PUSH1 0x40
[4] MSTORE
[6] PUSH1 0x04
[7] CALLDATASIZE
[8] LT
[11] PUSH2 0x0196
[12] JUMPI
[14] PUSH1 0x00
[15] CALLDATALOAD
[45] PUSH29 0x0100000000000000000000000000000000000000000000000000000000
[46] SWAP1
[47] DIV 
...

By the way, if you know any docs on compiled smart contract structure it'd be nice to share, so I and everyone could understand it better.

Check out this multi-part article series from OpenZeppelin: Deconstructing a Solidity Contract —Part I: Introduction.

Note that this just describes the bytecode produced by the Solidity compiler. Currently EVM does not enforce any structure so different compilers could do it differently. All EVM does is start executing the binary blob at position 0 and go wherever the jumps take it. Some parts of the binary might never be executed - you can for example just append random junk to any valid bytecode and it will remain valid (and that part won't be executed because there can be no jumps to it). solc uses this fact to create code/data sections (sub-assemblies/sub-objects) and add metadata hash at the end. For example the runtime code to be deployed is a sub-assembly. If your contract deploys other contracts with new, the bytecode of each of these contracts also gets a seperate sub-assembly.

This free-form structure has its downsides and is going to change in the future. See EIP-3540: EVM Object Format (EOF) v1, which is a new standard that will make it more rigid.

Related Solutions

Ethereum EVM – List of Available OPCODES

All of the opcodes and their complete descriptions are available in the Ethereum Yellow paper. For convenience, though, I've made a handy reference list of them all:

0s: Stop and Arithmetic Operations

0x00    STOP        Halts execution
0x01    ADD         Addition operation
0x02    MUL         Multiplication operation
0x03    SUB         Subtraction operation
0x04    DIV         Integer division operation
0x05    SDIV        Signed integer
0x06    MOD         Modulo
0x07    SMOD        Signed modulo
0x08    ADDMOD      Modulo
0x09    MULMOD      Modulo
0x0a    EXP         Exponential operation
0x0b    SIGNEXTEND  Extend length of two's complement signed integer

10s: Comparison & Bitwise Logic Operations

0x10    LT      Lesser-than comparison
0x11    GT      Greater-than comparison
0x12    SLT     Signed less-than comparison
0x13    SGT     Signed greater-than comparison
0x14    EQ      Equality  comparison
0x15    ISZERO  Simple not operator
0x16    AND     Bitwise AND operation
0x17    OR      Bitwise OR operation
0x18    XOR     Bitwise XOR operation
0x19    NOT     Bitwise NOT operation
0x1a    BYTE    Retrieve single byte from word

20s: SHA3

0x20    SHA3    Compute Keccak-256 hash

30s: Environmental Information

0x30    ADDRESS         Get address of currently executing account
0x31    BALANCE         Get balance of the given account
0x32    ORIGIN          Get execution origination address
0x33    CALLER          Get caller address. This is the address of the account that is directly responsible for this execution
0x34    CALLVALUE       Get deposited value by the instruction/transaction responsible for this execution
0x35    CALLDATALOAD    Get input data of current environment
0x36    CALLDATASIZE    Get size of input data in current environment
0x37    CALLDATACOPY    Copy input data in current environment to memory This pertains to the input data passed with the message call instruction or transaction
0x38    CODESIZE        Get size of code running in current environment
0x39    CODECOPY        Copy code running in current environment to memory
0x3a    GASPRICE        Get price of gas in current environment
0x3b    EXTCODESIZE     Get size of an account's code
0x3c    EXTCODECOPY     Copy an account's code to memory

40s: Block Information

0x40    BLOCKHASH   Get the hash of one of the 256 most recent complete blocks
0x41    COINBASE    Get the block's beneficiary address
0x42    TIMESTAMP   Get the block's timestamp
0x43    NUMBER      Get the block's number
0x44    DIFFICULTY  Get the block's difficulty
0x45    GASLIMIT    Get the block's gas limit

50s Stack, Memory, Storage and Flow Operations

0x50    POP         Remove item from stack
0x51    MLOAD       Load word from memory
0x52    MSTORE      Save word to memory
0x53    MSTORE8     Save byte to memory
0x54    SLOAD       Load word from storage
0x55    SSTORE      Save word to storage
0x56    JUMP        Alter the program counter
0x57    JUMPI       Conditionally alter the program counter
0x58    PC          Get the value of the program counter prior to the increment
0x59    MSIZE       Get the size of active memory in bytes
0x5a    GAS         Get the amount of available gas, including the corresponding reduction
0x5b    JUMPDEST    Mark a valid destination for jumps

60s & 70s: Push Operations

0x60    PUSH1   Place 1 byte item on stack
0x61    PUSH2   Place 2-byte item on stack
…
0x7f    PUSH32  Place 32-byte (full word) item on stack

80s: Duplication Operations

0x80    DUP1    Duplicate 1st stack item
0x81    DUP2    Duplicate 2nd stack item
…
0x8f    DUP16   Duplicate 16th stack item

90s: Exchange Operations

0x90    SWAP1   Exchange 1st and 2nd stack items
0x91    SWAP2   Exchange 1st and 3rd stack items
…   …
0x9f    SWAP16  Exchange 1st and 17th stack items

a0s: Logging Operations

0xa0    LOG0    Append log record with no topics
0xa1    LOG1    Append log record with one topic
…   …
0xa4    LOG4    Append log record with four topics

f0s: System operations

0xf0    CREATE          Create a new account with associated code
0xf1    CALL            Message-call into an account
0xf2    CALLCODE        Message-call into this account with alternative account's code
0xf3    RETURN          Halt execution returning output data
0xf4    DELEGATECALL    Message-call into this account with an alternative account's code, but persisting the current values for `sender` and `value`
0xf5    CREATE2         Create a child contract with a deterministic address

Halt Execution, Mark for deletion

0xff    SELFDESTRUCT    Halt execution and register account for later deletion

Precompiled Contracts vs Native Opcodes in Ethereum – Key Differences

Even though I do not know the real reason, I will try to guess. There would be the following considerations:

Size of the namespace. There are not so many possible opcodes, so these need to be allocated very sparingly. The space of contract addresses, on the other hand, is practically unlimited for all practical purposes.
Risk of name re-use. It is a good software engineering principle not to reuse names (or opcodes), especially in the system where one does not control the upgrades.
Utility. There are some operations that are always useful, like arithmetic operations, bit twiddling, flow control and others. Cryptographic primitives, on the other hand, may be in the future proven inadequate, and something else will be used instead. Making such primitives into opcodes is taking a risk of spending valuable namespace on something that could become obsolete.
Gentle promotion of popular/useful code. If certain things, for instance, zkSNARKs operations, or Dogecoin PoW verification, starting with solidity code, then being partially optimised, become very useful and popular, they might become pre-compiled contract. Such promotion is a much gentler change to the network than introducing a new opcode.

Best Answer

Related Solutions

Ethereum EVM – List of Available OPCODES

Precompiled Contracts vs Native Opcodes in Ethereum – Key Differences

Related Topic