I'm having problem understanding the NBT tag format.
I know about the TAG_INT
and TAG_CHAR
. In fact, I have done some basic programming.
I want to create an Inventory editor for Minecraft. But the world data file is in a binary format and I can't find a program (with source code for C) that can help me translate all this data.
How does NBT format work?
Best Answer
In order to make an inventory editor you don't really need to know how the binary structure of NBT is as you could just use one of the existing NBT libraries.
For example, check the following website that lists some libraries for 11 different popular programming languages.
That website also pretty much sums up the basics of the NBT system.
If you however want to build it from "nothing" you should keep in mind you can encounter NBT data that is in different formats (Uncompressed or compressed with either gzip or zlib).
As far as I am aware a NBT file normally always starts with 1 Compound element however this is not required by the specifications.
Each element starts with 1 byte, which specifies the type (also called the tag type).
Every tag/element except for the End tag has a name. When it has a name, the tag id byte is followed by 2 bytes (big endian, e.g. 00 0A (hex) means length 10) specifying the length of the string. This length is then followed by N bytes, these bytes are the bytes of the string.
These name bytes are then followed by the actual data of the tag. The data of TAG_Byte, TAG_Short, TAG_Int, TAG_Long are big-endian numbers stored in respectively 1,2,4 and 8 bytes. Note: Java has no unsigned data types, so assume these are signed types
The TAG_Float and TAG_Double are 4 and 8 bytes. According to 1 they are stored as big endian IEEE-754 single/double precision floating point numbers. How to parse these might depend on your programming language of choice.
The data of the array tags (TAG_Byte_Array/TAG_Int_Array) start with a 4-byte 32-bit integer which indicates the length of the array. After the length it contains C*N bytes, where N is the length read and C the amount of bytes needed per element (So 1 for bytes, 4 for integers)
The data of TAG_String is 2 bytes (short) indicating the length and then length-bytes for the string characters.
TAG_Compound is essentially a container for multiple nodes. It's data is other tags and all future tags are a child of this tag, until a TAG_End is read.
The TAG_List tag is a list of values of one specific type. It's data contains of 1 byte indicating the type (Refer to the TAG's listed above) followed by 4 bytes specifying the amount of elements. Each element is read by only reading the data section of the associated tag. (So, excluding the TagId byte the name-length and the name characters)
To summarize: Lets specify [NAME_BLOCK] as the 2 bytes containing the length and the bytes (length) containing the characters.
A few examples:
According to this this website the player files are compressed with GZip. Also you might find this page useful
In case you are in need of an example, here is a java example (Only capable of reading, outputs as JSON (with tag-types etc), java 1.8+, requires GSon).