I want to write a parser for NBT (Named Binary Tags) structures. The format is represented like this:
TAG_Compound("hello world"): 1 entries
{
TAG_String("name"): Bananrama
}
And in memory (or the file it's stored in) as hexadecimal view:
0000000: 0a 00 0b 68 65 6c 6c 6f 20 77 6f 72 6c 64 08 00 ...hello world..
0000010: 04 6e 61 6d 65 00 09 42 61 6e 61 6e 72 61 6d 61 .name..Bananrama
0000020: 00 .
0x0a
= TAG_Compound0x00 0x0b
= name is 11 characters long- "hello world"
0x08
= TAG_String0x00 0x04
= name is 4 characters long- "name"
0x00 0x09
= payload is 9 characters- "Bananarama"
0x00
= TAG_End
It can get more complicated with more and more nested TAG_Compound
s like a tree structure.
Now my question is not exactly about parsing the format, it's really easy. I would rather like to know how I could efficiently and more importantly convenietly store it for later usage.
I know I can't obtain a degree of ease like
tags["hello world"]["name"] = "Bananrama"
But what's the best way to store it while keeping it easy to use? I thought about a nbt_compound
structure (because every NBT tree has at least one root compound), have it store how many children it has and contain an array of nbt_value
structs that would store the type and content of that value. Is that a good idea?
Edit: The full specification can be seen here