views:

128

answers:

1

I want to write a parser for NBT (Named Binary Tags) structures. The format is represented like this:

TAG_Compound("hello world"): 1 entries
{
    TAG_String("name"): Bananrama
}

And in memory (or the file it's stored in) as hexadecimal view:

0000000: 0a 00 0b 68 65 6c 6c 6f 20 77 6f 72 6c 64 08 00  ...hello world..
0000010: 04 6e 61 6d 65 00 09 42 61 6e 61 6e 72 61 6d 61  .name..Bananrama
0000020: 00                                               .
  • 0x0a = TAG_Compound
    • 0x00 0x0b = name is 11 characters long
    • "hello world"
  • 0x08 = TAG_String
    • 0x00 0x04 = name is 4 characters long
    • "name"
    • 0x00 0x09 = payload is 9 characters
    • "Bananarama"
  • 0x00 = TAG_End

It can get more complicated with more and more nested TAG_Compounds like a tree structure.

Now my question is not exactly about parsing the format, it's really easy. I would rather like to know how I could efficiently and more importantly convenietly store it for later usage.

I know I can't obtain a degree of ease like

tags["hello world"]["name"] = "Bananrama"

But what's the best way to store it while keeping it easy to use? I thought about a nbt_compound structure (because every NBT tree has at least one root compound), have it store how many children it has and contain an array of nbt_value structs that would store the type and content of that value. Is that a good idea?

Edit: The full specification can be seen here

+2  A: 

I am certain this code is broken, but the idea is what I am trying to convey. I think I would use a Tag object, like

struct TagHeader
{
  TagType type;  // Enum of COMPOUND, STRING, etc
  char *name;
}

struct TagCompound
{
  TagHeader header;
  int nelems;
  void *children;
}

struct TagString
{
  TagHeader hearder;
  char *value;
}

With a function, like

void *get_value(void *root, char *name)
{
  int i;
  if (! root) return NULL;

  if (((TagHeader *) root)->type == COMPOUND)
  {
    TagCompound *c = (TagCompound *)root;
    for (i = 0; i < c->nelems; i++)
    {
      if (strcmp(((TagHeader *) c->values[i])->name, name) == 0)
      {
        return c->values[i];
      }
    }
    return NULL;
  } else if ( /* handle other tag Types */ ) {
  }
  return NULL;
}

Then access it like:

get_value(get_value(root, "Hello World"), "name");
Looks very close to what I was thinking about, thanks :)
LukeN