So I'm wondering if there is an answer in pure .NET for representing a collection of arbitrary data types. I know there's the old, late-bound, VB6 Collections, but I was looking for something like Generics, but either without having to specify the type at compile time, OR finding a clever way to allow the code to determine the type on its own and then call some generic class.
Why? I'm bored, and I thought it'd be fun to try and implement my own library for NBT, or NamedBinaryTag. It's the storage format used in the popular Minecraft game. Specification document is here: http://www.minecraft.net/docs/NBT.txt
I know there are existing implementations out there, but there's no point in copying those if I'm doing this solely as a learning experience to get a better grasp on file streams, byte arrays, endian conversion, and general .NET stuff (I used to fiddle with VB6/VBA a lot, so .NET is a huge change).
What's hanging me up is TAG_Compound. Per that specification, it's essentially a collection of objects of any other Tag type, including additional, nested TAG_Compounds. You can do some freaky nesting/recursion with this kind of a format.
I've got a rough outline in my head of how to do the other classes, but the storage of arbitrary types is just making me draw a blank on how to store that in a stub class (clsTagCompound) So that a generic class (clsNBT(Of T)) can use generic functions to access the payload.
List(Of T) looks like it could work if I could feed it a common interface. But since a Generic class will be the main component used, its interface is also generic, and that just leads to nasty generics chain (List(Of (clsNBT(Of XXX))).
Thoughts, tips, criticisms about my thinking?
Since this spec works with byte streams, here's hex output of what an uncompressed NBT file looks like (created using one of the Minecraft editors). It's a TAG_String wrapped in a TAG_Compound, which while not specifically stated, is usually the first TAG found in an NBT file and it encapsulates all other tags.
0A 00 04 72 6F 6F 74 08 00 06 66 6F 6F 62 61 72 00 07 50 49 52 41 54 45 21 00
From left-to-right:
Byte 1: TagType - specifies TAG_Compound.
Bytes 2-3: Length of string for the name of TAG_Compound.
Bytes 4-7: "root", name of TAG_Compound.
Byte 8: TagType - specifies TAG_String (embedded in TAG_Compound).
Bytes 9-10: Length of string for the name of TAG_String.
Bytes 11-16: "foobar", name of TAG_String.
Bytes 17-18: Length of Payload (TAG_String, so string length).
Bytes 19-25: "PIRATES!", payload of TAG_String.
Byte 26: TagType - specifies TAG_End, marks the end of a TAG_Compound or TAG_List.
Same basic principle applies to the other tag types. Very simple design, yet seems really powerful. Probably one of the reasons why a game at alpha-level code runs quite well, especially in Java.
EDIT: Here's a link to the level specification. It gives a more understandable way of seeing how these tags work together:
hxxp://www.minecraftwiki.net/wiki/Alpha_Level_Format#level.dat_Format
NOTE: Swap "hxxp" with "http" above. I lack enough reputation here to post multiple links (pft).
NOTE: I'm not too interested in doing any mods for the game -- I just find the NBT format neat and simple enough to be potentially useful. Already pondering on how to extend the format to handle Unsigned types in the tags (i.e., TAG_UInteger), and maybe prefixing the uncompressed stream with a magic number (like Linux/Unix executables have "ELF" in the first four bytes). That would prevent any issues from some of these tools being used to open arbitrary/unexpected data formats (and I will probably pass such ideas back to the game's developer, too).
EDIT2: So I changed things up. clsNamedBinaryTag is now an abstract class that implements a generic method defined in a generic interface:
Friend Interface INbt(Of T)
...
Function GetPayload() As T
Function SetPayload(ByRef data As BinaryReader) As Boolean
End Interface
Friend MustInherit Class clsNamedBinaryTag(Of T)
Implements INbt(Of T)
...
Protected Friend MustOverride _
Function GetPayload() As T _
Implements INbt(Of T).GetPayload
Protected Friend MustOverride _
Function SetPayload(ByRef data As BinaryReader) As Boolean _
Implements INbt(Of T).SetPayload
End Class
GetPayload is the generic method, since it will fetch and return payloads of arbitrary types. Great for the simple things like Strings and such. Not so great when we run into TAG_Compound.
What I'm thinking of doing, is making all derived classes implement INbt(Of T). For clsTagCompound, its SetPayload method will start walking a bytestream after the compound's name field is parsed. For each new TagType that it encounters, it would theoretically call DirectCast on a temp variable Dim'ed to INbt(Of T) to convert it to the class defining that particular TagType.
But this doesn't seem to work as planned. I believe my Catch 22 is that to even use clsTagCompound, I still have to define T, and that's where I get stuck again. I somehow need to create an interface that is NOT generic, yet can be applied to all the classes for the various Tag types and still call their GetPayload function to return the payload specific to a particular tag.