views:

100

answers:

3

Which serialization should I use?

I need to store a large Dictionary with 100000+ elements, and I just need to save and load this data directly without caring whether it's binary or whether it's formatted or not.

Right now I am using the BinarySerializer but not sure if it's the most effective?

Please suggest better alternatives in the .NET standard libraries or an external library, preferably free.

EDIT: This is to serialize to disk and from it. The app is single threaded too.

+2  A: 

This is entirely a guess since I haven't profiled this (ie. which is what you should do to truly get your answer).

But my guess is that the binary serializer would give you the best performance. Both in size and speed.

Joel Martinez
+4  A: 

Well, it will depend on what's in the dictionary - but if Protocol Buffers is flexible enough for you (you have to define your own types to serialize - it doesn't do all .NET types or anything like that), it's pretty darned fast.

For example, in protocol buffers I'd represent the dictionary as a message with a repeated key/value pair field. For ultimate speed you could use the CodedOutputStream and CodedInputStream to serialize/deserialize the dictionary directly rather than reading it all into memory separately first. Again, it'll depend on what the key/value types are though.

Jon Skeet
Thanks Jon. It's basically <int, Media> where media has 1 int and 2 string public members and 1 private static member. Does it care about private members. Also with PBs, do you just mark stuff as serializable or is it more complicated than that, as in you can serialize the same thing in 10 different ways?
Joan Venge
You describe the types involved in "messages" and then the classes are autogenerated for you (as partial classes, so you can add your own logic if you want).
Jon Skeet
Thanks Jon, I will definitely try this. If you wrote this, it can't be wrong.
Joan Venge
+1  A: 

This is a bit of an open-ended question. Are you storing this in memory or writing it to disk? Does this execute in a multi-threaded (and perhaps multi-concurrent-access) environment? Context is important.

BinarySerializer is generally going to be pretty fast, and there are external libs that provide better compression such as ProtoBuffers. I've personally had good success with DataContractSerializer.

The great thing about all these options is that you can try all of them (relatively pain free) to learn for yourself what works in your environment and operation.

jro
Thanks, writing to disk. It's single threaded.
Joan Venge