views:

147

answers:

4

Currently doing XML serialization however, it is very slow. Looking for a way to save/load information from file very quickly not really interested in how it looks on disc (if anything I want it to be obscured as I don't want manual editing).

Thinking of binary format however I am not sure if it would be able to serialize properties which may be of a custom type etc.

Any idea's?

+1  A: 

Binary serialization certainly works with properties of Custom Types and typically produces smaller files than XML serialization. It's certainly an approach you should consider if file size is an important factor for your situation.

JaredPar
Yeah keeping size down would really help. Thanks.
James
+6  A: 

What exactly is the data?

With xml, the obvious answer would be to use smoething like GZipStream to compress it - making it smaller and obscure. You could use BinaryFormatter but it is brittle and IMO unsuitable for long-term storage. I would say "protocol buffers", (maybe protobuf-net), but it depends what the "custom data" is. But if you are using XmlSerializer at the moment protobuf-net may work virtually without changes (maybe add a few attributes) - and it is (in every case I've seen to date) both smaller and faster than BinaryFormatter.

Here's the steep learning curve (see also: Getting Started):

[ProtoContract]
public class Person {
    [ProtoMember(1)]
    public int Id {get;set;}

    [ProtoMember(2)]
    public string Name {get;set;}

    //...
}

To be fair, it can get a little trickier if you are using inheritance - not much though. In many cases you can actually use your existing attributes - it'll work with xml / wcf attributes if an explicit element order is included.

Marc Gravell
Marc, was actually looking at and considering your protobuf-net solution. However, not sure about the learning curve involved? The data is going to be *record* type information e.g. Person.ID, Person.Name etc
James
I'll add the learning curve above, then ;-p
Marc Gravell
Marc, thanks appreciate the sample to get me up and running. Yeah pretty much all the classes inherit from a base class.
James
@James - note that *at the moment* it requires your base-class to know about the derived classes (very much like `[XmlInclude(...)]`). However, a new version is on the way that lets you express subclasses at runtime rather than in the code.
Marc Gravell
@Marc, any benefits of protobuf-net over SQLite?
James
They both have their uses. For example, if your existing code *doesn't* issue data queries, it could be far simpler to just switch the serialization engine and leave the rest of the code alone. Either would be fine, though.
Marc Gravell
@Marc, yeah thanks for the advice. I think for now I will go with the SQLite approach as I have an add-in for Visual Studio and can setup the DB at design time (your add-in for VS for protobuf didn't seem to install properly, does it work on win7 x64?). However, I will defo come back to protobuf if I find SQLite isn't working out for me. Thanks again.
James
You're welcome - good luck.
Marc Gravell
+7  A: 

You can try using Sqlite. It is very fast, and will give you complete database implementation with SQL queries on a file.

If you are thinking of trying binary formats, I suggest you try this first.

And can be used with ORM, and can be compressed and encrypted.

Am
Trying to avoid having anyone install anything on their machine. Although I suppose it is VERY minimal.
James
It's a single dll, you can bundle it with your app
Am
Less then 1mb..
Am
@Am, ah cool, not really familiar with SQLite. Defo consider this. The reason I said "non-database" solutions was because I wanted to avoid any setup on a client machine. However, if SQLite is not an installer then this might actually be the best solution.
James
@James, it is a very popular engine, very robust. I use it a lot, and i'm pleased.
Am
@Am, decided to go with the SQLite approach for the time being, I will re-word my question as it seems a little ironic that I accepted a database answer for a non-database question...
James
lol, good luck.
Am
I used sqlite for a ruby project - blew my mind how easy to install and functional it was. Didn't get around profiling it for large datasets but am looking for an excuse to use it on my next C++ project (and I'm not even a DB programmer).
Justicle
+1  A: 

I agree with Am about using an embedded database like SQLite. It comes with significant benefits. The ability to layer an ORM on top of it is probably the most significant.

XML Serialization is handy, particularly when you need to be able to edit the XML by hand or process it with other XML tools like XSLT etc, but it also has some unavoidable performance problems. One important technique when using XML Serialization in .Net is to cache the XML Serializers. Or to have them created by sgen on build.

The reason to cache the XML Serializer is related to the fact that the .Net runtime will automatically generate, compile and load an assembly containing a serializer if it can't find one in an already loaded assembly. This process can be really slow. Also constructing a new XMLSerializer instance can be quite slow. Hence why you should cache it. Be careful when caching the serializer though as different XMLSerializer constructors can produce different serializer implementations which behave differently. Particular with respect to namespaces, etc.

Then of course there is the usual performance implications of parsing a lot of text. Unfortunately that isn't easy to avoid with XML.

One of the reasons SQLite is a better choice than XML is related to the fact that it is, at its core, a fixed length record storage system. Any binary file with fixed length records is going to be fast to read, index and scan. Fixed block size file formats are almost always screamingly fast to read and write. I would recommend implementing one at some point for your own education.

If you still want a text based format (for ease of interoperability) and don't need the benefits of an ORM then consider using the FileHelpers library.

orj
James