views:

3048

answers:

7

I'm working on a compact framework application and need to boost performance. The app currently works offline by serializing objects to XML and storing them in a database. Using a profiling tool I could see this was quite a big overhead, slowing the app. I thought if I switched to a binary serialization the performance would increase, but because this is not supported in the compact framework I looked at protobuf-net. The serialization seems quicker, but deserialization much slower and the app is doing more deserializing than serializing.

Should binary serialization should be faster and if so what I can do to speed up the performance? Here's a snippet of how I'm using both XML and binary:

XML serialization:

public string Serialize(T obj)
{
  UTF8Encoding encoding = new UTF8Encoding();
  XmlSerializer serializer = new XmlSerializer(typeof(T));
  MemoryStream stream = new MemoryStream();
  XmlTextWriter writer = new XmlTextWriter(stream, Encoding.UTF8);
  serializer.Serialize(stream, obj);
  stream = (MemoryStream)writer.BaseStream;
  return encoding.GetString(stream.ToArray(), 0, Convert.ToInt32(stream.Length));
}
public T Deserialize(string xml)
{
  UTF8Encoding encoding = new UTF8Encoding();
  XmlSerializer serializer = new XmlSerializer(typeof(T));
  MemoryStream stream = new MemoryStream(encoding.GetBytes(xml));            
  return (T)serializer.Deserialize(stream);
}

Protobuf-net Binary serialization:

public byte[] Serialize(T obj)
{
  byte[] raw;
  using (MemoryStream memoryStream = new MemoryStream())
  {
    Serializer.Serialize(memoryStream, obj);
    raw = memoryStream.ToArray();
  }

  return raw;            
}

public T Deserialize(byte[] serializedType)
{
  T obj;
  using (MemoryStream memoryStream = new MemoryStream(serializedType))
  {
    obj = Serializer.Deserialize<T>(memoryStream);
  }
  return obj;
}
+1  A: 

Interesting... thoughts:

  • what version of CF is this; 2.0? 3.5? In particular, CF 3.5 has Delegate.CreateDelegate that allows protobuf-net to access properties much faster than in can in CF 2.0
  • are you annotating fields or properties? Again, in CF the reflection optimisations are limited; you can get beter performance in CF 3.5 with properties, as with a field the only option I have available is FieldInfo.SetValue

There are a number of other things that simply don't exist in CF, so it has to make compromises in a few places. For overly complex models there is also a known issue with the generics limitations of CF. A fix is underway, but it is a big change, and is taking "a while".

For info, some metrics on regular (full) .NET comparing various formats (including XmlSerializer and protobuf-net) are here.

Marc Gravell
I'm using CF2.0, and I've added attributes to the properties for the objects I need to serialize.
Charlie
Is it possible to try it in CF 3.5 (with the CF 3.5 binary) just to see if that fixes it?
Marc Gravell
Ok, I've just run my test on CF3.5 and see significant performance increases from CF2; binary performs a lot quicker for both serialization and deserialization. Unfortunately I'm tied to CF2 though so might have to rethink things.
Charlie
Just to clarify my wording above.. I mean I see significant performance increases in CF3.5; CF2 is slower.
Charlie
Sorry..scratch that, I read the perf report wrong! Here's what I get testing a simple entity with 3 properties: XML Serialize 317ms XML Deserialize: 7ms Binary Serialize: 147ms Binary Deserialize: 19ms
Charlie
Is that averaged over a number of iterations? I'm also not sure whether those numbers are CF2 or CF3.5
Marc Gravell
The just the results from one test, but comes out very similar each time. Thats on 3.5
Charlie
For info, the first iteration has the overhead of building the model - subsequent calls may be quicker... I'm intrigued that it is slower than XmlSerializer. I'd love to pull it apart ;-(
Marc Gravell
A: 

Have you tried creating custom serialization classes for your classes? Instead of using XmlSerializer which is a general purpose serializer (it creates a bunch of classes at runtime). There's a tool for doing this (sgen). You run it during your build process and it generates a custom assembly that can be used in pace of XmlSerializer.

If you have Visual Studio, the option is available under the Build tab of your project's properties.

Tundey
A: 

Some possible related information and ideas regarding serialization performance.

Mr Carl Franklin did some performance measurements regarding manual serialization vs. BinaryFormatter. His findings were interesting, but I don't know if it could be useful for the compact framework.

Magnus Johansson
A: 

Is the performance hit in serializing the objects, or writing them to the database? Since writing them is likely hitting some kind of slow storage, I'd imagine it to be a much bigger perf hit than the serialization step.

Keep in mind that the perf measurements posted by Marc Gravell are testing the performance over 1,000,000 iterations.

What kind of database are you storing them in? Are the objects serialized in memory or straight to storage? How are they being sent to the db? How big are the objects? When one is updated, do you send all of the objects to the database, or just the one that has changed? Are you caching anything in memory at all, or re-reading from storage each time?

kyoryu
The objects are being stored in a SQLCe database, but I can clearly see that the serialization and deserialization is the performance hit, not the database interaction. Stuff is being cached in memory too, but need to store stuff in a DB so that it can be retreived between sessions of the app.
Charlie
A: 

I'm going to correct myself on this, Marc Gravall pointed out the first iteration has an overhead of bulding the model so I've done some tests taking the average of 1000 iterations of serialization and deserialization for both XML and binary. I tried my tests with the v2 of the Compact Framework DLL first, and then with the v3.5 DLL. Here's what I got, time is in ms:

.NET 2.0
================================ XML ====== Binary ===
Serialization 1st Iteration      3236       5508
Deserialization 1st Iteration    1501       318
Serialization Average            9.826      5.525
Deserialization Average          5.525      0.771

.NET 3.5
================================ XML ====== Binary ===
Serialization 1st Iteration      3307       5598
Deserialization 1st Iteration    1386       200
Serialization Average            10.923     5.605
Deserialization Average          5.605      0.279
Charlie
A: 

Binary serialization is faster, and is more secure from the prying eyes of your customers. The main drag is when you come out with new versions, or bug patches, and if the insides of your class change, then you'll have a heck of a time deserializing "old" data.

Tangurena
I wouldn't use 'secure' to describe this since the BinaryFormatter format is publicly documented (http://msdn.microsoft.com/en-us/library/cc236844(PROT.10).aspx) and strings and other data are stored in the clear.
alexdej
A: 

XML is often slow to process and takes up a lot of space. There have been a number of different attempts to tackle this, and the most popular today seems to be to just drop the lot in a gzip file, like with the Open Packaging Convention.

The W3C has shown the gzip approach to be less than optimal, and they and various other groups have been working on a better binary serialisation suitable for fast processing and compression, for transmission.

IanGilham