views:

276

answers:

5

I have the following code that serializes a List to a byte array for transport via Web Services. The code works relatively fast on smaller entities, but this is a list of 60,000 or so items. It takes several seconds to execute the formatter.Serialize method. Anyway to speed this up?

    public static byte[] ToBinary(Object objToBinary)
    {
        using (MemoryStream memStream = new MemoryStream())
        {
            BinaryFormatter formatter = new BinaryFormatter(null, new StreamingContext(StreamingContextStates.Clone));
            formatter.Serialize(memStream, objToBinary);
            memStream.Seek(0, SeekOrigin.Begin);
            return memStream.ToArray();
        }
    }
+1  A: 

It would probably be much faster to serialize the entire array (or collection) of 60,000 items in one shot, into a single large byte[] array, instead of in separate chunks. Is having each of the individual objects be represented by its own byte[] array a requirement of other parts of the system you're working within? Also, are the actual Type's of the objects known? If you were using a specific Type (maybe some common base class to all of these 60,000 objects) then the framework would not have to do as much casting and searching for your prebuilt serialization assemblies. Right now you're only giving it Object.

Clay Fowler
+2  A: 

if you want some real serialization speed , consider using protobuf-net which is the c# version of google's protocol buffers. it's supposed to be an order of magnitude faster that binary formatter.

geva30
+1 for this. I've found that for serialising an 80,000 item List<ulong> it takes 7 seconds for BinaryFormatter and 800ms for protobuf-net.
Callum Rogers
I'd love to do this, but the effort to convert the entire project to Proto buffers would take forever and probably not worth saving 4 seconds in performance gains.
AngryHacker
well, i recently converted quite a few classes to Proto, and it is very simple to do using attributes.
geva30
+1  A: 

.ToArray() creates a new array, it more be more effcient to copy the data to an existing array using unsafe methods (such as accessing the stream's memory using fixed, then copying the memory using MemCopy() via DllImport).

Also consider using a faster custom formatter.

Danny Varod
+1  A: 

This related question recommends dynamic method serialization:

Faster deep cloning

0xA3
+3  A: 

The inefficiency you're experiencing comes from several sources:

  1. The default serialization routine uses reflection to enumerate object fields and get their values.
  2. The binary serialization format stores things in associative lists keyed by the string names of the fields.
  3. You've got a spurious ToArray in there (as Danny mentioned).

You can get a pretty big improvement off the bat by implementing ISerializable on the object type that is contained in your List. That will cut out the default serialization behavior that uses reflection.

You can get a little more speed if you cut down the number of elements in the associative array that holds the serialized data. Make sure the elements you do store in that associative array are primitive types.

Finally, you can eliminate the ToArray but I doubt you'll even notice the bump that gives you.

Kennet Belenky
ToArray effectively takes zero time. .Serialize is the big resource drain.
AngryHacker