views:

151

answers:

2

I'm attempting to serialize a significant amount of binary data out to a custom file format using System.IO.Packaging.Package and PackagePart. I'm attempting to use a BinaryFormatter to output a set of detailed medical imaging datasets to distinct parts within a file/package.

I can use the BinaryFormatter to output all my data directly to a FileStream (not using System.IO.Packaging at all) and my sample data outputs about 140meg of data in around 12 seconds. Pretty fast and not too bad of a solution but I'd prefer a more flexible format supporting compression and the ability to store additional data in a flexible format.

Getting a stream via _packagePart.GetStream() and attempting to serialize data to this stream via a BinaryFormatter cases my data serialization to take about 5 to 10 minutes (and this is with compression turned off).

The System.IO.Packaging.Package class is somewhat of a black box that I don't have significant experience with. Any idea why streaming data to this format vs a direct binary formatter to a file would differ so radically in performance? I know my object can be serialized relatively quickly to a binary format. Why so long to write?

A: 

Perhaps it is because PackagePart uses compression.

Try lowering the level of compression

http://msdn.microsoft.com/en-us/library/system.io.packaging.compressionoption.aspx

http://msdn.microsoft.com/en-us/library/ms568067.aspx

Try NotCompressed first to see if u get an improvement.

Simon
+1  A: 

I did try to turn off compression (NotCompressed) with very little difference in speed. But I did, ultimately, find a workable solution.

Knowing that the BinaryFormatter seems to work OK when not going directly to a Package, I instead Serialize the data to a MemoryStream first. Then, using the CopyStream function below, I copy the MemoryStream over to the PackageStream.

    public static void CopyStream(Stream input, Stream output)
    {
        byte[] buffer = new byte[32768];
        while (true)
        {
            int read = input.Read(buffer, 0, buffer.Length);
            if (read <= 0)
                return;
            output.Write(buffer, 0, read);
        }
    } 

This solution gets my serialization speed down to 10-15 seconds total (compared to 10 minutes) and, the great thing is, I can turn on the Normal or High compression options and get about 50% compression on my data.

I don't really have a great answer as to why this has such a huge impact but was simply trying to get my code into a format I had more visibility on the loops writing to the Package to see if I could profile it better.

Kevin Grossnicklaus