views:

348

answers:

2

I wrote a program to serialize a 'Person' class using XMLSerializer, BinaryFormatter and ProtoBuf. I thought protobuf-net should be faster than the other two. Protobuf serialization was faster than XMLSerialization but much slower than the binary serialization. Is my understanding incorrect? Please make me understand this. Thank you for the help.

EDIT :- I changed the code (updated below) to measure the time only for the serialization and not creating the streams and still see the difference. Could one tell me why?

Following is the output:-

Person got created using protocol buffer in 347 milliseconds

Person got created using XML in 1462 milliseconds

Person got created using binary in 2 milliseconds

Code below

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ProtoBuf;
using System.IO;
using System.Diagnostics;
using System.Runtime.Serialization.Formatters.Binary;
namespace ProtocolBuffers
{
    class Program
    {
        static void Main(string[] args)
        {

            string folderPath  = @"E:\Ashish\Research\VS Solutions\ProtocolBuffers\ProtocolBuffer1\bin\Debug";
            string XMLSerializedFileName = Path.Combine(folderPath,"PersonXMLSerialized.xml");
            string ProtocolBufferFileName = Path.Combine(folderPath,"PersonProtocalBuffer.bin");
            string BinarySerializedFileName = Path.Combine(folderPath,"PersonBinary.bin");

            if (File.Exists(XMLSerializedFileName))
            {
                File.Delete(XMLSerializedFileName);
                Console.WriteLine(XMLSerializedFileName + " deleted");
            }
            if (File.Exists(ProtocolBufferFileName))
            {
                File.Delete(ProtocolBufferFileName);
                Console.WriteLine(ProtocolBufferFileName + " deleted");
            }
            if (File.Exists(BinarySerializedFileName))
            {
                File.Delete(BinarySerializedFileName);
                Console.WriteLine(BinarySerializedFileName + " deleted");
            }

            var person = new Person
            {
                Id = 12345,
                Name = "Fred",
                Address = new Address
                {
                    Line1 = "Flat 1",
                    Line2 = "The Meadows"
                }
            };

            Stopwatch watch = Stopwatch.StartNew();

            using (var file = File.Create(ProtocolBufferFileName))
            {
                watch.Start();
                Serializer.Serialize(file, person);
                watch.Stop();
            }

            //Console.WriteLine(watch.ElapsedMilliseconds.ToString());
            Console.WriteLine("Person got created using protocol buffer in " + watch.ElapsedMilliseconds.ToString() + " milliseconds ");

            watch.Reset();

            System.Xml.Serialization.XmlSerializer x = new System.Xml.Serialization.XmlSerializer(person.GetType());
            using (TextWriter w = new StreamWriter(XMLSerializedFileName))
            {
                watch.Start();
                x.Serialize(w, person);
                watch.Stop();
            }

            //Console.WriteLine(watch.ElapsedMilliseconds.ToString());
            Console.WriteLine("Person got created using XML in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");

            watch.Reset();

            using (Stream stream = File.Open(BinarySerializedFileName, FileMode.Create))
            {
                BinaryFormatter bformatter = new BinaryFormatter();
                //Console.WriteLine("Writing Employee Information");
                watch.Start();
                bformatter.Serialize(stream, person);
                watch.Stop();
            }

            //Console.WriteLine(watch.ElapsedMilliseconds.ToString());
            Console.WriteLine("Person got created using binary in " + watch.ElapsedMilliseconds.ToString() + " milliseconds");

            Console.ReadLine();



        }
    }


    [ProtoContract]
    [Serializable]
    public class Person
    {
        [ProtoMember(1)]
        public int Id { get; set; }
        [ProtoMember(2)]
        public string Name { get; set; }
        [ProtoMember(3)]
        public Address Address { get; set; }
    }
    [ProtoContract]
    [Serializable]
    public class Address
    {
        [ProtoMember(1)]
        public string Line1 { get; set; }
        [ProtoMember(2)]
        public string Line2 { get; set; }
    }
}
+3  A: 

I replied to your e-mail; I didn't realise you'd also posted it here. The first question I have is: which version of protobuf-net? The reason I ask is that the development trunk of "v2" deliberately has auto-compilation disabled, so that I can use my unit tests to test both the runtime and pre-compiled versions. So if you are using "v2" (only available in source), you need to tell it to compile the model - otherwise it is running 100% reflection.

In either "v1" or "v2" you can do this with:

Serializer.PrepareSerializer<Person>();

Having done this, the numbers I get (from the code in your e-mail; I haven't checked if the above is the same sample):

10
Person got created using protocol buffer in 10 milliseconds
197
Person got created using XML in 197 milliseconds
3
Person got created using binary in 3 milliseconds

The other factor is the repeats; 3-10ms is frankly nothing; you can't compare numbers around this level. Upping it to repeat 5000 times (re-using the XmlSerializer / BinaryFormatter instances; no false costs introduced) I get:

110
Person got created using protocol buffer in 110 milliseconds
329
Person got created using XML in 329 milliseconds
133
Person got created using binary in 133 milliseconds

Taking this to sillier extremes (100000):

1544
Person got created using protocol buffer in 1544 milliseconds
3009
Person got created using XML in 3009 milliseconds
3087
Person got created using binary in 3087 milliseconds

So ultimately:

  • when you have virtually no data to serialize, most approaches will be very fast (including protobuf-net)
  • as you add data, the differences become more obvious; protobuf generally excels here, either for individual large graphs, or lots of small graphs

Note also that in "v2" the compiled model can be fully static-compiled (to a dll that you can deploy), removing even the (already small) spin-up costs.

Marc Gravell
Marc, absolutely! 'protobuf-serialized' files are of lesser size and when you test for a larger number of files, the time is significantly less than the binary. Thank you for your time. :-)
ydobonmai
+2  A: 

I have a slightly different opinion than the marked answer. I think the numbers from these tests reflects the meta-data overhead of binary formatter. BinaryFormatter writes meta-data about the class first before writing data, while protobuf writes only data.

For the very small object (one Person object) in your test, the meta-data cost of binary formatter weighs more than real cases, because it is writing more meta-data than data. So, when you increase the repeat count, the meta-data cost is exaggerated, up to the same level as xml serialization in extreme case.

If you serialize a Person array, and the array is large enough, then the meta-data cost will be trivial to the total cost. Then binary formatter should perform similar to protobuf for your extreme repeat test.

PS: I found this page because I'm evaluating different serializers. I also found a blog http://blogs.msdn.com/b/youssefm/archive/2009/07/10/comparing-the-performance-of-net-serializers.aspx which shows test result that DataContractSerializer + binary XmlDictionaryWriter performs several times better than binary formatter. It also tested with very small data. When I did the test myself with large data, I was surprised to find the result was very different. So do test with real data you will actually use.

Dudu
+1 Dudu, I will take a look.
ydobonmai