views:

148

answers:

2

I'm implementing a client-server application, and am looking into various ways to serialize and transmit data. I began working with Xml Serializers, which worked rather well, but generate data slowly, and make large objects, especially when they need to be sent over the net. So I started looking into Protobuf, and protobuf-net.

My problem lies in the fact that protobuf doesn't sent type information with it. With Xml Serializers, I was able to build a wrapper which would send and receive any various (serializable) object over the same stream, since object serialized into Xml contain the type name of the object.

ObjectSocket socket = new ObjectSocket();
socket.AddTypeHandler(typeof(string));  // Tells the socket the types
socket.AddTypeHandler(typeof(int));     // of objects we will want
socket.AddTypeHandler(typeof(bool));    // to send and receive.
socket.AddTypeHandler(typeof(Person));  // When it gets data, it looks for
socket.AddTypeHandler(typeof(Address)); // these types in the Xml, then uses
                                        // the appropriate serializer.

socket.Connect(_host, _port);
socket.Send(new Person() { ... });
socket.Send(new Address() { ... });
...
Object o = socket.Read();
Type oType = o.GetType();

if (oType == typeof(Person))
    HandlePerson(o as Person);
else if (oType == typeof(Address))
    HandleAddress(o as Address);
...

I've considered a few solutions to this, including creating a master "state" type class, which is the only type of object sent over my socket. This moves away from the functionality I've worked out with Xml Serializers, though, so I'd like to avoid that direction.

The second option would be to wrap protobuf objects in some type of wrapper, which defines the type of object. (This wrapper would also include information such as packet ID, and destination.) It seems silly to use protobuf-net to serialize an object, then stick that stream between Xml tags, but I've considered it. Is there an easy way to get this functionality out of protobuf or protobuf-net?


I've come up with a third solution, and posted it below, but if you have a better one, please post it too!


Information on field bounds bug (using System.String):

Hashing:

protected static int ComputeTypeField(Type type) // System.String
{
    byte[] data = ASCIIEncoding.ASCII.GetBytes(type.FullName);
    MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
    return Math.Abs(BitConverter.ToInt32(md5.ComputeHash(data), 0));
}

Serialization:

using (MemoryStream stream = new MemoryStream())
{
    Serializer.NonGeneric.SerializeWithLengthPrefix
        (stream, o, PrefixStyle.Base128, field);  // field = 600542181
    byte[] data = stream.ToArray();
    _pipe.Write(data, 0, data.Length);
}

Deserializaion:

using (MemoryStream stream = new MemoryStream(_buffer.Peek()))
{
    lock (_mapLock)
    {
        success = Serializer.NonGeneric.TryDeserializeWithLengthPrefix
            (stream, PrefixStyle.Base128, field => _mappings[field], out o);
    }
    if (success)
        _buffer.Clear((int)stream.Position);
    else
    {
        int len;
        if (Serializer.TryReadLengthPrefix(stream, PrefixStyle.Base128, out len))
            _buffer.Clear(len);
    }
}

field => _mappings[field] throws a KeyNotFoundException while looking for 63671269.

If I replace ToInt32 with ToInt16 in the hash function, the field value is set to 29723 and it works. It also works if I explicitly define System.String's field to 1. Explicitly defining the field to 600542181 has the same effect as using the hash function to define it. The value of the string being serialized does not change the outcome.

A: 

I've come up with another solution, but I decided to put it as an answer, instead of in the question, because that makes more sense to me. It's pretty ugly, in my opinion, and I've been warned against using reflection, so please comment on it or provide better answers if you have them. Thanks!


class Program
{
    static void Main(string[] args)
    {
        Person person = new Person
        {
            Id = 12345,
            Name = "Fred",
            Address = new Address
            {
                Line1 = "Flat 1",
                Line2 = "The Meadows"
            }
        };
        object value;
        using (Stream stream = new MemoryStream())
        {
            Send<Person>(stream, person);
            stream.Position = 0;
            value = Read(stream);
            person = value as Person;
        }
    }

    static void Send<T>(Stream stream, T value)
    {
        Header header = new Header()
        {
            Guid = Guid.NewGuid(),
            Type = typeof(T)
        };
        Serializer.SerializeWithLengthPrefix<Header>(stream, header, PrefixStyle.Base128);
        Serializer.SerializeWithLengthPrefix<T>(stream, value, PrefixStyle.Base128);
    }

    static object Read(Stream stream)
    {

        Header header;
        header = Serializer.DeserializeWithLengthPrefix<Header>
            (stream, PrefixStyle.Base128);
        MethodInfo m = typeof(Serializer).GetMethod("DeserializeWithLengthPrefix",
            new Type[] {typeof(Stream), typeof(PrefixStyle)}).MakeGenericMethod(header.Type);
        Object value = m.Invoke(null, new object[] {stream, PrefixStyle.Base128} );
        return value;
    }
}

[ProtoContract]
class Header
{
    public Header() { }

    [ProtoMember(1, IsRequired = true)]
    public Guid Guid { get; set; }

    [ProtoIgnore]
    public Type Type { get; set; }
    [ProtoMember(2, IsRequired = true)]
    public string TypeName
    {
        get { return this.Type.FullName; }
        set { this.Type = Type.GetType(value); }
    }
}

[ProtoContract]
class Person {
    [ProtoMember(1)]
    public int Id { get; set; }
    [ProtoMember(2)]
    public string Name { get; set; }
    [ProtoMember(3)]
    public Address Address { get; set; }
}

[ProtoContract]
class Address {
    [ProtoMember(1)]
    public string Line1 { get; set; }
    [ProtoMember(2)]
    public string Line2 { get; set; }
}
Daniel Rasmussen
+2  A: 

This functionality is actually built in, albeit not obviously.

In this scenario, it is anticipated that you would designate a unique number per message type. The overload you are using passes them all in as "field 1", but there is an overload that lets you include this extra header information (it is still the job of the calling code to decide how to map numbers to types, though). You can then specify different types as different fields is the stream (note: this only works with the base-128 prefix style).

I'll need to double check, but the intention is that something like the following should work:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using ProtoBuf;
static class Program
{
    static void Main()
    {
        using (MemoryStream ms = new MemoryStream())
        {
            WriteNext(ms, 123);
            WriteNext(ms, new Person { Name = "Fred" });
            WriteNext(ms, "abc");

            ms.Position = 0;

            while (ReadNext(ms)) { }            
        }
    }
    // *** you need some mechanism to map types to fields
    static readonly IDictionary<int, Type> typeLookup = new Dictionary<int, Type>
    {
        {1, typeof(int)}, {2, typeof(Person)}, {3, typeof(string)}
    };
    static void WriteNext(Stream stream, object obj) {
        Type type = obj.GetType();
        int field = typeLookup.Single(pair => pair.Value == type).Key;
        Serializer.NonGeneric.SerializeWithLengthPrefix(stream, obj, PrefixStyle.Base128, field);
    }
    static bool ReadNext(Stream stream)
    {
        object obj;
        if (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(stream, PrefixStyle.Base128, field => typeLookup[field], out obj))
        {
            Console.WriteLine(obj);
            return true;
        }
        return false;
    }
}
[ProtoContract] class Person {
    [ProtoMember(1)]public string Name { get; set; }
    public override string ToString() { return "Person: " + Name; }
}

Note that this doesn't currently work in the v2 build (since the "WithLengthPrefix" code is incomplete), but I'll go and test it on v1. If it works, I'll all the above scenario to the test suite to ensure it does work in v2.

Edit:

yes, it does work fine on "v1", with output:

123
Person: Fred
abc
Marc Gravell
Added to test suite as promised: http://code.google.com/p/protobuf-net/source/browse/trunk/Examples/MultiTypesWithLengthPrefix.cs
Marc Gravell
Shame on me for underestimating the comprehensiveness of protobuf-net! Would using `obj.GetType().GetHashCode()` be a bad idea for generating `field` numbers, if I wanted to avoid a magic predefined dictionary?
Daniel Rasmussen
@Daniel - as long as you have some scheme to mitigate against the *slim* chance of hash conflicts...
Marc Gravell
@Daniel - realised overnight; there are good reasons **not** to use the hash-code: trivially, the number needs to be `>= 1` (which is easy to fix), but more important: hash-codes should not really be trusted outside of a given app-domain. They can change; for example, the string hash algorithm changed between 1.1 and 2.0 - and could legitimately change again. It would be better to use something like an MD5 hash of the full type name. Or: make your first message (with field 1 or similar) the set of mappings between field numbers and type-names.
Marc Gravell
@Marc - I didn't know the hash algorithm had changed; an MD5 hash of the full type name does sound like a better solution for that. As does a server-side type map, sent out to clients on connect. The only problem with that method, tho, is if the client wants to send an object which the server is unfamiliar with, it won't know what number to give it. (Of course, I can't think of a time when this would actually happen - the server should probably be more up-to-date than the clients.)
Daniel Rasmussen
@Daniel - obviously it'll need some munging to get from MD5 to an Int32 - but the important point is: a known, constant algorithm.
Marc Gravell
@Marc - Thanks for all your advice! On a slightly different note, can you point me an explanation of what sort of fields protobuf-net can serialize? Private properties and properties with mixed access levels seem to serialize just fine, and I'm curious as to how. (If this is better suited for a separate question, let me know.)
Daniel Rasmussen
@Daniel - that depends on the platform. On full .NET it can get to everything. If it is compact-framework / silverlight etc (or if you are using "v2" to generate a standalone pre-compiled dll) then it only has access to public types and members. In "v1" it demands a parameterless constructor; in "v2" you can (optionally) bypass this (WCF-style).
Marc Gravell
@Marc - It appears that the field breaks for very large field indexes. `600542181` comes back as `63671269`. `29723` works just fine, though. What are the bounds on the field? I'll let you know if I have more problems or information.
Daniel Rasmussen
@Daniel - is that in protobuf-net? If so, which version?
Marc Gravell
@Marc - Yes, I'm using protobuf-net v1.0.0.282 (runtime v2.0.50727). I'll add some of my code to my question, cause it won't fit here.
Daniel Rasmussen