views:

294

answers:

1

Hi All.

Using Google Protocol Buffers, can I set a maximum size for all messages I encode?

if I know that what I encode is never larger than X bytes, then Google Protobuffs would always produce a buffer of size Y, and if I give it a smaller amount of data, pad it to size Y?

+1  A: 

The wire format for protocol buffers wouldn't make this trivial; I'm not aware of something to do this, but one option would be to serialize it into a buffer with your own length header and pad with extra data as needed.

You need to add a length prefix because this is not added by default, and otherwise it would be reading garbage at the end of your buffer. Even trailing 0s would not be legal (it would be looking for a field number).

I can't comment on the C++ or Jon's C# version, but for my C# version (protobuf-net), you should be able to do something like (untested):

using(var  ms = new MemoryStream(fixedLength)) {
     ms.SetLength(fixedLength);
     Serializer.SerializeWithLengthPrefix(ms, obj);
     if(ms.Length > fixedLength) { /* boom */ }
     byte[] arr = ms.ToArray(); // use this
}

This should deserialize fine if also using DeserializeWithLengthPrefix.


Re the questions (comments); SerializeWithLengthPrefix is a protobuf-net-specific method; there may be something in the C++ version, but it is pretty simple. The easiest way to implement this from scratch is:

  • assume we will leave a fixed-length (4 byte) header to indicate how much actual data we have
  • skip 4 bytes (or write 00-00-00-00)
  • now serialize to the rest of the buffer
  • find how many bytes you just wrote
  • write that value back at the start of the buffer

in reverse, obviously:

  • read 4 bytes and interpret as an int
  • deserialize that much as data

It is a little bit more complex in protobuf-net, as it offers a few more options (how the int should be encoded, and whether or not to wrap this so that the entire thing can still be treated as a 100% value protobuf stream - in particular I suspect I've just described the behaviour if I asked SerializeWithLengthPrefix to use fixed-width encoding and "field 0").

Marc Gravell
Thanks a lot Marc, always helpful.Can you explain in more detail what SerializeWithLengthPrefix actually does?Thanks again!
Roey
One more thing , I am deserializing with C++ version of Protocol Buffers.... does DeserializeWithLengthPrefix exist in some form in the C++ version??
Roey
@Roey - I'll edit that info in...
Marc Gravell