views:

7827

answers:

8

What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers?

+3  A: 

Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:

We decided against some extreme storage optimizations (i.e. packing small integers into ASCII or using a 7-bit continuation format) for the sake of simplicity and clarity in the code. These alterations can easily be made if and when we encounter a performance-critical use case that demands them.

Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.

Daniel Spiewak
+17  A: 

They both offer many of the same features; however, there are some differences:

  • Thrift supports 'exceptions'
  • Protocol Buffers have much better documentation/examples
  • Thrift has a builtin Map and Set type
  • Protocol Buffers allow "extensions" - you can extend an external proto to add extra fields, while still allowing external code to operate on the values. There is no way to do this in Thrift
  • I find Protocol Buffers much easier to read

Basically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).

hazzen
+7  A: 
  • Protobuf serialized objects are about 30% smaller then Thrift.
  • Most actions you may want to do with protobuf objects (create, serialize, deserialize) are much slower than thrift.
  • Thrift has richer data structures (Map, Set)
  • Protobuf API looks cleaner, though the generated classes are all packed as an inner classes which is not so nice.
  • Thrift enums are not real Java Enums, i.e. they are just ints. Protobuf has real java enums.

For a closer look at the differences, check out the source code diffs at this open source project.

eishay
That's "much slower when not optimising for speed"...
Jon Skeet
Here's the re-test with optimizing on: http://eishay.blogspot.com/2008/11/protobuf-with-option-optimize-for-speed.html
Scott Bilas
Quick suggestion: it'd be neat if there was another non-binary format (xml or json?) used as the baseline. There haven't been good tests that show general trends -- assumtpion is that PB and Thrift are more efficient, but if and by how much if so, is mostly an open question.
StaxMan
0.02 seconds?! I don't have that kind of time spare
Chris S
+5  A: 

One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.

Also, both have bit less tool support than standard formats like xml (and maybe even json).

(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.

StaxMan
+12  A: 

Another important difference are the languages supported by default.

  • protobuf: Java, C++, Python
  • Thrift: Java, C++, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, OCaml

Both could be extended for other platforms, but these at the languages bindings available out-of-the-box.

Mike Gray
+6  A: 

And according to the wiki the Thrift runtime doesn't run on Windows.

Stuart
+3  A: 

RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a data-interchange format alone.

saidimu
That's not true. Protocol buffers define an RPC service api and there are some libraries available to implement the message passing.
Stephen
I didn't say Protobuf does not have RPC defined, just that it doesn't seem to have been designed for that, at least not the external release everyone has access to. Read this Google engineer's comment [here](http://steve.vinoski.net/blog/2008/07/13/protocol-buffers-leaky-rpc/#comment-1093)
saidimu
More importantly, Thrift has RPC support built in. Protobuf currently relies on third-party libraries, meaning less eyes, less testing, less reliable code.
Alec Thomas
+1  A: 

I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.

So, if serialization/deserialization is all you need, then you can probably use something else.

http://dhruvbird.blogspot.com/2010/05/protocol-buffers-vs-http.html

dhruvbird

related questions