tags:

views:

982

answers:

14

It seemed like there use to be way more binary protocols because of the very slow internet speeds of the time (dialup). I've been seeing everything being replaced by HTTP and SOAP/REST/XML. Why is this?

Are binary protocols really dead or are they just less popular? Why would they be dead or less popular?

+19  A: 

Binary protocols will always be more space efficient than text protocols. Even as internet speeds drastically increase, so does the amount and complexity of information we wish to convey.

The text protocols you reference are outstanding in terms of standardization, flexibility and ease of use. However, I suspect there will always be applications where the efficiency of binary transport will outweigh those factors.

A great deal of information is binary in nature and will probably never be replaced by a text protocol. Video streaming comes to mind as a clear example.

Eric J.
I don't know, I can see it... <video><image timecode="0"><row rowIndex="0"><pixel colIndex="0"><red>5</red><green>100</green><blue>25</blue></pixel><pixel colIndex="1"><red>7</red><green>95</green><blue>25</blue></pixel>...
Jeffrey L Whitledge
@Jeffrey: At least you will be able to debug problems with notepad or vi ;-)
Eric J.
Are they always much better than text + gzip?
Martin Beckett
Except for compressed video with codecs where the compression is so bound to the data
Martin Beckett
@Martin: General-purpose compression is always less efficient than custom binary protocols. General algorithms must make assumptions about the variety and frequency of data. Extreme example: say you want to transmit a completely random series of 1's and 0's. A binary protocol will be 100% efficient (except for any header and routing info). If you represent that in, say, XML the best you could hope for is something like <b>1</b><b>0</b>... Not only do you have extra characters surrounding data, but a general compression algorithm can't assume you won't throw in other letters or numbers.
Eric J.
@Eric, but enough better to outweigh the disadvantages? For jpeg or movies you will always want binary - seeing the numbers is useless. But for most protocols, especially with a lot of redundant text like XML, gzip does very well. If you are doing a protocol consider a simple text stream, then gzip and see if that is good enough before going binary.
Martin Beckett
@Martin: Always is a dangerous concept. It really depends on the application needs. Certainly there is enough interest in binary protocols that vendors such as Microsoft continue to support them current product releases (e.g. WCF). I suspect vendor decisions (at least collectively) are market demand driven.
Eric J.
In the embedded world, there's often not enough memory available to process XML.
Tobias Langner
+2  A: 

There will always be a need for binary protocols in some applications, such as very-low-bandwidth communications. But there are huge advantages to text-based protocols. For example, I can use Firebug to easily see exactly what is being sent and received from each HTTP call made by my application. Good luck doing that with a binary protocol :)

Another advantage of text protocols is that even though they are less space efficient than binary, text data compresses very well, so the data may be automatically compressed to get the best of both worlds. See HTTP Compression, for example.

Justin Ethier
If the protocol is documented, the equivalent of Firebug is feasible, just complicated.
Adriano Varoli Piazza
Whats wrong with netcat hooked up to a hex editor? :)
Earlz
A protocol analyser such as [Wireshark](http://www.wireshark.org/) (previously Ethereal) can be used to understand binary protocols. If you're writing code for a binary protocol, it makes sense to make an analyser too.
Craig McQueen
+2  A: 

Binary protocols are not dead. It is much more efficient to send binary data in many cases.

WCF supports binary encoding using TCP. http://msdn.microsoft.com/en-us/library/ms730879.aspx

Raj Kaimal
A: 

Binary protocols will continue to live wherever efficency is required. Mostly, they will live in the lower-levels, where hardware-implementation is more common than software implementations. Speed isn't the only factor - the simplicity of implementation is also important. Making a chip process binary data messages is much easier than parsing text messages.

M.A. Hanin
A: 

depends on the application... I think in real time environment (firewire, usb, field busses...) will always be a need for binary protocols

chrmue
+3  A: 

Facebook, Last.fm, and Evernote use the Thrift binary protocol.

Don Reba
A: 

So far the answers all focus on space and time efficiency. No one has mentioned what I feel is the number one reason for so many text-based protocols: sharing of information. It's the whole point of the Internet and it's far easier to do with text-based, human-readable protocols that are also easily processed by machines. You rid yourself of language dependent, application-specific, platform-biased programming with text data interchange.

Link in whatever XML/JSON/*-parsing library you want to use, find out the structure of the information, and snip out the pieces of data you're interested in.

Jonathon
+5  A: 

A parallel with programming languages is probably very relevant.

While hi-level languages are the preferred tools for most programming jobs, and have been made possible (in part) by the increases in CPU speed and storage capactity, they haven't removed the need for assembly language.

In a similar fashion, non-binary protocols introduce more abstraction, more extensibility and are therefore the vehicle of choice particularly for application-level communication. They too have benefited from increases in bandwidth and storage capacity. Yet at lower level it is still impractical to be so wasteful.

Furthermore unlike with programming languages where there are strong incentives to "take the performance hit" in exchange for added simplicity, speed of development etc., the ability to structure communication in layers makes the complexity and "binary-ness" of lower layers rather transparent to the application level. For example so long as the SOAP messages one receives are ok, the application doesn't need to know that these were effectively compressed to transit over the wire.

mjv
+2  A: 

Some binary protocols I've seen on the wild for Internet Applications

  • Google Protocol Buffers which are used for internal communications but also on, for example Google Chrome Bookmark Syncing
  • Flash AMF which is used for communication with Flash and Flex applications. Both Flash and Flex have the capability of communicating via REST or SOAP, however the AMF format is much more efficient for Flex as some benchmarks prove
Jeduan Cornejo
+1  A: 

Are binary protocols dead?

Two answers:

  1. Let's hope so.
  2. No.

At least a binary protocol is better than XML, which provides all the readability of a binary protocol combined with all the efficiency of less efficiency than a well-designed ASCII protocol.

Norman Ramsey
+2  A: 

I rarely see this talked about but binary protocols, block protocols especially can greatly simplify the complexity of server architectures.

Many text protocols are implemented in such a way that the parser has no basis upon which to infer how much more data is necessary before a logical unit has been received (XML, and JSON can all provide minimum necessary bytes to finish, but can't provide meaningful estimates). This means that the parser may have to periodically cede to the socket receiving code to retrieve more data. This is fine if your sockets are in blocking mode, not so easy if they're not. It generally means that all parser state has to be kept on the heap, not the stack.

If you have a binary protocol where very early in the receive process you know exactly how many bytes you need to complete the packet, then your receiving operations don't need to be interleaved with your parsing operations. As a consequence, the parser state can be held on the stack, and the parser can execute once per message and run straight through without pausing to receive more bytes.

Kennet Belenky
A: 

Surely this depends entirely on the application? There have been two general types of example so far, xml/html related answers and video/audio. One is designed to be 'shared' as noted by Jonathon and the other efficient in its transfer of data (and without Matrix vision, 'reading' a movie would never be useful like reading a HTML document).

Ease of debugging is not a reason to choose a text protocol over a 'binary' one - the requirements of the data transfer should dictate that. I work in the Aerospace industry, where the majority of communications are high-speed, predictable data flows like altitude and radio frequencies, thus they are assigned bits on a stream and no human-readable wrapper is required. It is also highly efficient to transfer and, other than interference detection, requires no meta data or protocol processing.

So certainly I would say that they are not dead.

I would agree that people's choices are probably affected by the fact that they have to debug them, but will also heavily depend on the reliability, bandwidth, data type, and processing time required (and power available!).

Kurucu
Absolutely so, please see my comment as well. The fact that many enjoy debugging things in "decyphered" mode, is today translated into the bad habit of having the computer decypher and re-cypher it from binary to text and back into binary, a redundant task once the human is not in the loop!
Etamar L.
+2  A: 

I'm really glad you have raised this question, as non-binary protocols have multiplied in usage many folds since the introduction of XML. Ten years ago, you would see virtually everybody touting their "compliance" with XML based communications. However, this approach, one of several approaches to binary protocols, has many deficiencies.

One of the values, for example, was readability. But readability is important for debugging, when humans should read the transaction. They are very inefficient when compared with binary transfers. This is due to the fact that XML itself is a binary stream, that has to be translated using another layer into textual fragments ("tokens"), and then back into binary with the contained data.

Another value people found was extensibility. But extensibility can be easily maintained if a protocol version number for the binary stream is used at the beginning of the transaction. Instead of sending XML tags, one could send binary indicators. If the version number is an unknown one, then the receiving end can download the "dictionary" of this unknown version. This dictionary could, for example, be an XML file. But downloading the dictionary is a one time operation, instead of every single transaction!

So efficiency could be kept together with extensibility, and very easily! There are a good number of "compiled XML" protocols out there which do just that.

Last, but not least, I have even heard people say that XML is a good way to overcome little-endian and big-endian types of binary systems. For example, Sun computers vs Intel computers. But this is incorrect: if both sides can accept XML (ASCII) in the right way, surely both sides can accept binary in the right way, as XML and ASCII are also transmitted binarically.......

Hope you find this interesting reading!

Etamar L.
A: 

interesting discussion ideed but does anyone have a good definition for what a binary protocol is? and what is a text protocol actually? how do these compare to each other in terms of bits sent on the wire?

here's what wikipedia says about binary protocols:

A binary protocol is a protocol which is intended or expected to be read by a machine rather than a human being (http://en.wikipedia.org/wiki/Binary_protocol)

oh come on!

to be more clear, if I want to send the string "ABC" on the wire how would a binary protocol do that and how would a text one in terms of bits sent on the wire?

salutations!

der_grosse