views:

698

answers:

8

Yesterday, I have a discussion with my colleagues about HTTP. It is asked why HTTP is designed in plain text way. Surely, it can be designed in binary way just like TCP protocol, using flags to represents different kinds of method(POST, GET) and variables (HTTP headers). So, why HTTP is designed in such way? Is there any technical or historical reasons?

+5  A: 

Many Internet application protocols use more or less plain text for the protocol (see FTP, POP, SMTP, etc.).

It makes interoperability and troubleshooting much easier.

Michael Burr
Especially when you can open a telnet session and fake one or both sides of the conversation for debugging.
Paul Tomblin
+2  A: 

So it's easier to "read" the traffic or create a client or server?

You can debate whether it actually makes it easier, but surely that was the intent.

Eli
It sure does - you look at the raw traffic, and you see what's going on. Most humans are better at interpreting strings than raw binary bytes, whereas computers are reasonably fast that the performance hit is negligible.
Piskvor
+5  A: 

HTTP stands for "Hypertext Transfer Protocol".

It was initially devised as a way to serve text documents, hence the text based protocol.

What we do with HTTP now is far beyond its original intent.

FlySwat
So are alot of other internet protocols that are not "hypertext"
icelava
+20  A: 

A reason that's both technical and historical is that text protocols are almost always preferred in the Unix world.

Well, this is not really a reason but a pattern. The rationale behind this is that text protocols allows you to see what's going on on the network by just dumping everything that goes through. You don't need a specialized analyzer as you need for TCP/IP. This makes it easier to debug and easier to maintain.

Not only HTTP, but many protocols are text based (e.g., SMTP).

You might want to take a look at The Art of Unix Programming for a much more detailed explanation of this Unix thing.

PolyThinker
The discussion in TAOUP on why text protocols are good is very, very relevant. Also, look at the number of bugs in implementations of protocols described in things like ASN.1. (And they're often easier than binary protocols not described in ASN.1!)
Jonathan Leffler
A: 

TCP and other lower layer protocols need to be very fast, and thus, they can't be verbose.

The other reason is that binary protocols are nearly impossible to test manually and harder to implement for few reasons, including endianness.

maurycy
+4  A: 

As with RFC 2616 section 3.7.1 for HTTP 1.1, the key identifier to a line of command or header is the text line-break CRLF; text-based application protocols makes it easier to carry out a conversation (for troubleshooting) purely with a Telnet client. It also makes it easier to program with ReadLine() calls and matching text strings.

The CRLF parameter break also gives near-unlimited abitrary header extensions unlike a fixed-size TCP or IP headers where one hard-codes by bit offsets.

icelava
+6  A: 

With HTTP, the content of a request is almost always orders of magnitude larger than the protocol overhead. Converting the protocol into a binary one would save very little bandwidth, and the easy debugability that a text protocol offers easily trumps the minor bandwidth savings of a binary protocol.

Adam Rosenfield
+1  A: 

Historically, it all starts from RFC822 (STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES), whose latest version is RFC5322 (Internet Message Format). SMTP (RFC 821) was one of the most popular protocol based on RFC822. And, HTTP was born out of SMTP (your mail protocol).

yogman