views:

51

answers:

3

The book "Designing Embedded Hardware" in the chapter "9.3. Old Faithful: RS-232C" mentions that emails are still sent in 7bit char set because of RS-232C:

It's also not unheard of to see RS-232C systems still using 7-bit data frames (another leftover from the '60s), rather than the more common 8-bit. In fact, this is one of the reasons why you'll still see email being sent on the Internet limited to a 7-bit character set, just in case the packets happen to be routed via a serial connection that supports only 7-bit transmissions.

How can I confirm the observation?

+3  A: 

Check out the spec. The original rfc822, for ARPA Internet Text Messages, explicitly states:

A message consists of header fields and, optionally, a body. The body is simply a sequence of lines containing ASCII characters.

Since ASCII is 7-bit, voila.

Note, however, that there are a whole bunch of additions to that original spec, all the MIME extensions, which allow message header extensions for non-ascii text.

jvenema
+1  A: 

The Quoted-printable MIME encoding is specifically designed to encode 8-bit data in 7-bit characters. This encoding is widely used to encode email.

Note also that the text you quoted says "in case the packets happen to be routed via a serial connection" which is misleading, especially if they're talking in a context of IP packets. IP packets assume an 8-bit data path, and cannot be sent directly over a 7-bit RS-232 link without additional encoding (and then it's not a 7-bit data path anymore, it's 8-bit).

Greg Hewgill
Actually, the context is not necessarily IP packets. Emails are sufficiently well enveloped that they could be transmitted on their own as a data transmission protocol (albeit one without error correction). One example currently in use is the mbox file format which is just a concatenation of raw email envelopes. This is the mailbox format used by Thunderbird. I have personally seen an email relay using RS232 used to send email to non-networked PCs.
slebetman
+1  A: 

The systems that were restricted to 7 bits were already old when email first became popular. The chances that you will find one today approach zero.

Since certain characters have special meaning to email programs (most notably the end-of-line character), it still makes sense to limit the character set.

Mark Ransom
Surely it makes sense to follow the standard? That would be a stronger reason than accommodating old systems.
John Saunders
When it comes down to it, the full stop (`.`) is still the most important character when it comes to email transmission. In any case, there were still 7-bit RS-232 links in common-place use 10 years ago - think serial port concentrators and WYSE terminals in warehouses.
D.Shawley
@John Saunders: The standards were developed to accomodate the old systems, not the other way around. Of course the standards must be maintained, but I didn't think that was the question.
Mark Ransom
@D.Shawley: Certainly there were still serial ports in use, but were they really restricted to 7 bits? That wasn't my experience. And even if they were, were they routing email traffic?
Mark Ransom
@Mark: When I was at Uni way back in 97 a friend of mine ran an email relay service in his building using rs232 and rs422 (for Macs) for people with machines that don't have networking capability. But the link was 1200,8N1 not 7 bit.
slebetman