tags:

views:

1613

answers:

7

Possible Duplicate:
What is the difference between \r and \n?

I'd like to know the difference (with examples if possible) between CR LF (Windows), LF (Unix) and CR (Macintosh) line break types.

+3  A: 

It's really just about which bytes are stored in a file. CR is a bytecode for carriage return (from the days of typewriters) and LF similarly, for line feed. It just refers to the bytes that are placed as end-of-line markers.

Way more information, as always, on wikipedia.

Peter
Thanks for making it simple to understand (for a noob). :)
Nimbuz
+4  A: 

CR an LF are control characters, respectively coded 0x0D (13 decimal) and 0x0A (10 decimal).

They are used to mark a line break in text file. A you indicated, Windows uses two characters the CR LF sequence; Unix only uses LF and MacIntosh CR.

An apocryphal historical perspective
As indicated by Peter, CR = Carriage Return and LF = Line Feed, two expressions which have their roots in the old typewriters / TTY. LF moved the paper up (but kept the horizontal position identical) and CR brought back the "carriage" so that the next character typed would be at the leftmost position on the paper (but on the same line). CR+LF was doing both, i.e. preparing to type a new line. As time went by the physical semantics of the codes were not applicable, and as memory and floppy disk space was at a premium, some OS designers decided to only use one of the characters, they just didn't communicate very well with one another ;-)

Most modern text editors and text-oriented application offer options/settings etc. that allow the automatic detection of the file's end-of-line convention and to display it accordingly.

mjv
Well, now I have to vote you *down* since you reverted my fix. 0x0c is 12 decimal, *not* 13.
paxdiablo
@Pax, thanks for catching the typo. The overwrite was accidental, probably happened while I was typing the trip to the 20th century ;-)
mjv
And there's your two rep (and my one) back. Cheers.
paxdiablo
LOL and thanks. We always get plenty of reps on these basic questions.
mjv
A: 

Systems based on ASCII or a compatible character set use either LF (Line feed, 0x0A, 10 in decimal) or CR (Carriage return, 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, 0x0D 0x0A); These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.

Here is the details.

pierr
+3  A: 

This is a good summary I found:

The Carriage Return (CR) character (0x0D, \r) moves the cursor to the beginning of the line without advancing to the next line. This character is used as a new line character in Commodore and Early Macintosh operating systems (OS-9 and earlier).

The Line Feed (LF) character (0x0A, \n) moves the cursor down to the next line without returning to the beginning of the line. This character is used as a new line character in UNIX based systems (Linux, Mac OSX, etc)

The End of Line (EOL) character (0x0D0A, \r\n) is actually two ASCII characters and is a combination of the CR and LF characters. It moves the cursor both down to the next line and to the beginning of that line. This character is used as a new line character in most other non-Unix operating systems including Microsoft Windows, Symbian OS and others.

Source

Taylor Leese
+1  A: 

CR - ASCII code 13

LF - ASCII code 10.

Theoretically CR returns cursor to the first position (on the left). LF feeds one line moving cursor one line down. This is how in old days you controled printers and text-mode monitors. There characters are usually used to mark end of lines in text files. Different opearating systems used different conventions. As you pointed out Windows uses (CR/LF combination while Macs use just CR and so on)

DmitryK
A: 

The sad state of "record separators" or "line terminators" is a legacy of the dark ages of computing.

Now, we take it for granted that anything we want to represent is in some way structured data and conforms to various abstractions that define lines, files, protocols, messages, markup, whatever.

But once upon a time this wasn't exactly true. Applications built-in control characters and device-specific processing. The brain-dead systems that required both CR and LF simply had no abstraction for record separators or line terminators. The CR was necessary in order to get the teletype or video display to return to column one and the LF (today, NL, same code) was necessary to get it to advance to the next line. I guess the idea of doing something other than dumping the raw data to the device was too complex.

Unix and Mac actually specified an abstraction for the line end, imagine that. Sadly, they specified different ones. (Unix, ahem, came first.) And naturally, they used a control code that was already "close" to S.O.P.

Since almost all of our operating software today is a descendent of Unix, Mac, or MS operating SW, we are stuck with the line ending confusion.

DigitalRoss
+1  A: 

Jeff Atwood has a recent blog post about this: The Great Newline Schism

Here is the essence from Wikipedia:

The sequence CR+LF was in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the teletype machine and follow its conventions. The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even after teletypes were replaced by computer terminals with higher baud rates, many operating systems still supported automatic sending of these fill characters, for compatibility with cheaper terminals that required multiple character times to scroll the display.

Manu
+1 for coding horror reference :)
Mauricio