I assume everyone here is familiar with the adage that all text files should end with a newline. I've known of this "rule" for years but I've always wondered — why?
Presumably simply that some parsing code expected it to be there.
I'm not sure I would consider it a "rule", and it certainly isn't something I adhere to religiously. Most sensible code will know how to parse text (including encodings) line-by-line (any choice of line endings), with-or-without a newline on the last line.
Indeed - if you end with a new line: is there (in theory) an empty final line between the EOL and the EOF? One to ponder...
Basically there are many programs which will not process files correctly if they don't get the final EOL EOF.
GCC warns you about this because it's expected as part of the C standard. (section 5.1.1.2 apparently)
http://stackoverflow.com/questions/72271/no-newline-at-end-of-file-compiler-warning
I personally like new lines at the end of source code files.
It may have its origin with Linux or all UNIX systems for that matter. I remember there compilation errors (gcc if I'm not mistaken) because source code files did not end with an empty new line. Why was it made this way one is left to wonder.
Each line should be terminated in a newline character, including the last one. Some programs have problems processing the last line of a file if it isn't newline terminated.
GCC warns about it not because it can't process the file, but because it has to as part of the standard.
The C language standard says A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character.
Since this is a "shall" clause, we must emit a diagnostic message for a violation of this rule.
This is in section 2.1.1.2 of the ANSI C 1989 standard. Section 5.1.1.2 of the ISO C 1999 standard (and probably also the ISO C 1990 standard).
Reference: The GCC/GNU mail archive.
It may be related to the difference between:
- text file (each line is supposed to end in an end-of-line)
- binary file (there are no true "lines" to speak of, and the length of the file must be preserved)
If each line does end in with an end-of-line, this avoids, for instance, that concatenating two text files would make the last line of the one run into the first line of the other.
Plus, an editor can check at load whether the file ends in and end-of-line, saves it in its local option 'eol', and uses that when writing the file.
A few years back (2005), many editors (ZDE, Eclipse, Scite, ...) did "forget" that final EOL, which was not very appreciated.
Not only that, but they interpreted that final EOL incorrectly, as 'start a new line', and actually start to display another line as if it already existed.
This was very visible with a 'proper' text file with a well-behaved text editor like vim, and open it in one of the above editors. It displayed an extra line below the real last line of the file. You see something like this:
1 first line
2 middle line
3 last line
4
Imagine that the file is being processed while the file is still being generated by another process.
It might have to do with that? A flag that indicates that the file is ready to be processed.
IMHO, it's a matter of personal style and opinion.
In olden days, I didn't put that newline. A character saved means more speed through that 14.4K modem.
Later, I put that newline so that it's easier to select the final line using shift+downarrow.
This originates from the very early days when simple terminals were used. The newline char was used to trigger a 'flush' of the transferred data.
Today, the newline char isn't required anymore. Sure, many apps still have problems if the newline isn't there, but I'd consider that a bug in those apps.
If however you have a text file format where you require the newline, you get simple data verification very cheap: if the file ends with a line that has no newline at the end, you know the file is broken. With only one extra byte for each line, you can detect broken files with high accuracy and almost no CPU time.
There's at least one hard advantage to this guideline when working on a terminal emulator: when more
ing the file on the command line (more
displays the file) or cat
ting multiple files (cat
= concatenate input files), it results in a correct display. Otherwise the result is something like this:
konrad$ more testfile.txt
Line one
Line twokonrad$
This has tripped me up repeatedly and while it's no big deal in general, it can make working with cat
to concatenate files harder.
Files should not necessarily end with a new line.
For example, my signature files used by my email client don't end with a new line because if they would, my email message would have an empty line at the end and I don't like it.
I was always under the impression the rule came from the days when parsing a file without an ending newline was difficult. That is, you would end up writing code where an end of line was defined by the EOL character or EOF. It was just simpler to assume a line ended with EOL.
However I believe the rule is derived from C compilers requiring the newline. And as pointed out on “No newline at end of file” compiler warning, #include will not add a newline.