As just a follow up on the Carriage Return, then Line Feed research pot.
On NotePad it detect when a end-of-line (starts at coloumn 0, and one line down) when it finds the patter CR+LF. This is the general format used by CP/M, MS-DOS, and Win32, Source
Unix detects a end-of-line when it finds a LF.
Apple detects a end-of-line when it finds a CR.
From a Uni-Code perspective there is a control character called NEXT LINE (NEL) just to make the situation even more complex.
With C programming language, why does it write out carriage return + line feed when you give it a new line character for example
printf("hello World\n")
The C specification specify that the stdio library routines are supposed to convert the newline character to whatever it takes to actually go to the beginning of the next line for that platform e.g., carriage return + line feed for Win32.
So when you write the new line character in either C/C++ and on either Windows or Linux the studio library will determine the output format that will need to be outputted for that end of line for that platform.
This is evident with creating a binary file or a text file in a C program. If you specify that you're writing a Binary file the studio library will leave the output format unchanged for that platform. So when writing the data to file and it comes across the newline character it will not insert the platform dependent characters for a new line.
Though coming to the conclusion after all this.
Even if you do follow the Win32 rules for Carriage Return + Line Feed to for example writing the following into a file as pure binary file.
MyText \n MyText \n MyText
And you assume it will render like this in your text editor.
MyText
MyText
MyText
Most editors instead will render it like this.
MyText
MyText
MyText
The confusion is mostly because of the C standard that use \n new line character for two different meanings. First, as a new line indicator for the STIO library to convert to the operating system new line format (CR+LF on win32, LF on Linux, and CR for Apple). Secondly as the just hex value line feed.
Well after 10 revisions and trying out different approaches on Win3.1,95,98,XP I have come to the conclusion I couldn't find an application that used CR and LF independently and can use a combination of them in the same document. Most text editors will show a square when it hits a single CR or LF. Most smarter text editors will change the file format depending if they find a CR+LF/LF/CR for the appropriate platform.
Most if not all editors are only concerned about rendering a new line to the user and will switch between the different file formats. So if your writer a lexer and string tokenizer any time soon and worried about when to detect a new line. Its best for the lower levels to detect the file format (CR+LF Win32, LF Linux, CR Apple) to increment the line number. Or use the ReadLine functionality that will take this into account.
It is puzzling to say the least, that why Carriage Return + Line Feed was adopted by IBM and Win32 as the standard for instructing the text editor to render a new line. When in fact its redundancy. I could not find a single application that rendered or used Carriage Return + Line Feed independently for the actual name it suggests it does.
So if your a University student writing the new text editor to amaze the world. Automatically detect the file format, and don't worry about the actual technical meaning given to CR+LF.