views:

82

answers:

4

I have a multi-line ASCII string coming from some (Windows/UNIX/...) system. Now, I know about differences in newline character in Windows and UNIX (CR-LF / LF) and I want to parse this string on both (CR and LF) characters to detect which newline character(s) is used in this string, so I need to know what "\n" in VS6 C++ means.

My question is if I write a peace of code in Visual Studio 6 for Windows:

bool FindNewline (string & inputString) {
    size_t found;
    found = inputString.find ("\n");
    return (found != string::npos ? true : false);
}

does this searches for CR+LF or only LF? Should I put "\r\n" or compiler interprets "\n" like CR+LF?

+1  A: 
inputString.find ("\n");

will search for the LF character (alone).

Library routines may 'translate' between CR/LF and '\n' when I/O is performed on a text stream, but inside the realm of your program code, '\n' is just a line-feed.

Michael Burr
+1  A: 

"\n" means "\n". Nothing else. So you search for LF only. However Microsoft CRT does some conversions for you when you read a file in text mode, so you can write simpler code, sometimes.

ybungalobill
+1  A: 

Apart from the VS6 part (you really, really want to upgrade this, the compiler is way out of date and Microsoft doesn't really support it anymore), the answer to the question depends on how you are getting the string.

For example, if you read it from a file in text mode, the runtime library will translate \r\n into \n. So if all your text strings are read in text mode via the usual file-based APIs, your search for\n` (ie, newline only) would be sufficient.

If the strings originate in files that are read in binary mode on Windows and are known to contain the DOS/Windows line separator \r\n, the you're better off searching for that character sequence.

EDIT: If you do get it in binary form, yes, ideally you'd have to check for both \r\n and \n. However I would expect that they aren't mixed within one string and still carry the same meaning unless it's a really messed up data format. I would probably check for \r\n first and then \n second if the strings are short enough and scanning them twice doesn't make that much of a difference. If it does, I'd write some code that checks for both \r\n and single \n in a single pass.

Timo Geusch
Since I am getting the string in a binary form, I want to check both cases (the LF and CR+LF case). In this environment I should search for "\n" and "\r\n", right? Does "\n" in my case mean LF and does "\r\n" mean CR+LF ?
kliketa
@Timo: And VS6 is not my choice :)
kliketa
+1  A: 

All translation between "\n" and "\r\n" happens during I/O. At all other times, "\n" is just that and nothing more.

Somehow: return (found != string::npos ? true : false); reminds me of another answer I wrote a while back.

Jerry Coffin
Since I am getting the string in a binary form, I want to check both cases (the LF and CR+LF case). In this environment I should search for "\n" and "\r\n", right? Does "\n" in my case mean LF and does "\r\n" mean CR+LF ?
kliketa
@Kliketa: yes to both your questions.
Jerry Coffin