tags:

views:

186

answers:

7

Whats the best way in C# to determine the line endings used in a text file (Unix, Windows, Mac)?

A: 

There is Environment.NewLine though that is only for determining what is used on the current system and won't help with reading files from various sources.

If it's reading I usually look for \n (Edit: apperantly there are some using only \r) and assume that the line ends there.

Don
hmm, I'd switch your paragraphs around--the second paragraph is an answer--not sure that `Environment.NewLine` is terribly relevant
STW
There's no mentioning if this is for reading from various sources or writing to multiple targets in the question as far as I could see and the tag was simply C# at the time. I considered `Environment.NewLine` useful if the question related to writing "correctly" on other plattforms (mono etc) for example. Either way I didn't spend much time considering ordering of the paragraphs.
Don
A: 

I would imagine you couldn't know for sure, would have to set this in the editor. You could use some AI, the algorithm would be:

  1. Search for each type of line ending, you'd search those specific characters
  2. Measure the distances between the them.
  3. If one type tends to repeat then you assume that's the type. Count the repeats and use some measure of dispersion.

So, for example, if you had repeats of CRLF at 38, 40, 45, and that was within tolerance you'd default to assuming the line end was CRLF.

Curtis White
A: 

If it were me, I'd just read the file one char at a time until I came across the first \r or a \n. This is assuming you have sensical input.

zildjohn01
+1  A: 

I'd just search the file for the first \r or \n and if it was a \n I'd look at the previous character to see if it's a \r, if so, it's \r\n otherwise it's whichever found.

ho1
A: 

Reading most of textual formats I usually look for \n, and then Trim() the whole string (whitespaces at beginning and end are often redundant).

Yossarian
A: 

Here is some advanced guesswork: read the file, count CRs and LFs

if (CR > LF*2) then "Mac" 
else if (LF > CR*2) then "Unix"
else "Windows"

Also note, that newer Macs (Mac OS X) use Unix line endings

unbeli
+2  A: 

Notice that text files may have inconsistent line endings. Your program should not choke on that. Using ReadLine on a StreamReader (and similar methods) will take care of any possible line ending automatically.

If you manually read lines from a file, make sure to accept any line endings, even if inconsistent. In practice, this is quite easy using the following algorithm:

  • Scan ahead until you find either CR or LF.
  • If you read CR, peek ahead at the next character;
  • If the next character is LF, consume it (otherwise, put it back).
Konrad Rudolph