views:

1194

answers:

5

I have a bunch (hundreds) of files that are supposed to have Unix line endings. I strongly suspect that some of them have Windows line endings, and I want to programmatically figure out which ones do.

I know I can just run

flip -u
or something similar in a script to convert everything, but I want to be able to identify those files that need changing first.

A: 

Windows use char 13 & 10 for line ending, unix only one of them ( i don't rememeber which one ). So you can replace char 13 & 10 for char 13 or 10 ( the one, which use unix ).

TcKs
+3  A: 

You can use the file tool, which will tell you the type of line ending. Or, you could just use dos2unix -U which will convert everything to Unix line endings, regardless of what it started with.

Adam Rosenfield
+1  A: 

Unix uses one byte, 0x0A (LineFeed), while windows uses two bytes, 0x0D 0x0A (Carriage Return, Line feed).

If you never see a 0x0D, then it's very likely Unix. If you see 0x0D 0x0A pairs then it's very likely MSDOS.

Adam Davis
+3  A: 

You could use grep

egrep -l $'\r\n' *
stimms
Just note: the command above requires to be run from bash.
ΤΖΩΤΖΙΟΥ
for some reason, when I run this command in a MacOS X shell, I get a list of all files in the directory. Even one that I newly generate with "echo "test" >torderform6.cpp". Any idea what might be going wrong?
Adrian Grigore
+1  A: 

Something along the lines of:

perl -p -e 's[\r\n][WIN\n]; s[(?<!WIN)\n][UNIX\n]; s[\r][MAC\n];' FILENAME

though some of that regexp may need refining and tidying up.

That'll output your file with WIN, MAC, or UNIX at the end of each line. Good if your file is somehow a dreadful mess (or a diff) and has mixed endings.

joachim