I have a bunch of files. Some are unix line endings, many are dos. I'd like to test each file to see if if is dos formatted, before I switch the line endings.
How would I do this? Is there a flag I can test for? Something similar?
I have a bunch of files. Some are unix line endings, many are dos. I'd like to test each file to see if if is dos formatted, before I switch the line endings.
How would I do this? Is there a flag I can test for? Something similar?
You could search the string for \r\n
. That's DOS style line ending.
EDIT: Take a look at this
As a complete Python newbie & just for fun, I tried to find some minimalistic way of checking this for one file. This seems to work:
if "\r\n" in open("/path/file.txt","rb").read():
print "DOS line endings found"
Edit: simplified as per John Machin's comment (no need to use regular expressions).
If you just want to read text files, either DOS or Unix-formatted, this works:
print open('myfile.txt', 'U').read()
That is, Python's "universal" file reader will automatically use all the different end of line markers, translating them to "\n".
Python knows how to find the newline endings in a file, and you can access them through the newline
attribute:
f = open('myfile.txt', 'U')
f.readline() # Reads a line
# This is the newline ending of the first line.
# It can be "\r\n" (Windows), "\n" (Unix), "\r" (old Mac OS), or None (no newline termination found):
print repr(f.newlines)
This gives the newline ending of the first line (Unix, DOS, etc.), if any. As John M. pointed out, if by any chance you have a pathological file that uses more than one newline coding, f.newlines
is a tuple with all the newline codings found so far, after reading many lines.
If you just want to convert all files, you can simply do:
text = open('myfile.txt', 'U').read() # Automatic conversion of newlines to "\n"
open('myfile.txt', 'w').write(text) # Writes newlines for your platform
Using grep & bash:
grep -c -m 1 $'\r$' file
echo $'\r\n\r\n' | grep -c $'\r$' # test
echo $'\r\n\r\n' | grep -c -m 1 $'\r$'