views:

773

answers:

5

Hi,

How to read a file in reverse order using python? I want to read a file from last line to first line. Any one can help me?

Thanks in advance, Nimmy

+6  A: 
for line in reversed(open("filename").readlines()):
    print line.rstrip()
Matt Joiner
Thanks a lot...
Nimmy
Alas, this doesn't work if you can't fit the whole file in memory.
vy32
yes, alas. if computers handled corner cases for us, they'd be even slower.
Matt Joiner
+4  A: 
for line in reversed(open("file").readlines()):
    print line.rstrip()

If you are on linux, you can use tac command.

$ tac file

2 recipes you can find in ActiveState here and here

ghostdog74
I wonder if reversed() consumes the whole sequence before iteration. Docs say a `__reversed__()` method is needed, but python2.5 doesn't complain on a custom class without it.
muhuk
Thanks a lot...
Nimmy
@muhuk, it probably has to cache it somehow, i suspect it generates a new list in reverse order then returns an iterator to that
Matt Joiner
@Matt: that would be ridiculous. It simply goes from the back to the front-- len(L)-1 is the back, 0 is the front. You can picture the rest.
Devin Jeanpierre
@muhuk: Sequences aren't meaningfully consumed (you can iterate over the whole sequence, but it doesn't matter very much). A `__reversed__` method is also not necessary, and there didn't use to be such a thing. If an object provides `__len__` and `__getitem__` it will work just fine (minus some exceptional cases, such as dict).
Devin Jeanpierre
@Devin Jeanpierre: Only if readlines() returns an object that provides `__reversed__`?
Matt Joiner
@Matt: It returns a list. Anyway, if an object provides just `__len__` and `__getitem__`, it works just as described (e.g.: http://codepad.org/aglIbcXy ). Lists also work just as described (see the definition of list_reverse (`__reversed__`) at http://svn.python.org/view/python/trunk/Objects/listobject.c?view=markup ).
Devin Jeanpierre
@Devin: Then my original statement is correct. The full file must be read before the list can be iterated in reverse order.
Matt Joiner
@Matt: Your original statement was not exactly what I would call correct (or if it was correct, I did not read it as intended). It would read the full file regardless of whether `reversed()` is called on it, because this is what readlines() does. it does not construct the list in reverse order, rather it creates an iterator which iterates over the list (which is in regular order), backwards.
Devin Jeanpierre
@Devin: Indeed, I think we're both agreeing with the behaviour here (and that it's not ideal): readlines() generates a list, immediately (by reading the whole file). reversed() doesn't not have to generate the reversed list immediately, rather it creates an iterator using len and getitem. finally a reversed operator could be provided for File objects, or readlines() retvars, that actually read the file from back to front, but it is not the case presently, and may not even be ideal.
Matt Joiner
But if readlines() was returning an iterator with a proper `__reversed__()` it would be cool. In fact both answers using `readlines()` are horribly inefficient for big files.
muhuk
@muhuk: I use iterators absolutely everywhere possible: Here they're not possible (except to iterate readlines in reverse), but otherwise you can just use `for line in open("file")`
Matt Joiner
+1  A: 
def filerev(somefile, buffer=0x20000):
  somefile.seek(0, os.SEEK_END)
  size = somefile.tell()
  lines = ['']
  rem = size % buffer
  pos = max(0, (size // buffer - 1) * buffer)
  while pos >= 0:
    somefile.seek(pos, os.SEEK_SET)
    data = somefile.read(rem + buffer) + lines[0]
    rem = 0
    lines = re.findall('[^\n]*\n?', data)
    ix = len(lines) - 2
    while ix > 0:
      yield lines[ix]
      ix -= 1
    pos -= buffer
  else:
    yield lines[0]

with open(sys.argv[1], 'r') as f:
  for line in filerev(f):
    sys.stdout.write(line)
Ignacio Vazquez-Abrams
This appears to produce the wrong output for files larger than buffer. It won't correctly handle lines that span the buffer-sized chunks you read in, as I understand it. I posted another similar answer (to another similar question).
Darius Bacon
@Darius: Ah yes, I seem to have missed a bit. Should be fixed now.
Ignacio Vazquez-Abrams
Looks right. I'd still prefer my own code because this does O(N^2) work on a big file that's all one long line. (In the similar answers to the other question that I tested this caused a serious genuine slowdown on such files.)
Darius Bacon
Well the question didn't mention performance, so I can't nitpick the performance disaster that is regular expressions :P
Matt Joiner
A: 

If you are on Mac OSX tac command does not work, use tail -r

# We need a command to reverse the line order of the file. On Linux this
# is 'tac', on OSX it is 'tail -r'
# 'tac' is not supported on osx, 'tail -r' is not supported on linux.

if sys.platform == "darwin":
    command += "|tail -r"
elif sys.platform == "linux2":
    command += "|tac"
else:
    raise EnvironmentError('Platform %s not supported' % sys.platform)
jeorgen
A: 

I've posted an answer that reads the file incrementally, in response to Most efficient way to search the last x lines of a file in python.

Darius Bacon