In Python, after
fh = open('file.txt')
one may do the following to iterate over lines:
for l in fh:
pass
Then why do we have fh.readlines()
?
In Python, after
fh = open('file.txt')
one may do the following to iterate over lines:
for l in fh:
pass
Then why do we have fh.readlines()
?
I would imagine that it's from before files were iteratators and is maintained for backwards compatibility. Even for a one-liner, it's totally1 fairly redundant as list(fh)
will do the same thing in a more intuitive way. That also gives you the freedom to do set(fh)
, tuple(fh)
, etc.
1 See gnibbler's answer.
readlines()
returns a list of lines, which you may want if you don't plan on iterating through each line.
Mostly it is there for backward compatibility. readlines was there way before file objects were iterable
Using readlines with the size argument is also one of the fastest ways to read from files because it reads a bunch of data in one hit, but doesn't need to allocate memory for the entire file all at once
the magic behind this is iterators, if you check dir(fh)
.
It contains special method called __iter__
so when you try to use for
or in
statements over variable; internally its using __iter__
method !
__iter__
: Return the iterator object
itself. This is required to allow both containers and iterators to be used with the for
and in
statements. This method corresponds to the tp_iter slot of the type structure for Python objects in the Python/C API
.