ansaurus

Question

In Python, why do we need readlines() when we can iterate over the file handle itself?

Answer 1

+9 A:

I would imagine that it's from before files were iteratators and is maintained for backwards compatibility. Even for a one-liner, it's ~~totally~~¹ fairly redundant as list(fh) will do the same thing in a more intuitive way. That also gives you the freedom to do set(fh), tuple(fh), etc.

¹ See gnibbler's answer.

aaronasterling 2010-10-14 12:50:30

+1 Good point on `collection_constructor(fh)`. I must admit, it never occured to me.

delnan 2010-10-14 12:52:13

When I was beginning with Python I used to `list(fh)` before I realized it was redundant. Are you sure it is for backward compatibility? I mean they did break a lot of that with Python 3, so why not clean this up too!

Ashish 2010-10-14 12:55:04

@Ashish, They really didn't break that much and no I'm not sure it's for backwards compatibility. I started programming python when 2.6 was not quite fairly new so I'm something of a noob. I've never even written code for 2.5 for instance so I don't know the history at all.

aaronasterling 2010-10-14 12:58:40

+1, gnibbler made a good point like you pointed out. which was in fact first pointed out in one of the comments by someone else.

Ashish 2010-10-14 13:59:06

Answer 2

A:

readlines() returns a list of lines, which you may want if you don't plan on iterating through each line.

2010-10-14 12:52:13

`list(fh)` will do the same.

Ashish 2010-10-14 12:53:26

@Ashish: `readlines([size])` takes an optional argument.

MattH 2010-10-14 12:57:25

@MattH, note that the size is a hint, and more or less lines may be returned

gnibbler 2010-10-14 12:59:33

@MattH: That makes it not redundant. Could you post it as an answer? That way it will help readers spot your point clearly. :)

Ashish 2010-10-14 13:03:35

@Ashish, posting that as an answer now would be redundant. Let Mr Gnibbler take the points.

MattH 2010-10-14 13:45:56

Answer 3

+6 A:

Mostly it is there for backward compatibility. readlines was there way before file objects were iterable

Using readlines with the size argument is also one of the fastest ways to read from files because it reads a bunch of data in one hit, but doesn't need to allocate memory for the entire file all at once

gnibbler 2010-10-14 13:22:18

Answer 4

A:

the magic behind this is iterators, if you check dir(fh).
It contains special method called __iter__

so when you try to use for or in statements over variable; internally its using __iter__ method !

__iter__ : Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements. This method corresponds to the tp_iter slot of the type structure for Python objects in the Python/C API.

Tumbleweed 2010-10-14 13:35:18

ansaurus

tags:

views:

answers:

In Python, why do we need readlines() when we can iterate over the file handle itself?

related questions