ansaurus

Question

Is python automagically parallelizing IO- and CPU- or memory-bound sections?

Answer 1

+1 A:

How big are the lines in your file? If they're not very long (anything under about 1K probably qualifies) then you're likely seeing performance gains because of input buffering.

David Locke 2009-05-13 23:37:27

Answer 2

+1 A:

Why do you think list.append() would be a slower operation? It is extremely fast, considering the internal pointer arrays used by lists to hold references to the objects in them are allocated in increasingly large blocks, so that every append does not actually re-allocate the array, and most can simply increment the length counter and set a pointer and incref.

ironfroggy 2009-05-13 23:52:30

Well, it's allocating memory, and that's not usually a light process. I don't know the details of python VM, but it's either allocating 65150224 small objects (maybe fast, but repeated too much times) or more likely it's doubling the allocation size ~ 26 times during the whole program (2^26 ~ 64M) in increasingly larger chunks of memory, possibly harder to find due to memory fragmentation (or de-fragmentation?)

Davide 2009-05-18 23:23:10

Answer 3

+5 A:

Obviously data.append() is happening in parallel with the IO.

I'm afraid not. It is possible to parallelize IO and computation in Python, but it doesn't happen magically.

One thing you could do is use posix_fadvise(2) to give the OS a hint that you plan to read the file sequentially (POSIX_FADV_SEQUENTIAL).

In some rough tests doing "wc -l" on a 600 meg file (an ISO) the performance increased by about 20%. Each test was done immediately after clearing the disk cache.

For a Python interface to fadvise see python-fadvise.

Benji York 2009-05-13 23:58:13

This is solid information.

Joe Koberg 2009-05-14 03:32:00

Answer 4

+1 A:

I don't see any evidence that "data.append() is happening in parallel with the IO." Like Benji, I don't think this is automatic in the way you think. You showed that doing data.append(line[-1]) takes about the same amount of time as lc = lc + 1 (essentially no time at all, compared to the IO and line splitting). It's not really surprising that data.append(line[-1]) is very fast. One would expect the whole line to be in a fast cache, and as noted append prepares buffers ahead of time and only rarely has to reallocate. Moreover, line[-1] will always be '\n', except possibly for the last line of the file (no idea if Python optimizes for this).

The only part I'm a little surprised about is that the xrange is so variable. I would expect it to always be faster, since there's no IO, and you're not actually using the counter.

Matthew Flaschen 2009-05-14 00:02:32

Answer 5

+1 A:

If your run times are varying by that amount for the second example, I'd suspect your method of timing or outside influences (other processes / system load) to be skewing the times to the point where they don't give any reliable information.

Doug 2009-05-14 04:11:56

Of course you might be right. But it could also be memory fragmentation, couldn't it?

Davide 2009-05-18 23:23:50

I'd be hesitant to blame memory fragmentation on such drastic differences. I can't say for sure, but my inclination is to believe that many data structures in Python suffer from fragmentation due to the nature of the language. For this specific case, I might believe that swapping is a more likely culprit than fragmentation.

Doug 2009-05-21 17:35:09

ansaurus

tags:

views:

answers:

Is python automagically parallelizing IO- and CPU- or memory-bound sections?

related questions