views:

90

answers:

4
for i in vr_world.getNodeNames():
    if i != "_error_":
         World[i] = vr_world.getChild(i)

vr_world.getNodeNames() returns me a gigantic list, vr_world.getChild(i) returns a specific type of object.

This is taking a long time to run, is there anyway to make it more efficient? I have seen one-liners for loops before that are supposed to be faster. Ideas?

A: 
World = dict((i, vr_world.getChild(i)) for i in vr_world.getNodeNames() if i != "_error_")

This is a one-liner, but not necessarily much faster than your solution...

eumiro
Is that actually more efficient or just more compact?
GWW
It really didn't make matters any better - or worse. Maybe it is in the getChild where the problem really is.
relima
@GWW, a generator expression will _normally_ be 10% to 100% faster than an equivalent `for` loop, though I've seen cases where they're slower.
aaronasterling
That's good to know. That could be why some of my file parsing code takes forever thanks!
GWW
A: 

Maybe you can use a filter and a map, however I don't know if this would be any faster:

valid = filter(lambda i: i != "_error_", vr_world.getNodeNames())
World = map(lambda i: vr_world.getChild(i), valid)

Also, as you'll see a lot around here, profile first, and then optimize, otherwise you may be wasting time. You have two functions there, maybe they are the slow parts, not the iteration.

korbes
Sadly, in python, `filter`/`map` solutions are almost always slower than list comprehensions.
Daenyth
if speed is a factor in the first line: `valid = filter('_error_'.__eq__, vr_world.getNodeNames())`
eumiro
Also, lambda functions are slower than regular ones or, as in the case above, inline logic.
kaloyan
and for a `huge` list, you should _really_ be using `ifilter` and `imap` from itertools to process it lazily and not have to keep everything in memory at once. this can be a huge timesaver for really large lists.
aaronasterling
+1  A: 

I don't think you can make it faster than what you have there. Yes, you can put the whole thing on one line but that will not make it any faster. The bottleneck obviously is getNodeNames(). If you can make it a generator, you will start populating the World dict with results sooner (if that matters to you) and if you make it filter out the "_error_" values, you will not have the deal with that at a later stage.

kaloyan
+1  A: 

kaloyan suggests using a generator. Here's why that may help.

If getNodeNames() builds a list, then your loop is basically going over the list twice: once to build it, and once when you iterate over the list.

If getNodeNames() is a generator, then your loop doesn't ever build the list; instead of creating the item and adding it to the list, it creates the item and yields it to the caller.

Whether or not this helps is contingent on a couple of things. First, it has to be possible to implement getNodeNames() as a generator. We don't know anything about the implementation details of that function, so it's not possible to say if that's the case. Next, the number of items you're iterating over needs to be pretty big.

Of course, none of this will have any effect at all if it turns out that the time-consuming operation in all of this is vr_world.getChild(). That's why you need to profile your code.

Robert Rossney