tags:

views:

130

answers:

3

I want to sort objects using by one of their attributes. As of now, I am doing it in the following way

USpeople.sort(key=lambda person: person.utility[chosenCar],reverse=True)

This works fine, but I have read that using operator.attrgetter() might be a faster way to achieve this sort. First, is this correct? Assuming that it is correct, how do I use operator.attrgetter() to achieve this sort?

I tried,

 keyFunc=operator.attrgetter('utility[chosenCar]')
 USpeople.sort(key=keyFunc,reverse=True)

However, I get an error saying that there is no attribute 'utility[chosenCar]'.

The problem is that the attribute by which I want to sort is in a dictionary. For example, the utility attribute is in the following form:

utility={chosenCar:25000,anotherCar:24000,yetAnotherCar:24500}

I want to sort by the utility of the chosenCar using operator.attrgetter(). How could I do this?

Thanks in advance.

+1  A: 

to access chosenCar item you'd have to use:

>>> P.utility={'chosenCar':25000,'anotherCar':24000,'yetAnotherCar':24500}
>>> operator.itemgetter('chosenCar')(operator.attrgetter('utility')(P))
25000

for the key function you'll have to do the following:

>>> def keyfunc(P):
    util = operator.attrgetter('utility')(P)
    return operator.itemgetter('chosenCar')(util)

>>> USpeople.sort(key=keyfunc,reverse=True)

However, your main claim re the better performance of this approach seems poorly researched. I'd suggest to use timeit module to test performance of both approaches for your own data.

SilentGhost
Thanks for pointing out that the speed will not increase. However, I would like to understand your solution for using operator.attrgetter for future. The person objects that I have are in a list. For example,people=[P1,P2,P3,P4]where P1,P2,P3,P4 are persons.I want to sort this using attrgetter. What do I write in place of (P) that you have in your second line.Thanks.
Curious2learn
@Curious: updated
SilentGhost
Of course, `operator.attrgetter('utility')(P)` is just a silly way of writing `P.utility`. The reason to use `operator.attrgetter` and `operator.itemgetter` is when you can use them *directly*. If you're going to write your own key function, it would just be `return P.utility['chosenCar']`, just like the lambda originally was.
Thomas Wouters
+2  A: 

No, attrgetter will not be any faster than the lambda - it's really just another way of doing the same thing.

You may have been confused by a recommendation to use key instead of cmp, which is indeed significantly faster, but you're already doing that.

Daniel Roseman
Danie, thanks for the clarification.
Curious2learn
`attrgetter` is faster than the lambda, since it's implemented in C. The difference is only about 15% though according to my testing.
interjay
+1  A: 
  • Never, ever, ever optimize based on something you've read. Going into your code and making random changes from what you have to something you think should be faster is not a working optimization strategy.

  • Here is how you optimize if you want to improve your code.

    1. Don't. It's often a waste of time.
    2. Make a working, testable program.
    3. Determine performance metrics—be able to answer "Is this code fast enough?"
    4. Realize that your code is already fast enough.
    5. If you weren't able to do step (4), profile your code for realistic input to determine where it spends its time. In Python you can use http://docs.python.org/library/profile.html to do this. Bottlenecks occur at unexpected places, and this will tell you where you actually have to put in the effort.
    6. Examine the time-consuming code for algorithmic suboptimality. This sometimes occurs at the level you are, but often occurs several levels out too. Improving your algorithm will almost always be the biggest chance at a speedup.
    7. If you cannot improve your algorithm, test various pieces of code that do the same thing based and see how they perform. Use http://docs.python.org/library/timeit.html to test snippets (this is harder to get right than people realise, so be careful) and re-run your performance tests and profile.

      It can be tempting to try to do this step upfront, but this would often prove to be unfruitful. You need to know that what you're optimizing makes sense.

    I hope this provides some insight into how to speedup your code (and when not to bother). I've seen lots of people try replacing random code with rule-of-thumb optimizations, but I haven't seen those people producing great, fast software. Optimization must be done scientifically, using theory (such as the computer science in 6) and experimentation (such as the timing in 7).

  • In this specific case, I would bet money that SilentGhost's code ultimately is slower than yours. I of course don't know for sure, but neither do you unless you time it.

    (And I don't think you should bother timing it, I think you should go with the clearest approach, your original one.)

Mike Graham
Thanks for your comment. I profiled my code after I asked the question and found the function that is taking most of the time. I am going to think whether I can use a better algorithm. If I cannot, I will come back to stackoverflow for any ideas.
Curious2learn