tags:

views:

222

answers:

3

I am attempting to take a list of objects, and turn that list into a dict. The dict values would be each object in the list, and the dict keys would be a value found in each object.

Here is some code representing what im doing:

class SomeClass(object):

    def __init__(self, name):
        self.name = name

object_list = [
    SomeClass(name='a'),
    SomeClass(name='b'),
    SomeClass(name='c'),
    SomeClass(name='d'),
    SomeClass(name='e'),
]

object_dict = {}
for an_object in object_list:
    object_dict[an_object.name] = an_object

Now that code works, but its a bit ugly, and a bit slow. Could anyone give an example of something thats faster/"better"?

edit: Alright, thanks for the replies. I must say i am surprised to see the more pythonic ways seeming slower than the hand made way.

edit2: Alright, i updated the test code to make it a bit more readable, with so many tests heh.

Here is where we are at in terms of code, i put authors in the code and if i messed any up please let me know.

from itertools import izip
import timeit

class SomeClass(object):

    def __init__(self, name):
        self.name = name

object_list = []

for i in range(5):
    object_list.append(SomeClass(name=i))

def example_1():
    'Original Code'
    object_dict = {}
    for an_object in object_list:
        object_dict[an_object.name] = an_object

def example_2():
    'Provided by hyperboreean'
    d = dict(zip([o.name for o in object_list], object_list))

def example_3():
    'Provided by Jason Baker'
    d = dict([(an_object.name, an_object) for an_object in object_list])

def example_4():
    "Added izip to hyperboreean's code, suggested by Chris Cameron"
    d = dict(izip([o.name for o in object_list], object_list))

def example_5():
    'zip, improved by John Fouhy'
    d = dict(zip((o.name for o in object_list), object_list))

def example_6():
    'izip, improved by John Fouhy'
    d = dict(izip((o.name for o in object_list), object_list))

def example_7():
    'Provided by Jason Baker, removed brackets by John Fouhy'
    d = dict((an_object.name, an_object) for an_object in object_list)

timeits = []
for example_index in range(1, 8):
    timeits.append(
        timeit.Timer(
            'example_%s()' % example_index,
            'from __main__ import example_%s' % example_index)
    )

for i in range(7):
    timeit_object = timeits[i]
    print 'Example #%s Result: "%s"' % (i+1, timeit_object.repeat(2))

With 5 objects in the list i am getting a result of:

    Example #1 Result: "[1.2428441047668457, 1.2431108951568604]"
    Example #2 Result: "[3.3567759990692139, 3.3188660144805908]"
    Example #3 Result: "[2.8346641063690186, 2.8344728946685791]"
    Example #4 Result: "[3.0710639953613281, 3.0573830604553223]"
    Example #5 Result: "[5.2079918384552002, 5.2170760631561279]"
    Example #6 Result: "[3.240635871887207, 3.2402129173278809]"
    Example #7 Result: "[3.0856869220733643, 3.0688989162445068]"

and with 50:

    Example #1 Result: "[9.8108220100402832, 9.9066231250762939]"
    Example #2 Result: "[16.365023136138916, 16.213981151580811]"
    Example #3 Result: "[15.77024507522583, 15.771029949188232]"
    Example #4 Result: "[14.598290920257568, 14.591825008392334]"
    Example #5 Result: "[20.644147872924805, 20.64064884185791]"
    Example #6 Result: "[15.210831165313721, 15.212569952011108]"
    Example #7 Result: "[17.317100048065186, 17.359367847442627]"

And lastly, with 500 objects:

    Example #1 Result: "[96.682723999023438, 96.678673028945923]"
    Example #2 Result: "[137.49416589736938, 137.48705387115479]"
    Example #3 Result: "[136.58069896697998, 136.5823769569397]"
    Example #4 Result: "[115.0344090461731, 115.1088011264801]"
    Example #5 Result: "[165.08325910568237, 165.06769108772278]"
    Example #6 Result: "[128.95187497138977, 128.96077489852905]"
    Example #7 Result: "[155.70515990257263, 155.74126601219177]"

Thanks to all that replied! Im very surprised with the result. If there are any other tips for a faster method i would love to hear them. Thanks all!

+6  A: 
d = dict(zip([o.name for o in object_list], object_list))
hyperboreean
+1 A good way to do this in Python 2.
Jason Baker
+1: Also if you use izip you may save time and space, may be important if the list is very long.
Chris Cameron
+12  A: 

In python 3.0 you can use a dict comprehension:

{an_object.name : an_object for an_object in object_list}

This is also possible in Python 2, but it's a bit uglier:

dict([(an_object.name, an_object) for an_object in object_list])
Jason Baker
Hey, nice, didn't know they added dict comprehension in 3.0
hyperboreean
No need (in Python 2.4 or above) to create a list only to throw it away. Instead of giving the dict constructor a list comprehension, use a generator expression which only iterates the sequence once: dict((item.name, item) for item in object_list)
bignose
@bignose - yes, you're correct. See John Fouy's answer for a way to do that.
Jason Baker
Well in this test i am strictly focusing on speed, so im not even taking memory/other into account. But Jasons method, with the added list, preforms faster. Compare example_3 to example_7, 3 is always faster in the above tests.
Lee Olayvar
+5  A: 

If you're concerned with speed, then we can improve things slightly. Your "verbose" solution (which is really fine) creates no intermediate data structures. On the other hand, hyperboreean's solution,

d = dict(zip([o.name for o in object_list], object_list))

creates two unnecessary lists: [o.name for o in object_list] creates a list, and zip(_, _) creates another list. Both these lists serve only to be iterated over once in the creation of the dict.

We can avoid the creation of one list by replacing the list comprehension with a generator expression:

d = dict(zip((o.name for o in object_list), object_list))

Replacing zip with itertools.izip will return an iterator and avoid creating the second list:

import itertools
d = dict(itertools.izip((o.name for o in object_list), object_list))

We could modify Jason Baker's solution in the same way, by simply deleting the square brackets:

d = dict((an_object.name, an_object) for an_object in object_list)
John Fouhy
That's true, nice.
hyperboreean