views:

212

answers:

6

Hello Everybody,

I have a list of values and I want to put them in a dictionary that would map each value to it's index.

I can do it this way:

>>> t = (5,6,7)
>>> d = dict(zip(t, range(len(t))))
>>> d
{5: 0, 6: 1, 7: 2}

this is not bad, but I'm looking for something more elegant.

I've come across the following, but it does the opposite of what I need:

>>> d = dict(enumerate(t))
>>> d
{0: 5, 1: 6, 2: 7}

Please share your solutions,
Thank you

EDIT: Python 2.6.4

For lists containing 1000 elements the dict(zip) version is the fastest, the generator and the list comprehension versions are virtually identical and they are ~1.5 times slower and the functional map(reversed) is considerably slower.

$ python -mtimeit -s"t = range(int(1e3))" "d = dict(zip(t, range(len(t))))"
1000 loops, best of 3: 277 usec per loop

$ python -mtimeit -s"t = range(int(1e3))" "d = dict([(y,x) for x,y in enumerate(t)])"
1000 loops, best of 3: 426 usec per loop

$ python -mtimeit -s"t = range(int(1e3))" "d = dict((y,x) for x,y in enumerate(t))"
1000 loops, best of 3: 437 usec per loop

$ python -mtimeit -s"t = range(int(1e3))" "d = dict(map(reversed, enumerate(t)))"
100 loops, best of 3: 3.66 msec per loop

I tried running the same tests for longer and for shorter lists (1e2, 1e4, 1e5) and the time per loop scales linearly with the length of the list.

Could somebody time py 2.7+ version?

+11  A: 

You can use a list comprehension (or a generator, depending on your python version) to perform a simple in-place swap for your second example.


Using a list comprehension:

d = dict([(y,x) for x,y in enumerate(t)])

Using a generator expression (Python 2.4 and up):

d = dict((y,x) for x,y in enumerate(t))
kibibu
You don't need the `[]` in there. `dict` works fine with a generator expression (saves making an intermediate list)
gnibbler
Yeah, that's why I wrote "depending on your python version". Generators have been around a long time though (since 2.4), so I will include both
kibibu
@kibibu: Where Python < 2.4 is used?
J.F. Sebastian
@J.F. Sebastian In deployed systems developed prior to 2004? Python has been around for quite a while now. It's not hard to imagine having to work on some Python 2.0 application, I mean some people *still* have to work in VB6.
kibibu
+4  A: 
>>> dict((x,i) for i,x in enumerate(t))
{5: 0, 6: 1, 7: 2}
>>>
S.Mark
+2  A: 

Are all your elements unique (i.e. your list would never be 5,6,7,7)? The dict solution will only work if all your elements are unique.

By storing the index, you're essentially duplicating information, since you could simply query the current index of the item in the list. Duplicating information is usually not the best idea, because it allows the possibility for one set of data to get out of sync with the other.

If the list is being modified, there's also nothing preventing you from accidentally assigning the same index to more than one item.

Why are you trying to store the index value, when you could simply get the index from the list?

Brendan Abel
all list elements are unique. I'm storing the index for fast lookup in a different data structure.
Dragan Chupacabrovic
That sounds like a useless level of indirection if all elements are unique, test for membership with `in` and index with `index()`. I'm guessing that you imagine that a hash-map backed dictionary will give you faster lookup than `index()` will. In Python premature optimization truly is evil because your intuitions about "faster" are often wrong until actually timed. Make it work, then find out where you are slow, added complexity isn't worth it.
msw
@Dragan, are you modifying your list or does it remain static?
Hamish Grubijan
I'm not modifying the list
Dragan Chupacabrovic
@msw valid concern, in general i agree w/ you, in this case I think it's worth it python -mtimeit -s"t = range(int(1e2))" "truthVal = (55 in t)" # 3.11 usec per loop python -mtimeit -s"t = range(int(1e2)); d = dict(zip(t, range(len(t))))" "truthVal = (55 in d)" #0.138 usec per loop python -mtimeit -s"t = range(int(1e2))" "idxVal = t.index(55)" #3.46 usec per loop python -mtimeit -s"t = range(int(1e2)); d = dict(zip(t, range(len(t))))" "indexVal = d[55]" #0.136 usec per loop
Dragan Chupacabrovic
I apologize for the way it got formatted
Dragan Chupacabrovic
@dragan: assertion confirmed, even when I biased your test wildly against dictionaries by the use of "5" as a target dicts still blasted the doors off lists by a factor between 3 and 7 times faster. I thought cache hits/misses may have been a confound, it doesn't appear to be for my system even with list sized of 1e7 that forced it out to swap. (Python 2.6.5 Linux 2.6)
msw
A: 

I like the dict(zip(t, range(len(t)))) best.

Arafangion
@Arafangion: Why? Is it faster?
Hamish Grubijan
It's short, simple, and to the point.
Arafangion
+12  A: 

In Python2.7+ you can write it like this

>>> t = (5,6,7)
>>> d = {x:i for i,x in enumerate(t)}
>>> print d
{5: 0, 6: 1, 7: 2}
gnibbler
+1 for showcasing the beauty of Py3k's dict comprehensions.
Dustin
+2  A: 

As everybody has already written, in Python 2.6 I would consider the following as the most pythonic:

>>> dict((x, i) for i, x in enumerate(t))
{5: 0, 6: 1, 7: 2}

Still, in a moment of functional frenzy I would write:

>>> dict(map(reversed, enumerate(t)))
{5: 0, 6: 1, 7: 2}
krawyoti