ansaurus

Question

Answer 1

+1 A:

Iterating over a dictionary is no different from iterating over a list in python:

for key in dic:
    print("dic[%s] = %s" % (key, dic[key]))

This will print all of the keys and values of your dictionary.

Avihu Turzion 2009-08-06 20:01:37

While you're right, this was handled in the comments, and doesn't answer his question, which was deducible.

Triptych 2009-08-06 20:29:19

Answer 2

+1 A:

I assume that your unique id will be the key.
Probably not very beautiful, but returns a dict with your unique values:

>>> dict_ = {'1': ['first/dir', 'hello.txt'],
'3': ['first/dir', 'foo.txt'], 
'2': ['second/dir', 'foo.txt'], 
'4': ['second/dir', 'foo.txt']}  
>>> dict((v[0]+v[1],k) for k,v in dict_.iteritems())  
{'second/dir/foo.txt': '4', 'first/dir/hello.txt': '1', 'first/dir/foo.txt': '3'}

I've seen you updated your post:

>>> a
{'324234324': ('third/dir', 'dog.txt'), 
'2323221383': ('second/dir', 'foo.txt'), 
'3434221': ('first/dir', 'hello.txt'), 
'2323232838': ('first/dir', 'hello.txt'), 
'32232334': ('first/dir', 'hello.txt')}
>>> dict((v[0]+"/"+v[1],k) for k,v in a.iteritems())
{'second/dir/foo.txt': '2323221383', 
'first/dir/hello.txt': '32232334', 
'third/dir/dog.txt': '324234324'}

buster 2009-08-06 20:39:14

that's not what OP has asked for at all.

SilentGhost 2009-08-06 20:59:09

As yours isn't, too.The OP had some different version in the beginning which confused me.Tryptichs version seems to be alright, though.

buster 2009-08-06 21:04:54

Answer 3

+8 A:

The code below will result in two variables, matches and remainders. matches is an array of dictionaries, in which matching items from the original dictionary will have a corresponding element. remainder will contain, as in your example, a dictionary containing all the unmatched items.

Note that in your example, there is only one set of matching values: ('first/dir', 'hello.txt'). If there were more than one set, each would have a corresponding entry in matches.

import itertools

# Original dict
a = {"2323232838": ("first/dir", "hello.txt"),
     "2323221383": ("second/dir", "foo.txt"),
     "3434221": ("first/dir", "hello.txt"),
     "32232334": ("first/dir", "hello.txt"),
     "324234324": ("third/dir", "dog.txt")}

# Convert dict to sorted list of items
a = sorted(a.items(), key=lambda x:x[1])

# Group by value of tuple
groups = itertools.groupby(a, key=lambda x:x[1])

# Pull out matching groups of items, and combine items   
# with no matches back into a single dictionary
remainder = []
matched   = []

for key, group in groups:
   group = list(group)
   if len(group) == 1:
      remainder.append( group[0] )
   else:
      matched.append( dict(group) )
else:
   remainder = dict(remainder)

Output:

>>> matched
[
  {
    '3434221':    ('first/dir', 'hello.txt'), 
    '2323232838': ('first/dir', 'hello.txt'), 
    '32232334':   ('first/dir', 'hello.txt')
  }
]

>>> remainder
{
  '2323221383': ('second/dir', 'foo.txt'), 
  '324234324':  ('third/dir', 'dog.txt')
}

As a newbie, you're probably being introduced to a few unfamiliar concepts in the code above. Here are some links:

Triptych 2009-08-06 20:49:46

nice. I can see now that i misunderstood the question with my answer.Anyway, looks good to me :)

buster 2009-08-06 20:58:09

Thank you, i will need to read up on groups, but that's all good, thanks a million. Also thanks for editing my question!

2009-08-06 21:06:25

Note, len(group) is 1 should read len(group) == 1. While the identity test ("is") works here in cPython due to small integer caching, it's a bad habit to get into. You want an equality test.

Ned Deily 2009-08-06 22:50:59

Answer 4

A:

if you know what value you want to filter out:

known_tuple = 'first/dir','hello.txt'
b = {k:v for k, v in a.items() if v == known_tuple}

then a would become:

a = dict(a.items() - b.items())

this is py3k notation, but I'm sure something similar can be implemented in legacy versions. If you don't know what the known_tuple is, then you'd need to first find it out. for example like this:

c = list(a.values())
for i in set(c):
    c.remove(i)
known_tuple = c[0]

SilentGhost 2009-08-06 20:52:47

No, it can very well be "third/dir", "something.txt", i don't know.

2009-08-06 20:58:56

Answer 5

+4 A:

What you're asking for is called an "Inverted Index" -- the distinct items are recorded just once with a list of keys.

>>> from collections import defaultdict
>>> a = {"2323232838": ("first/dir", "hello.txt"),
...      "2323221383": ("second/dir", "foo.txt"),
...      "3434221": ("first/dir", "hello.txt"),
...      "32232334": ("first/dir", "hello.txt"),
...      "324234324": ("third/dir", "dog.txt")}
>>> invert = defaultdict( list )
>>> for key, value in a.items():
...     invert[value].append( key )
... 
>>> invert
defaultdict(<type 'list'>, {('first/dir', 'hello.txt'): ['3434221', '2323232838', '32232334'], ('second/dir', 'foo.txt'): ['2323221383'], ('third/dir', 'dog.txt'): ['324234324']})

The inverted dictionary has the original values associated with a list of 1 or more keys.

Now, to get your revised dictionaries from this.

Filtering:

>>> [ invert[multi] for multi in invert if len(invert[multi]) > 1 ]
[['3434221', '2323232838', '32232334']]
>>> [ invert[uni] for uni in invert if len(invert[uni]) == 1 ]
[['2323221383'], ['324234324']]

Expanding

>>> [ (i,multi) for multi in invert if len(invert[multi]) > 1 for i in invert[multi] ]
[('3434221', ('first/dir', 'hello.txt')), ('2323232838', ('first/dir', 'hello.txt')), ('32232334', ('first/dir', 'hello.txt'))]
>>> dict( (i,multi) for multi in invert if len(invert[multi]) > 1 for i in invert[multi] )
{'3434221': ('first/dir', 'hello.txt'), '2323232838': ('first/dir', 'hello.txt'), '32232334': ('first/dir', 'hello.txt')}

A similar (but simpler) treatment works for the items which occur once.

S.Lott 2009-08-06 22:08:42

Huh, very simple, gotta use the python standard lib. more, thanks for this.

2009-08-06 22:52:56

Ah, nice, too.It's amazing what you can do with simple standard calls :)

buster 2009-08-07 07:41:12

ansaurus

tags:

views:

answers:

How to filter a dictionary by value?

related questions