ansaurus

Question

Filtering dictionaries and creating sub-dictionaries based on keys/values in Python?

Answer 1

+1 A:

The answer is too simple, so I guess we are missing some information. Anyway:

result = []
for datadict in data:
    for key, value in datadict.items():
        thefiltering()

    if datadict.get('matchkey') == 'matchvalue':
        result.append(datadict)

Also, you "main dictionary" is not a dictionary but a list. Just wanted to clear that up.

Lennart Regebro 2009-07-31 21:39:32

Answer 2

+6 A:

Here is a not so pretty way of doing it. The result is a generator, but if you really want a list you can surround it with a call to list(). Mostly it doesn't matter.

The predicate is a function which decides for each key/value pair if a dictionary in the list is going to cut it. The default one accepts all. If no k/v-pair in the dictionary matches it is rejected.

def filter_data(data, predicate=lambda k, v: True):
    for d in data:
         for k, v in d.items():
               if predicate(k, v):
                    yield d


test_data = [{"key1":"value1", "key2":"value2"}, {"key1":"blabla"}, {"key1":"value1", "eh":"uh"}]
list(filter_data(test_data, lambda k, v: k == "key1" and v == "value1"))
# [{'key2': 'value2', 'key1': 'value1'}, {'key1': 'value1', 'eh': 'uh'}]

Skurmedel 2009-07-31 21:44:29

"not so pretty"? Disagree. This is very nice.

S.Lott 2009-07-31 21:51:52

Thank you :). I tend to think stair case functions like that are ugly.

Skurmedel 2009-07-31 21:55:28

@Skurmedel: Your function is elegant and it's easy to see how it does the job in simple steps; it saves the readers having to parse a complicated one-liner in their heads.

John Machin 2009-08-02 00:25:30

Wow, that's pretty much exactly what I was looking for... and I have to disagree on the 'not so pretty' comment too.

Crazy Serb 2009-08-04 16:20:11

Answer 3

+2 A:

Net of the issues already pointed out in other comments and answers (multiple identical keys can't be in a dict, etc etc), here's how I'd do it:

def select_sublist(list_of_dicts, **kwargs):
    return [d for d in list_of_dicts 
            if all(d.get(k)==kwargs[k] for k in kwargs)]

subdata = select_sublist(data, key1='value1')

Alex Martelli 2009-07-31 22:24:15

Answer 4

A:

Inspired by the answer of Skurmedal, I split this into a recursive scheme to work with a database of nested dictionaries. In this case, a "record" is the subdictionary at the trunk. The predicate defines which records we are after -- those that match some (key,value) pair where these pairs may be deeply nested.

def filter_dict(the_dict, predicate=lambda k, v: True):
    for k, v in the_dict.iteritems():
        if isinstance(v, dict) and _filter_dict_sub(predicate, v):
            yield k, v

def _filter_dict_sub(predicate, the_dict):
    for k, v in the_dict.iteritems():
        if isinstance(v, dict) and filter_dict_sub(predicate, v):
            return True
        if predicate(k, v):
            return True
    return False

Since this is a generator, you may need to wrap with dict(filter_dict(the_dict)) to obtain a filtered dictionary.

2010-08-30 16:33:07

ansaurus

tags:

views:

answers:

Filtering dictionaries and creating sub-dictionaries based on keys/values in Python?

related questions