views:

384

answers:

3

Python newb here looking for some assistance...

For a variable number of dicts in a python list like:

list_dicts = [
{'id':'001', 'name':'jim', 'item':'pencil', 'price':'0.99'},
{'id':'002', 'name':'mary', 'item':'book', 'price':'15.49'},
{'id':'002', 'name':'mary', 'item':'tape', 'price':'7.99'},
{'id':'003', 'name':'john', 'item':'pen', 'price':'3.49'},
{'id':'003', 'name':'john', 'item':'stapler', 'price':'9.49'},
{'id':'003', 'name':'john', 'item':'scissors', 'price':'12.99'},
]

I'm trying to find the best way to group dicts where the value of key "id" is equal, then add/merge any unique key:value and create a new list of dicts like:

list_dicts2 = [
{'id':'001', 'name':'jim', 'item1':'pencil', 'price1':'0.99'},
{'id':'002', 'name':'mary', 'item1':'book', 'price1':'15.49', 'item2':'tape', 'price2':'7.99'},
{'id':'003', 'name':'john', 'item1':'pen', 'price1':'3.49', 'item2':'stapler', 'price2':'9.49', 'item3':'scissors', 'price3':'12.99'},
]

So far, I've figured out how to group the dicts in the list with:

myList = itertools.groupby(list_dicts, operator.itemgetter('id'))

But I'm struggling with how to build the new list of dicts to:

1) Add the extra keys and values to the first dict instance that has the same "id"

2) Set the new name for "item" and "price" keys (e.g. "item1", "item2", "item3"). This seems clunky to me, is there a better way?

3) Loop over each "id" match to build up a string for later output

I've chosen to return a new list of dicts only because of the convenience of passing a dict to a templating function where setting variables by a descriptive key is helpful (there are many vars). If there is a cleaner more concise way to accomplish this, I'd be curious to learn. Again, I'm pretty new to Python and in working with data structures like this.

A: 

I imagine it would be easier to combine the items in list_dicts into something that looks more like this:

list_dicts2 = [{'id':1, 'name':'jim', 'items':[{'itemname':'pencil','price':'0.99'}], {'id':2, 'name':'mary', 'items':[{'itemname':'book','price':'15.49'}, {'itemname':'tape','price':'7.99'}]]

You could also use a list of tuples for 'items' or perhaps a named tuple.

Mark
A: 

This looks very much like a homework problem.

As the above poster mentioned, there are a few more appropriate data structures for this kind of data, some variant on the following might be reasonable:

[ ('001', 'jim', [('pencil', '0.99')]), 
('002', 'mary', [('book', '15.49'), ('tape', '7.99')]), 
('003', 'john', [('pen', '3.49'), ('stapler', '9.49'), ('scissors', '12.99')])]

This can be made with the relatively simple:

list2 = []
for id,iter in itertools.groupby(list_dicts,operator.itemgetter('id')):
  idList = list(iter)
  list2.append((id,idList[0]['name'],[(z['item'],z['price']) for z in idList]))

The interesting thing about this question is the difficulty in extracting 'name' when using groupby, without iterating past the item.

To get back to the original goal though, you could use code like this (as the OP suggested):

list3 = []
for id,name,itemList in list2:
    newitem = dict({'id':id,'name':name})
    for index,items in enumerate(itemList):
        newitem['item'+str(index+1)] = items[0]
        newitem['price'+str(index+1)] = items[1]
    list3.append(newitem)
jkerian
+2  A: 

Try to avoid complex nested data structures. I believe people tend to grok them only while they are intensively using the data structure. After the program is finished, or is set aside for a while, the data structure quickly becomes mystifying.

Objects can be used to retain or even add richness to the data structure in a saner, more organized way. For instance, it appears the item and price always go together. So the two pieces of data might as well be paired in an object:

class Item(object):
    def __init__(self,name,price):
        self.name=name
        self.price=price

Similarly, a person seems to have an id and name and a set of possessions:

class Person(object):
    def __init__(self,id,name,*items):
        self.id=id
        self.name=name
        self.items=set(items)

If you buy into the idea of using classes like these, then your list_dicts could become

list_people = [
    Person('001','jim',Item('pencil',0.99)),
    Person('002','mary',Item('book',15.49)),
    Person('002','mary',Item('tape',7.99)),
    Person('003','john',Item('pen',3.49)),
    Person('003','john',Item('stapler',9.49)),
    Person('003','john',Item('scissors',12.99)), 
]

Then, to merge the people based on id, you could use Python's reduce function, along with take_items, which takes (merges) the items from one person and gives them to another:

def take_items(person,other):
    '''
    person takes other's items.
    Note however, that although person may be altered, other remains the same --
    other does not lose its items.    
    '''
    person.items.update(other.items)
    return person

Putting it all together:

import itertools
import operator

class Item(object):
    def __init__(self,name,price):
        self.name=name
        self.price=price
    def __str__(self):
        return '{0} {1}'.format(self.name,self.price)

class Person(object):
    def __init__(self,id,name,*items):
        self.id=id
        self.name=name
        self.items=set(items)
    def __str__(self):
        return '{0} {1}: {2}'.format(self.id,self.name,map(str,self.items))

list_people = [
    Person('001','jim',Item('pencil',0.99)),
    Person('002','mary',Item('book',15.49)),
    Person('002','mary',Item('tape',7.99)),
    Person('003','john',Item('pen',3.49)),
    Person('003','john',Item('stapler',9.49)),
    Person('003','john',Item('scissors',12.99)), 
]

def take_items(person,other):
    '''
    person takes other's items.
    Note however, that although person may be altered, other remains the same --
    other does not lose its items.    
    '''
    person.items.update(other.items)
    return person

list_people2 = [reduce(take_items,g)
                for k,g in itertools.groupby(list_people, lambda person: person.id)]
for person in list_people2:
    print(person)
unutbu