views:

114

answers:

6

An example list of lists:

[
["url","name","date","category"]
["hello","world","2010","one category"]
["foo","bar","2010","another category"]
["asdfasdf","adfasdf","2010","one category"]
["qwer","req","2010","another category"]
]

What I wish do to is create a dictionary -> category : [ list of entries ].

The resultant dictionary would be:

{"category" : [["url","name","date","category"]],
"one category" : [["hello","world","2010","one category"],["asdfasdf","adfasdf","2010","one category"]],
"another category" : [["foo","bar","2010","another category"], ["qwer","req","2010","another category"]]}
+7  A: 
dict((category, list(l)) for category, l 
     in itertools.groupby(l, operator.itemgetter(3))

The main thing here is the usage of itertools.groupby. It simply returns iterables instead of lists, which is why there's a call for list(l), which means that if you're ok with that, you can simply write dict(itertools.groupby(l, operator.itemgetter(3)))

abyx
You can use `operator.itemgetter(3)` instead of that lambda.
Ignacio Vazquez-Abrams
@Ignacio - you're right, I always forget that one.
abyx
I never know if these 1/2 liners that utilize itertools/lambdas/whatnot are better than the more verbose/explicit versions. For someone reading this without seeing an example of what's happening it's very hard to understand.
Idan K
@Idan - After reading once what `groupby` does, this is pretty straightforward. And if these 2 lines are in a method called `group_by_categories` it's even better.
abyx
+5  A: 
newdict = collections.defaultdict(list)
for entry in biglist:
  newdict[entry[3]].append(entry)
Ignacio Vazquez-Abrams
newdict['category that does not exist'] adds a new element to newdict. This might be fine with the original poster, but this is a very specific semantics.
EOL
@EOL: It only picks up categories that are in the original list, so I don't see an issue here.
Ignacio Vazquez-Abrams
@Ignacio: In general, there is no reason for newdict['category that does not exist'] to be set to [] when 'category that does not exist' is not in biglist. For instance, the existence of some categories could be tested with `try: newdict['example category'] except KeyError:…` If newdict is a collections.defaultdict, no exception will be raised, whereas a dict would raise an exception. I just wanted to give a caveat: collections.defaultdicts do not behave exactly like dicts, and the original poster wanted a dict.
EOL
In short, you can't make it behave like a defaultdict during initialization and like a dict afterwards.
Robert Rossney
+1  A: 
list_of_lists=[
["url","name","date","category"],
["hello","world","2010","one category"],
["foo","bar","2010","another category"],
["asdfasdf","adfasdf","2010","one category"],
["qwer","req","2010","another category"]
]
d={}
for li in list_of_lists:
    d.setdefault(li[-1], [])
    d[ li[-1] ].append(li)
for i,j in d.iteritems():
    print i,j
ghostdog74
+1, but see my answer, which makes use of the fact that setdefault() returns a value.
EOL
+1  A: 

d = {}
for e in l:
    if e[3] in d:
        d[e[3]].append(e)
    else:
        d[e[3]] = [e]
Dyno Fu
people really donnot like straightforward...
Dyno Fu
forget list is non-hashable...
Dyno Fu
What's wrong with this? It's straightforward, and it does work. L[0][3] is "category", and so on.
telliott99
A: 
>>> l = [
... ["url","name","date","category"],
... ["hello","world","2010","one category"],
... ["foo","bar","2010","another category"],
... ["asdfasdf","adfasdf","2010","one category"],
... ["qwer","req","2010","another category"],
... ]
#Intermediate list to generate a more dictionary oriented data
>>> dl = [ (li[3],li[:3]) for li in l ]
>>> dl
[('category', ['url', 'name', 'date']), 
 ('one category', ['hello', 'world', '2010']), 
 ('another category', ['foo', 'bar', '2010']), 
 ('one category', ['asdfasdf', 'adfasdf', '2010']), 
 ('another category', ['qwer', 'req', '2010'])]
#Final dictionary
>>> d = {}
>>> for cat, data in dl:
...   if cat in d:
...     d[cat] = d[cat] + [ data ]
...   else:
...     d[cat] = [ data ]
...
>>> d
{'category': [['url', 'name', 'date']], 
 'one category': [['hello', 'world', '2010'], ['asdfasdf', 'adfasdf', '2010']], 
 'another category': [['foo', 'bar', '2010'], ['qwer', 'req', '2010']]}

The final data it's a little different as I haven't included on the data the category (seems quite pointless to me), but you can add it easily, if needed...

Khelben
+2  A: 

A variation on ghostdog74's answer, which fully uses the semantics of setdefaults:

result={}
for li in list_of_lists:
    result.setdefault(li[-1], []).append(li)
EOL