In my first version, it looks like I misinterpreted your question. So if I've got this correct, you're trying to process a list of files so that you can easily access all the filenames with a given extension, or all the filenames with a given base ("base" being the part before the period)?
If that's the case, I would recommend this way:
from itertools import groupby
def group_by_name(filenames):
'''Puts the filenames in the given iterable into a dictionary where
the key is the first component of the filename and the value is
a list of the filenames with that component.'''
keyfunc = lambda f: f.split('.', 1)[0]
return dict( (k, list(g)) for k,g in groupby(
sorted(filenames, key=keyfunc), key=keyfunc
) )
For instance, given the list
>>> test_data = [
... exia.frame, exia.head, exia.swords, exia.legs,
... exia.arms, exia.pilot, exia.gn_drive, lockon_stratos.data,
... tieria_erde.data, ribbons_almark.data, otherstuff.dada
... ]
that function would produce
>>> group_by_name(test_data)
{'exia': ['exia.arms', 'exia.frame', 'exia.gn_drive', 'exia.head',
'exia.legs', 'exia.pilot', 'exia.swords'],
'lockon_stratos': ['lockon_stratos.data'],
'otherstuff': ['otherstuff.dada'],
'ribbons_almark': ['ribbons_almark.data'],
'tieria_erde': ['tieria_erde.data']}
If you wanted to index the filenames by extension instead, a slight modification will do that for you:
def group_by_extension(filenames):
'''Puts the filenames in the given iterable into a dictionary where
the key is the last component of the filename and the value is
a list of the filenames with that extension.'''
keyfunc = lambda f: f.split('.', 1)[1]
return dict( (k, list(g)) for k,g in groupby(
sorted(filenames, key=keyfunc), key=keyfunc
) )
The only difference is in the keyfunc = ...
line, where I changed the key from 0 to 1. Example:
>>> group_by_extension(test_data)
{'arms': ['exia.arms'],
'dada': ['otherstuff.dada'],
'data': ['lockon_stratos.data', 'ribbons_almark.data', 'tieria_erde.data'],
'frame': ['exia.frame'],
'gn_drive': ['exia.gn_drive'],
'head': ['exia.head'],
'legs': ['exia.legs'],
'pilot': ['exia.pilot'],
'swords': ['exia.swords']}
If you want to get both those groupings at the same time, though, I think it'd be better to avoid a list comprehension, because that can only process them one way or another, it can't construct two different dictionaries at once.
from collections import defaultdict
def group_by_both(filenames):
'''Puts the filenames in the given iterable into two dictionaries,
where in the first, the key is the first component of the filename,
and in the second, the key is the last component of the filename.
The values in each dictionary are lists of the filenames with that
base or extension.'''
by_name = defaultdict(list)
by_ext = defaultdict(list)
for f in filenames:
name, ext = f.split('.', 1)
by_name[name] += [f]
by_ext[ext] += [f]
return by_name, by_ext