views:

147

answers:

4

I've just read in a file that is something like:

name: john, jane
car: db9, m5
food: pizza, lasagne

Each of these rows (names, car, food) are in order of who owns what. Therefore John owns the car 'DB9' and his favourite food is 'Pizza'. Likewise with Jane, her car is an 'M5' and her favourite food is 'Lasagne'.

I effectively have:

>>> names['Name']="John"
>>> namesL.append(name)
>>> names['Name']="Jane"
>>> namesL.append(name)
>>> car['Car']="DB9"
>>> cars.append(car)
>>> car['Car']="M5"
>>> cars.append(car)
>>> food['Food']="Pizza"
>>> foodL.append(food)
>>> food['Food']="Lasagne"
>>> foodL.append(food)

>>>ultimateList.append(foodL)
...

However I want it so that each of these things are in their own dictionary. So something like this:

>>>PersonalDict
{'Name': 'John', 'Car': 'DB9', 'Food': 'Pizza'}

I've been staring at it for a while and can't work out how I should approach this. Can anyone offer some ideas or shall I just do this some other way?

+1  A: 

Try:

f = open('filename.txt')

result = []
for line in f:
    key, values = line.split(':')
    values = values.rstrip().split(', ')
    for i, value in enumerate(values):
        try:
            result[i][key] = value
        except IndexError:
            result.append({ key: value})

print result
oggy
+3  A: 

Looks like you want something like:

import collections

data = '''name: john, jane
car: db9, m5
food: pizza, lasagne
'''

personal_list = collections.defaultdict(dict)

for line in data.splitlines():
  key, _, info = line.partition(':')
  infos = info.split(',')
  key = key.strip().title()
  for i, item in enumerate(infos):
    item = item.strip().title()
    personal_list[i][key] = item

for i in personal_list:
  print personal_list[i]

That doesn't do exactly what you specify (the capitalization of the B in DB9 seems totally weird for example -- how would the code know to capitalize that particular second letter and not any other second letter?!) but it seems pretty close.

Alex Martelli
I'm slightly confused. How would you include reading in the file with this?
day_trader
To read from a file, use `for line in f.readlines()`. I'm happy to see that the version I wrote's almost identical to Alex's, except that the first line in my for loop is `(k, _, d) = map(str.title, map(str.strip, line.partition(":")))`
Robert Rossney
You don't need to use `for line in f.readlines()` to read a file; it is enough to just say `for line in f`. The `.readlines()` method will slurp in all the lines in the file and make a list in memory; the file object returned by `open()` will act as an iterator, yielding up one line at a time.
steveha
Better: `for line in f:` -- absolutely no need to add `.readlines()` to that `f`.
Alex Martelli
Could you please explain what the first line of the first for loop does does, particularly the underscore?
day_trader
`_` is a traditional name for a placeholder variable -- one you don't care about, but have to have **some** name there (we're unpacking a tuple with 3 items so we need exactly 3 variables on the left of `=`). `partition` is a new string method in Python 2.6, se http://docs.python.org/library/stdtypes.html?highlight=partition#str.partition . Briefly, we're setting `key` to what's before the ':', `info` to what's after the ':' (and `_` to exactly ':', but we don't care;-).
Alex Martelli
+1  A: 

Split the initial data into index/key/value triples go from there.

def parse_data(lines):
    for line in lines:
        key, _, data = line.partition(':')
        for i, datum in enumerate(x.strip() for x in data.split(',')):
            yield i, key, datum

From there you can aggregate the data useing Alex's defaultdict approach (probably best) or sort and a bunch of extra code to build individual dictionaries on demand.

Corey Porter
+1  A: 

An homage to generators:

#!/usr/bin/env python
data=(zip(*([elt.strip().title() for elt in line.replace(':',',',1).split(',')]
            for line in open('filename.txt','r'))))
personal_list=[dict(zip(data[0],datum)) for datum in data[1:]]
print(personal_list)

# [{'Food': 'Pizza', 'Car': 'Db9', 'Name': 'John'}, {'Food': 'Lasagne', 'Car': 'M5', 'Name': 'Jane'}]

To understand how the script works, we break it apart:

First we load filename.txt into a list of lines:

In [41]: [line for line in open('filename.txt','r')]
Out[41]: ['name: john, jane\n', 'car: db9, m5\n', 'food: pizza, lasagne\n']

Next we replace the first colon (:) with a comma (,)

In [42]: [line.replace(':',',',1) for line in open('filename.txt','r')]
Out[42]: ['name, john, jane\n', 'car, db9, m5\n', 'food, pizza, lasagne\n']

Then we split each line on commas:

In [43]: [line.replace(':',',',1).split(',') for line in open('filename.txt','r')]
Out[43]: 
[['name', ' john', ' jane\n'],
 ['car', ' db9', ' m5\n'],
 ['food', ' pizza', ' lasagne\n']]

For each element in each line, we strip off beginning/ending whitespace and capitalize the string like a title:

In [45]: [[elt.strip().title() for elt in line.replace(':',',',1).split(',')] for line in open('filename.txt','r')]
Out[45]: [['Name', 'John', 'Jane'], ['Car', 'Db9', 'M5'], ['Food', 'Pizza', 'Lasagne']]

Now we collect the first element of each list, then the second, and so forth:

In [47]: data=(zip(*([elt.strip().title() for elt in line.replace(':',',',1).split(',')] for line in open('filename.txt','r'))))

In [48]: data
Out[48]: [('Name', 'Car', 'Food'), ('John', 'Db9', 'Pizza'), ('Jane', 'M5', 'Lasagne')]

data[0] now holds the keys for a dict.

In [49]: data[0]
Out[49]: ('Name', 'Car', 'Food')

Each tuple in data[1:] are the values for a dict.

In [50]: data[1:]
Out[50]: [('John', 'Db9', 'Pizza'), ('Jane', 'M5', 'Lasagne')]

Here we zip up the keys with the values:

In [52]: [ zip(data[0],datum) for datum in data[1:]]
Out[52]: 
[[('Name', 'John'), ('Car', 'Db9'), ('Food', 'Pizza')],
 [('Name', 'Jane'), ('Car', 'M5'), ('Food', 'Lasagne')]]

Finally, we turn it into a list of dicts:

In [54]: [dict(zip(data[0],datum)) for datum in data[1:]]
Out[54]: 
[{'Car': 'Db9', 'Food': 'Pizza', 'Name': 'John'},
 {'Car': 'M5', 'Food': 'Lasagne', 'Name': 'Jane'}]
unutbu
+1, but I'd never stand for something like this in production code.
Robert Rossney