ansaurus

Question

Answer 1

+1 A:

Try:

f = open('filename.txt')

result = []
for line in f:
    key, values = line.split(':')
    values = values.rstrip().split(', ')
    for i, value in enumerate(values):
        try:
            result[i][key] = value
        except IndexError:
            result.append({ key: value})

print result

oggy 2009-11-05 18:26:02

Answer 2

+3 A:

Looks like you want something like:

import collections

data = '''name: john, jane
car: db9, m5
food: pizza, lasagne
'''

personal_list = collections.defaultdict(dict)

for line in data.splitlines():
  key, _, info = line.partition(':')
  infos = info.split(',')
  key = key.strip().title()
  for i, item in enumerate(infos):
    item = item.strip().title()
    personal_list[i][key] = item

for i in personal_list:
  print personal_list[i]

That doesn't do exactly what you specify (the capitalization of the B in DB9 seems totally weird for example -- how would the code know to capitalize that particular second letter and not any other second letter?!) but it seems pretty close.

Alex Martelli 2009-11-05 18:26:38

I'm slightly confused. How would you include reading in the file with this?

day_trader 2009-11-05 18:53:20

To read from a file, use `for line in f.readlines()`. I'm happy to see that the version I wrote's almost identical to Alex's, except that the first line in my for loop is `(k, _, d) = map(str.title, map(str.strip, line.partition(":")))`

Robert Rossney 2009-11-05 20:14:55

You don't need to use `for line in f.readlines()` to read a file; it is enough to just say `for line in f`. The `.readlines()` method will slurp in all the lines in the file and make a list in memory; the file object returned by `open()` will act as an iterator, yielding up one line at a time.

steveha 2009-11-05 20:24:14

Better: `for line in f:` -- absolutely no need to add `.readlines()` to that `f`.

Alex Martelli 2009-11-06 04:18:58

Could you please explain what the first line of the first for loop does does, particularly the underscore?

day_trader 2009-11-06 10:08:42

`_` is a traditional name for a placeholder variable -- one you don't care about, but have to have **some** name there (we're unpacking a tuple with 3 items so we need exactly 3 variables on the left of `=`). `partition` is a new string method in Python 2.6, se http://docs.python.org/library/stdtypes.html?highlight=partition#str.partition . Briefly, we're setting `key` to what's before the ':', `info` to what's after the ':' (and `_` to exactly ':', but we don't care;-).

Alex Martelli 2009-11-06 15:26:35

Answer 3

+1 A:

Split the initial data into index/key/value triples go from there.

def parse_data(lines):
    for line in lines:
        key, _, data = line.partition(':')
        for i, datum in enumerate(x.strip() for x in data.split(',')):
            yield i, key, datum

From there you can aggregate the data useing Alex's defaultdict approach (probably best) or sort and a bunch of extra code to build individual dictionaries on demand.

Corey Porter 2009-11-05 19:15:17

Answer 4

+1 A:

An homage to generators:

#!/usr/bin/env python
data=(zip(*([elt.strip().title() for elt in line.replace(':',',',1).split(',')]
            for line in open('filename.txt','r'))))
personal_list=[dict(zip(data[0],datum)) for datum in data[1:]]
print(personal_list)

# [{'Food': 'Pizza', 'Car': 'Db9', 'Name': 'John'}, {'Food': 'Lasagne', 'Car': 'M5', 'Name': 'Jane'}]

To understand how the script works, we break it apart:

First we load filename.txt into a list of lines:

In [41]: [line for line in open('filename.txt','r')]
Out[41]: ['name: john, jane\n', 'car: db9, m5\n', 'food: pizza, lasagne\n']

Next we replace the first colon (:) with a comma (,)

In [42]: [line.replace(':',',',1) for line in open('filename.txt','r')]
Out[42]: ['name, john, jane\n', 'car, db9, m5\n', 'food, pizza, lasagne\n']

Then we split each line on commas:

In [43]: [line.replace(':',',',1).split(',') for line in open('filename.txt','r')]
Out[43]: 
[['name', ' john', ' jane\n'],
 ['car', ' db9', ' m5\n'],
 ['food', ' pizza', ' lasagne\n']]

For each element in each line, we strip off beginning/ending whitespace and capitalize the string like a title:

In [45]: [[elt.strip().title() for elt in line.replace(':',',',1).split(',')] for line in open('filename.txt','r')]
Out[45]: [['Name', 'John', 'Jane'], ['Car', 'Db9', 'M5'], ['Food', 'Pizza', 'Lasagne']]

Now we collect the first element of each list, then the second, and so forth:

In [47]: data=(zip(*([elt.strip().title() for elt in line.replace(':',',',1).split(',')] for line in open('filename.txt','r'))))

In [48]: data
Out[48]: [('Name', 'Car', 'Food'), ('John', 'Db9', 'Pizza'), ('Jane', 'M5', 'Lasagne')]

data[0] now holds the keys for a dict.

In [49]: data[0]
Out[49]: ('Name', 'Car', 'Food')

Each tuple in data[1:] are the values for a dict.

In [50]: data[1:]
Out[50]: [('John', 'Db9', 'Pizza'), ('Jane', 'M5', 'Lasagne')]

Here we zip up the keys with the values:

In [52]: [ zip(data[0],datum) for datum in data[1:]]
Out[52]: 
[[('Name', 'John'), ('Car', 'Db9'), ('Food', 'Pizza')],
 [('Name', 'Jane'), ('Car', 'M5'), ('Food', 'Lasagne')]]

Finally, we turn it into a list of dicts:

In [54]: [dict(zip(data[0],datum)) for datum in data[1:]]
Out[54]: 
[{'Car': 'Db9', 'Food': 'Pizza', 'Name': 'John'},
 {'Car': 'M5', 'Food': 'Lasagne', 'Name': 'Jane'}]

unutbu 2009-11-05 19:55:19

+1, but I'd never stand for something like this in production code.

Robert Rossney 2009-11-05 20:18:02

ansaurus

tags:

views:

answers:

Sorting Lists of List of Dictionaries

related questions