ansaurus

Question

using Python to import a CSV (lookup table) and add GPS coordinates to another output CSV

Answer 1

+1 A:

Reading the python tutorial, it seems like {dictionary} is what I need, although I've read on here that tuples might be better. I don't know.

They're both fine choices for this task.

print row.keys() The output look like:

{'LATITUDE': '-1.311467078',

No it doesn't! This is the output from print row, most definitely NOT print row.keys(). Please don't supply disinformation in your questions, it makes them really hard to answer effectively (being a newbie makes no difference: surely you can check that the output you provide actually comes from the code you also provide!).

I'm a newbie (and not a programmer). Question is how do I use the keys to pluck out the corresponding row data and match it against words in the body of the element in the other set?

Since you give us absolutely zero information on the structure of "the other set", you make it of course impossible to answer this question. Guessing wildly, if for example the entries in "the other set" are also dicts each with a key of KEYWORD, you want to build an auxiliary dict first, then merge (some of) its entries in the "other set":

l = csv.DictReader(floc)
dloc = dict((d['KEYWORD'], d) for d in l)
for d in otherset:
  d.update(dloc.get(d['KEYWORD'], ()))

This will leave the location missing from the other set when not present in a corresponding keyword entry in the CSV -- if that's a problem you may want to use a "fake location" dictionary as the default for missing entries instead of that () in the last statement I've shown. But, this is all wild speculation anyway, due to the dearth of info in your Q.

Alex Martelli 2010-08-06 14:56:32

My bad. The command I entered wasfor row in l: print rowwithout the .keys(). I'm sorry. I'm debugging and fiddled with various types of output to understand how the data is getting stored.

2010-08-06 15:22:40

Answer 2

A:

If you dump the DictReader into a list (data = [row for row in csv.DictReader(file)]), and you have unique keywords for each row, convert that list of dictionaries into a dictionary of dictionaries, using that keyword as the key.

>>> data = [row for row in csv.DictReader(open('C:\\my.csv'),
...                                       ('num','time','time2'))]
>>> len(data)  # lots of old data :P
1410
>>> data[1].keys()
['time2', 'num', 'time']
>>> keyeddata = {}
>>> for row in data[2:]:  # I have some junk rows
...     keyeddata[row['num']] = row
...
>>> keyeddata['32']
{'num': '32', 'time2': '8', 'time': '13269'}

Once you have the keyword pulled out, you can iterate through your other list, grab the keyword from it, and use it as the index for the lat/long list. Pull out the lat/long from that index and add it to the other list.

Nick T 2010-08-06 14:58:10

Thanks! I will test this out later today - unfortunately stuck in meetings for a while first.

2010-08-06 15:23:09

Thanks - I have your code working, but I'm still don't know syntax for refering to the 'location' part of the dictionary that matches keyword 'kibera' for example.In your example, Keyeddata['32'] returns the {dict} for 'num' = '32'. How would you assign x = the 'time' cooresponding to ('num'=32) ?

2010-08-06 19:10:53

Answer 3

A:

Thanks -

Alex: My code for the other set is working, and the only relevant part is that I have a string that may or may not contain the 'keyword' that is in this dictionary.

Structurally, this is how I organized it:

def main():
    f = open('c:\python\ggce.sms', 'r')
    sensetree = etree.parse(f)
    senses = sensetree.getiterator('SenseMakingItem')
    bodies = sensetree.getiterator('Body')       
    stories = []
    for body in bodies:
            fix_body(body)
            storybyte = unicode(body.text)
            storybit = storybyte.encode('ascii','ignore')
            stories.append(storybit)
    rows = [ids,titles,locations,stories]
    out = map(None, *rows)
    print out[120:121]
    write_data(out,'c:\python\output_test.csv')

(I omitted the code for getting its, titles, locations because they work and will not be used to get the real locations from the data within stories)

Hope this helps.

2010-08-06 15:33:44

the code didn't seem to format correctly

2010-08-06 15:34:02

Just indent it by four spaces (or mark it and press Ctrl-K).

Tim Pietzcker 2010-08-06 15:36:35

ansaurus

tags:

views:

answers:

using Python to import a CSV (lookup table) and add GPS coordinates to another output CSV

related questions