ansaurus

Question

Answer 1

A:

I'd use a regex to chop off everything after the first ] (and hang on to it). Then another regex to explode the string into an array. Then do whatever you need to do to it with regards to merging different arrays from different files, and then piecing it all back together shouldn't be too hard. I'll leave the regex's as an exercise for the reader :-)

fredley 2010-10-28 19:56:39

Answer 2

A:

for l, m in zip(f1, f2):
    l_head, l_tail = l.strip("[ ").split("]")
    m_head, m_tail = m.strip("[ ").split("]")

    l_head = l_head.split(",")
    m_head = m_head.split(",")
    assert len(l_head) == len(m_head)

    l_tail = l_tail.split(",")
    m_tail = m_tail.split(",")
    assert len(l_tail) == len(m_tail)

    ...

I haven't given your variables good names because I don't know what they are. I would name them something more useful.

I also haven't written the code for reassembling the lines. It shouldn't be too hard...

katrielalex 2010-10-28 20:00:50

Answer 3

+3 A:

def read_file(fp,hash):
    for l in fp:
        p = l[1:].find(']')
        k = l[p+3:-1]
        v = l[1:p+1].split(",")
        if k not in hash:
            hash[k] = v
        else:
            hash[k] = zip(hash[k], v)

hash = {}

for fname in ('f1.txt', 'f2.txt'):
    with open(fname) as fp:
        read_file(fp, hash)

for k,v in hash.items():
    print "[{0}] {1}".format(",".join("^".join(vv) for vv in v), k)

This is a basic way to do it, if you need the lines in the files in the order they were read you'll have to do a bit more work.

Here's the output I get:

[a^a2,b^b2,c^c2,d^d,e^e,f^f] 13,4,6
[dog^banana,cat^cat2,monkey^monkey2] 1,2,3

Edit:

This also assumes that each key ie. 13,4,6 appears once in a file. If it can appear multiple times you'll have to change the hash[k] = zip(hash[k],v) to something more elaborate such has

if k not in hash:
    hash[k] = [[vv] for vv in v]
else:
    for i,vv in enumerate(v):
        hash[k][i].append(vv)

GWW 2010-10-28 20:16:08

This is how I looked at it too. Alternatively, I wonder if there's merit to skipping the split(",") and just storing the value as the raw string from the file. hash[k] = hash[k] + "," + v

gbc 2010-10-28 20:22:56

If you skip splitting the value it's messier to merge later on with other values. However, the key doesn't have to be split

GWW 2010-10-28 20:24:04

Indeed, I see what you mean. I skimmed over the important bit of joining with "^"!

gbc 2010-10-28 20:29:50

ansaurus

tags:

views:

answers:

quick data processing with python?

related questions