tags:

views:

41

answers:

2

hi,

i have a csv file similar to the following :

title  title2  h1  h2  h3 ... 
l1.1     l1     1   1   0  
l1.2     l1     0   1   0
l1.3     l1     1   0   1
l2.1     l2     0   0   1
l2.2     l2     1   0   1
l3.1     l3     0   1   1
l3.2     l3     1   1   0
l3.3     l3     1   1   0
l3.4     l3     1   1   0    

i want to be able to add the columns in the following manner:
h1 ( l1.1 + l1.2+ l1.3 ) = 2
h1 ( l2.1 + l2.2 ) = 1
h1 ( l3.1 + l3.2 + l3.3 +l3.4) = 3 and so on for every column And i want the final count for every such value as a summarised table :

title2  h1  h2  h3...
l1     2   2   1
l2     1   0   2
l3     3   4   1

how do i implement this?

A: 

Have a look at the csv module. What you want to do is open the file with a csv.reader. Then you iterate over the file, one row at the time. You accumulate the results of the additions into a temporary list. When you are done, you write this list to a new csv.writer.

You might need to define a dialect as you are not really using CSV but some tab-delimited format.

Ranieri
even if i read one row at a time, how do i accumulate the results for l1 alone, l2 alone and so on.... and this is necessarily comma separated i ve just given the table like this for better understanding..
newbie
I can, but I won't. Accumulation is basic programming practice.
Ranieri
i ve done some editing to the above table... i have now managed to get it in the above format... will this help??
newbie
can you atleast give me more instructions on how to do this... i really can't get the hang of it...
newbie
i don't mean to read it row wise... i want to read it column wise... can this be done by comparing title2 column's data?
newbie
+1  A: 

Something like this should work. It takes an input in the form

title,title2,h1,h2,h3
l1.1,l1,1,1,0
l1.2,l1,0,1,0
l1.3,l1,1,0,1
l2.1,l2,0,0,1
l2.2,l2,1,0,1
l3.1,l3,0,1,1
l3.2,l3,1,1,0
l3.3,l3,1,1,0
l3.4,l3,1,1,0

and outputs

title2,h1,h2,h3
l1,2,2,1
l2,1,0,2
l3,3,4,1

Tested with Python 3.1.2. In Python 2.x you'll need to change the open() calls to use binary mode, and drop the newline="" bit). You can also drop the call to list() since in Python 2.x, map() already returns a list.

import csv
import operator

reader = csv.reader(open("test.csv", newline=""), dialect="excel")
result = {}

for pos, entry in enumerate(reader):
    if pos == 0:
        headers = entry
    else:
        if entry[1] in result:
            result[entry[1]] = list(map(operator.add, result[entry[1]], [int(i) for i in entry[2:]]))
        else:
            result[entry[1]] = [int(i) for i in entry[2:]]

writer = csv.writer(open("output.txt", "w", newline=""), dialect="excel")
writer.writerow(headers[1:])

keys = sorted(result.keys())
for key in keys:
    output = [key]
    output.extend(result[key])
    writer.writerow(output)
Tim Pietzcker
i m using python 2.7... i ve changed the open() part to "open('test.csv','rb'))... but i am getting a syntax error with the "result[entry[1]] = list(map(operator.add, result[entry[1]], [int(i) for i in entry[2:]]))" line12.... any modification i might have to make??
newbie
Ah yes, remove the `list()`, that's only needed in Python 3 since `map` returns a view instead of a list in Python 3.
Tim Pietzcker