tags:

views:

53

answers:

2

i have a csv file which is actually a matrix of 0's and 1's

i need exclude those columns that have 0's and pick only those that have 1's and copy them to another csv file... using python....

i tried this:

    reader=csv.DictReader(open("test1.csv","r"),[])

for data in reader:
        if data==1:
                print data

please help... i m badly in need of this as i have a deadline today...

A: 

If you need to exclude all columns that have any zeroes in them, then first you need to read the whole file in memory -- because only after having seen every row will you know which columns have any zeroes! This is a logical need -- whatever language you use the need will remain, it's intrinsic to the problem

So, for example:

allrows = list(reader)

Now, allrows is a list of dictionaries, whose items are strings, presumably 0 or 1. Now, you could do:

keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]

...not the fastest approach, but hopefully very, very simple to understand!

Once you do know which columns you want to keep, prepare a DictWriter instance w with those columns as the headers and the extrasaction='ignore' argument (so it will ignore "extra" keys in the dicts passed to it, and finally

w.writerows(allrows)

If you mean something different than "exclude all columns which have any zeroes in them", then please clarify exactly what you do mean by "i need exclude those columns that have 0's" because I can't interpret it differently.

Alex Martelli
you have interpreted it correctly... i mean to "exclude all columns which have any zeroes in them".... but i did not understand the extrasaction and headers part... pl elaborate thank you so much... :)
Anand
Not much to elaborate -- just read http://docs.python.org/library/csv.html?highlight=dictreader#csv.DictReader and use `fieldnames` for what I called `headers`.
Alex Martelli
pl read the code i ve written from what you gave me and help... i m so dead....
Anand
the syntax "c for c in allrows[0]" seems to be wrong...
Anand
@Anand, may *seem* wrong to you, but after assigning `allrows` a list of dicts I just copied and pasted it to my Python 2.6 interpreter and it doesn't _seem_ wrong to **it**... and I suspect a Python interpreter is more likely to be correct than you, when it comes to right or wrong syntax.
Alex Martelli
A: 

i tried the following based on what you had sent me Alex... : but i m not sure i got it right....

please help me:

keepcols=[]

for c in allrows[0]:
        for r in allrows:
                if r[c]=='1':
                        keepcols=r[c]

print keepcols
writer=csv.DictWriter(open("output1.csv","w"),[],extrasaction='ignore')
writer.writerows(keepcols)

i don t even think i did it right... i m very new to programming...

Anand
No, those four hugely nested lines (and the silly `keepcols = []` at the start) are **TOTALLY** different from what I wrote, which is `keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]` -- single line. Use the code I just copied AGAIN, instead of those weird six lines you use; use `keepcols` instead of that absurd `[]`; maybe, also, learn **SOME** very, **VERY** elementary Python, if you're supposed to program in it -- the absurd you've distorted my code, and that crazy `[]` where I said to use `keepcols`, suggest the latter course of action is sorely necessary.
Alex Martelli
File "prg1.py", line 24 keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)] ^SyntaxError: invalid syntaxthis is what i get when i type in your code
Anand
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)] SyntaxError: invalid syntax this is wat i get when i execute the code
Anand