views:

589

answers:

2

assume I have a csv.DictReader object and I want to write it out as a csv file. How can I do this?

I thought of the following:

dr = csv.DictReader(open(f), delimiter='\t')
# process my dr object
# ...
# write out object
output = csv.DictWriter(open(f2, 'w'), delimiter='\t')
for item in dr:
  output.writerow(item)

Is that the best way?

More importantly, how can I make it so a header is written out too, in this case the object "dr"s .fieldnames property?

thanks.

+4  A: 

Instantiating DictWriter requires a fieldnames argument.
From the documentation:

The fieldnames parameter identifies the order in which values in the dictionary passed to the writerow() method are written to the csvfile.

Put another way: The Fieldnames argument is required because Python dicts are inherently unordered.
Here is an example of how you'd write the header and data to a file:

dr = csv.DictReader(open(f), delimiter='\t')

# dr.fieldnames contains values from first row of `f`.
dw = csv.DictWriter(open(fou,'w'), delimiter='\t', fieldnames=dr.fieldnames)

# headers must be a dict (this is, after all, a DictWriter)
headers = {}
for n in dw.fieldnames:
    headers[n] = n
dw.writerow(headers)
for row in dr:
    dw.writerow(row)

As @FM mentions in a comment, you can condense header-writing to a one-liner, e.g.:

dw = csv.DictWriter(open(fou,'w'), delimiter='\t', fieldnames=dr.fieldnames)
dw.writerow(dict((fn,fn) for fn in dr.fieldnames))
for row in dr:
    dw.writerow(row)

Edit:
John Machin's answer provides a simpler method of writing the header row.
And in 2.7 / 3.2 there is a new writeheader() method.

Adam Bernier
+1 Yet another way to write the header: `dw.writerow( dict((f,f) for f in dr.fieldnames) )`.
FM
Thanks, FM. I prefer the `dict()` syntax (as in your example) for its clarity.
Adam Bernier
@Adam: for a shorter one-liner, see my answer.
John Machin
@John: +1 to your answer; simply utilising "the underlying writer instance" is certainly preferable to "laborious identity-mapping".
Adam Bernier
+3  A: 

A few options:

(1) Laboriously make an identity-mapping (i.e. do-nothing) dict out of your fieldnames so that csv.DictWriter can convert it back to a list and pass it to a csv.writer instance.

(2) The documentation mentions "the underlying writer instance" ... so just use it (example at the end).

dw.writer.writerow(dw.fieldnames)

(3) Avoid the csv.Dictwriter overhead and do it yourself with csv.writer

Writing data:

w.writerow([d[k] for k in fieldnames])

or

w.writerow([d.get(k, restval) for k in fieldnames])

Instead of the extrasaction "functionality", I'd prefer to code it myself; that way you can report ALL "extras" with the keys and values, not just the first extra key. What is a real nuisance with DictWriter is that if you've verified the keys yourself as each dict was being built, you need to remember to use extrasaction='ignore' otherwise it's going to SLOWLY (fieldnames is a list) repeat the check:

wrong_fields = [k for k in rowdict if k not in self.fieldnames]

============

>>> f = open('csvtest.csv', 'wb')
>>> import csv
>>> fns = 'foo bar zot'.split()
>>> dw = csv.DictWriter(f, fns, restval='Huh?')
# dw.writefieldnames(fns) -- no such animal
>>> dw.writerow(fns) # no such luck, it can't imagine what to do with a list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\python26\lib\csv.py", line 144, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
  File "C:\python26\lib\csv.py", line 141, in _dict_to_list
    return [rowdict.get(key, self.restval) for key in self.fieldnames]
AttributeError: 'list' object has no attribute 'get'
>>> dir(dw)
['__doc__', '__init__', '__module__', '_dict_to_list', 'extrasaction', 'fieldnam
es', 'restval', 'writer', 'writerow', 'writerows']
# eureka
>>> dw.writer.writerow(dw.fieldnames)
>>> dw.writerow({'foo':'oof'})
>>> f.close()
>>> open('csvtest.csv', 'rb').read()
'foo,bar,zot\r\noof,Huh?,Huh?\r\n'
>>>
John Machin