ansaurus

Question

How to parse a single line csv string without the csv.reader iterator in python?

Answer 1

+2 A:

Loop over lines on file can be done this way:

with open('path/to/my/file.csv', 'r') as f:
    for line in f:
        puts line # here You can convert encoding and save lines

But if You want to convert encoding of a whole file You can also call:

$ iconv -f Windows-1250 -t UTF8 < file.csv > file.csv

Edit: So where the problem is?

with open('path/to/my/file.csv', 'r') as f:
    for line in f:
        line = line.decode('windows-1250').encode('utf-8')
        elements = line.split(",")

Dejw 2010-02-25 14:05:11

I do not want to read/write the file twice. The iconv solution is lame, I want it done in code no by some tool, I need to crate a tool that will prepare files in one process not instructions to do that.

WooYek 2010-02-25 14:31:17

Again, no support for CSV parsing at the same time, line splitting just won't cut it.

WooYek 2010-02-25 14:39:53

Answer 2

+1 A:

Thx, for the answers. The wrapping one gave me an idea:

def reencode(file):
    for line in file:
        yield line.decode('windows-1250').encode('utf-8')

csv_writer = csv.writer(open(outfilepath,'w'), delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
csv_reader = csv.reader(reencode(open(filepath)), delimiter=";",quotechar='"')
for c in csv_reader:
    l = # rearange columns here
    csv_writer.writerow(l)

That's exactly what i was going for re-encoding a line just before it's get parsed by the csv_reader.

WooYek 2010-02-25 14:38:00

Answer 3

+1 A:

At the very bottom of the csv documentation is a set of classes (UnicodeReader and UnicodeWriter) that implements Unicode support for csv:

rfile = open('input.csv')
wfile = open('output.csv','w')
csv_reader = UnicodeReader(rfile,encoding='windows-1250')
csv_writer = UnicodeWriter(wfile,encoding='utf-8')
for c in csv_reader:
    # process Unicode lines
    csv_writer.writerow(c)
rfile.close()
wfile.close()

Mark Tolonen 2010-02-25 19:12:07

ansaurus

tags:

views:

answers:

How to parse a single line csv string without the csv.reader iterator in python?

related questions