tags:

views:

77

answers:

2

I have a csv file like 120 column by 4500 row. I read the field "customer name" in the first column, first row. I then look fot this field in a second cvs file containing the "customer name , and customer ID" I write a new cvs file with "customer name", customer ID", and all the rest of the 119 colunm.and continue until end of first file.

This is working, but I have special character everywhere in the first two csv files. And I dont want to have 'Montr\xe9al-Nord' instead of Montréal-Nord or 'Val\xe9rie Lamarche' instead of 'Valérie Lamarche' in the resulting csv file.

here is a test code exemple:

# -*- coding: utf-8 -*-


import  types
import  wx
import sys
import os, os.path
import win32file
import shutil
import string
import  wx.lib.dialogs
import re
import EmailAttache
import StringIO,csv
import time
import csv

outputfile=open(os.path.join(u"c:\\transales","Resultat-second_contact_act.csv"), "wb")

resultat = csv.writer (outputfile )

def Writefile ( info1, info2 ):
    print info1, info2
    resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )


data = open(os.path.join(u"c:\\transales","SECONDARY_CONTACTS.CSV"),"rb")
data2 = open(os.path.join(u"c:\\transales","AccountID+ContactID.csv"),"rb")

source1 = csv.reader(data)
source2 = csv.reader(data2)



for line in source1:
    name= line[0]
    data2.seek(0)
    for line2 in source2:
        if line[0] == line2[0]:    
            Writefile(line[0],line2[1])
            break

outputfile.close()

Any help ?

regards, francois

+1  A: 

Although I am not familiar with csv.reader or writer, I have been dealing with utf-8 file reading recently and perhaps using the codecs module might help you out.

Instead of,

data = open(..., "wb")

try,

import codecs

and then for all your utf-8 files, use,

data = codecs.open(..., "rb", "utf-8")

This automatically reads your files in as unicode (utf-8) and might write them to your file correctly.

emish
thanks, you solved my issue with "foreign characters"
Uku Loskit
@sheepz: Glad it helped you -- it's certainly not a solution to the OP's problem !-)
John Machin
A: 

The problem is in this line:

resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )

Wrapping an expression in "back-ticks" aka "grave accents" is an old-fashioned and deprecated way of saying repr(expression).

Please consider the following:

>>> s = "Montréal"
>>> print s
Montréal
>>> print repr(s)
'Montr\xe9al'
>>> ord(s[5])
233
>>> hex(233)
'0xe9'
>>> s == "Montr\xe9al"
True
>>> `s` == repr(s)
True

The offending (in 3 ways) line should be simply replaced by

resultat.writerow([info1, info2] + [line[1:18]]) # WRONG (sorry!)
resultat.writerow([info1, info2] + line[1:18]) # RIGHT
John Machin
that's not working!! the : line[1:18] write all info in one column !! and still wrong accent
francois
@francois: Fixed; please try again.
John Machin
Thanks you John, you just save my holidays.With you help, I will be able to import all those file before tonight.regardsfrancois
francois