views:

214

answers:

1

This data comes from an Oracle database and is extracted to flatfiles in encoding 'WE8MSWIN1252'.

I want to parse the data and do some analysis. I want to see the text fields but do not need to publish the results to any other system so if some characters do not get converted perfectly I do not have a problem with that.

I just do not want my parsing to fail with a decode error which is what I get if I use:

inputFile = codecs.open( dataFileName, "r", "utf-8'")

+2  A: 

From the last few characters, I'd guess that this encoding is what the rest of the world calls windows-1252. So try:

inputFile = codecs.open(dataFileName, "r", "windows-1252")
Daniel Roseman