ansaurus

Question

CSV File read. Special Chars problem.

Answer 1

A:

I think the key really is the encoding. What is the text encoding of the input data?

jsight 2009-07-27 17:38:38

Answer 2

A:

What if you read the whole file in and split on \r\n?

Jon 2009-07-27 17:38:40

Answer 3

+2 A:

You've declared the input file as being ASCII, which it clearly is not. Change it to something like iso-8859-1 or CP-1252 (Windows Latin-1) and you might have better luck...

This doesn't fix the fundamental problem that there is no equivalent for ó ã ç in ASCII, so what are you going to do with them? Simply throw them away? Or should you make sure that you're using a more universal encoding like UTF-8 for your output instead?

The best thing to do is find out from your source what the encoding is that was used for this file, and ask your file's recipient what is acceptable for the output. The only way to find out is to ASK, because there are various encodings that look similar on the surface.

Galghamon 2009-07-27 17:38:51

Answer 4

+1 A:

Here, there are two places you may be screwing,

While reading (which inherently screws the next step)
While writing

Check for the source file encoding (you may try Notepad2 which has a status bar that shows the encoding) and use that while reading from source file.

After successfully reading the file, write with UTF-8 to preserve those characters in output file.

huseyint 2009-07-27 17:47:42

Answer 5

+1 A:

From what you've said:

You're managing to read the data correctly, that is, you've made the correct assumption about the encoding of the input file (not that assuming the encoding is a good thing). This is because you've stated that you can write the string to the console and it matches the input.
The output file data somehow isn't right when you view it.

But, since you've read the data in correctly, and the output encoding that you are using (Windows-1252) does in fact support the characters that you've stated (are there others?), namely, ó, ã and ç, then there is no reason why the file shouldn't be written correctly.

So, how about the way in which you're drawing the conclusion that the output file is being written incorrectly? Is the tool you're using to view the output assuming a certain encoding?

Eric Smith 2009-07-27 20:10:56

ansaurus

tags:

views:

answers:

CSV File read. Special Chars problem.

related questions