views:

445

answers:

5

Hi guys. I have to edit a csv file, the problem is that my special chars like ó ã ç gets screwed up.

Heres is a peice of my code:

static void ReadFromFile(){

        StreamReader SR = new StreamReader("c:\\Users\\Levy\\Documents\\Vale\\Base\\Desknote.csv", Encoding.Default);

        StreamWriter SW = new StreamWriter("c:\\Users\\Levy\\Documents\\Vale\\Base\\Desknote_Ed.csv", true, System.Text.Encoding.GetEncoding("Windows-1252"));

        string S;
        char C='a';
        int i=0;
        S=SR.ReadLine();
        SW.Write(S);

}

UPDATING:

Well im able to read the chars by using Encoding.Default on the Stramreader Object. (i can display them on console)

Ive tryed with ISO 8859-1 and CP-1252 on the writer but my output is still messed up.

Thanks all

A: 

I think the key really is the encoding. What is the text encoding of the input data?

jsight
A: 

What if you read the whole file in and split on \r\n?

Jon
+2  A: 

You've declared the input file as being ASCII, which it clearly is not. Change it to something like iso-8859-1 or CP-1252 (Windows Latin-1) and you might have better luck...

This doesn't fix the fundamental problem that there is no equivalent for ó ã ç in ASCII, so what are you going to do with them? Simply throw them away? Or should you make sure that you're using a more universal encoding like UTF-8 for your output instead?

The best thing to do is find out from your source what the encoding is that was used for this file, and ask your file's recipient what is acceptable for the output. The only way to find out is to ASK, because there are various encodings that look similar on the surface.

Galghamon
+1  A: 

Here, there are two places you may be screwing,

  1. While reading (which inherently screws the next step)
  2. While writing

Check for the source file encoding (you may try Notepad2 which has a status bar that shows the encoding) and use that while reading from source file.

After successfully reading the file, write with UTF-8 to preserve those characters in output file.

huseyint
+1  A: 

From what you've said:

  1. You're managing to read the data correctly, that is, you've made the correct assumption about the encoding of the input file (not that assuming the encoding is a good thing). This is because you've stated that you can write the string to the console and it matches the input.

  2. The output file data somehow isn't right when you view it.

But, since you've read the data in correctly, and the output encoding that you are using (Windows-1252) does in fact support the characters that you've stated (are there others?), namely, ó, ã and ç, then there is no reason why the file shouldn't be written correctly.

So, how about the way in which you're drawing the conclusion that the output file is being written incorrectly? Is the tool you're using to view the output assuming a certain encoding?

Eric Smith