views:

302

answers:

1

HI, I am creating a file like so.

FileStream temp = File.Create( this.FileName );

Then putting data in the file like so.

this.Writer = new StreamWriter( this.Stream );
this.Writer.WriteLine( strMessage );

That code is encapsulated in a class hierarchy but that is the meat and potatoes of it.

My problem is this. MSDN says that the default encoding for creating a file this way is UTF8. And when I write a french character such as é Textpad interprets the file as a UTF 8 file, but notepad++ says it's "ANSI as UTF8" or maybe it's an ansi file but is reading it as UTF8. When I create a file the same way without the french character both textpad and notepad++ read the file as an ansi file even though according to msdn it should be a utf 8 file still.

Which program should be trusted. Notepad++ or textpad - Notepad++ seems to be more consistant, but is still the oppossite to what MSDN says it should be. My problem is that we create files that get sent off to another company and depending on whether there are french characters the encoding seems to keep changing.

Or is there a better way to determine the encoding of a file. I've read about byte order marks and preambles but as far as I understand neither are guaranteed to be there.

We initially thought that all the files we were building were ansi. Also please note that both ansi and utf8 should handle the french characters appropriately as the characters are part of both character sets.

A: 

as far as i know, "ansi" character encoding is another name for ascii-us.

if there are no characters in the file that aren't in the ascii charset then the file is valid ascii and valid utf8, there's no way to distinguish them. so your program can write it as utf8 and any other program would be correct in seeing it as ascii (ansi) just as it would be seeing it as utf8.

fsb