views:

1488

answers:

2

Is there an option to easily view the current file's encoding (utf-8, utf-8 w/out BOM, ascii, western, etc)?

I am working mostly on the web applications, so this is crucial to me.

I can't find this anywhere besides the "Save as" dialog... When you are saving a file in Visual Studio, you can choose "Save MyClass.cs as", and then click on the down arrow and choose "Save with Encoding...". Then you have the option to see and change the currently selected Encoding and Line endings options. But this is 4-5 clicks, just too much work for such a simple information!?

There is a "View.ChooseEncoding" command, but no luck in running it, all I get is "Command "View.ChooseEncoding" is not available."

I have tried to display it in the status bar with my own Add-in, but no luck, can't find Encoding information anywhere in the automation API. I have used EnvDTE.Document to access the current file information when opening/saving.

+1  A: 

Bokka,

what exactly are you wanting to do? As far as I'm aware Visual Studio (2005 at least) will just use the local codepage for the operating system you are using. If you're an English speaking that's probably Latin1, ISO-8859-1, Extended-ASCII, whatever. If you have no accented/European characters they're all "pretty much" the same representation.

We saw some behaviour at work where Visual Studio 2005 on 64bit operating systems (eg. Vista64) was saving in Unicode (UTF8 with BOM) by default; the commenter above suggests that Visual Studio 2008 behaves the same way.

Which brings me back to the question - what do you want to do? Are you considering using web.config globalization section? It has an option to specify fileEncoding as well as the request and response.

   <system.web>
      <globalization 
         fileEncoding="iso-8859-1"
         requestEncoding="utf-8"
         responseEncoding="utf-8" />
   </system.web>

For any particular stream of bytes you receive, it's difficult to "know" what encoding was used (unless it's UTF-8 with BOM, in which case it's easy to check the first few bytes) or it was created on your PC (in which case it uses your default codepage).

A long while back I wrote NCharDet which attempts to determine encodings (mainly for different Asian languages) but I'm not sure this is what you need either (besides which it is a little out-of-date). MLang used the be the main 'API' Microsoft provided for stuff like this.

Sorry I can't provide an actual answer...

CraigD
Visial Studio 2005 saves to ISO-8859-1 by default; if there are characters that are not covered by ISO-8859-1 in the file, it saves to UTF-16 by default which is great because Subversion will treat it as binary. I think the default in 2008 is UTF-8 but I'm not sure.
DrJokepu
A: 

Look at this: How to Determine Text File Encoding But this is applicable only to Unicode files saved with signature.

macropas