tags:

views:

37

answers:

3

Easy one here I think, can you confirm that each code page is implemented as a seperate and unique subclass of System.Text.Encoding in .NET 2.0?

+2  A: 

I'm not sure I understand the context of your question, but yes, Encoding.UTF8, Encoding.UTF16, and anything that is returned from Encoding.GetEncoding() inherits from System.Text.Encoding. It cannot be any other way, since Encoding.GetEncoding() returns an Encoding instance, meaning the only thing it can do is return a subclass.

Dean Harding
I mean there are hundreds of code pages, are there hundreds of sub classes or do some code pages share a class?
g_g
@Gary, oh right. No there is not a one-one mapping of code pages to classes. Many of the code pages are driven by a simple table mapping of bytes to unicode code points, so they would all share the same subclass of `Encoding`, but just with a different mapping table.
Dean Harding
+1  A: 

Looking through Reflector, there's only a handful of classes that are subclasses of System.Text.Encoding.

Public:

  • System.Text.ASCIIEncoding
  • System.Text.UnicodeEncoding
  • System.Text.UTF32Encoding
  • System.Text.UTF7Encoding
  • System.Text.UTF8Encoding

Internal:

  • System.Text.Base64Encoding
  • System.Text.BinHexEncoding
  • System.Text.EncodingNLS
  • System.Xml.Ucs4Encoding

The GetEncoding() method uses variations of each of these in order to give back the other hundreds of codepages supported.

rossisdead
when i run Console.WriteLine(Encoding.GetEncoding("Korean").GetType().ToString()); i get "System.Text.DBCSCodePageEncoding" output on the console
g_g
Doh! I didn't look far enough. There are quite a few more subclasses that subclass even more!
rossisdead
+1  A: 

No. Some encodings are supported natively by the .NET Framework (listed below) and these have their own subclasses. Any other code page is stored as a property of the object returned by GetEncoding()[1] and support is provided by the underlying operating system.

The native (to .NET Framework) encodings are:

  • ASCIIEncoding
  • UTF7Encoding
  • UTF8Encoding
  • UnicodeEncoding (uses UTF-16)
  • UTF32Encoding

(Information from http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx)

1: For non-native code pages this is potentially a subclass of Encoding, but I haven't checked. The documentation linked below seems to suggest that an instance of Encoding is used.

Zooba