Is each code page implemented as a subclass of System.Text.Encoding in .NET

tags:

.net
c#

views:

answers:

Is each code page implemented as a subclass of System.Text.Encoding in .NET

Easy one here I think, can you confirm that each code page is implemented as a seperate and unique subclass of System.Text.Encoding in .NET 2.0?

+2 A:

I'm not sure I understand the context of your question, but yes, Encoding.UTF8, Encoding.UTF16, and anything that is returned from Encoding.GetEncoding() inherits from System.Text.Encoding. It cannot be any other way, since Encoding.GetEncoding() returns an Encoding instance, meaning the only thing it can do is return a subclass.

Dean Harding 2010-09-20 23:25:04

I mean there are hundreds of code pages, are there hundreds of sub classes or do some code pages share a class?

g_g 2010-09-20 23:30:52

@Gary, oh right. No there is not a one-one mapping of code pages to classes. Many of the code pages are driven by a simple table mapping of bytes to unicode code points, so they would all share the same subclass of `Encoding`, but just with a different mapping table.

Dean Harding 2010-09-20 23:51:52

+1 A:

Looking through Reflector, there's only a handful of classes that are subclasses of System.Text.Encoding.

Public:

System.Text.ASCIIEncoding
System.Text.UnicodeEncoding
System.Text.UTF32Encoding
System.Text.UTF7Encoding
System.Text.UTF8Encoding

Internal:

System.Text.Base64Encoding
System.Text.BinHexEncoding
System.Text.EncodingNLS
System.Xml.Ucs4Encoding

The GetEncoding() method uses variations of each of these in order to give back the other hundreds of codepages supported.

rossisdead 2010-09-20 23:44:39

when i run Console.WriteLine(Encoding.GetEncoding("Korean").GetType().ToString()); i get "System.Text.DBCSCodePageEncoding" output on the console

g_g 2010-09-20 23:53:35

Doh! I didn't look far enough. There are quite a few more subclasses that subclass even more!

rossisdead 2010-09-21 21:18:39

+1 A:

No. Some encodings are supported natively by the .NET Framework (listed below) and these have their own subclasses. Any other code page is stored as a property of the object returned by GetEncoding()[1] and support is provided by the underlying operating system.

The native (to .NET Framework) encodings are:

ASCIIEncoding
UTF7Encoding
UTF8Encoding
UnicodeEncoding (uses UTF-16)
UTF32Encoding

(Information from http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx)

1: For non-native code pages this is potentially a subclass of Encoding, but I haven't checked. The documentation linked below seems to suggest that an instance of Encoding is used.

Zooba 2010-09-20 23:45:17

ansaurus

tags:

views:

answers:

Is each code page implemented as a subclass of System.Text.Encoding in .NET

related questions