views:

266

answers:

5

Just wondering why do we have 'char' type of size=2Bytes in c#(.net) unlike 1Byte in other programming languages?

+8  A: 

A char is unicode in C#, therefore the number of possible characters exceeds 255. So you'll need two bytes.

Extended ASCII for example has a 255-char set, and can therefore be stored in one single byte. That's also the whole purpose of the System.Encoding namespace, as different systems can have different charsets, and char sizes. C# can therefore handle one/four/etc. char bytes, but Unicode UTF-16 is default.

Jan Jongboom
+1  A: 

Because strings in .NET are encoded as 2 byte Unicode charactes.

JohnM2
(a) Strings are sequences of characters. (b) There are no 2-byte Unicode characters. You may be looking for the terms *code unit* and *code point*. And with the latter, there are still no 16 bit, only 21.
Joey
UTF-8/16/32 != Unicode
Lucas
So what is the relation between a C# character and Unicode code point?
JohnM2
+1  A: 

Actually C#, or more accurately the CLR's, size of char is consistent with most other managed languages. Managed languages, like Java, tend to be newer and have items like unicode support built in from the ground up. The natural extension of supporting unicode strings is to have unicode char's.

Older languages like C/C++ started in ASCII only and only later added unicode support.

JaredPar
+1  A: 

C has actually two different char types: char and wchar_t. char may be one byte long, wchar_t not necessarily.

In C# (and .NET) for that matter, all character strings are encoded as Unicode in UTF-16. That's why a char in .NET represents a single UTF-16 code unit which may be a code point or half of a surrogate pair (not actually a character, then).

Joey
+1  A: 

Because a character in a C# string defaults to the UTF-16 encoding of Unicode, which is 2 bytes (by default).

Bob Moore