tags:

views:

102

answers:

5

That is, why does unsigned short var= L'ÿ' work, but unsigned short var[]= L"ÿ"; does not?

A: 

For what I remember of C

  • 'Y' or whatever is a char and you can cast it into an int and therefore convert it into a L,
  • "y" is a string constant and you can't translate it into a integer value
Eineki
`unsigned short var[]` is not an integer value. It's an array of `short`s. :-P
Chris Jester-Young
+8  A: 

L'ÿ' is of type wchar_t, which can be implicitly converted into an unsigned short. L"ÿ" is of type wchar_t[2], which cannot be implicitly converted into unsigned short[2].

Chris Jester-Young
This is the right answer.
Dervin Thunk
+3  A: 

L is the prefix for wide character literals and wide-character string literals. This is part of the language and not a header. It's also not GCC-specific. They would be used like so:

wchar_t some_wchar = L'ÿ';
wchar_t *some_wstring = L"ÿ"; // or wchar_t some_wstring[] = L"ÿ";

You can do unsigned short something = L'ÿ'; because a conversion is defined from wchar_t to short. There is not such conversion defined between wchar_t* and short.

Steve M
and this works as well... pretty much same as Chris.
Dervin Thunk
+2  A: 

wchar_t is just a typedef to one of the standard integer types. The compiler implementor choses such a type that is large enough to hold all wide characters. If you don't include the header, this is still true and L'ß' is well defined, only that you as a programmer don't know what type it has.

Your initialization to an integer type works because there are rules to convert one into another. Assigning a wide character string (i.e the address of the first address of a wide character array) to an integer pointer is only possible if you guess the integer type to which wchar_t corresponds correctly. There is no automatic conversion of pointers of different types, unless one of them is void*.

Jens Gustedt
+1  A: 

Chris has already given the correct answer, but I'd like to offer some thoughts on why you may have made the mistake to begin with. On Windows, wchar_t was defined as 16-bit way back in the early days of Unicode where it was intended to be a 16-bit character set. Unfortunately this turned out to be a bad decision (it makes it impossible for the C compiler to support non-BMP Unicode characters in a way that conforms to the C standard), but they were stuck with it.

Unix systems from the beginning have used 32-bit wchar_t, which of course means short * and wchar_t * are incompatible pointer types.

R..