Are L'A' and 'A' totally same?

If you write 'A', and that value gets converted to wchar_t, then on Microsoft compilers at least, it will have the same value as if you'd written L'A' or _T('A').

The same can't be said of string literals, since there is no useful conversion from const char* to const wchar_t*. I think this means it's rather less important to get character literal types right, than string literals.

It's easy to write code that behaves differently according to whether a character literal is wide or narrow - just have an overloaded function that does something completely different. But in practice, sensible functions overloaded to take both types of character are going to end up doing the same thing with 'A' that they do with L'A'. And functions which aren't overloaded, and only take wchar_t, can take 'A' just fine.

I don't immediately see anything in the standard to require that L'A' == (wchar_t)'A', so in theory non-Microsoft compilers might do something completely different. But you'd normally expect the wide character set to be an extension of the narrow character set, just as Unicode extends ISO-8859-1. To be specific what "extension" means, code points which are equal as integers designate the "same character".

'A' isn't necessarily ASCII and L'A' isn't necessarily unicode.

John Burton 2010-03-17 10:55:07

Yes, L'A' isn't necessarily unicode.But I can't understand what "'A' isn't necessarily ASCII" means.

Benjamin 2010-03-17 11:01:26

It means, "`'A'` (and in general the `char` type) isn't necessarily ASCII". For example, it might be EBCDIC. But not on Microsoft compilers, which is what Johannes is (reasonably IMO) talking about, on account of you mentioning `_T` in the question.

Steve Jessop 2010-03-17 12:21:26

thanks Steve :)

Benjamin 2010-03-17 15:16:05

My understanding is that having `L'A' == (wchar_t)'A'` is mandated in both C and C++. The constraint does not any longer hold in C (since TC2, some details have been modified in TC3) if `__STDC_MB_MIGHT_NEQ_WC__` is 1. C++0X imports the C TC3 solution. For C discussion see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_321.htm

AProgrammer 2010-03-17 12:49:21

Assuming your understanding is correct, I still can't actually *find* it in the C++ standard, either in the definition of the executable wide character set, the definition of wide character literals, or the definition of `wchar_t`. Doesn't mean it's not there.

Steve Jessop 2010-03-17 13:40:46

The POSIX standard requires narrow and wide characters to have the same numeric value *for characters that are in the POSIX portable character set*. So, while L'A' == (wchar_t)'A' is guaranteed, L'€' == (wchar_t)'€' isn't.

dan04 2010-03-19 04:09:12

I disagree that functions overloaded to take char and wchar_t should do the same thing. If I want to read a file's contents into a char string, I'll just read it. If I want to read a file's contents into a wchar_t, I have to decode it, because nobody uses UTF-16.

dan04 2010-03-19 04:15:22

@dan04. You can't read a file's contents into a `wchar_t`. What I said about overloading on `char` vs. `wchar_t` applies only to what I said, not also to overloading on `char*` vs `wchar_t*`. But for example `std::isspace` should return the same for a `char` and a `wchar_t` that represent the "same character". That requirement about the POSIX-portable character set rules out a system which is EBCDIC, but uses unicode for wide chars. Hence the proposal to remove the equivalent restriction from C99, provided the implementation sets `__STDC_BTOWC_NEQ_WCTOB__`, as in AProgrammer's link above.

Steve Jessop 2010-03-19 10:27:32

Anyway, by "do the same thing", I sort of mean "a function which claims to treat both types as characters via overloads, will do the same thing in the context of the function, from the user's POV". `std::cout << 'A';` and `std::cout << L'A';` don't "do the same thing", but then that's because `std::cout` is a stream of `char`, and only supports `wchar_t` because it supports everything in the universe and somehow co-erces it to one or more chars. A stream of wide characters would hopefully print `A` in both cases.

Steve Jessop 2010-03-19 11:04:56

ansaurus

tags:

views:

answers:

Are L'A' and 'A' totally same?

related questions