ansaurus

Question

isalpha equivalent for wchar_t

Answer 1

+4 A:

iswalpha, iswalnum. Same usage.

Docs - Windows (msdn)

Docs - Linux (opengroup.org)

Krevan 2010-07-31 13:47:43

Answer 2

A:

The header is <wctype.h>. The basic macro/function names have a 'w' in them:

int iswalpha(wint_t wc);
int iswalnum(wint_t wc);

Etc.

There are also the functions:

wctype_t wctype(const char *property);
int iswctype(wint_t wc, wctype_t desc);

You could write, for example:

if (iswctype(wc, wctype("alnum")))
    ...process a wide alphanumeric...

Or you could simply write:

if (iswalnum(wc))
    ...process a wide alphanumeric...

Jonathan Leffler 2010-07-31 13:50:20

Answer 3

+2 A:

Take a look at std::isaplha<charT> from <locale>. Could use that as std::isalpha<wchar_t>.

Dmitry 2010-07-31 13:58:21

Answer 4

A:

It depends on how you define “equivalent.” The C character classes are quite simple minded compared to Unicode character classes. For example, if you want to test whether a given code point usually represents a letter (for some definition of “letter”), you could test for the general category L; if you want to check whether a given string comprises a valid identifier, you could use UAX #31, etc. iswalnum and iswalpha might give the intended result depending on the current “locale” setting.

Philipp 2010-08-01 13:57:29

Answer 5

A:

You include tag "localization" in your question. In case of writing of international application you should clear define what do you mean under alphabetical or numerical characters. If you write programs for Windows I'll recommend you to use GetStringTypeEx function (see http://msdn.microsoft.com/en-us/library/dd318118.aspx). For example the code

BOOL bSuccess;
int isTrue;
WORD wCharType;

bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE1, L"a", 1, &wCharType);
if (wCharType & C1_ALPHA == C1_ALPHA) {
    // 
}

You can also use CT_CTYPE3 or CT_CTYPE2 to determne whether a charachter is an Ideographic or whether it is an European number.

To be more exact just try to use function iswalpha, IsCharAlphaW, iswalnum, iswdigit and GetStringTypeExW to test following charachters: L'a', L'ü', L'á', L'я' (Russian charackter), L'ノ' (Japanese charackter in Katakana), L'一' (1 in Japanese). You will see that

iswalpha (L'ノ') return alpha
IsCharAlphaW (L'ノ') return NOT alpha
iswalnum (L'一') return alpha or digit
iswdigit (L'一') return NOT digit

The code

bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE2, L"一", 1, &wCharType);
if ((wCharType & C2_EUROPENUMBER) == wCharType) {
    // numeric
}

say you that L"一" is NOT a european number. You can use GetStringTypeExW to destinduish european number from for example arabic number etc.

So I recommend you to specify your requirement more exactly and then choose the API based on the requirements. In general the usage of the C API is not the best way for an international application.

Oleg 2010-08-01 16:49:04

Answer 6

A:

Strictly speaking, this is not possible under visual studio/windows, because wchar_t is 2 bytes on this platform and is unable to hold a unicode codepoint.

What you really need is a function accepting char*. You have one in ICU AFAIK.

See also http://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful

Pavel Radzivilovsky 2010-08-23 09:10:17

I'm pretty sure that half the point of Unicode is the whole multi-byte thing, where you don't need to be able to hold one codepoint in one character type, because it has the whole carry-over thing.

DeadMG 2010-08-23 09:16:56

DeadMG, what thing are you talking about? I lost you somewhere...If you cannot pass a character to a function, you cannot know if it's a letter. So far so good - and you really cannot, this is life.

Pavel Radzivilovsky 2010-08-23 09:23:35

ansaurus

tags:

views:

answers:

isalpha equivalent for wchar_t

Docs - Windows (msdn)

Docs - Linux (opengroup.org)

related questions