views:

155

answers:

6

what is the equivalent function for isalpha or isalnum using wchar_t?

wctype ?

an example would be nice also

thanks

+4  A: 

iswalpha, iswalnum. Same usage.


Docs - Windows (msdn)

Docs - Linux (opengroup.org)

Krevan
A: 

The header is <wctype.h>. The basic macro/function names have a 'w' in them:

int iswalpha(wint_t wc);
int iswalnum(wint_t wc);

Etc.

There are also the functions:

wctype_t wctype(const char *property);
int iswctype(wint_t wc, wctype_t desc);

You could write, for example:

if (iswctype(wc, wctype("alnum")))
    ...process a wide alphanumeric...

Or you could simply write:

if (iswalnum(wc))
    ...process a wide alphanumeric...
Jonathan Leffler
+2  A: 

Take a look at std::isaplha<charT> from <locale>. Could use that as std::isalpha<wchar_t>.

Dmitry
A: 

It depends on how you define “equivalent.” The C character classes are quite simple minded compared to Unicode character classes. For example, if you want to test whether a given code point usually represents a letter (for some definition of “letter”), you could test for the general category L; if you want to check whether a given string comprises a valid identifier, you could use UAX #31, etc. iswalnum and iswalpha might give the intended result depending on the current “locale” setting.

Philipp
A: 

You include tag "localization" in your question. In case of writing of international application you should clear define what do you mean under alphabetical or numerical characters. If you write programs for Windows I'll recommend you to use GetStringTypeEx function (see http://msdn.microsoft.com/en-us/library/dd318118.aspx). For example the code

BOOL bSuccess;
int isTrue;
WORD wCharType;

bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE1, L"a", 1, &wCharType);
if (wCharType & C1_ALPHA == C1_ALPHA) {
    // 
}

You can also use CT_CTYPE3 or CT_CTYPE2 to determne whether a charachter is an Ideographic or whether it is an European number.

To be more exact just try to use function iswalpha, IsCharAlphaW, iswalnum, iswdigit and GetStringTypeExW to test following charachters: L'a', L'ü', L'á', L'я' (Russian charackter), L'ノ' (Japanese charackter in Katakana), L'一' (1 in Japanese). You will see that

  • iswalpha (L'ノ') return alpha
  • IsCharAlphaW (L'ノ') return NOT alpha
  • iswalnum (L'一') return alpha or digit
  • iswdigit (L'一') return NOT digit

The code

bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE2, L"一", 1, &wCharType);
if ((wCharType & C2_EUROPENUMBER) == wCharType) {
    // numeric
}

say you that L"一" is NOT a european number. You can use GetStringTypeExW to destinduish european number from for example arabic number etc.

So I recommend you to specify your requirement more exactly and then choose the API based on the requirements. In general the usage of the C API is not the best way for an international application.

Oleg
A: 

Strictly speaking, this is not possible under visual studio/windows, because wchar_t is 2 bytes on this platform and is unable to hold a unicode codepoint.

What you really need is a function accepting char*. You have one in ICU AFAIK.

See also http://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful

Pavel Radzivilovsky
I'm pretty sure that half the point of Unicode is the whole multi-byte thing, where you don't need to be able to hold one codepoint in one character type, because it has the whole carry-over thing.
DeadMG
DeadMG, what thing are you talking about? I lost you somewhere...If you cannot pass a character to a function, you cannot know if it's a letter. So far so good - and you really cannot, this is life.
Pavel Radzivilovsky