what is the equivalent function for isalpha or isalnum using wchar_t?
wctype ?
an example would be nice also
thanks
what is the equivalent function for isalpha or isalnum using wchar_t?
wctype ?
an example would be nice also
thanks
The header is <wctype.h>
. The basic macro/function names have a 'w' in them:
int iswalpha(wint_t wc);
int iswalnum(wint_t wc);
Etc.
There are also the functions:
wctype_t wctype(const char *property);
int iswctype(wint_t wc, wctype_t desc);
You could write, for example:
if (iswctype(wc, wctype("alnum")))
...process a wide alphanumeric...
Or you could simply write:
if (iswalnum(wc))
...process a wide alphanumeric...
Take a look at std::isaplha<charT>
from <locale>
. Could use that as std::isalpha<wchar_t>
.
It depends on how you define “equivalent.” The C character classes are quite simple minded compared to Unicode character classes. For example, if you want to test whether a given code point usually represents a letter (for some definition of “letter”), you could test for the general category L
; if you want to check whether a given string comprises a valid identifier, you could use UAX #31, etc. iswalnum
and iswalpha
might give the intended result depending on the current “locale” setting.
You include tag "localization" in your question. In case of writing of international application you should clear define what do you mean under alphabetical or numerical characters. If you write programs for Windows I'll recommend you to use GetStringTypeEx
function (see http://msdn.microsoft.com/en-us/library/dd318118.aspx). For example the code
BOOL bSuccess;
int isTrue;
WORD wCharType;
bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE1, L"a", 1, &wCharType);
if (wCharType & C1_ALPHA == C1_ALPHA) {
//
}
You can also use CT_CTYPE3
or CT_CTYPE2
to determne whether a charachter is an Ideographic or whether it is an European number.
To be more exact just try to use function iswalpha
, IsCharAlphaW
, iswalnum
, iswdigit
and GetStringTypeExW
to test following charachters: L'a', L'ü', L'á', L'я' (Russian charackter), L'ノ' (Japanese charackter in Katakana), L'一' (1 in Japanese). You will see that
The code
bSuccess = GetStringTypeExW (LOCALE_USER_DEFAULT, CT_CTYPE2, L"一", 1, &wCharType);
if ((wCharType & C2_EUROPENUMBER) == wCharType) {
// numeric
}
say you that L"一" is NOT a european number. You can use GetStringTypeExW
to destinduish european number from for example arabic number etc.
So I recommend you to specify your requirement more exactly and then choose the API based on the requirements. In general the usage of the C API is not the best way for an international application.
Strictly speaking, this is not possible under visual studio/windows, because wchar_t is 2 bytes on this platform and is unable to hold a unicode codepoint.
What you really need is a function accepting char*. You have one in ICU AFAIK.
See also http://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful