I want to know how the "isupper" macro is defined in C/C++. Could you please provide me the same or point me to available resources. I tried looking at ctype.h but couldnt figure it out.
It's a function, not a macro. The function definition of isupper()
differs depending on things like locale and the current character set - that's why there's a function specifically for this purpose.
For ASCII, because of the way the letters are assigned, it's actually quite easy to test for this. If the ASCII code of the character falls in between 0x41
and 0x5A
inclusive, then it is an upper case letter.
It's implementation defined -- every vendor can, and usually does, do it differently.
The most common usually involves a "traits" table - an array with one element for each character, the value of that element being a collection of flags indicates details about the character. An example would be:
traits[(int) 'C'] = ALPHA | UPPER | PRINTABLE;
In which case,, isupper() would be something like:
#define isupper(c) ((traits[(int)(c)] & UPPER) == UPPER)
It's implementation-specific. One obvious way to implement it would be:
extern char *__isupper;
#define isupper(x) ((int)__isupper[(x)])
Where __isupper
points to an array of 0's and 1's determined by the locale. However this sort of technique has gone out of favor since accessing global variables in shared libraries is rather inefficient and creates permanent ABI requirements, and since it's incompatible with POSIX thread-local locales.
Another obvious way to implement it on ASCII-only or UTF-8-only implementations is:
#define isupper(x) ((unsigned)(x)-'A'<='Z'-'A')
It's actually fairly complicated, in GCC for instance. But a simple implementation of isupper could be (although it has a double-evaluation bug) most simply defined as:
#define isupper(c) (c >= 'A') & (c <= 'Z')
GCC specifically checks bit 0 is 1 in the character for the current locale:
(*__ctype_b_loc ())[(int) (c)] & (unsigned short int) (1 << (0))
Where __ctype_b_loc() is a function that returns a pointer into an array of characters in the current locale that contains characteristics for each character in the current character set.