Look at std::numeric_limits<char>::min()
and max()
. Or CHAR_MIN
and CHAR_MAX
if you don't like typing, or if you need an integer constant expression.
If CHAR_MAX == UCHAR_MAX
and CHAR_MIN == 0
then chars are unsigned (as you expected). If CHAR_MAX != UCHAR_MAX
and CHAR_MIN < 0
they are signed (as you're seeing).
In the standard 3.9.1/1, ensures that there are no other possibilities: "... a plain char can take on either the same values as a signed char or an unsigned char; which one is implementation-defined."
This tells you whether char
is signed or unsigned, and that's what's confusing you. You certainly can't call anything to modify it: from the POV of a program it's baked into the compiler even if the compiler has ways of changing it (GCC certainly does: -fsigned-char
and -funsigned-char
).
The usual way to deal with this is if you're going to cast a char
to int
, cast it through unsigned char
first. So in your example, (int)(unsigned char)mystring[a]
. This ensures you get a non-negative value.
It doesn't actually tell you what charset your implementation uses for char
, but I don't think you need to know that. On Microsoft compilers, the answer is essentially that commonly-used character encoding "ISO-8859-mutter-mutter". This means that chars with 7-bit ASCII values are represented by that value, while values outside that range are ambiguous, and will be interpreted by a console or other recipient according to how that recipient is configured. ISO Latin 1 unless told otherwise.
Properly speaking, the way characters are interpreted is locale-specific, and the locale can be modified and interrogated using a whole bunch of stuff towards the end of the C++ standard that personally I've never gone through and can't advise on ;-)
Note that if there's a mismatch between the charset in effect, and the charset your console uses, then you could be in for trouble. But I think that's separate from your issue: whether chars can be negative or not is nothing to do with charsets, just whether char is signed.