Hi All
How to detect the presence of Extended ASCII values (128 to 255) in a C++ character array.
Thank you
Hi All
How to detect the presence of Extended ASCII values (128 to 255) in a C++ character array.
Thank you
Iterate over array and check that each character doesn't fall in 128 to 255 range?
Please remember that there is no such thing as extended ASCII. ASCII was and is only defined between 0 and 127. Everything above that is either invalid or needs to be in a defined encoding other than ASCII (for example ISO-8859-1).
Other than that: what's wrong with iterating over it and check for any value > 127 (or <0 when using signed char
s)?
bool detect(const signed char* x) {
while (*x++ > 0);
return x[-1];
}
Make sure you know the endianness of the machine in question, and just check the highest bit with a bitwise AND mask:
if (ch & 128) {
// high bit is set
} else {
// looks like a 7-bit value
}
But there are probably locale functions you should be using for this. Better yet, KNOW what character encoding data is coming in as. Trying to guess it is like trying to guess the format of data going into your database fields. It might go in, but garbage in, garbage out.
Char can be signed or unsigned. This doesn't really matter, though. You actually want to check if each character is valid ASCII. This is a positive, non-ambiguous check. You simply check if each char is both >=0 and <= 127. Anything else (whether positive or negative, "Extended ASCII" or UTF-8) is invalid.