views:

1030

answers:

4

I'm given a string hex_txt containing a 4 digit hex number in the way outlined in the code, split up in two array entries. I need to convert it to decimal. The following is the way I'm doing it.

unsigned char hex_txt[] = "\xAB\xCD";
unsigned char hex_num[5];
unsigned int dec_num;

sprintf(hex_num, "%.2x%.2x", (int)hex_txt[0], (int)hex_txt[1]);
printf("%s\n", hex_num);
sscanf(hex_num, "%x", &dec_num);
printf("%d\n", dec_num);

Is there a faster, or more efficient way of doing this? This is my current ad hoc solution, but I'd like to know if there's a proper way to do it.

A: 

Here's what I do:

n = 0;
while (*p){
    if (ISDIGIT(*p)) n = n*16 + (*p++ - '0');
    else if (ISALPHA(*p)) n = n*16 + (LOWERCASE(*p++) - 'a' + 10);
    else break;
}

And (you'll hate this but it works for ASCII) I cheat:

#define LOWERCASE(c) ((c) | ' ')

ADDED: Sorry, I just re-read your question. Why not do:

(unsigned int)(unsigned char)p[0] * 256 + (unsigned int)(unsigned char)p[1]
Mike Dunlavey
Using 'ISALPHA' (why not 'isalpha'?) leaves you vulnerable to converting 'z'. The validation is not stringent enough. If I were you, I'd remove the first part of your answer and leave the second only. Also, the 'unsigned int' casts are not necessary, though they do no harm.
Jonathan Leffler
@Jonathan: If you multiply an unsigned char by 256, don't you get nothing? And if you cast a signed char to unsigned int, I wanted to be sure it wouldn't accidentally propogate the sign bit.
Mike Dunlavey
Mike: Intermediate results in C are always at least `int` - if you calculations with all `unsigned char` operands, they are promoted to either `int` or `unsigned int`, depending on the relative size of `int` and `char`. If you store the result back in an `unsigned char`, *that's* when the result is reduced modulo `UCHAR_MAX` - but in this case the result is being stored into an `int`, so all is well.
caf
BTW, rather than `isalpha` you probably want its little-known cousin `isxdigit`. By the way, if `p` is a `char *`, you have to cast `*p` to `unsigned char` before passing it to `isalpha` and friends (which is not a very widely-known fact).
caf
@caf: thanks for those points.
Mike Dunlavey
+4  A: 
int result = (unsigned char)hex_txt[0] * 256 + (unsigned char)hex_txt[1];

The string hex_txt contains two bytes; I'm guessing that the order is big-endian (reverse the subscripts if it is little-endian). The character code for hex_txt[0] is 0xAB, for hex_txt[1] is 0xCD. The use of unsigned char casts ensures that you don't get messed up by signed characters.

Or, to do it all at once:

printf("%d\n", (unsigned char)hex_txt[0] * 256 + (unsigned char)hex_txt[1]);
Jonathan Leffler
Exactly what I was looking for, thanks.
v64
How can you be sure (unsigned char)hex_txt[0] * 256 isn't zero?
Mike Dunlavey
@Mike: hex_txt[0] is 0xAB or 171; 171 * 256 = 43776. If 'int' was a 16-bit quantity, then that would be a negative non-zero result (or, possibly, an overflow signal, but I've not heard of a machine that generates an overflow signal for signed arithmetic overflow); if it is an unsigned 16-bit quantity, it would still be a non-zero result. There is no way I can see for the multiplication to yield zero. And, further, my answer did not depend on the multiplication yielding a non-zero result - leaving me puzzled about where your question came from.
Jonathan Leffler
The `(unsigned char)` casts are unnecessary if the array is defined as `unsigned char`, as it is.
caf
@Jonathan: I was just afraid (unsigned char)hex_text[0] * 256 would leave an (unsigned char) result, which would be zero, but caf assures me the intermediate result will be the full size of an int, and thus the shifted bits will not be lost.
Mike Dunlavey
@caf - oh, yes! @Mike: the first thing C does is promote a variable of type char to integer - so the expression is calculated with 'int' values - hence my unconcern about overflow. But now I see why you were worried, but you need not be.
Jonathan Leffler
@Jonathan: Funny, I've been using C for 25 years, and that little subtlety never registered, or maybe it did and I forgot it.
Mike Dunlavey
A: 

I might be babbling but if you use:

 char hex_txt[] = "\xAB\xCD";

then in effect you just define 2 bytes:

 char hex_txt[] = {0xab, 0xcd};

so to convert this to an int you can do this:

 int number = (int) ((short *) *hex_text);
Toad
Drop the * from in front of 'hex_text' and the second e inside it. Then you're just left with an indeterminate result; you get one answer on a big-endian machine and another answer on a little-endian machine. You are also vulnerable to a SIGBUS error if the compiler is cruel enough to put hex_txt on an odd byte boundary and your CPU objects to reading short integers from odd addresses. Oh, and the first declaration defines an array of length 3, not 2.
Jonathan Leffler
You can run into 1) endian-issues, and 2) alignment issues, no?
Mike Dunlavey
details ;^)
Toad
Unfortunately, overlooking details like these lead compilers to object at compile time, and programs to crash or give erroneous results at run time. None of these are good outcomes.
Jonathan Leffler
A: 

You might not need it, though there's a piece of C++ code I published here.

t.g.