ansaurus

Question

Efficiently convert between Hex, Binary, and Decimal in C/C++

Answer 1

+1 A:

That depends on what you're optimizing for, what do you mean by "efficient"? Is it important that the conversions be fast, use little memory, little programmer time, fewer WTFs from other programmers reading the code, or what?

For readability and ease of implementation, you should at least implement both Dec2Hex() and Dec2Binary() by just calling strotul(). That makes them into one-liners, which is very efficient for at least some of the above interpretations of the word.

unwind 2009-05-04 09:49:09

By "efficiency" I mean time efficiency: Shortest execution time. Thanks for clarifying that.

Igor Oks 2009-05-04 09:55:44

Answer 2

+1 A:

Sounds very much like a homework problem, but what the heck...

The short answer is for converting from long int to your strings use two lookup tables. Each table should have 256 entries. One maps a byte to a hex string: 0 -> "00", 1 -> "01", etc. The other maps a byte to a bit string: 0 -> "00000000", 1 -> "00000001".

Then for each byte in your long int you just have to look up the correct string, and concatenate them.

To convert from strings back to long you can simply convert the hex string and the bit string back to a decimal number by multiplying the numeric value of each character by the appropriate power of 16 or 2, and summing up the results.

EDIT: You can also use the same lookup tables for backwards conversion by doing binary search to find the right string. This would take log(256) = 8 comparisons of your strings. Unfortunately I don't have time to do the analysis whether comparing strings would be much faster than multiplying and adding integers.

Dima 2009-05-04 10:01:19

Regarding the strings to long conversion: Would it work faster than strotul() ?

Igor Oks 2009-05-04 10:05:12

Don't know... Try it.

Dima 2009-05-04 10:06:46

Answer 3

+3 A:

I would suggest just using sprintf and sscanf.

Also, if you're interested in how it's implemented you can take a look at the source code for glibc, the GNU C Library.

Robert S. Barnes 2009-05-04 10:04:18

Wouldn't it work slower than the other solutions?

Igor Oks 2009-05-04 10:07:22

Two answers: 1. Test all solutions and see which is faster.2. Remember the code in the C Standard Library is typically expertly written and highly optimized - problems like these are the entire reason standard libraries exist, so programmers have access to expertly written solutions to extremely common problems and don't have to go and constantly reinvent the wheel themselves.

Robert S. Barnes 2009-05-04 13:00:51

Remember also that sprintf and sscanf have been tested extensively, and aren't going to have the little bugs you might introduce trying to do the conversion yourself.

David Thornley 2009-05-04 16:28:06

Unfortunately, %b is not a standard printf conversion specifier for binary. I would still consider using it, though.

Matthew Flaschen 2009-05-05 02:59:19

Answer 4

+1 A:

Why not just use a Macro to also take the format as an input. If you are in C at least.

#define TO_STRING( string, format, data) \
sprintf( string, "##format##", data)
// Int
TO_STRING(buf,%d,i);
// Hex ( Two char representation )
TO_STRING(buf,%02x,i);
// Binary
TO_STRING(buf,%b,i);

Or you can use sprintf directly: Or you can have multiple macroes.

#define INT_STRING( buf, data) \
sprintf( buf, "%d", data)
#define HEX_STRING( buf, data) \
sprintf( buf, "%x", data)
#define BIN_TO_STRING( buf, data) \
sprintf( buf, "%b", data)

BIN_TO_STRING( loc_buf, my_bin );

eaanon01 2009-05-04 16:09:15

Answer 5

+1 A:

Why do these routines have to be so time-efficient? That sort of claim always makes me wonder. Are you sure the obvious conversion methods like strtol() are too slow, or that you can do better? System functions are usually pretty efficient. They are sometimes slower to support generality and error-checking, but you need to consider what to do with errors. If a bin argument has characters other than '0' and '1', what then? Abort? Propagate massive errors?

Why are you using "Dec" to represent the internal representation? Dec, Hex, and Bin should be used to refer to the string representations. There's nothing decimal about an unsigned long. Are you dealing with strings showing the number in decimal? If not, you're confusing people here and are going to confuse many more.

The transformation between binary and hex text formats can be done quickly and efficiently, with lookup tables, but anything involving decimal text format will be more complicated.

David Thornley 2009-05-04 16:45:09

Answer 6

+1 A:

Let's think about half of task for a moment - converting from a string-ized base n to unsigned long, where n is a power of 2 (base 2 for binary and base 16 for hex).

If your input is sane, then this work is nothing more than a compare, a subract, a shift and an or per digit. If your input is not sane, well, that's where it gets ugly, doesn't it? Doing the conversion superfast is not hard. Doing it well under all circumstances is the challenge.

So let's assume that your input is sane, then the heart of your conversion is this:

unsigned long PowerOfTwoFromString(char *input, int shift)
{
    unsigned long val = 0;
    char upperLimit = 'a' + (1 << shift)
    while (*input) {
        char c = tolower(*input++);
        unsigned long digit = (c > 'a' && c < upperLimit) ? c - 'a' + 10 : c - '0';
        val = (val << shift) | digit;
    }
    return val;
 }

 #define UlongFromBinaryString(str) PowerOfTwoFromString(str, 1)
 #define UlongFromHexString(str) PowerOfTwoFromString(str, 4)

See how easy that is? And it will fail on non-sane inputs. Most of your work is going to go into making your input sane, not performance.

Now, this code takes advantage of power of two shifting. It's easy to extend to base 4, base 8, base 32, etc. It won't work on non-power of two bases. For those, your math has to change. You get

val = (val * base) + digit

which is conceptually the same for this set of operations. The multiplication by the base is going to be equivalent to the shift. So I'd be as likely to use a fully general routine instead. And sanitize the code while sanitizing the inputs. And at that point, strtoul is probably your best bet. Here's a link to a version of strtoul. Nearly all the work is handling edge conditions - that should clue you in on where you energies should be focused: correct, resilient code. The savings for using bit shifts is going to be minimal compared to the savings of say, not crashing on bad input.

plinth 2009-05-04 17:28:53

Answer 7

+3 A:

As others have pointed out, I would start with sscanf(), printf() and/or strtoul. They are fast enough for most applications, and they are less likely to have bugs. I will say, however, that these functions are more generic than you might expect, as they have to deal with non-ASCII character sets, with numbers represented in any base and so forth. For some domains it is possible to beat the library functions.

So, measure first, and if the performance of these conversion is really an issue, then:

1) In some applications / domains certain numbers appear very often, for example zero, 100, 200, 19.95, may be so common that it makes sense to optimize your functions to convert such numbers with a bunch of if() statements, and then fall back to the generic library functions. 2) Use a table lookup if the most common 100 numbers, and then fall back on a library function. Remember that large tables may not fit in your cache and may require multiple indirections for shared libraries, so measure these things carefully to make sure you are not decreasing performance.

You may also want to look at boost lexical_cast functions, though in my experience the latter are relatively compared to the good old C functions.

Tough many have said it, it is worth repeating over and over: do not optimize these conversions until you have evidence that they are a problem. If you do optimize, measure your new implementation to make sure it is faster and make sure you have a ton of unit tests for your own version, because you will introduce bugs :-(

coryan 2009-05-04 21:23:14

ansaurus

tags:

views:

answers:

Efficiently convert between Hex, Binary, and Decimal in C/C++

related questions