ansaurus

Question

Answer 1

A:

Seems reasonable to me on first glance.

kenny 2009-01-05 12:31:29

Answer 2

+8 A:

It seems like a lot of templated code to achieve very little, given you have direct hex conversion in the standard C scanf and printf functions. why bother?

Shane MacLaughlin 2009-01-05 12:32:19

I was thinking the same thing.

Daniel 2009-01-05 12:33:12

sscanf/printf are not type-safe. The stream << and >> operators are.

xtofl 2009-01-05 12:37:44

True, but you could wrap sscanf and sprintf with very simple functions to achieve the type safety. The above templated code is way over the top for something this simple, and the use of templates incorrectly suggests the code is type friendly.

Shane MacLaughlin 2009-01-05 12:51:40

I think this answer comes from misunderstanding the purpose of the functions (the mistake being in the question, not your reading of it - I've added a bit to the question). I can't see any 'nice' way of using printf for the purpose of changing a byte array into printable ascii and back.

Patrick 2009-01-05 13:44:07

In fact I can't see how to turn 0x30 into 0x00 using sprintf at all. Do you agree or am I being dumb?

Patrick 2009-01-05 13:48:17

check out the printf %x qualifier, in the following link, http://www.cplusplus.com/reference/clibrary/cstdio/printf.html, e.g. sprintf(str,"%02x",i). Similarly for sscanf http://www.cplusplus.com/reference/clibrary/cstdio/scanf.html, as in sscanf("%x",

Shane MacLaughlin 2009-01-05 14:05:35

Which turns 0x00 into 0x30 but how to do the reverse?

Patrick 2009-01-05 14:25:38

Shane MacLaughlin 2009-01-05 14:36:39

Printf is great but there are times when one might need functions like this in an environment where sscanf/printf are not desired (or perhaps not available): embedded, low-level debugging, etc.

jwfearn 2009-01-05 15:53:16

@jwfearn True, but there are still better ways that the example above. Particuarly in embedded where resources are an issue, I would tend to go for a streamed output rather than resizing a buffer.

Shane MacLaughlin 2009-01-05 16:20:13

@smacl: what if s = "12abdc34d5ba990345326234656". When I code this with scanf it just looks very wrong.

Patrick 2009-01-05 16:56:00

@smacl: don't resize a buffer in embeded due to resources? I'm confused. A stream provides a neat API for a buffer but the stream still holds the data in a buffer.

Patrick 2009-01-05 16:59:58

@smacl: ignore the scanf looks wrong comment. I just needed to think about it for a bit

Patrick 2009-01-05 17:11:23

Answer 3

+6 A:

My main comment about it is that it's very difficult to read.

Especially:

*outit = ((( (*it > '9' ? *it - 0x07 : *it)  - 0x30) << 4) & 0x00f0) + 
            (((*(it+1) > '9' ? *(it+1) - 0x07 : *(it+1)) - 0x30) & 0x000f)

It would take my brain a little while to grok that, and annoy me if I inherited the code.

Stephen Cox 2009-01-05 12:35:38

Likewise, confusing for sure.

Shane MacLaughlin 2009-01-05 12:56:24

"A little while"? Either genius or understatement.

Anthony 2009-01-05 13:01:06

Ok, a long while. ;-)

Stephen Cox 2009-01-05 13:15:25

That line defintely needs a comment but I can't see anyway of simplifying it once this approach to conversion is chosen. Splitting it over multiple lines...

Patrick 2009-01-05 13:18:15

Agreed, it's not clear how to improve it. Possible moving some of the checks into submethods. Still, it makes my eyes bug out.

Stephen Cox 2009-01-05 13:26:43

I'd certainly replace "(*it > '9' ? *it - 0x07 : *it) - 0x30)" with an inline function 'getHexValue', and call that twice. I'm also unsure of the value of the masks: if the input string is garbage, who cares what value is output? If the input string is valid, the masks have no effect.

Steve Jessop 2009-01-06 14:10:37

The result is "*outit = (getHexValue(it) << 4) + getHexValue(it+1);", which I'd settle for.

Steve Jessop 2009-01-06 14:16:49

Answer 4

+3 A:

I don't really object against it. It's generic (within limits), it uses consts, references where needed, etc... It lacks a bit of documentation, and the asciihex *outit assignment is not quite clear at first sight.

resize initializes the output's elements unnecessary (use reserve instead).

Maybe the genericity is somewhat too flexible: you can feed the algorithms with any datatype you like, while you should only give it hex numbers (not e.g. a vector of doubles)

And indeed, it may be a bit overkill, given the presence of good library functions.

xtofl 2009-01-05 12:39:09

The code will not work if you use reserve. If out.size() is 0 before you call reserve, it will still be 0 after you call reserve, so the loops wouldn't execute.See http://www.gotw.ca/gotw/074.htm

Joel 2009-01-05 17:59:35

Indeed, you would need additional adaptations. My point was that objects got constructed unnecessarily.

xtofl 2009-01-07 09:55:14

Answer 5

+4 A:

What is it supposed to do? There is no well-known accepted meaning of hexascii or asciihex, so the names should change.

[edit] Converting from binary to hex notation should often not be called ascii..., as ascii is a 7-bit format.

Stephan Eggermont 2009-01-05 12:41:07

Good call, I only just noticed that now. At a glance, I assumed it was a function for converting between various types of integers and hex. Re-reading it appears to be for converting between blocks of binary (in bytes) and hex.

Shane MacLaughlin 2009-01-05 12:59:12

Edited the question, change a binary array into printable format and back again.

Patrick 2009-01-05 13:31:11

Answer 6

+2 A:

Some problems that I see:

This will work great if it is only used for an input container that stores 8 bit types - e.g. char or unsigned char. For example, the following code will fail if used with a 32 bit type whose value after the right shift is greater than 15 - recommend that you always use a mask to ensure that lookup index is always within range.

*outit++ = hexDigits[*it >> 4];

What is the expected behavior if you pass in a container containing unsigned longs - for this to be a generic class it should probably be able to handle the conversion of 32 bit numbers to hext strings also.

This only works when the input is a container - what if I just want to convert a single byte? A suggestion here is to refactor the code into a core function that can covert a single byte (hex=>ascii and ascii=>hex) and then provide additional functions to use this core function for coverting containers of bytes etc.

In asciihex(), bad things will happen if the size of the input container is not divisible by 2. The use of:

it != in.end(); it += 2

is dangerous since if the container size is not divisible by 2 then the increment by two will advance the iterator past the end of the container and the comparison against end() will never work. This is somewhat protected against via the assert call but assert can be compiled out (e.g. it is often compiled out in release builds) so it would be much better to make this an if statement.

Stephen Doyle 2009-01-05 12:47:38

The assert on the second line of asciihex() checks that the input size is divisible by 2, so +=2 is safe in this case - I agree though that it at least looks dangerous, and I think I'd code it a bit differently myself.

jrb 2009-01-05 13:11:11

@jrbushell, assert will only be included in debug versions. It has no effect on release code, and is an aid to testing rather than run-time checking

Shane MacLaughlin 2009-01-05 13:20:00

assert doesn't only check in debug code in our build environment(though you can configure it to do so)

Patrick 2009-01-05 13:27:19

Answer 7

+3 A:

What's wrong with

*outit = hexDigits[*it]

Why can't these two functions share a common list of hexDigits and eliminate the complex (and slow) calculation of an ASCII character?

S.Lott 2009-01-05 12:47:59

How do you link 0x30 to 0x00? You could use a map I suppose but seems like overkill.

Patrick 2009-01-05 13:28:24

The map is a trivial array lookup. There will only be a few bytes (256 in the crazy worst case; 128 is more realistic). And it will perform instantly using an add and a multiply and nothing more.

S.Lott 2009-01-05 13:50:39

Answer 8

+3 A:

Code has assert statements instead of proper handling of an error condition (and if your assert is turned off, the code may blow up)
for loop has dangerous double-increase of iterator (it+=2). Especially in case your assert did not fire. What happens when your iterator is already at the end and you ++ it?
Code is templated, but what you're doing is simply converting characters to numbers or the other way round. It's cargo cult programming. You hope that the blessings of template programming will come upon you by using templates. You even tagged this as a template question although the template aspect is completely irrelevant in your functions.
the *outit= line is too complicated.
code reinvents the wheel. In a big way.

Thorsten79 2009-01-05 14:05:50

Thanks Thorsten, I was hoping you'd see this. On the cargo cult point, the code allows me to process 2 containers from different libraries without having to write 2 functions which differ only in their parameters, one of the purposes of templates. I didn't consider all the other containers :(.

Patrick 2009-01-05 14:44:06

I value your thoughts about being interoperable between libraries. But why are you fiddling with character constants and obscure ASCII values? It's code "halfway between the gutter and the stars". It tries to be elegant but fails to deliver on that promise. You could easily use C++ strings here.

Thorsten79 2009-01-05 15:47:23

Answer 9

A:

The reason I would consider it toy code is there is no error checking.

I could pass it two vector and it would happily try and do something and make a complete mess generating random gibberish.

Martin York 2009-01-05 16:03:56

Answer 10

+1 A:

Problems I spot:

hexascii does not check if sizeof(T2::value_type)==1

hexascii dereferences it twice, asciihex even more. There's no reason for this, as you can store the result. This means you can't use an istream_iterator.

asciihex needs a random iterator as input, because (it+1) and (it+=2) are used. The algorithm could work on a forward iterator if you use only (++it).

(*it > '9' ? *it - 0x07 : *it) - 0x30 can be simplified to *it - (*it > '9' ? 0x37 : 0x30) so there is only one unconditional subtraction left. Still, an array lookup would be more efficient. Subtract 0x30. '0' will become 0;'A' will become 0x11 and 'a' will become 0x31. Mask with 0x1f to make it case-insensitive, and you can do the resulting lookup in a char[0x20] without overflow risks. Non-hex chars will just give you weird values.

MSalters 2009-01-05 16:07:52

ansaurus

tags:

views:

answers:

Converting binary data to printable hex

related questions