tags:

views:

808

answers:

4
+4  Q: 

C++ Hex Parsing

I'm wondering how to convert a hex string to a human readable string (if that makes any sense) this would be my first real encounter with hex values so I'm still learning about them and how to manage them.

I have a program which is reading in data from a file which contains raw packet data (hex) and I need to parse this information so it's human readable.

An example of what I need to do is something like this site does http://home2.paulschou.net/tools/xlate/ where you can put in hex and have it converted to text.

A: 
fprintf(file, "%h", thing);

Something along those lines?

Tom R
A: 

The C++-ish way to get a string containing the hexadecimal representation of a given number is to use the hex modifier for streams, as in this example:

const int i = 0xdeadbeef;
cout << "0x" << hex << i << endl; // prints "0xdeadbeef"

You can use the same modifier on string streams in case you need to have the hexadecimal representation in a string variable:

const int i = 0xdeadc0de;
ostringstream stream;
stream << "0x" << hex << i;

const string s = stream.str(); // s now contains "0xdeadc0de"

UPDATE:

If your input data is given as a string containing the hexadecimal representation of the characters of a string, you will need to know the encoding of the input string in order to display it correctly. In the simplest case, the string is something like ASCII which maps one byte to one character. So in a given input "414243", every two characters ("41", "42", "43) map to an ASCII value (65, 66, 67), which map to a character ("A", "B", "C").

Here's how to that in C++:

const string hexData = "414243";

assert( hexData.size() % 2 == 0 );

ostringstream asciiStream;
istringstream hexDataStream( hexData );
vector<char> buf( 3 ); // two chars for the hex char, one for trailing zero
while ( hexDataStream.good() ) {
    hexDataStream.get( &buf[0], buf.size() );
    if ( hexDataStream.good() ) {
        asciiStream << static_cast<char>( std::strtol( &buf[0], 0, 16 ) );
    }
}

const string asciiData = asciiStream.str(); // asciiData == "ABC"

Using std::strtol from <cstdlib> makes this easy; if you insist on using a template class for this, use std::stringstream to perform the conversion of the single sub strings (like "41") to decimal values (65).

Frerich Raabe
I'm a bit confused, all that seems to do is store the hex into a string? I need to convert the hex data to a readable string from its hex data, like http://home2.paulschou.net/tools/xlate/ that kind of website does.
Undawned
It gives the hexadecimal representation of a number, right. What do you mean with 'hex data'? Is the input data a string with the characters "414243" and you want to convert that to e.g. "ABC" (because that's what the website does)?
Frerich Raabe
Yes I want to convert it to "ABC" this 1062000000000002000100024177616b656e65642d4465760036372e3232382e35302e3232333a38303835000000000009022c010000576f575472616e63652d4177616b656e696e670036372e3232382e34392e39303a3830383500000000000a contains some strings and an ip address of the server I got the data from, I'd like to be able to convert the above data to a format that I can read the values it holds.
Undawned
This was similar to what I was trying to do earlier, but it yields the same results for some reason it appears to only display the first two characters when printed back out, I'm not sure why the length of the string is defiantly larger than what gets printed out could the data I'm parsing contain a terminating character which is interfering in the printout?
Undawned
If I use the (long) example string from your comment in my updated code, I can indeed see some strings and an IP address. I simply used my above code and added a 'cout << asciiData << endl;' to the end.Maybe you're using C API to print the string, and the third '00' byte (a NULL byte!) terminates the string output.
Frerich Raabe
Oh yeah that worked, I had .c_str() on the output, thanks a lot this is really gonna help me move forward with my app =).
Undawned
A: 

Hex is a way of displaying binary data. It is not "raw data" as you say. If the raw data you have contains a string, you should be able to see the string (possibly among other garbage) when you output it to the screen.

Here's a loop to print the ASCII characters in a block of data. To get anything else, you will have to deal with its format.

char *binary_data[ BUFFER_SIZE ];
size_t len = BUFFER_SIZE;
len = get_a_packet( data, len ); // or however you get data

for ( char *text_ptr = binary_data; text_ptr != binary_data + len; ++ text_ptr ) {
    if ( * text_ptr <= '~' && * text_ptr >= ' ' ) { // if it's ascii
        cerr << * text_ptr; // print it out
    }
}

cerr << endl;
Potatoswatter
+4  A: 

Taken from the Strtk library, the following should suffice. Note that out should point to a piece of memory half the size of as std::distance(begin,end), and that the values in the range of [begin,end) be 0-9A-F or 0-9a-f

inline bool convert_hex_to_bin(const unsigned char* begin, 
                               const unsigned char* end, 
                               unsigned char* out)
    {
       if (std::distance(begin,end) % 2)
          return false;
       static const std::size_t symbol_count = 256;
       static const unsigned char hex_to_bin[symbol_count] = {
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x00 - 0x07
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x08 - 0x0F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x10 - 0x17
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x18 - 0x1F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x20 - 0x27
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x28 - 0x2F
                    0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, // 0x30 - 0x37
                    0x08, 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x38 - 0x3F
                    0x00, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x00, // 0x40 - 0x47
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x48 - 0x4F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x50 - 0x57
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x58 - 0x5F
                    0x00, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x00, // 0x60 - 0x67
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x68 - 0x6F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x70 - 0x77
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x78 - 0x7F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x80 - 0x87
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x88 - 0x8F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x90 - 0x97
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0x98 - 0x9F
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xA0 - 0xA7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xA8 - 0xAF
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xB0 - 0xB7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xB8 - 0xBF
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xC0 - 0xC7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xC8 - 0xCF
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xD0 - 0xD7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xD8 - 0xDF
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xE0 - 0xE7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xE8 - 0xEF
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 0xF0 - 0xF7
                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00  // 0xF8 - 0xFF
                  };

       const unsigned char* itr = begin;
       while (end != itr)
       {
          (*out)  = static_cast<unsigned char>(hex_to_bin[*(itr++)] << 4);
          (*out) |= static_cast<unsigned char>(hex_to_bin[*(itr++)]     );
          ++out;
       }
       return true;
    }
Beh Tou Cheh