views:

261

answers:

3

I have two hex strings, accompanied by masks, that I would like to merge into a single string value/mask pair. The strings may have bytes that overlap but after applying masks, no overlapping bits should contradict what the value of that bit must be, i.e. value1 = 0x0A mask1 = 0xFE and value2 = 0x0B, mask2 = 0x0F basically says that the resulting merge must have the upper nibble be all '0's and the lower nibble must be 01011

I've done this already using straight c, converting strings to byte arrays and memcpy'ing into buffers as a prototype. It's tested and seems to work. However, it's ugly and hard to read and doesn't throw exceptions for specific bit requirements that contradict. I've considered using bitsets, but is there another way that might not demand the conversion overhead? Performance would be nice, but not crucial.


EDIT: More detail, although writing this makes me realize I've made a simple problem too difficult. But, here it is, anyway.

I am given a large number of inputs that are binary searches of a mixed-content document. The document is broken into pages, and pages are provided by an api the delivers a single page at a time. Each page needs to be searched with the provided search terms.

I have all the search terms prior to requesting pages. The input are strings representing hex digits (this is what I mean by hex strings) as well a mask to indicate bits that are significant in the input hex string. Since I'm given all input up-front I wanted to improve the search of each page returned. I wanted to pre-process merge these hex strings together. To make the problem more interesting, every string has an optional offset into the page where they must appear and a lack of an offset indicates that the string can appear anywhere in a page requested. So, something like this:

class Input {
  public:
    int input_id;
    std::string value;
    std::string mask;
    bool offset_present;
    unsigned int offset;
};

If a given Input object has offset_present = false, then any value assigned to offset is ignored. If offset_present is false, then it clearly can't be merged with other inputs.

To make the problem more interesting, I want to report an output that provides information about what was found (input_id that was found, where the offset was, etc). Merging some input (but not others) makes this a bit more difficult.

I had considered defining a CompositeInput class and was thinking about the underlying merger be a bitset, but further reading about about bitsets made me realize it wasn't what I really thought. My inexperience made me give up on the composite idea and go brute force. I necessarily skipped some details about other input types an additional information to be collected for the output (say, page number, parag. number) when an input is found. Here's an example output class:

class Output {
  public:
    Output();
    int id_result;
    unsigned int offset_result;
};

I would want to product N of these if I merge N hex strings, keeping any merger details hidden from the user.

A: 

it sounds like |, & and ~ would work?

Will
+1  A: 

I don't know what a hexstring is... but other than that it should be like this:

 outcome = (value1 & mask1) | (value2 & mask2);
Toad
A: 
const size_t prefix = 2; // "0x"
const size_t bytes  = 2;
const char* value1 = "0x0A";
const char* mask1  = "0xFE";
const char* value2 = "0x0B";
const char* mask2  = "0x0F";
char output[prefix + bytes + 1] = "0x";

uint8_t char2int[] = { /*zeroes until index '0'*/ 0,1,2,3,4,5,6,7,8,9 /*...*/ 10,11,12,13,14,15 };
char int2char[] = { '0', /*...*/ 'F' };

for (size_t ii = prefix; ii != prefix + bytes; ++ii)
{
    uint8_t result1 = char2int[value1[ii]] & char2int[mask1[ii]];
    uint8_t result2 = char2int[value2[ii]] & char2int[mask2[ii]];
    if (result1 & result2)
        throw invalid_argument("conflicting bits");
    output[ii] = int2char[result1 | result2];
}
John Zwinck