views:

55

answers:

2

Hi everyone, I'm facing the same problem than Eduardo (http://stackoverflow.com/questions/539311/generate-a-hash-sum-for-several-integers) but mine is a little bit different as said in the title.

I have four 32bits integers and I need to generate a 64bits unique key. What I have done for now is to generate a string concatenation of the fours integers separated by a '/' and then, generate a CRC with the string.

char id_s[64];
sprintf(id_s, "%d/%d/%d/%d", a, b, c, d);
uint64_t id = CRC(id_s);

But the problem is that I have to do it several million times so it appears to be very CPU consuming. So I was thinking about directly storing the four integers into a single integer.

This can be done easily if the four integers where 16bits integers. It could just be done using bit shift operator.

uint64_t id = a << 48 + b << 32 + c << 16 + d;

With four 32bits integers, I have to put 128bits into a single 64bits integer.

Does someone have any suggestion?

A: 

Depending on the nature of your input data, something almost like what you suggested might work just fine:

uint64_t id = static_cast<uint64_t>(a & 0xFFFFu) << 48 + static_cast<uint64_t>(b & 0xFFFFu) << 32 + static_cast<uint64_t>(c & 0xFFFFu) << 16 + static_cast<uint64_t>(d & 0xFFFFu);

As long as the upper bits of the values are fairly constant and the lower bits relatively random this should get you close. You're trying to cram 128 bits of data into 64 bits so you have to throw away data somewhere. It's just a matter of which bits to discard, and how you do it.

Mark B
Yup, I already thought about throwing away some bits since some input integers just take small values. Two of them can be stored in an 16bits and the two others in 48bits but still remaining 16bits.
+1  A: 

I think your best bet would be to use xor:

  uint64_t makeId(uint32_t a, uint32_t b, uint32_t c, uint32_t d)
  {
     uint64_t id = a;
     id <<=11;
     id ^= b;
     id <<=11;
     id ^= c;
     id <<=10;
     id ^=d;

     return id;
  }

This will work fairly well if your inputs are well spread and use all 32 bits. Like Mark said, you can't turn 128bits into 64bits without duplication.

Dolphin
+1: much better than MarkB because discarding bits means higher chances of hash collision.
Matthieu M.