I have the following structures defined (names are anonymised, but data types are correct):
Public Type ExampleDataItem
Limit As Integer ' could be any value 0-999
Status As Integer ' could be any value 0-2
ValidUntil As Date ' always a valid date
End Type
Public Type ExampleData
Name As String ' could be 5-20 chars long
ValidOn As Date ' could be valid date or 1899-12-30 representing "null"
Salt As Integer ' random value 42-32767
Items(0 To 13) As ExampleDataItem
End Type
I would like to generate a 32-bit hash code for an ExampleData
instance. Minimising hash collisions is important, performance and data order is not important.
So far I have got (in pseudocode):
- Serialise all members into one byte array.
- Loop through the byte array, reading 4 bytes at a time into a
Long
value. - XOR all the
Long
values together.
I can't really post my code because it's heavily dependent on utility classes to do the serialisation, but if anyone wants to see it regardless then I will post it.
Will this be OK, or can anyone suggest a better way of doing it?
EDIT:
This code is being used to implement part of a software licensing system. The purpose of the hash is to confirm whether the data entered by the end user equals the data entered by the tech support person. The hash must therefore:
- Be very short. That's why I thought 32 bits would be most suitable, because it can be rendered as a 10-digit decimal number on screen. This is easy, quick and unambiguous to read over the telephone and type in.
- Be derived from all the fields in the data structure, with no extra artificial keys or any other trickery.
The hash is not required for lookup, uniqueness testing, or to store ExampleData
instances in any kind of collection, but only for the one purpose described above.