ansaurus

Question

Answer 1

+2 A:

Hmm with only 256 possible values, since you will parse your source code to know all possible functions, maybe the best way to do it would be to attribute a number to each of your function ???

A real hash function would probably won't work because you have only 256 possible hashes. but you want to map at least 26^15 possible values (assuming letter-only, case-insensitive function names). Even if you restricted the number of possible strings (by applying some mandatory formatting) you would be hard pressed to get both meaningful names and a valid hash function.

Ksempac 2009-08-05 13:02:19

Answer 2

+2 A:

No, there isn't.

You can't make a collision free hash code, or even close to it, with just an eight bit hash. If you allow strings that are longer than one character, you have more possible strings than there are possible hash codes.

Why not just extract the function names and give each function name an id? Then you only need a lookup table on each side of the wire.

(As others have shown you can generate a hash algorithm without collisions if you already have all the function names, but then it's easier to just assign a number to each name to make a lookup table...)

Guffa 2009-08-05 13:05:26

Why the downvotes? If you don't say what it is that you don't like, it's really pointless.

Guffa 2009-08-05 15:18:20

Answer 3

+7 A:

Try minimal perfect hashing:

Minimal perfect hashing guarantees that n keys will map to 0..n-1 with no collisions at all.

C code is included.

Martin B 2009-08-05 13:06:29

also see gperf, http://www.gnu.org/software/gperf/

Hasturkun 2009-08-05 13:18:49

That doesn't work without first getting all the function names.

Guffa 2009-08-05 15:20:42

Yes, you can only do perfect hashing if you know all of the strings in advance. If that's not the case, one approach is to use a hash table to handle the collisions, then transmit the index of the entry in the hash table.

Martin B 2009-08-05 15:39:08

you might also be able to coax gperf or similar to inline at compile time, reducing the computation cost to 0

Hasturkun 2009-08-05 16:35:14

Answer 4

+1 A:

DrJokepu 2009-08-05 13:10:27

Answer 5

+2 A:

If you have a way to track the functions within your code (i.e. a text file generated at run-time) you can just use the memory locations of each function. Not exactly a byte, but smaller than the entire name and guaranteed to be unique. This has the added benefit of low overhead. All you would need to 'decode' the address is the text file that maps addresses to actual names; this could be sent to the remote location or, as I mentioned, stored on the local machine.

ezpz 2009-08-05 13:14:44

This is how I'd do it. You should be able to use the debug information in the compiled binary to extract the function name, without needing an additional table.

Brooks Moses 2009-08-05 19:09:56

ansaurus

tags:

views:

answers:

Hash function for short strings

related questions