ansaurus

Question

Most Efficient Unicode Hash Function for Delphi 2009

Answer 1

+6 A:

ASM output is not a good indication of algorithm speed. Also, from what I can see, the two pieces of code are doing almost the identical work. The biggest difference seem to be the memory access strategy and the first is using roll-left instead of the equivalent set of instructions (shl | shr -- most higher-level programming languages leave out the "roll" operators). The latter may pipeline better than the former.

ASM optimization is black magic and sometimes more instructions execute faster than fewer.

To be sure, benchmark both and pick the winner. If you like the output of the second but the first is faster, plug the second's values into the first.

rol edx,5 { edx := (edx shl 5) or (edx shr 27)... }

Note that different machines will run the code in different ways, so if speed is REALLY of the essence then benchmark it on the hardware that you plan to run the final application on. I'm willing to bet that over megabytes of data the difference will be a matter of milliseconds -- which is far less than the operating system is taking away from you.

PS. I'm not convinced this algorithm creates even distribution, something you explicitly called out (have you run the histograms?). You may look at porting this hash function to Delphi. It may not be as fast as the above algorithm but it appears to be quite fast and also gives good distribution. Again, we're probably talking on the order of milliseconds of difference over megabytes of data.

Talljoe 2009-06-17 04:02:26

I can't agree with this enough. On modern processors, trying to hand-optimize assembler is very nearly if not actually a thing of the past.

Lee 2009-06-17 04:10:01

I do appreciate your ideas. I don't really intend to try to go crazy optimizing the assembler code. But I would like to eliminate obvious overhead. One run of my program can call the hash function hundreds of millions of times as it's used for almost everything

lkessler 2009-06-17 04:42:16

@lkessler, There isn't much overhead to eliminate here. You'll probably find greater optimizations figuring out places to cache the value than you will squeeze out of a couple of microseconds of execution in the hash function. When you profile your application and see that most of your time is being spent in the hash method there are two options -- optimize the hash function (not much further to go) or figure out how to call it less. Your best bet right now is the latter.

Talljoe 2009-06-17 05:18:21

I found this: http://landman-code.blogspot.com/2008/06/superfasthash-from-paul-hsieh.html

lkessler 2009-06-17 05:27:45

Answer 2

+4 A:

There has been a bit of discussion in the Delphi/BASM forum that may be of interest to you. Have a look at the following:

http://forums.embarcadero.com/thread.jspa?threadID=13902&tstart=0

PhiS 2009-06-17 12:45:49

Answer 3

+5 A:

We held a nice little contest a while back, improving on a hash called "MurmurHash"; Quoting Wikipedia :

It is noted for being exceptionally fast, often two to four times faster than comparable algorithms such as FNV, Jenkins' lookup3 and Hsieh's SuperFastHash, with excellent distribution, avalanche behavior and overall collision resistance.

You can download the submissions for that contest here.

One thing we learned was, that sometimes optimizations don't improve results on every CPU. My contribution was tweaked to run good on AMD, but performed not-so-good on Intel. The other way around happened too (Intel optimizations running sub-optimal on AMD).

So, as Talljoe said : measure your optimizations, as they might actually be detrimental to your performance!

As a side-note: I don't agree with Lee; Delphi is a nice compiler and all, but sometimes I see it generating code that just isn't optimal (even when compiling with all optimizations turned on). For example, I regularly see it clearing registers that had already been cleared just two or three statements before. Or EAX is put into EBX, only to have it shifted and put back into EAX. That sort of thing. I'm just guessing here, but hand-optimizing that sort of code will surely help in tight spots.

Above all though; First analyze your bottleneck, then see if a better algorithm or datastructure can be used, then try to optimize the pascal code (like: reduce memory-allocations, avoid reference counting, finalization, try/finally, try/except blocks, etc), and then, only as a final resort, optimize the assembly code.

PatrickvL 2009-06-21 22:01:39

Answer 4

+4 A:

I'v written two assembly "optimized" functions in Delphi, or more implemented known fast hash algorithms in both fine-tuned pascal and Borland Assembler. The first was a implementation of SuperFastHash, and the second was a MurmurHash2 implementation triggered by a request from Tommi Prami on my blog to translate my c# version to a pascal implementation. This spawned a discussion continued on the Embarcadero Discussion BASM Forums, that in the end resulted in about 20 implementations (check the latest benchmark suite) which ultimately showed that it would be difficult to select the best implementation due to the big differences in cycle times per instruction between Intel and AMD.

So, try one of those, but remember, getting the fastest every time would probably mean changing the algorithm to a simpler one which would hurt your distribution. Fine-tunning a implementation takes lots of time and better create a good validation and benchmarking suite to make check your implementations.

Davy Landman 2009-07-17 13:06:11

Davy: It's nice to hear from the person who did the work. I noted your implementation in my comment to talljoe's answer, and the discussion was pointed out by PhiS.It looks like the SuperFastHash has a lot of code, especially when you compare it to the six lines of pascal in the HashOf function of my question. I'm wondering what would make SuperFastHash faster than HashOf, and if it is faster, then by how much?

lkessler 2009-07-17 14:08:23

@lkessler: your questions all point to what has been mentioned in every answer, create a benchmarking program to simulate your expected usage of the hash function, measure speed and distribution both and you might find the reason why SuperFastHash/MurmurHash2 are probably slower than HashOf. For small strings (10 chars) I would *expect* HashOf to be faster, for larger strings the other functions have unrolled loops to take advantage off.

Davy Landman 2009-07-17 21:25:23

ansaurus

tags:

views:

answers:

Most Efficient Unicode Hash Function for Delphi 2009

related questions