Float versus Integer arithmetic performance on modern chips

views:

285

answers:

+1 Q:

Float versus Integer arithmetic performance on modern chips

Consider a Viterbi decoder on an additive model. It spends its time doing additions and comparisons. Now, consider two: one with C/C++ float as the data type, and another with int. On modern chips, would you expect int to run significantly faster than float? Or will the wonders of pipelining (and the absence of multiplication and division) make it all come out about even?

+1 A:

Depends on what you mean by significantly. I usually expect to see ints perform about 2x faster, but it all depends on what else is going on. Modern processors that can handle the AMD64 (AMD/Core2) instruction set can usually do effectively 1 float operation per cycle if they can keep the pipeline fed

They can also usually do 2 or 3 integer operations in the same amount of time. and even can do both at once.

But it's not that hard to write code that stalls the pipeline, you have to avoid using the result of a calculation immediately after it's complete or the pipeline will stall and you get more like 3 cycles per multiply rather than 1.

The instructions per cycle for the PowerPC is the same or better than AMD/Intel in most cases.

Addendum:

By the way, you may discover that the comparisons (or rather the branches that the comparisons imply) end up costing a lot more than the additions. mis-predicted branches are expensive, especially on the Pentium 4 processor.

John Knoeller 2010-01-06 01:08:28

Some compilers implement comparisons using the `SETcc` instructions rather than the `Jcc` instructions, in which case no branching is involved and you don't get the branch misprediction penalty.

Chris Jester-Young 2010-01-09 17:14:21

ansaurus

tags:

views:

answers:

Float versus Integer arithmetic performance on modern chips

related questions