tags:

views:

1394

answers:

4

Hey everyone,

  1. I have some misconceptions about measuring flops, on Intel architecture, is a FLOP one addition and one multiplication together? I read about this somewhere online and there is no debate that could reject this. I know that FLOP has a different meaning on different types of cpu.

  2. How do I calculate my theoretical peak FLOPS? I am using Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz. What exactly is the relationship between GHz and FLOPS? (even wikipedia's entry on FLOPS does NOT specify how to do this)

  3. I will be using the following methods to measure the actual performance of my computer (in terms of flops): Inner product of two vectors: for two vectors of size N, is the number of flops 2n(n -1) (if one addition or one multiplication is considered to be 1 flop). If not, how should I go about calculating this?

I know there better ways to do so, but I would like to know whether my proposed calculations are right. I read somewhere about LINPACK as a benchmark, but I would still like to know how it's done.

+1  A: 

This article shows some theory on FLOPS numbers for x86 CPUs. It's only current up to Pentium 4, but perhaps you can extrapolate.

unwind
+1  A: 

A FLOP stands for Floating Point Operation.

It means the same in any architecture that supports floating point operations, and is usually measured as the ammount of operations that can take place in any one second (as in FLOPS; floating point operations per second).

here you can find tools to measure your computer's FLOPS.

dsm
A: 

Intel's data sheets contain GFLOPS numbers and your processor has a claimed 22.4

http://www.intel.com/support/processors/sb/CS-023143.htm

Since your machine is dual core that means 11.2 GFlops per core at 2.8 GHz. Divide this out and you get 4. So Intel claims that their cores can each do 4 FLOPS per cycle.

Andrew
+6  A: 

As for your 2nd question, the theoretical FLOPS calculation isn't too hard. It can be broken down into roughly:

(Number of cores) * (Number of execution units / core) * (cycles / second) * (Execution unit operations / cycle) * (floats-per-register / Execution unit operation)

A Core-2 Duo has 2 cores, and 1 execution unit per core. an SSE register is 128 bits wide. a float is 32 bits wide so you can store 4 floats per register. I assume the execution unit does 1 SSE operation per cycle. So it should be:

2 * 1 * 2.8 * 1 * 4 = 22.4 GFLOPS

which matches: http://www.intel.com/support/processors/sb/cs-023143.htm

This number is obviously purely theoretical best case performance. Real world performance will most likely not come close to this due to a variety of reasons. It's probably not worth trying to directly correlate flops to actual app runtime, you'd be better off trying out the computations used by your applicaton.

Falaina
That's exactly what I need thank you so much. BTW where did you find that equation?
confused
The Core2 can actually issue an SSE multiply and add each cycle, so the calculation for single precision FLOPS is 2*1*2.8*2*4 = 44.8 GFLOPS; I believe that the Intel link is listing double-precision FLOPS (2*1*2.8*2*2 = 22.4).
Stephen Canon