views:

127

answers:

2

I hope this has not been covered before, but if i compile a 32 bit program in c++ that uses 64 bit floating point numbers (double), and run it on a 64 bit OS, will it still take as many clock cycles to move the 64 bit float to the cpu and back to ram as it would on a 32bit OS because its compiled for 32 bit. Or would it take less clock cycles to because the OS moves in 64 bit at a time, even though the program is compiled in 32 bit compiler. The reason i ask is because im using VS express which has 32 bit only, and im wondering if i can use 64 bit floats while maintaining speed or if 32 bit floats will be faster, even though im using a 64 bit OS, and trust me, the program that i want to write will use tens of thousand of floating point numbers that will have many calculations and bit wise operations performed on them (looking into neural networks).

Thank you.

A: 

The 32 vs 64 bits you're hearing about is how many bits are in the address. It has little to do with how many bits are used to represent a double. In particular, 32-bit programs still represent a double in 64 bits, and modern processors have hardware that can process 64 bit floats natively (even if they can only process 32 bit integers natively).

So to answer your question, no it shouldn't matter. The speed of floating point operations should not depend on the 32 or 64 -bitness of either the OS or the compiler.

Keith Randall
oh, thank you, and if its not to much to ask, could you point me in the right direction, with like a book or something for understanding how the data is transferred between the processor and ram.
JAKE6459
Data gets between the processor and RAM through a multitude of caches these days, check out http://en.wikipedia.org/wiki/CPU_cache for a primer.
Keith Randall
A: 

Doubles are slower than floats as a general rule.

A) it takes longer for the data to be moved in and out of memory. Busses are really wide nowadays, but it still takes 'more' time to move 'more' data.

B) Doubles take more time to compute, since the hardware tends to operate on groups of bits at a time when doing math, not all at once.

C) 4byte floats have had more use and therefore, more gates thrown at them. Note that SSE on x86 chips operate on 4 floats at once, usually in 4 clocks per instruction, There are some SSE instructions that use doubles, but only two at a time,

However, if you need the extra precision, then use doubles, and eat the performance cost.

You may want to look into GPGPU for crunching large datasets, Only the latest hardware does doubles, but their genearly GPU float performance is outstanding.

Tim
Note that this is very, very dubious. Common hardware such as x86 (x87) works on doubles, not floats, and therefore doubles are faster. Sure, SSE can execute one instruction on 4 floats at a time, compared to 2 doubles. But if you only have one float or double, it doesn't matter whether SSE is 75% or 50% idle. Realistically, use doubles for the calculations, but store the endresults as `float[]`.
MSalters
i have been looking into CUDA. I have a GTS250, so it should do rather well if i take that rout.
JAKE6459