views:

312

answers:

7

I want to know the relative performances of a normal C++ application in the following scenarios:

  1. Built as 32-bit app, run on Intel 64-bit processor (x64-64)
  2. Built as 32-bit app, run on Intel 32-bit processor (x86)
  3. Built as 64-bit app.

Also, what factors should I consider when modifying / developing the application to make it to run faster on 64-bit processors?

+3  A: 

The performance will very likely depend on your application, and can vary a lot, depending on whether or not you use libraries that have optimizations for 64-bit environments. If you want to count on speed up, you should focus on improving your algorithms, rather than considering the instruction set architecture.

As for preparing/developing for 64-bit... the key thing is to not make assumptions with regard to types and their respective sizes. If you need a type with a specific size, use the types defined in <stdint.h>. Whenever you see functions that use size_t or ptrdiff_t, you should use the typedefs rather than some other type.

Michael Aaron Safyan
+7  A: 

Short answer: you probably won't notice much of a difference.

Longer answer: 64-bit x86 has more general purpose registers, which gives the compiler more of an opportunity to optimize local variables into registers for faster access. the compiler can also assume more modern features, eg. not having to optimize code for a 386, and can assume your CPU has stuff like SSE instead of the old x87 FPU for floating point math. but pointers will be twice as wide, which is worse for the cache.

asveikau
+1 for mentioning the pointer length. This can make a huge difference if large pointer structures are used.
swegi
You can get some of those compiler optimizations (SSE scalar float math, etc) even in 32-bit builds by specifying some compiler options, like /arch:SSE2 .
Crashworks
...or for the gcc users, something like `-march=core2 -msse2 -mfpmath=sse`
Tom
sure, you can enable that on 32-bit code, but you can't assume that everyone you send the binary to has a CPU capable of that. with amd64 you can.
asveikau
+1  A: 

In general, you won't find equivalent processors that differ only in their support for 64-bit operation, so it'll be hard to give any concrete comparisons between 1) and 2). On the other hand, the difference between building for 32 and 64 bit mode is entirely dependent on the application. A 64-bit version might be slightly slower or slightly faster than the 32-bit version. If your application uses a lot of temporary variables, then the increased register set of 64-bit mode can make a very large difference in performance.

Mark Bessey
+1  A: 

From experience I've tended to find a 64-bit re-compile of a 32-bit application generally makes things about 30% faster. Its a rough figure but it holds for quite a number of applications i've ported to 64-bit. Basically its for the reasons explained above. You have more registers which is a godsend and allows for much less swapping in and out of memory (which will probably be cached anyway making the win quite small). Certain optimisations can be made much more easily as well. HOWEVER, you do suffer the problem of larger pointers that does wipe out some of the gain, not to mention that doing a context switch requires more memory to be used due to the larger register set.

Careful hand optimisation in 64-bit can provide HUGE performance wins, however.

Your best plan is to recompile as 64-bit and profile. ie See which is better.

Goz
+4  A: 

CPU-intensive programs might be noticeably faster on 64-bit. The processor has 16 instead of 8 general purpose registers available which are also twice as wide (64 instead of 32 bits).

Also the number of registers for SSE instructions is doubled from 8 to 16 which helps for multimedia-applications or other applications which do a lot of floating-point computations.

For details see x86-64 on Wikipedia.

One thing that has not been mentioned yet is that 64-bit versions of operating systems such as Windows and Linux use a different calling convention for function calls on 64-bit systems; instead of passing arguments on the stack, arguments are (preferrably) passed in registers, which is in principle faster. So software will be faster because there is less function call overhead.

Jesper
A: 

do you guys know anything about multi-channels MC concurrent data bus burst, IMC, and multi-core features of new x86_64 architectures? at least, memcpy can be optimized faster if 64 bits because of using 64 bits bus and registers regardless of concurrent burst. at least new archs are able to prefetch data from multiple memory modules into cache concurrently. and more...

EffoStaff Effo
refer to http://code.google.com/p/effogpled/downloads/list, document name EffoDesign_MemTest.pdf for some ideas.
EffoStaff Effo
A: 

Do you have any requirement for > 4G of memory? Exploiting gobs of memory is really the big reason to go 64-bit.

Peeter Joot