views:

418

answers:

4

Knuth recently objected to 64-bit systems, saying that for programs which fit in 4 gigs of memory, "they effectively throw away half of the cache" because the pointers are twice as big as on a 32-bit system.

My question is: can this problem be avoided by installing a 32-bit operating system on a 64-bit machine? And are there any bandwidth-intensive benchmarks which demonstrate the advantage in this case?

+4  A: 

The answer is: yes it can to a certain extent, although the performance difference is unlikely to be great.

Any benchmark to test this will have to do a lot of pointer resolution, which will be difficult to separate out from the noise. Designing a benchmark that will not optimise away is difficult. This article about flawed java benchmarks was posted by someone in response to another question, but many of the principles described in it will apply to this.

ConcernedOfTunbridgeWells
+6  A: 

Bandwidth is not really the correct term here. What Knuth was really talking about was data density, as it relates to cache footprint. Imagine that you have a 16KB L1 data cache: If you're purely storing pointers, you can store 2^14/2^2 = 2^12 = 4096 32-bit pointers, but only 2048 64-bit pointers. If the performance of your application depends on being able to keep track of over 2K different buffers, you may see a real performance benefit from a 32-bit address space. However, most real code is not this way, and real performance benefits from a caching system often come from being able to cache common integer and floating-point data structures, not huge quantities of pointers. If your working set is not pointer-heavy, the downside of 64-bit becomes negligible, and the upside becomes much more obvious if you're performing a lot of 64-bit integer arithmetic.

Matt J
Thanks for the clarification.
fivebells
+2  A: 

i've seen somewhere that the best mix (on x86 CPUs) is to use a 64-bit OS and 32-bit applications.

with a 64-bit OS you get:

  • ability to handle more than 4GB of address space
  • more, bigger registers to help in data-copying operations

with a 32-bit app you get:

  • smaller pointers
  • less, smaller registers to save on context switches

cons:

  • all libraries must be duplicated. tiny by HD space standards.
  • all loaded libraries are duplicated on RAM. not so tiny...

surprisingly, there seems not to be any overhead when switching modes. I guess that breaking from userspace to kernel costs the same, no matter the bitness of the userspace.

of course, there are some applications that benefit from big address space. but for everything else, you can get an extra 5% performance by staying at 32-bit.

and no, i don't care about this small speedup. but it doesn't "offend" me to run 32-bit FireFox on a 64-bit KUbuntu machine (like i've seen on some forums)

Javier
May or may not prove true on all arch. I'm not familiar with x64 machine code, but 16-bit code on a 32-bit OS (x86 arch) incurs a performance hit vs. 16-bit mode.
Brian Knoblauch
the only example i know about 16bit apps running on 32bit x86 was old windows systems. back then there wasn't a real kernel/userspace separation, so any mode-switching was a noticeable overhead. nowadays context switch is mandatory, and 32/64 doesn't seem to be (much?) worse than 32/32 or 64/64
Javier
@Brian: Also, 16-bit code was a bit of a nightmare because it did not use a linear address space. It had to use a lot of extra code for segment registers. Additionally, the CPU had to do a very slow mode switch between the 32-bit linear addressed OS system mode and 16-bit segment addressed user mode. The switch between 32-bit and 64-bit linear modes was designed to be much faster on current CPUs because a majority of software is still 32-bit.
Adisak
+3  A: 

I don't think Knuth objected to 64-bit systems. He just said that using 64-bit pointers on a system that has less than 4GB ram is idiotic (at least if you have lots of pointers like the ones in a double-linked list). I can't say that I agree with him, here are 3 different ways that can be taken. Let's assume you have a 64-bit capable CPU that can also run in 32-bit mode like some Intel Core Duo.

1 - Everything is 32-bit, the OS, the APPZ, all of them. So you have 32-bit pointers but you can not use the extra registers/instructions that are available on 64-bit mode.

2 - Everything is 64-bit, the OS, the APPZ, all of them. So you have 64-bit pointers and you can use the extra registers/instructions that are available on 64-bit mode. But as you have less than 4GB ram, using 64-bit pointers seems like idiotic. But, is it ?

3 - OS is 64-bit and OS interestingly makes sure that all the code/data pointers are in the 0x00000000 - 0xFFFFFFFF range (Virtual Memory !!!). The ABI runs in a very strange way that all the code/data pointers kept in memory/files are 32-bit wide but they are loaded into 64-bit registers as zero-extended. If there is a code location to jump, compiler/ABI does the necessary fix-ups and does the actual 64-bit jump. This way, pointers are 32-bit but APPZ can be 64-bit meaning they can make use of the 64-bit registers and instructions. This process is something like thunking, I think ;-P

My conclusion is ::

The 3rd option seemed doable to me but it is not an easy problem. In theory it can work but I do not think it is feasible. And I also think that his quote "When such pointer values appear inside a struct, they not only waste half the memory, they effectively throw away half of the cache." is exaggerated...

Malkocoglu