Since RAM seems to be the new disk, and since that statement also means that access to memory is now considered slow similarly to how disk access has always been, I do want to maximize locality of reference in memory for high performance applications. For example, in a sorted index, I want adjacent values to be close (unlike say, in a hashtable), and I want the data the index is pointing to close by, too.
In C, I can whip up a data structure with a specialized memory manager, like the developers of the (immensely complex) Judy array did. With direct control over the pointers, they even went so far as to encode additional information in the pointer value itself. When working in Python, Java or C#, I am deliberately one (or more) level(s) of abstraction away from this type of solution and I'm entrusting the JIT compilers and optimizing runtimes with doing clever tricks on the low levels for me.
Still, I guess, even at this high level of abstraction, there are things that can be semantically considered "closer" and therefore are likely to be actually closer at the low levels. For example, I was wondering about the following (my guess in parentheses):
- Can I expect an array to be an adjacent block of memory (yes)?
- Are two integers in the same instance closer than two in different instances of the same class (probably)?
- Does an object occupy a contigous region in memory (no)?
- What's the difference between an array of objects with only two
int
fields and a single object with twoint[]
fields? (this example is probably Java specific)
I started wondering about these in a Java context, but my wondering has become more general, so I'd suggest to not treat this as a Java question.