ansaurus

Question

Efficient reordering of large dataset to maximize memory cache effectiveness

Answer 1

A:

I think the answer to this question is going to depend very heavily on exactly the access pattern of the pair of objects. As you said, just sorting the pointers would be best in a simple, non-paired case. In a more complex case it may still make sense to sort by one of the halves of the pair if the pattern is such that locality for those values is more important (if, for example, these are key/value pairs and you are doing a lot of searches, locality for the keys is infinitely more important than for the values).

So, really, my answer is that this question can't be answered in a general case.

For storing your structure, what you actually want is probably a B-tree. These are designed for what you're talking about--keeping track of large collections where you don't want to (or can't) keep the whole thing in memory.

SoapBox 2009-01-31 21:29:24

The cost of accessing either the first or the second object is the same. I still remain optimistic that there is a way to answer this question in the general case - just as problems like the minimum graph-cut do have general case solutions.

sanity 2009-01-31 22:44:01

Answer 2

+1 A:

For start, you could mmap the list. That works if there's enough address space, not memory, e.g. on 64-bit CPUs. This makes it easier to access the elements in order.

You could sort that list according to a minimum distance in cache which considers both elements, which works well if the objects are in a contiguous space. The sorting function could be something like: compare (a, b) to (c, d) = (a - c) + (b - d) (which looks like a Hamming distance). Then you pull in slices of the object store and process according to the list.

EDIT: fixed a mistake in the distance.

Eduard - Gabriel Munteanu 2009-01-31 21:35:55

Answer 3

+1 A:

Even though you're not just sorting this list, the general pattern of a multiway merge sort might be applicable - that is, consider some kind of (possibly recursive) breakdown of the set into smaller sets that can be dealt with in memory separately, and then a second phase where small chunks of the previously dealt-with sets can all be combined together. Even not knowing the specific nature of what you're doing with the pairs, it's safe to say that many algorithmic problems are made much more straightforward when you're dealing with sorted data (including graph problems, which might be what you have on your hands here).

Ian Varley 2009-01-31 21:48:12

Answer 4

+1 A:

Your problem is related to a similar one for computer graphics hardware:

When rendering indexed vertices in a triangle mesh, typically the hardware has a cache of most recently transformed vertices (~128 the last time I had to worry about it, but suspect the number is larger these days). Vertices not cached need a relatively expensive transform operation to calculate. "Mesh optimisation" to restructure triangle meshes to optimise cache usage used to be a pretty hot research topic. Googling vertex cache optimisation (or optimization :^) might find you some interesting material relevant to your problem. As other posters suggest, I suspect doing this effectively will depend on exploiting any inherent coherence in your data.

Another thing to bear in mind: as an LRU cache becomes overloaded it can be well worth changing to an MRU replacement strategy to at least hold some of the items in memory (rather than turning over the entire cache each pass). I seem to remember John Carmack has written some good material on this subject in connection with Direct3D texture caching strategies.

timday 2009-02-01 10:30:33

ansaurus

tags:

views:

answers:

Efficient reordering of large dataset to maximize memory cache effectiveness

related questions