views:

166

answers:

2

The following article discusses an alternative heap structure that takes into consideration that most servers are virtualized and therefore most memory is paged to disk.

http://queue.acm.org/detail.cfm?id=1814327

Can (or should) a .NET developer implement a B-Heap data structure so that parent-child relationships are maintained within the same Virtual Memory Page? How or where would this be implemented?

Clarification
In other words, is this type of data structure needed within .NET as a primimitive type? True it should be implemented in either natively in the CLR or in a p/invoke.

When a server administrator deploys my .NET app within a virtual machine, does this binary heap optimization make sense? If so, when does it make sense? (number of objects, etc)

+1  A: 

if you have an especially special-case scenario and algorithm then you might benefit from that kind of optimization.

But generally speaking, when reimplementing core parts of the CLR framework (on top of the CLR I might add, ie in managed code) your chances of doing it more efficiently than the CLR team did are incredibly slim. So I wouldn't recommend it unless you have already profiled the heck out of your current implementation and have positively identified issues related to locality of data in memory. And even then, you will get more bang for your buck by tweaking your algorithm to work better with the CLR memory management scheme then trying to bypass or work around it.

Addys
The question is about a Bin Heap data structure, not about (replacing) the managed heap.
Henk Holterman
I think it would be nice to have a CLR support for this scenario... Where the implementation detail is abstracted from the .NET developer. Maybe it could be implemented in the CLR. Maybe it can be implemented in a library. Either way it is an interesting optimization.
MakerOfThings7
+1  A: 

To at least a certain extent, BCL collections do seem to take paging concerns into account. They also take CPU cache concerns into account (which overlaps in some regard, as locality of memory can affect both, though in different ways).

Consider that Queue<T> uses arrays for internal storage. In purely random-access terms (that is to say, where there is never any cost for paging or CPU cache flushing) this is a poor choice; the queue will almost always be solely added to at one point and removed from at another and hence an internal implementation as a singly linked list would win in almost every way (for that matter, in terms of iterating through the queue - which it also supports - a linked list shouldn't do much worse than an array in this regard in a pure-random-access situation). Where array-based implementation fares better than singly-linked-list is precisely when paging and CPU cache are considered. That MS went for a solution that is worse in the pure-random-access situation but better in the real-world case where paging matters, so that they are paying attention to the effects of paging.

Of course, from the outside that isn't obvious - and shouldn't be. From the outside we want something that works like a queue; making the inside efficient is a different concern.

These concerns are also met in other ways. The way the GC works, for example, minimises the amount of paging necessary as its moving objects not only makes for less fragmentation, but also makes for fewer page faults. Other collections are also implemented in ways to make paging less frequent than the most immediate solution would suggest.

That's just a few things that stand out to me from things I have looked at. I'd bet good money such concerns are also considered at many other places in the .NET teams work. Likewise with other frameworks. Consider that the one big performance concern Cliff Click mentions repeatedly in terms of his Java lock-free hashtable (I really much finish checking my C# implementation) apart from those of lock-free concurrency (the whole point of the exercise) is cache-lines; and it's also the one other performance concern he doesn't dismiss!

Consider also, that most uses of most collections are going to fit in one page anyway!

If you are implementing your own collections, or putting a standard collection into particularly heavy use, then these are things you need to think about (sometimes "nah, not an issue" is enough thinking, sometimes it isn't) but that doesn't mean they aren't already thought about in terms of what we get from the BCL.

Jon Hanna
Thank you, +1; Is the memory layout aligned with what may be ultimately paged to disk? +1 for MSFT if they already implemented this and never told us. I wonder what Mono's implementation is like.
MakerOfThings7
About alignment I don't know, but in cases I've looked at the details the details are working to take advantage of paging and cache lines in a manner appropriate for that level of abstraction, just like the B*-tree like heap does over a B-heap in the article linked to. `Queue<T>` is a classic because singly-linked-list would kick array's ass on pure random-access, but array wins on paged machines and array is what they went for. Mono also pay attention to the fact that paging and cache-lines matter, from what I've looked at of theirs (`HashSet<T>` in particular because I needed something...
Jon Hanna
... based on `HashSet<T>` but with some custom requirements (I wanted to be able to get the existing reference type when an `Add()` would have failed due to duplication, for a sort of intern-pool style collection) so I looked at it quite a bit. With hash-sets and dictionaries again the fact that both use reprobing rather than open tables could be influenced by those approaches dealing better with paging and CPU caches (though there are other factors in their favour). There's other stuff I just don't know about, but it's clear both MS and Mono do pay serious attention to the issue.
Jon Hanna
Now I think of it, I'd be surprised if objects smaller than page size were allowed to cross page-boundaries (I don't know for sure though). Once you are bigger than page-size, you're going to cross it anyway (indeed, one of the things about B+ tree design is that while big blocks give better locality, you don't want them crossing pages). Since keeping I/O buffers < page-size is something juniors learn, I'd be surprised if neither this was already considered, or the people working at that level considered it unimportant as an *informed* decision based on their knowing more than me on that score
Jon Hanna