Ok, I'd like to thank you all for your answers - very interesting and helpful. :)
PriorityQueue is definitely the right term I was searching for - thanks for that.
Now it's all about implementation.
Here is what I think:
Let N be the size of the queue and M be the average amount of events per timestamp ("concurrent" events so to speak) at the time of processing (the density of events will not be evenly distributed, the "far future" beeing much more sparse, but as time moves on, this area of time becomes much more dense (actually, I think the maximum density will be somewhere in the 4 to 12 hours future)). I am looking for a scalable solution, that performs well for considerably big M. The goal is to really process those M due events within one second, so I wanna spend the least time possible on finding them.
- Going for the simple tree approach, as suggested several times, I'll be having O(log N) insertion, which is quite good, I guess. The cost of processing one timestamp would be O(M*log N), if I am right, which is not so good anymore.
- An alternative would be, to have a tree with lists of events instead of single events. it should be feasible to implement some getlistForGivenStampAndCreateIfNoneExists-operation that'd be a little faster than going down the tree twice if no list exists. But anyway, as M grows, this shouldn't even matter too much. Thus insertion would be O(log N), as before, and processing would be at O(M+log N), which is also good, I think.
- The hash-of-lists-of-events approach, I formulated. This also should have O(1) insertion and O(M) processing cost, although this is not too trivial with hashes. Sounds cool, actually. Or am I missing something? Of course it is not so easy to make a hash perform well, but apart from that, are there any problems? Or is the hash the problem? Wikipedia states:
"In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at constant average (indeed, amortized) cost per operation."
A quick benchmark showed that the standard implementation for my platform seems to match this.
- The array-of-lists-of-events approach provided by DVK. This has O(1) insertion. Now that is good. But if I understand correctly, it has O(M+T) processing cost, with T being the size of the array (the number of time slots if you will), because removal from arrays comes at linear cost. Also, this only works if there is a maximum time offset.
Actually, I would like to discuss the array approach. O(M+T) is not good. Not at all. But I put some brains into it, and this is what I came up with:
First Idea: Lazyness
The O(T) could be crunched down by an arbitrary factor, introducting a bit of lazyness, but in the end it'd stay O(T). But how bad is that? Let's have T=2419200, which is 28 days. And then, once a day I'd clean it up (preferably while low load is expected). That'd waste less than 5% of the array. On my target platform, the copy operation takes 31msecs on a fairly old 2GHz core, so it doesn't seem such a bad idea after all.
Second Idea: Chunks
After thinking a little, I thought of this solution: a hash-of-intervals, an interval (I.e. given time frame) in turn being an array-of-lists-of-events. the intervals are all of equal sizes, preferably something simple, like days or maybe hours.
For insertion, I lookup the right interval through the hash (create if none exists), and in the interval, the right list-of-events (again create if none exists) and then just insert it, which is O(1).
For processing, I simply take the current interval, and process due events, by processing the currently due list-of-events, and then disposing it. The array stays of constant length, so we are at O(M) (which is quite the best you can get for processing M elements). Once the current interval is entirely processed (thus if the interval now represents the "past"), I simply dispose it at O(1). I can keep an extra reference to the current interval, eliminating the need to look it up, but I guess this doesn't provide any noticable improvement.
It seems to me, the second optimization is really the best solution, since it is fast and unbound. Choosing a good size for the intervals allows optimizing memory overhead vs. hash lookup overhead. I don't know, whether i should worry about the the hash lookup time at all. For high M, it shouldn't really matter, should it? Thus I'd choose an interval size of 1, which leads me back to approach number 3.
I'd be really greatful for any input on that.