views:

345

answers:

7

I'm implementing a sliding window over a stream of events, in Java. So I want a data structure which allows me to do the following:

  1. add to the end of the data structure when new events occur;

  2. remove from the start of the data structure when old events are processed;

  3. get standard random access (size(), get(i)) to the elements of the data structure; in general, typical List "read" operations;

  4. is efficient for all of the above operations;

  5. is unbounded.

No other access is required. And no thread safety is required.

I'm currently doing this with an ArrayList, to get things up and running. But I want something more efficient; the remove(0) method (2. above) is inefficient with an ArrayList.

Numbers 1. and 2. are standard Queue-style operations. However, the implementations of Queue in the JDK (such as ArrayDeque) don't allow for get(i) in 3.

So, I'm wondering if there are any libraries out there which have such an implementation, and are suitable for commercial use.

If not, I guess I'll resort to writing my own...

A: 

The only library I can think of that would implement such an interface would be a LinkedList but frankly I'm not sure what the performance characteristics are.

Jasoon
This won't allow for efficient random access - calling get(i).
Calum
Not good for random access.
Alexander Pogrebnyak
Very true. I'd be interested to see what comes up here, since I'm not sure how you could effectively roll both efficient random access with the kind of speed you get from a bog standard FIFO queue...
Jasoon
ArrayDeque in the JDK (a Queue) actually has an array of elements (which can be resized) and head and tail fields. So it *is* a suitable implementation, but it doesn't expose elements in a random access way - it only exposes the head and tail.s the head and tail of the
Calum
+2  A: 

Seems like a task for a Circular Buffer - as long as it's okay if the queue has a fixed capacity. I don't know of any standard implementation though. But here is a nice recipe to roll your own.

sfussenegger
Forgot to say that it should be unbounded - have now updated the question.
Calum
You couod increde the capacity if needed - like arraylist does for instance. It's expensive though
sfussenegger
+2  A: 

If you have a large enough array you can implement a queue with a basic array, and just use an index as to the head of the list, and use the mod operator so you can wrap around if needed.

So, you basically have a circular array that supports the insert and remove functions.

UPDATE:

It is a quick operation to copy an array to a larger array, so just double in size, perhaps, when getting close to the end, and just copy the array, as a step in doing insert. Overall you will still have very fast access, as the norm should not be to increase and copy.

James Black
+2  A: 

How fast are the events coming in and out of this queue?

On the one hand, you can have a "sufficiently large" circular buffer.

While it is technically "bounded", you can make it "unbounded" by growing it as necessary.

By the same token, you can "shrink" it down in terms of overall capacity when it's "quiet".

But, for many applications a 100, 1000, or even 10000 item capacity circular buffer is effectively unbounded.

Will Hartung
A: 

Just throwing this out there as an alternative to rolling your own, so please take with a grain of salt: Depending on how frequently you need the random access get(i) and what performance you need from it (and how big your queue size will generally be), you could always use ArrayDeque.toArray()[i] when you need to access an element. The toArray() uses System.arraycopy() under the covers which should be pretty fast for small queue sizes and occasional usage. Would help to understand why you need random access to a queue and how often it is needed -- possibly there's a different way to implement your algorithm without it.

Chris B.
A: 

A binomial heap can have O(1) amortized insert and O(log n) amortized delete min; I believe that it can also have O(log**2 n) amortized random access. Queue-push would insert the element into the heap, with successive integers as keys.

With a rbtree, you can do queue-push with O(log n) pessimistic for all of insert, delete min and random access. That is because the tree will have a contiguos set of integers as the keys, and the k-th element of the queue will be the element in the tree with the k-th key.

Adrian Panasiuk
A: 

If it truly must be unbounded, then something like ConcurrentSkipListMap may be useful, if you assign an incrementing sequence to each event to use as the key in the map. It provided methods such as pollFirst/LastEntry. If you can sacrifice the unbounded nature of it, then a ring buffer maybe what you need.