views:

238

answers:

8

I'm looking for a collection that offers list semantics, but also allows array semantics. Say I have a list with the following items:

apple orange carrot pear 

then my container array would:

container[0] == apple 
container[1] == orangle 
container[2] == carrot 

Then say I delete the orange element:

container[0] == apple 
container[1] == carrot 

I want to collapse gaps in the array without having to do an explicit resizing, Ie if I delete container[0], then the container collapses, so that container[1] is now mapped as container[0], and container[2] as container[1], etc. I still need to access the list with array semantics, and null values aren't allow (in my particular use case).

EDIT:

To answer some questions - I know O(1) is impossible, but I don't want a container with array semantics approaching O(log N). Sort of defeats the purpose, I could just iterate the list.

I originally had some verbiage here on sort order, I'm not sure what I was thinking at the time (Friday beer-o-clock most likely). One of the use-cases is Qt list that contains images - deleting an image from the list should collapse the list, not necessary take the last item from the list and throw it in it's place. In this case, yet, I do want to preserve list semantics.

The key differences I see as separating list and array: Array - constant-time access List - arbitrary insertion

I'm also not overly concerned if rebalancing invalidates iterators.

+3  A: 

You should use a HashMap, the you will have O(1)- Expected insertion time, just do a mapping from integers to whatever.

luke
I don't think he wants to update the keys on insert/delete.
Hank Gay
Deletion is O(n) with this.
Moron
The deletion semantics don't work quite right - there's a hole in the indices (unless one goes through and updates all the other `key=>value` mappings - which is definitely not O(1)). That might be fine, but it's not consistent with the array/list api.
Carl
+3  A: 

If the order isn't important, then a vector will be fine. Access is O(1), as is insertion using push_back, and removal like this:

swap(container[victim], container.back());
container.pop_back();

EDIT: just noticed the question is tagged C++ and Java. This answer is for C++ only.

Mike Seymour
By order, i think he meant sort order. I would expect position in the array would matter, and swapping is not what he wants.
Moron
From the OP: "I don't particularly care if sort order is maintained"
Peter Alexander
@Peter: That is what I said. sort order position != position in array. Since OP wants array semantics, the position in array would definitely matter, wouldn't it?
Moron
btw, since many people seem to be interpreting it as you did, +1. Will revoke/let it stay once the OP clarifies the intent.
Moron
Regardless of the OPs intent, this is the only way to get constant time insertion, deletion, and random access. If he wants references to remain valid after deletions, then he is straight out of luck.
Dennis Zickefoose
@Dennis: It said O(1)-ish (see title). So I suppose OP is willing to relax that a bit.
Moron
@Dennis: I'm willing to relax constant-time insertion/deletion. I just want as close to O(1) access as I can get. There's an acceptable amount of effort that's going to have to go on for bookkeeping in managing the array semantics, and I accept that. iterator validation is not a concern.
Chris Kaminski
+2  A: 

O(1) might be too much to ask for.

Is O(logn) insert/delete/access time ok? Then you can have a balanced red-black tree with order statistics: http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-seven/

It allows you to insert/delete/access elements by position.

As Micheal was kind enough to point out, Java Treemap supports it: http://java.sun.com/j2se/1.5.0/docs/api/java/util/TreeMap.html

Also, not sure why you think O(logN) will be as bad as iterating the list!

From my comments to you on some other answer:

For 1 million items, using balanced red-black trees, the worst case is 2log(n+1) i.e ~40. You need to do no more than 40 compares to find your element and that is the absolute worst case. Red-black trees also cater to the hole/gap disappearing. This is miles ahead of iterating the list (~ 1/2 million on average!).

With AVL trees instead of red-black trees, the worst case guarantee is even better: 1.44 log(n+1), which is ~29 for a million items.

Moron
...and is part of the Java standard API with java.util.TreeMap
Michael Borgwardt
@Micheal: Thanks! I have added a link to that to my answer.
Moron
+6  A: 

You could do an ArrayList/Vector (Java/C++) and when you delete, instead swap the last element with the deleted element first. So if you have A B C D E, and you delete C, you'll end up with A B E D. Note that references to E will have to look at 2 instead of 4 now (assuming 0 indexed) but you said sort order isn't a problem.

I don't know if it handles this automatically (optimized for removing from the end easily) but if it's not you could easily write your own array-wrapper class.

glowcoder
Array/list semantics = keep the relative positions the same, isn't it? Swapping on deletion isn't list/array semantics. Perhaps OP should clarify...
Moron
By array semantics i believe he means accessing by index in O(1). He was specific about sort order not being an issue.
glowcoder
He also said list semantics... Anyway, this is pointless and OP seems disinterested anyway. I am done with this question. Please pardon me if I don't respond to any more comments.
Moron
@moron: Sorry, real-life called. :-) items shouldn't swap in that fashion - the hole/gap just needs to disappear. However @glowcoders answer would work for my immediate needs. I'm not sure where I was going wrt sort order. I'll have to ponder my question a bit more.
Chris Kaminski
@Chris: Real life? What is that? ;-) Anyway, after posting a question it might be good to wait around a bit to clarify anything ppl need :-). And question for you: why do you think O(log N) is same as iterating the list? For 1 million items, using balanced red-black trees, the worst case is 2log(n+1) i.e ~40. You need to do no more than 40 compares to find your element and that is the absolute worst case. Red-black trees also cater to the hole/gap disappearing. This is miles ahead of iterating the list (~ 1/2 million on average!). Not sure what made you think it might not be worth it...
Moron
...and with AVL trees instead of red-black trees, the worst case guarantee is even better: 1.44 log(n+1), which is ~29 for a million items.
Moron
For the record, while my solution answers your question, I believe a lg n approach better solves your problem. If I were in your shoes, I would be using trees.
glowcoder
+3  A: 

I'm not aware of any data structure that provides O(1) random access, insertion, and deletion, so I suspect you'll have to accept some tradeoffs.

LinkedList in Java provides O(1) insertion/deletion from the head or tail of the list is O(1), but random access is O(n).

ArrayList provides O(1) random access, but insertion/deletion is only O(1) at the tail of the list. If you insert/delete from the middle of the list, it has to move around the remaining elements in the list. On the bright side, it uses System.arraycopy to move elements, and it's my understanding that this is essentially O(1) on modern architectures because it literally just copies blocks of memory around instead of processing each element individually. I say essentially because there is still work to find enough contiguous blocks of free space, etc. and I'm not sure what the big-O might be on that.

Hank Gay
A: 

glowcoder's answer is the best. (I'd post this in a comment, but I'm not awesome enough to comment yet, apparently.)

Using an ArrayList and swapping the value to the end before deletion should give you O(1) performance for deletion.
Note: This only works because you stated that you didn't care if sort order was maintained. It will mess up the sort order, and that is why ArrayList can't just swap items to the end before deletion.

Kyle_Solo
@Kyle_Solo:Did you not read Mike Seymour's answer, which appeared at least 15 minutes(eternity in SO time) prior to glowcoder's answer? You need at least 50 rep to comment, check out the FAQ.
Moron
I've checked the FAQ. I must have skimmed over Mike Seymour's answer though, as on a closer look it does what glowcoder described.
Kyle_Solo
+2  A: 

Since you seem to want to insert at arbitrary positions in (near) constant time, I think using a std::deque is your best bet in C++. Unlike the std::vector, a deque (double-ended queue) is implemented as a list of memory pages, i.e. a chunked vector. This makes insertion and deletion at arbitrary positions a constant-time operation (depending only on the page size used in the deque). The data structure also provides random access (“array access”) in near-constant time – it does have to search for the correct page but this is a very fast operation in practice.

Java’s standard container library doesn’t offer anything similar but the implementation is straightforward.

Konrad Rudolph
A: 

Does the data structure described at http://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html do anything like what you want?

Zack