views:

10434

answers:

12

I'm sure there's a good reason, but could someone please explain why the java.util.Set interface lacks get(int Index), or any similar get() method?

It seems that sets are great for putting things into, but I can't find an elegant way of retrieving a single item from it.

If I know I want the first item, I can use set.iterator().next(), but otherwise it seems I have to cast to an Array to retrieve an item at a specific index?

What are the appropriate ways of retrieving data from a set? (other than using an iterator)

I'm sure the fact that it's excluded from the API means there's a good reason for not doing this -- could someone please enlighten me?

EDIT: Some extremely great answers here, and a few saying "more context". The specific scneario was a dbUnit test, where I could reasonalby assert that the returned set from a query had only 1 item, and I was trying to access that item.

However, the question is more valid without the scenario, as it remains more focussed : What's the difference between set & list.

Thanks to all for the fantastic answers below.

+54  A: 

Because sets are not ordered. Some implementations are, but that is not a general property of sets.

If you're trying to use sets this way, you should consider using a list instead.

(Edit: Several commenters thought I wasn't strong enough on this last point. From the comments: "Consider it, then do it.")

Michael Myers
I'd revise that last comment to "you should use a list instead"
matt b
@matt b: No, I think he should consider it. Thinking is good. ;)
Michael Myers
Consider it, then do it.
Joe Philllips
"Consider" is the correct phrasing. There are two possible problems (a) He is using a set when he should be using something else, or (b) He is trying to do things with Sets that they don't support but that he could do a different way. It is good to *consider* which of these is the case.
kenj0418
There's your guru badge. :)
raven
+1  A: 

That is because Set only guarantees uniqueness, but says nothing about the optimal access or usage patterns. Ie, a Set can be a List or a Map, each of which have very different retrieval characteristics.

jsight
+9  A: 

Just adding one point that was not mentioned in mmyers' answer.

If I know I want the first item, I can use set.iterator().next(), but otherwise it seems I have to cast to an Array to retrieve an item at a specific index?

What are the appropriate ways of retrieving data from a set? (other than using an iterator)

You should also familiarise yourself with the SortedSet interface (whose most common implementation is TreeSet).

A SortedSet is a Set (i.e. elements are unique) that is kept ordered by the natural ordering of the elements or using some Comparator. You can easily access the first and last items using first() and last() methods. A SortedSet comes in handy every once in a while, when you need to keep your collection both duplicate-free and ordered in a certain way.

Edit: If you need a Set whose elements are kept in insertion-order (much like a List), take a look at LinkedHashSet.

Jonik
I like LinkedHashSet myself. But yes, this is good to mention. +1
Michael Myers
Thanks, I tweaked the answer a little. (Seems I had some aspects of TreeSet confused with those of LinkedHashSet.)
Jonik
+3  A: 

The only reason I can think of for using a numerical index in a set would be for iteration. For that, use

for(A a : set) { 
   visit(a); 
}
Hugo
Not true, what about accessing a random element?
Jeremybub
Ha, ha. good point :) but that would be highly prone to misuse, i'm sure.
Hugo
+5  A: 

I'm not sure if anybody has spelled it out exactly this way, but you need to understand the following:

There is no "first" element in a set.

Because, as others have said, sets have no ordering. A set is a mathematical concept that specifically does not include ordering.

Of course, your computer can't really keep a list of stuff that's not ordered in memory. It has to have some ordering. Internally it's an array or a linked list or something. But you don't really know what it is, and it doesn't really have a first element; the element that comes out "first" comes out that way by chance, and might not be first next time. Even if you took steps to "guarantee" a particular first element, it's still coming out by chance, because you just happened to get it right for one particular implementation of a Set; a different implementation might not work that way with what you did. And, in fact, you may not know the implementation you're using as well as you think you do.

People run into this ALL. THE. TIME. with RDBMS systems and don't understand. An RDBMS query returns a set of records. This is the same type of set from mathematics: an unordered collection of items, only in this case the items are records. An RDBMS query result has no guaranteed order at all unless you use the ORDER BY clause, but all the time people assume it does and then trip themselves up some day when the shape of their data or code changes slightly and triggers the query optimizer to work a different way and suddenly the results don't come out in the order they expect. These are typically the people who didn't pay attention in database class (or when reading the documentation or tutorials) when it was explained to them, up front, that query results do not have a guaranteed ordering.

skiphoppy
Heh, and of course the ordering usually changes right after the code goes into production, when it's too slow, so they add an index to speed up the query. Now the code runs fast, but gives the wrong answers. And nobody notices for three or four days...if you're lucky. If you're not lucky, nobody notices for a month...
TMN
+7  A: 

This kind of leads to the question when you should use a set and when you should use a list. Usually, the advice goes:

  1. If you need ordered data, use a list
  2. If you need unique data, use a set
  3. If you need both, use a sorted set

A fourth case that appears often is that you need neither. In this case you see some programmers go with lists and some with sets. Personally I find it very harmful to see set as a list without ordering - because it is really a whole other beast. Unless you need stuff like set uniqueness or set equality, always favor lists.

waxwing
if you are unspecific, accept Collection<T> or even Iterable<T> and initialize as a List.
Andreas Petersson
A: 

some data structures are missing from the standard java collections.

Bag (like set but can contain elements multiple times)

UniqueList (ordered list, can contain each element only once)

seems you would need a uniquelist in this case

if you need flexible data structures, you might be interested in Google Collections

Andreas Petersson
A: 

That's true, element in Set are not ordered, by definition of the Set Collection. So they can't be access by an index.

But why don't we have a get(object) method, not by providing the index as parameter, but an object that is equal to the one we are looking for? By this way, we can access the data of the element inside the Set, just by knowing its attributes used by the equal method.

walls
+2  A: 

Actually this is a recurring question when writing JavaEE applications which use Object-Relational Mapping (for example with Hibernate); and from all the people who replied here, Andreas Petersson is the only one who understood the real issue and offered the correct answer to it: Java is missing a UniqueList! (or you can also call it OrderedSet, or IndexedSet).

Maxwing mentioned this use-case (in which you need ordered AND unique data) and he suggested the SortedSet, but this is not what Marty Pitt really needed.

This "IndexedSet" is NOT the same as a SortedSet - in a SortedSet the elements are sorted by using a Comparator (or using their "natural" ordering).

But instead it is closer to a LinkedHashSet (which others also suggested), or even more so to an (also inexistent) "ArrayListSet", because it guarantees that the elements are returned in the same order as they were inserted.

But the LinkedHashSet is an implementation, not an interface! What is needed is an IndexedSet (or ListSet, or OrderedSet, or UniqueList) interface! This will allow the programmer to specify that he needs a collection of elements that have a specific order and without duplicates, and then instantiate it with any implementation (for example an implementation provided by Hibernate).

Since JDK is open-source, maybe this interface will be finally included in Java 7...

Sorin Postelnicu
A: 

To get element in a Set, i use to following one:

public T getElement(Set<T> set, T element) {
T result = null;
if (set instanceof TreeSet<?>) {
    T floor = ((TreeSet<T>) set).floor(element);
    if (floor != null && floor.equals(element))
    result = floor;
} else {
    boolean found = false;
    for (Iterator<T> it = set.iterator(); !found && it.hasNext();) {
    if (true) {
        T current = it.next();
        if (current.equals(element)) {
        result = current;
        found = true;
        }
    }
    }
}
return result;
}
lala
A: 

I ran into situations where I actually wanted a *Sorted*Set with access via index (I concur with other posters that accessing an unsorted Set with an index makes no sense). An example would be a tree where I wanted the children to be sorted and duplicate children were not allowed.

I needed the access via index to display them and the set attributes came in handy to efficiently eliminate duplicates.

Finding no suitable collection in java.util or google collections, I found it straightforward to implement it myself. The basic idea is to wrap a SortedSet and create a List when access via index is required (and forget the list when the SortedSet is changed). This does of course only work efficiently when changing the wrapped SortedSet and accessing the list is separated in the lifetime of the Collection. Otherwise it behaves like a list which is sorted often, i.e. too slow.

With large numbers of children, this improved performance a lot over a list I kept sorted via Collections.sort.

A: 

Hi everybody and excuse the level of my English.

about question "but could someone please explain why the java.util.Set interface lacks get(int Index), or any similar get() method"

Notice that if you have un object why would you search to get it, you already have it.

Generaly for get() use, you must provide an index or key to get an object corresponding to this index or mapping to this key.

Even so in List implementations, suppose you want get an object from ArrayList, how to know what position it is stored in order to provide this index to get method ? and even if we use ListIterator, instead of Iterator, who give us method such nextIndex() and previousIndex(). I think that get method is suited only to Map implementations. Thanks