views:

685

answers:

6

What is the simplest way to find if two Lists contain exactly the same elements, in the standard Java libraries?

It shouldn't matter if the two Lists are the same instance or not, and it shouldn't matter if the type parameter of the Lists are different.

e.g.

List list1
List<String> list2; 
// ... construct etc

list1.add("A");
list2.add("A"); 
// the function, given these two lists, should return true

There's probably something staring me in the face I know :-)


EDIT: To clarify, I was looking for the EXACT same elements and number of elements, in order.


EDIT: Thanks for pointing out the obvious answer I couldn't see for looking :-)

Although all the answers given so far are correct, some are more correct than others, so I'll wait a while for the best rounded-off answer before accepting.

+1  A: 

The equals method on List will do this, Lists are ordered, so to be equal two Lists must have the same elements in the same order.

return list1.equals(list2);
daveb
Lists are not ordered unless you sort them.
Michael Myers
Sigh@Myself. Such an obvious answer. You know it's been too long a day when you can't even Ctrl+F a web page any longer.:)
Grundlefleck
@mmyers: items *in* lists are not ordered unless you sort them. Lists themselves have an implicit ordering of items (by index), which don't change unless you change the items in the list. (vs. Sets or Collections where there's not a guarantee of consistent ordering if you iterate through them twice)
Jason S
I think what daveb means by saying lists are ordered is that List.equals takes the order of the elements into consideration to determine equality. See the Javadoc.
laz
What I mean is that a list containing {"A", "B"} and a list containing {"B", "A"} would be unequal with this method. That may very well be what is intended, but I wanted to make sure no one was overlooking it.
Michael Myers
+1  A: 
list1.equals(list2);

If your list contains a custom Class MyClass, this class must override the equals function.

 class MyClass
  {
  int field=0;
  @0verride
  public boolean equals(Object other)
        {
        if(this==other) return true;
        if(other==null || !(other instanceof MyClass)) return false;
        return this.field== MyClass.class.cast(other);
        }
  }

Note :if you want to test equals on a java.util.Set rather than a java.util.List, then your object must override the hashCode function.

Pierre
A: 

It depends on what concrete List class you are using. The abstract class AbstractCollection has a method called containsAll(Collection) that takes another collection ( a List is a collection) and:

Returns true if this collection contains all of the elements in the specified collection.

So if an ArrayList is being passed in you can call this method to see if they are exactly the same.

       List foo = new ArrayList();
 List bar = new ArrayList();
 String str = "foobar";

 foo.add(str);
 bar.add(str);

 foo.containsAll(bar);

The reason for containsAll() is because it iterates through the first list looking for the match in the second list. So if they are out of order equals() will not pick it up.

EDIT: I just want to make a comment here about the amortized running time of performing the various options being offered. Is running time important? Sure. Is it the only thing you should consider? No.

The cost of copying EVERY single element from your lists into other lists takes time, and it also takes up a good chunk of memory (effectively doubling the memory you are using).

So if memory in your JVM isn't a concern (which it should generally be) then you still need to consider the time it takes to copy every element from two lists into two TreeSets. Remember it is sorting every element as it enters them.

My final advice? You need to consider your data set and how many elements you have in your data set, and also how large each object in your data set is before you can make a good decision here. Play around with them, create one each way and see which one runs faster. It's a good exercise.

amischiefr
Wouldn't it have to befoo.containsAll(bar) ?
Carl Manaster
No, it goes through every element in foo and sees if bar contains that element. It then ensures that the length is the same of the two lists. If for every foo there is an element in bar such that foo.element == bar.element and foo.length == bar.length then they contain the same elements.
amischiefr
do we know if there is an efficiency guarantee? or is this typically O(n^2)?
Tom
Like any other array that iterates through looking for a matching element the worst case running time is going to be O(n^2). In this case, it looks like the implementation is indeed iterating through one element at a time looking for the match. I won't speculate on the amortized running time, but yes the worst case is O(n^2).
amischiefr
+10  A: 

If you care about order, then just use the equals method:

list1.equals(list2)

From the javadoc:

Compares the specified object with this list for equality. Returns true if and only if the specified object is also a list, both lists have the same size, and all corresponding pairs of elements in the two lists are equal. (Two elements e1 and e2 are equal if (e1==null ? e2==null : e1.equals(e2)).) In other words, two lists are defined to be equal if they contain the same elements in the same order. This definition ensures that the equals method works properly across different implementations of the List interface.

If you want to check independent of order, you could copy all of the elements to Sets and use equals on the resulting Sets:

Set<Object> set1 = new Set<Object>();
set1.addAll(list1);
Set<Object> set2 = new Set<Object>();
set2.addAll(list2);
set1.equals(set2)

One caveat with this approach is that it won't check duplicates exactly. eg: if list1 was ["A", "B", "A"] and list2 was ["A", "B", "B"] the Set appraoch would say they were equal.

If you don't want these to be treated as equal but you don't care about order you can either sort both lists before comparing them or you could do the same thing as the Set approach but with a Multiset (not part of the standard libraries, but Google Collections has a good one.

Laurence Gonsalves
Couldn't you use containsAll if you want to check independent of order?
laz
I don't know about the implementation details of containsAll, but it seems like it could be bad. If containsAll calls contains() over and over, you will have an O(n^2) alg. The sets overall should be O(nlogn)
Tom
Yes, HOWEVER if the lists may not be in the same order, there isn't much more you can do. You have to iterate through list B with every element of list A to see if it has that element.
amischiefr
Actually, if the sets are just going to be O(nlogn), another approach is to call Collections.sort() on a list, and then use equals. If you want to preserve order though, you would need to copy the list, and that may be expensive and favor the set solution... so you have to think about your situation :-).
Tom
@amischiefr: are you suggesting O(n^2) is the best you can do?
Tom
I am saying that if he is using Lists (possibly ArrayLists) for his implementation, then calling containsAll() is going to be faster than trying to add every element from each list into separate sets and trying to sort them. UNLESS the data set is sufficiently large. You can't just look at the big O running value here. You have to factor in the time it takes to copy each list to a set as well. If the data set is sufficiently large, then yes your way is better, if not...
amischiefr
@amischiefr: what you say is partly true: for very small lists the containsAll solution may be faster than the Set solution, similar to the way selection sort beats quicksort for very small lists. If you care about the performance for small Lists you could benchmark to verify, but I imagine the switchover (when the Sets approach beats containsAll) is quite low. Copying items from a List into a Set in Java is just copying references, not deep-copying arbitrarily large objects, so the constant multiple is quite small. By the way, if you do use containsAll, remember to check both directions.
Laurence Gonsalves
+3  A: 

I posted a bunch of stuff in comments I think it warrants its own answer.

As everyone says here, using equals() depends on the order. If you don't care about order, you have 3 options.

Option 1

Use containsAll(). This option is not ideal, in my opinion, because it offers worst case performance, O(n^2).

Option 2

There are two variations to this:

2a) If you don't care about maintaining the order ofyour lists... use Collections.sort() on both list. Then use the equals(). This is O(nlogn), because you do two sorts, and then an O(n) comparison.

2b) If you need to maintain the lists' order, you can copy both lists first. THEN you can use solution 2a on both the copied lists. However this might be unattractive if copying is very expensive.

This leads to:

Option 3

If your requirements are the same as part 2b, but copying is too expensive. You can use a TreeSet to do the sorting for you. Dump each list into its own TreeSet. It will be sorted in the set, and the original lists will remain intact. Then perform an equals() comparison on both TreeSets. The TreeSetss can be built in O(nlogn) time, and the equals() is O(n).

Take your pick :-).

EDIT: I almost forgot the same caveat that Laurence Gonsalves points out. The TreeSet implementation will eliminate duplicates. If you care about duplicates, you will need some sort of sorted multiset.

Tom
If you care about duplicates you can always test that the size of the collections are equal before any other tests.
laz
More specifically, if having duplicates indicates inequality, the size of the lists must be the same before any equality check has a chance to succeed.
laz
@laz: checking the size won't work if different elements are duplicated in the two lists. eg: [A, A, B] vs [A, B, B] are equal size.
Laurence Gonsalves
@Laurence: I agree that laz's post is a bit confusing (I read it a few times before I understood it). I take it that he is just trying to provide a "shortcut" for the special case when 2 conditions hold: (1) duplicates matter, and (2) the list sizes are different. In your example, I think laz is still saying it is necessary to do all the same checks we discussed. (At least that's how I read it). If duplicates DO NOT matter, then you can't use size as a special case check. But when the 2 conditions hold, you can just say "if (list1.size() != list2.size()) return false;.
Tom