views:

8172

answers:

7

I have an ArrayList of Strings, and I want to remove repeated strings from it. How can I do this?

A: 

If you have any control over the creation of your list then you might want to consider using a Map instead? Or you could put them in a Map from your ArrayList.

Simon
Why would you use a Map and not a Set ?
Luc Touraille
+9  A: 

If you don't want duplicates in a Collection, you should consider why you're using a Collection that allows duplicates. The easiest way to remove repeated elements is to add the contents to a Set (which will not allow duplicates) and then add the Set back to the ArrayList:

ArrayList al = new ArrayList();
// add elements to al, including duplicates
HashSet hs = new HashSet();
hs.addAll(al);
al.clear();
al.addAll(hs);

Of course, this destroys the ordering of the elements in the ArrayList...

jonathan-stafford
See also LinkedHashSet, if you wish to retain the order.
volley
+16  A: 

If you don't want duplicates, use a Set instead of a List. To convert a List to a Set you can use the following code:

// list is some List of Strings
Set<String> s = new HashSet<String>(list);

If really necessary you can use the same construction to convert a Set back into a List.

Bno
+15  A: 

Although converting the ArrayList to a HashSet effectively removes duplicates, if you need to preserve insertion order, I'd rather suggest you to use this variant

// list is some List of Strings
Set<String> s = new LinkedHashSet<String>(list);

Then, if you need to get back a List reference, you can use again the conversion constructor.

abahgat
A: 

As said before, you should use a class implementing Set interface instead of List to be sure of unicity of elements. If you have to keep the order of elements, the SortedSet interface can then be used ; the TreeSet class implements that interface.

Vinze
A: 

Probably a bit overkill, but I enjoy this kind of isolated problem. :)

This code uses a temporary Set (for the uniqueness check) but removes elements directly inside the original list. Since element removal inside an ArrayList can induce a huge amount of array copying, the remove(int)-method is avoided.

public static <T> void removeDuplicates(ArrayList<T> list) {
    int size = list.size();
    int out = 0;
    {
        final Set<T> encountered = new HashSet<T>();
        for (int in = 0; in < size; in++) {
            final T t = list.get(in);
            final boolean first = encountered.add(t);
            if (first) {
                list.set(out++, t);
            }
        }
    }
    while (out < size) {
        list.remove(--size);
    }
}

While we're at it, here's a version for LinkedList (a lot nicer!):

public static <T> void removeDuplicates(LinkedList<T> list) {
    final Set<T> encountered = new HashSet<T>();
    for (Iterator<T> iter = list.iterator(); iter.hasNext(); ) {
        final T t = iter.next();
        final boolean first = encountered.add(t);
        if (!first) {
            iter.remove();
        }
    }
}

Use the marker interface to present a unified solution for List:

public static <T> void removeDuplicates(List<T> list) {
    if (list instanceof RandomAccess) {
        // use first version here
    } else {
        // use other version here
    }
}

EDIT: I guess the generics-stuff doesn't really add any value here.. Oh well. :)

volley
Why use ArrayList in parameter? Why not just List? Will that not work?
Shervin
A List will absolutely _work_ as in-parameter for the first method listed. The method is however _optimized_ for use with a random access list such as ArrayList, so if a LinkedList is passed instead you will get poor performance. For example, setting the n:th element in a LinkedList takes O(n) time, whereas setting the n:th element in a random access list (such as ArrayList) takes O(1) time. Again, though, this is probably overkill... If you need this kind of specialized code it will hopefully be in an isolated situation.
volley
A: 
public static <T> void removeDuplicates(ArrayList<T> aList){
    for (int i = 0; i < aList.size(); i++) {
     removeDuplicates(aList, aList.get(i));
    }
}
Where is the other removeDuplicates method defined?
Michael Myers