I have an ArrayList
of Strings
, and I want to remove repeated strings from it. How can I do this?
views:
8172answers:
7If you have any control over the creation of your list then you might want to consider using a Map instead? Or you could put them in a Map from your ArrayList.
If you don't want duplicates in a Collection, you should consider why you're using a Collection that allows duplicates. The easiest way to remove repeated elements is to add the contents to a Set (which will not allow duplicates) and then add the Set back to the ArrayList:
ArrayList al = new ArrayList();
// add elements to al, including duplicates
HashSet hs = new HashSet();
hs.addAll(al);
al.clear();
al.addAll(hs);
Of course, this destroys the ordering of the elements in the ArrayList...
If you don't want duplicates, use a Set instead of a List
. To convert a List
to a Set
you can use the following code:
// list is some List of Strings
Set<String> s = new HashSet<String>(list);
If really necessary you can use the same construction to convert a Set
back into a List
.
Although converting the ArrayList
to a HashSet
effectively removes duplicates, if you need to preserve insertion order, I'd rather suggest you to use this variant
// list is some List of Strings
Set<String> s = new LinkedHashSet<String>(list);
Then, if you need to get back a List
reference, you can use again the conversion constructor.
As said before, you should use a class implementing Set interface instead of List to be sure of unicity of elements. If you have to keep the order of elements, the SortedSet interface can then be used ; the TreeSet class implements that interface.
Probably a bit overkill, but I enjoy this kind of isolated problem. :)
This code uses a temporary Set (for the uniqueness check) but removes elements directly inside the original list. Since element removal inside an ArrayList can induce a huge amount of array copying, the remove(int)-method is avoided.
public static <T> void removeDuplicates(ArrayList<T> list) {
int size = list.size();
int out = 0;
{
final Set<T> encountered = new HashSet<T>();
for (int in = 0; in < size; in++) {
final T t = list.get(in);
final boolean first = encountered.add(t);
if (first) {
list.set(out++, t);
}
}
}
while (out < size) {
list.remove(--size);
}
}
While we're at it, here's a version for LinkedList (a lot nicer!):
public static <T> void removeDuplicates(LinkedList<T> list) {
final Set<T> encountered = new HashSet<T>();
for (Iterator<T> iter = list.iterator(); iter.hasNext(); ) {
final T t = iter.next();
final boolean first = encountered.add(t);
if (!first) {
iter.remove();
}
}
}
Use the marker interface to present a unified solution for List:
public static <T> void removeDuplicates(List<T> list) {
if (list instanceof RandomAccess) {
// use first version here
} else {
// use other version here
}
}
EDIT: I guess the generics-stuff doesn't really add any value here.. Oh well. :)
public static <T> void removeDuplicates(ArrayList<T> aList){
for (int i = 0; i < aList.size(); i++) {
removeDuplicates(aList, aList.get(i));
}
}