In Java, how do I convert an array of strings to a array of unique values?
If I have this array of Strings:
String[] test = {"1","1","1","2"}
And I want to end up with:
String[] uq = {"1","2"}
In Java, how do I convert an array of strings to a array of unique values?
If I have this array of Strings:
String[] test = {"1","1","1","2"}
And I want to end up with:
String[] uq = {"1","2"}
An easy way is to create a set, add each element in the array to it, and then convert the set to an array.
List list = Arrays.asList(test);
Set set = new HashSet(list);
String[] uq = set.toArray();
Quick but somewhat inefficient way would be:
Set<String> temp = new HashSet<String>(Arrays.asList(test));
String[] uq = temp.toArray(new String[temp.size()]);
String[] test = {"1","1","1","2"};
java.util.Set result = new java.util.HashSet(java.util.Arrays.asList(test));
System.out.println(result);
An alternative to the HashSet approach would be to:
Sort the input array
Count the number of non-duplicate values in the sorted array
Allocate the output array
Iterate over the sorted array, copying the non-duplicate values to it.
The HashSet approach is O(N)
on average assuming that 1) you preallocate the HashSet with the right size and 2) the (non-duplicate) values in the input array hash roughly evenly. (But if the value hashing is pathological, the worst case is O(N**2)
!)
The sorting approach is O(NlogN)
on average.
The HashSet approach takes more memory on average.
If you are doing this infrequently OR for really large "well behaved" input arrays, the HashSet approach is probably better. Otherwise, it could be a toss-up which approach is better.