Your code is, I'm sorry to say, very C-like.
A Java String
is not a char[]
. You say you want to remove duplicates from a String
, but you take a char[]
instead.
Is this char[]
\0
-terminated? Doesn't look like it because you take the whole .length
of the array. But then your algorithm tries to \0
-terminate a portion of the array. What happens if the arrays contains no duplicates?
Well, as it is written, your code actually throws an ArrayIndexOutOfBoundsException
on the last line! There is no room for the \0
because all slots are used up!
You can add a check not to add \0
in this exceptional case, but then how are you planning to use this code anyway? Are you planning to have a strlen
-like function to find the first \0
in the array? And what happens if there isn't any? (due to all-unique exceptional case above?).
What happens if the original String
/char[]
contains a \0
? (which is perfectly legal in Java, by the way, see JLS 10.9 An Array of Characters is Not a String)
The result will be a mess, and all because you want to do everything C-like, and in place without any additional buffer. Are you sure you really need to do this? Why not work with String
, indexOf
, lastIndexOf
, replace
, and all the higher-level API of String
? Is it provably too slow, or do you only suspect that it is?
"Premature optimization is the root of all evils". I'm sorry but if you can't even understand what the original code does, then figuring out how it will fit in the bigger (and messier) system will be a nightmare.
My minimal suggestion is to do the following:
- Make the function takes and returns a
String
, i.e. public static String removeDuplicates(String in)
- Internally, works with
char[] str = in.toCharArray();
- Replace the last line by
return new String(str, 0, tail);
This does use additional buffers, but at least the interface to the rest of the system is much cleaner.
Alternatively, you can use StringBuilder
as such:
static String removeDuplicates(String s) {
StringBuilder noDupes = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
String si = s.substring(i, i + 1);
if (noDupes.indexOf(si) == -1) {
noDupes.append(si);
}
}
return noDupes.toString();
}
Note that this is essentially the same algorithm as what you had, but much cleaner and without as many little corner cases, etc.