views:

156

answers:

8

Is there a way to check if two strings contain the same characters. For example,

abc, bca -> true
aaa, aaa -> true
aab, bba -> false
abc, def -> false
+11  A: 

Turn each string into a char[], sort that array, then compare the two. Simple.

private boolean sameChars(String firstStr, String secondStr) {
  char[] first = firstStr.toCharArray();
  char[] second = secondStr.toCharArray();
  Arrays.sort(first);
  Arrays.sort(second);
  return Arrays.equals(first, second)
}
GaryF
...and remove duplicates before comparing
testalino
No, if we were removing duplicates then "aab, bba" would return true and it's specified as returning false.
GaryF
yeah, you are right
testalino
`Arrays.equals`, not `equal`.
aioobe
Thanks, aioobe. Updated.
GaryF
Thank you very much. It was simpler than I had thought. I'm surprised this isn't already a method inside the String class.
Brent
@Brent: re: not already a method-- really? :) I have never needed to do this particular task in a decade of professional string manipulation. Plus little problems like this are nice to keep in the fun domain of interview questions and Stack Overflow quickies over morning tea.
quixoto
+2  A: 

You can convert the string into char array, sort the arrays and them compare the arrays:

String str1 = "abc";                 
String str2 = "acb";
char[] chars1 = str1.toCharArray();
char[] chars2 = str2.toCharArray();
Arrays.sort(chars1);
Arrays.sort(chars2);

if(Arrays.equals(chars1,chars2)) {
        System.out.println(str1 + " and " + str2 + " are anagrams");
} else {
        System.out.println(str1 + " and " + str2 + " are not anagrams");
}
codaddict
A: 

here:

    String str1 = "abc";
    String str2 = "cba";
    /* create sorted strings */

/*  old buggy code
    String sorted_str1 = new String( java.utils.Arrays.sort(str1.toCharArray()) );
    String sorted_str2 = new String( java.utils.Arrays.sort(str2.toCharArray()) );
*/    
/* the new one */
char [] arr1 = str1.toCharArray();
char [] arr2 = str2.toCharArray();
java.utils.Arrays.sort(arr1);
java.utils.Arrays.sort(arr2);
String sorted_str1 = new String(arr1);
String sorted_str2 = new String(arr2);

if (sorted_str1.equals( sorted_str2 ) ) {
        /* true */
    } else {
        /* false */
    }
Erhan Bagdemir
Arrays.sort(..) has a return type of void, so you cannot use it directly in the String constructor.
GaryF
you are right. i have corrected the code and posted again.
Erhan Bagdemir
+3  A: 

A very easy - but not very efficient - way to do that is, convert your Strings to char arrays and use java.util.Arrays.sort on them, get Strings back and compare for equality. If your strings are under a few thousand characters, that should be very okay.

If you have several megabytes strings, you may want to create an array with a count for each character (using its code as an index), have one pass on one string adding one on the count of each char, and one pass on the second string removing one. If you fall under 0 at any point during the second pass, they don't have the same characters. When you're done with the second string without error, you are sure they have the same characters if they have the same length (which you should have checked first anyway).
This second method is much more complicated than sorting the strings, and it requires a big array if you want to work with unicode strings, but it's perfectly good if you're okay with only the 128 chars of the ascii set, and much faster.
Do NOT bother with that if you don't have several million characters in your strings. Sorting the strings is much easier, and not significantly slower on strings with only a couple dozen chars.

Jean
+1 for pointing out pros and cons of different solutions
sleske
+2  A: 

As a (nitpicking ;-) ) side note:

Be aware that the solutions proposed here only work for strings composed of characters from the Basic Multilingual Plane (BMP) of Unicode.

Characters outside the BMP are represented as a pair of char in a String, so you need to pay extra attention, so you keep the pairs together. See the Javadocs of java.lang.Character for the gory details.

Fortunately, most characters outside the BMP are rather exotic. Even most of Japanese and Chinese is in the BMP...

sleske
Actually, the solutions here will work outside the BMP just fine. The problem is that they won't work on non-normalized strings; the issue is that "é" can be written as either a single character or a composition of "e" and an accent. (This is a problem for a number of European languages, and few others too.)
Donal Fellows
@Donal Fellows: How can they work outside the BMP? A character from outside the BMP will be represented as a pair of surrogates, i.e. as two `char`.If you then invoke e.g. `Arrays.sort(chars1)`, the sort function, which does not know about surrogates, will happily tear apart the surrogates and produce junk data. Or am I missing something?
sleske
@Donal Fellows: But of course you are right that the problem will also occur with combining characters. And BTW, using a normalized string is not enough, because there are several different normalizations, and some use combining characters.
sleske
@sleske: re non-BMP: Damn, just realized that I'm wrong as it's possible to have two non-BMP character groups confused. I doubt it will happen in practice at the moment though; the amount of non-BMP characters defined and in use is fairly small and they're typically sparsely used. A normalized string is enough, but everything must be normalized the same way (i.e., to NFC or NFD, not a mix!)
Donal Fellows
NFC will *not* work. It will decompose characters, and then you'll compare the decomposed parts individually. That would mean that e.g. aé and áe would compare as equal, as both decompose to "acute", "a", "e". This is (probably) not what is intended. Just goes to show that Unicode (rather, characters sets in general) has its pitfalls...
sleske
A: 

Consider creating a signature for a given String. Using count and character.

a-count:b-count:c-count:.....:z-count: (extend for upper case if you want ).

Then compare the signature. This should scale better for very large Strings.

As a shortcut, check the length. If they are not matching, return false anyway.

Jayan
A: 

Maybe it's not the fastest answer, but must shortest answer.

boolean hasSameChar(String str1, String str2){
  for(char c : str1.toCharArray()){
    if(str2.indexOf(c) < 0 ) return false;
  }
  for(char c : str2.toCharArray()){
    if(str1.indexOf(c) < 0 ) return false;
  }
  return true;
}
guilin 桂林
A: 

You could use the communative properties of addition and treat each character numerically, adding each element of each array into a total figure.

If the totals are the same then they contain the same charaters.

You can thus avoid a possibly expensive sort on each char array.

bool equal = false;
if(strA.length() == strB.length()) {
   long totalA = 0;
   long totalB = 0;
   for(int i = 0; i < strA.length(); i++) {
     totalA += (int)strA.charAt(i);
     totalB += (int)strB.charAt(i);
   }
   equal = totalA == totalB;
}
return equal;
Adrian Regan