views:

1381

answers:

4

Is there a native way to sort a String by its contents in java? E.g.

String s = "edcba"  ->  "abcde"
+15  A: 

toCharArray followed by Arrays.sort followed by a String constructor call:

import java.util.Arrays;

public class Test
{
    public static void main(String[] args)
    {
        String original = "edcba";
        char[] chars = original.toCharArray();
        Arrays.sort(chars);
        String sorted = new String(chars);
        System.out.println(sorted);
    }
}

EDIT: As tackline points out, this will fail if the string contains surrogate pairs or indeed composite characters (accent + e as separate chars) etc. At that point it gets a lot harder... hopefully you don't need this :) In addition, this is just ordering by ordinal, without taking capitalisation, accents or anything else into account.

Jon Skeet
The correct way would be to sort the code points. Unfortunately there is no String.toCodePointArray. (What order should we be sorting into, btw?)
Tom Hawtin - tackline
The ICU project describes a code point order UTF-16 sort method: http://icu-project.org/docs/papers/utf16_code_point_order.html . I don't think Arrays.sort will destroy any supplementary characters due to the way the ranges are defined, but don't quote me.
McDowell
It maybe will not destroy anything, but the sort order is not optimal if you want to take into account uppercase and accents for example. This algorithm will sort "éDedCBcbAàa" as "ABCDabcdeàé" while, in English (US) locale for example, it would be more desirable to obtain "aAàbBcCdDeé".
eljenso
+12  A: 

No there is no built-in String method. You can convert it to a char array, sort it using Arrays.sort and convert that back into a String.

 String test= "edcba";
 char[] ar = test.toCharArray();
 Arrays.sort(ar);
 String sorted = String.valueOf(ar);

Or, when you want to deal correctly with locale-specific stuff like uppercase and accented characters:

import java.text.Collator;
import java.util.Arrays;
import java.util.Comparator;
import java.util.Locale;

public class Test
{
  public static void main(String[] args)
  {
    Collator collator = Collator.getInstance(new Locale("fr", "FR"));
    String original = "éDedCBcbAàa";
    String[] split = original.split("");
    Arrays.sort(split, collator);
    String sorted = "";
    for (int i = 0; i < split.length; i++)
    {
      sorted += split[i];
    }
    System.out.println(sorted); // "aAàbBcCdDeé"
  }
}
eljenso
FYI: this method will split 32bit code points in two - Unicode characters with a value greater than 0xFFFF, creating strings with invalid values. Not an issue for French, but may cause problems for some locales.
McDowell
See Character.isHighSurrogate(char)
McDowell
Somehow I think this will do... unless he wants to sort Strings containing Swahili or something :)
eljenso
"I think this will do... unless he wants to sort Strings containing Swahili" -- I can see the slogan -- Unicode: when you want an easy way of localizing and translating your applications to *some* languages. Bzzt. Fail. Doing thing *almost* right means you *almost* don't have a bug to fix later.
Jonas Kölker
@Jonas I said I *think*, unless the OP wants to specify that it's absolutely necessary to support Swahili. I even prefer the simple solution without the locale, again unless the OP states that it is not sufficient. Ever heard of the YAGNI principle?
eljenso
+4  A: 
    String a ="dgfa";
    char [] c = a.toCharArray();
    Arrays.sort(c);
    return new String(c);

Note that this will not work as expected if it is a mixed case String (It'll put uppercase before lowercase). You can pass a comparator to the Sort method to change that.

A: 

Note that this will not work as expected if it is a mixed case String (It'll put uppercase before lowercase)

Why wouldn't you expect a sequence of numbers to be sorted by their numerical value unless you specify otherwise?

Jonas Kölker