tags:

views:

788

answers:

7

I am facing a strange problem while sorting a list of strings with integer values. However some values could be prefixed with some characters.

e.g.

// B1, 5, 50, A10, 7, 72, B3, A1, A2

There are basically page numbers and should be sorted like:

// A1, A2, A10, B1, B3, 5, 7, 50, 72

But if I use default string sorting then these will be sorted like

// A1, A10, A2, B1, B3, 5, 50, 7, 72

Any solution for this in C#?

+13  A: 

You're looking for the Alphanum algorithm. Fortunately for you, a number of implementations exist already. See here.

John Feminella
Alphanum will return // 5, 7, 50, 72, A1, A2, A10, B1, B3 instead of // A1 ... 5
Carra
If you go through some of the code samples, it's outlined how you'd change it to accomodate subtly different scenarios.
John Feminella
A: 

Well, you could always pInvoke the Win32 API function StrCmpLogicalW which does exactly what you want (it's what Explorer uses to sort filenames). The only possible downside is that the sort is case insensitive.

Skizz

Skizz
+1  A: 

You can use this NaturalStringComparer that I put together and cleaned up a bit (Don't remember where I got the basis for it). It uses the Win32 function StrCmpLogicalW that Skizz mentions.

http://my.opera.com/Svishy/blog/2009/03/02/natural-sorting

Svish
+3  A: 

This is how I solved it for our application, order will be like in a windows directory:

public class NaturalSortComparer : IComparer<string>
{
    public int Compare(string x, string y)
    {
        return StrCmpLogicalW(x, y);
    }

    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode, ExactSpelling = true)]
    public static extern int StrCmpLogicalW(string x, string y);
}

Usage:

  NaturalSortComparer comparer = new NaturalSortComparer();
  return comparer.Compare(string1, string2);

But it's probably not exactly what you want:

// A1, A2, A10, B1, B3, 5, 7, 50, 72

This will give

// 5, 7, 50, 72, A1, A2, A10, B1, B3

Carra
A: 

Not sure about performance, and sure this can be optimized, but it do the work:

string[] sort(string[] data)
{
 return data
  .OrderBy(s => Regex.Match(s, @"^\D").Length == 0)
  .ThenBy(s => Regex.Match(s, @"\D*").Value)
  .ThenBy(s => Int32.Parse(Regex.Match(s, @"\d+").Value)).ToArray();
}

var result = sort(new string[] { "B1", "5", "50", "A10", "7", "72", "B3", "A1", "A2" });

Kamarey
+3  A: 

What you are looking for is a Natural Sort.

Jeff Atwood made a great post on his blog once, explaining the concept and linking to various other sources with algorithms you could take as an example.

Sorting for Humans : Natural Sort Order

lowglider
+1  A: 

Here's a custom comparer that will sort into your required order. Note that there are no error/sanity checks in this code: it assumes that all the strings will be in the correct format.

public class MyComparer : IComparer<string>
{
    public int Compare(string x, string y)
    {
        Match xMatch = Regex.Match(x, @"^(\D*)(\d+)$");
        Match yMatch = Regex.Match(y, @"^(\D*)(\d+)$");

        string xChars = xMatch.Groups[1].Value;
        string yChars = yMatch.Groups[1].Value;

        if ((xChars.Length == 0) && (yChars.Length > 0))
        {
            return 1;
        }
        else if ((xChars.Length > 0) && (yChars.Length == 0))
        {
            return -1;
        }
        else
        {
            int charsResult = xChars.CompareTo(yChars);

            return (charsResult != 0)
                ? charsResult
                : int.Parse(xMatch.Groups[2].Value)
                    .CompareTo(int.Parse(yMatch.Groups[2].Value));
        }
    }
}

You can use it like so:

List<string> testList =
    new List<string>() { "B1","5","50","A10","7","72","B3","A1","A2" };

testList.Sort(new MyComparer());    // A1, A2, A10, B1, B3, 5, 7, 50, 72
LukeH