tags:

views:

899

answers:

5

I took the plunge this afternoon and began studying LINQ, so far just mucking around with LINQ on collections. One of the first things I tried was to implement QSort.

Now -- ignoring the fact that I could just use an ORDERBY and that this is a very silly qsort implementation -- what I came up with was this:

public class lqsort
{
    public static List<int> QSLinq(List<int> _items)
    {
        if (_items.Count <= 1)
            return _items;

        int _pivot = _items[0];

        List<int> _less = (from _item in _items where _item < _pivot select _item).ToList();
        List<int> _same = (from _item in _items where _item == _pivot select _item).ToList();
        List<int> _greater = (from _item in _items where _item > _pivot select _item).ToList();

        return (QSLinq(_less).Concat(_same.Concat(QSLinq(_greater)))).ToList();
    }
}

The only thing that really bugs me is all of the casting involved. Are there any LINQ tricks I might use? Or am I just using LINQ for things it wasn't intended for?

+3  A: 

How about this? (If I understand well you don't like the .ToList calls)

public static IEnumerable<int> QSLinq(IEnumerable<int> items)
{
    if (items.Count() <= 1)
     return items;

    int pivot = items.First();

    return QSLinq(items.Where(i => i < pivot))
     .Concat(items.Where(i => i == pivot))
     .Concat(QSLinq(items.Where(i => i > pivot)));
}

Disclaimer based on Jon answer: Do NOT use this algorithm in a real problem. It is very inefficient.

Panos
Unfortunately it's still very inefficient. Take the "less than" recursive call to QSLinq - it will have to iterate through its parameter three times, once for each Where call. Iterating through those means iterating through the original list. The same is true of the "more than" recursive call.
Jon Skeet
(Continued) Now the recursive "less than" call within the second level of recursion needs to iterate through the results of the first level of recursion, etc. Basically it ends up iterating through the list a hideous number of times. I'm too sleepy to work out the complexity now, but it's nasty.
Jon Skeet
I agree Jon, but the problem here is not the efficiency. No one is going to use this algorithm. We can always use .OrderBy() (which I hope does not use this :)
Panos
If efficiency is of no concern then I have no problems with it :) I think it's worth considering *why* efficiency is so hard to come by when looking at sequences though.
Jon Skeet
"Do NOT use this algorithm in a real problem. It is very inefficient." Very inefficient is lightly said, it's O(n^3*log(n)) complexity :)
Pop Catalin
Since items is IEnumerable, it would be much better to use Any() instead of Count().
David B
+2  A: 

All the casting involved? I don't see any casting. What I do see is calls to "ToList" which are hideously inefficient. Basically LINQ is designed to work over sequences, which intrinsically don't allow you to work in place (or in a duplicate space) in the way that quicksort tends to. Basically you've got an awful lot of data copying going on here :(

Jon Skeet
Yeah, I should have said type conversion instead of casting. Thanks for the tips. I knew I was bastardizing linq, but I'm mainly playing around with the syntax right now.
Dana
+5  A: 

Just change the type of the parameter to IEnumerable and use the var construct instead of your List<int> for your local variables.

This will make your QSLinq method better because it will accept more types of parameters, for example int[], as well as List<int>.

See the new method:

    public static IEnumerable<int> QSLinq(IEnumerable<int> _items)
    {
        if (_items.Count() <= 1)
            return _items;

        var _pivot = _items.First();

        var _less = from _item in _items where _item < _pivot select _item;
        var _same = from _item in _items where _item == _pivot select _item;
        var _greater = from _item in _items where _item > _pivot select _item;

        return QSLinq(_less).Concat(QSLinq(_same)).Concat(QSLinq(_greater));
    }

Hope this helps.

Alfred B. Thordarson
This is the worst thing - while I was typing the answer Panos completed and posted his answer - exactly the same as mine :-(
Alfred B. Thordarson
Is there any difference other than syntax between this explicit LINQ representation as opposed to the Lambda expressions in Panos' version? Is either more efficient than the other?
Skeolan
I'm going to accept this answer, 'cause it's what it's basically what I wanted. But the other answers were really cool!
Dana
+3  A: 

Fun Question! This doesn't outperform OrderBy, but it does limit the repeated enumerations some.

    public static IEnumerable<int> QSort2(IEnumerable<int> source)
    {
        if (!source.Any())
            return source;
        int first = source.First();
        return source
            .GroupBy(i => i.CompareTo(first))
            .OrderBy(g => g.Key)
            .SelectMany(g => g.Key == 0 ? g : QSort2(g));
    }

I inadvertently generated a stackoverflow during development, as I QSorted when the Key == 0.


Just for fun I tested these solutions. I commited the cardinal performance testing sin (testing in debug mode), but I don't think that affects the big O effect we'll see. Here is the input (reversed input is worst case for quicksort)

IEnumerable<int> source = Enumerable.Range(0, 1000).Reverse().ToList();
  • The triple concat-where solution took 71000 milliseconds.
  • My solution took 330 milliseconds
  • OrderBy.ToArray took 15 milliseconds (the timer's resolution, so actual time is probably less)
David B
+2  A: 

Here's another solution using Aggregate. I call it: Group and Go Fish. This one takes 170 ms by my 1000 reversed elements test.

    public static IEnumerable<int> QSort3(IEnumerable<int> source)
    {
        if (!source.Any())
            return source;
        int first = source.First();

        QSort3Helper myHelper = 
          source.GroupBy(i => i.CompareTo(first))
          .Aggregate(new QSort3Helper(), (a, g) =>
              {
                  if (g.Key == 0)
                      a.Same = g;
                  else if (g.Key == -1)
                      a.Less = g;
                  else if (g.Key == 1)
                      a.More = g;
                  return a;
              });
        IEnumerable<int> myResult = Enumerable.Repeat(1, 0);
        if (myHelper.Less != null)
            myResult = myResult.Concat(QSort3(myHelper.Less));
        if (myHelper.Same != null)
            myResult = myResult.Concat(myHelper.Same);
        if (myHelper.More != null)
            myResult = myResult.Concat(QSort3(myHelper.More));

        return myResult;
    }

    public class QSort3Helper
    {
        public IEnumerable<int> Less;
        public IEnumerable<int> Same;
        public IEnumerable<int> More;
    }

Why is this faster than my solution using OrderBy? I guess it's a little more resistent to the worst case.

David B
The line IEnumerable<int> myResult = Enumerable.Repeat(1, 0); could be replaced by Enumerable.Empty<int>().
Patrik Hägne
That would be clearer code, yes.
David B