views:

260

answers:

8

So I was writing a mergesort in C# as an exercise and although it worked, looking back at the code, there was room for improvement.

Basically, the second part of the algorithm requires a routine to merge two sorted lists.

Here is my way too long implementation that could use some refactoring:

private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
    if (sLeft.Count == 0 || sRight.Count == 0)
    {
        sLeft.AddRange(sRight);
        return sLeft;
    }
    else if (sLeft.Count == 1 && sRight.Count == 1)
    {
        if (sLeft[0] <= sRight[0])
            sLeft.Add(sRight[0]);
        else
            sLeft.Insert(0, sRight[0]);
        return sLeft;
    }
    else if (sLeft.Count == 1 && sRight.Count > 1)
    {
        for (int i=0; i<sRight.Count; i++)
        {
            if (sLeft[0] <= sRight[i])
            {
                sRight.Insert(i, sLeft[0]);
                return sRight;
            }
        }
        sRight.Add(sLeft[0]);
        return sRight;
    }
    else if (sLeft.Count > 1 && sRight.Count == 1)
    {
        for (int i=0; i<sLeft.Count; i++)
        {
            if (sRight[0] <= sLeft[i])
            {
                sLeft.Insert(i, sRight[0]);
                return sLeft;
            }
        }
        sLeft.Add(sRight[0]);
        return sLeft;
    }
    else
    {
        List<int> list = new List<int>();
        if (sLeft[0] <= sRight[0])
        {
            list.Add(sLeft[0]);
            sLeft.RemoveAt(0);
        }
        else
        {
            list.Add(sRight[0]);
            sRight.RemoveAt(0);
        }

        list.AddRange(MergeSortedLists(sLeft, sRight));
        return list;
    }       
}

Surely this routine can be improved/shortened by removing recursion, etc. There are even other ways to merge 2 sorted lists. So any refactoring is welcome.

Although I do have an answer, I'm curious as to how would other programmers would go about improving this routine.

Thank you!

+3  A: 

Are you really sure your code works at all? Without testing it, i see the following:

...
else if (sLeft.Count > 1 && sRight.Count == 0)  //<-- sRight is empty
{
    for (int i=0; i<sLeft.Count; i++)
    {
        if (sRight[0] <= sLeft[i]) //<-- IndexError?
        {
            sLeft.Insert(i, sRight[0]);
            return sLeft;
        }
    }
    sLeft.Add(sRight[0]);
    return sLeft;
}
...
dkson
it will never execute the inside of that branch (see my comment to the question) and therefore the out of bounds will not make the code fail. The catch all would probably even yield the correct result
Rune FS
@Rune FS Yes, you are right. The code is a little strange at all :-)
dkson
Sorry about that, see my comment to Rune FS (he made the same point first)
tzup
@tzup no need to say sorry. You asked for help, and here you will get some :-)
dkson
I know, the intention wasn't to slide in a syntax error and see how fast it gets spotted, so I'm sorry for having mislead you guys.
tzup
+4  A: 

My take on this would be:

private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
    List<int> result = new List<int>();
    int indexLeft = 0;
    int indexRight = 0;

    while (indexLeft < sLeft.Count || indexRight < sRight.Count)
    {
        if (indexRight == sRight.Count ||
            (indexLeft < sLeft.Count && sLeft[indexLeft] < sRight[indexRight]))
        {
            result.Add(sLeft[indexLeft]);
            indexLeft++;
        }
        else
        {
            result.Add(sRight[indexRight]);
            indexRight++;
        }
    }
    return result;
}

Exactly what I'd do if I had to do it by hand. =)

Jens
Although that "if" statement if very clever (the shortcircuiting logic), I like Carra's answer better because it is a lot easier to follow.
tzup
+2  A: 

As a starting point, I would remove your special cases for when either of the lists has Count == 1 - they can be handled by your more general (currently recursing) case.

The if (sLeft.Count > 1 && sRight.Count == 0) will never be true because you've checked for sRight.Count == 0 at the start - so this code will never be reached and is redundant.

Finally, instead of recursing (which is very costly in this case due to the number of new Lists you create - one per element!), I'd do something like this in your else (actually, this could replace your entire method):

List<int> list = new List<int>();

while (sLeft.Count > 0 && sRight.Count > 0)
{
    if (sLeft[0] <= sRight[0])
    {
        list.Add(sLeft[0]);
        sLeft.RemoveAt(0);
    }
    else
    {
        list.Add(sRight[0]);
        sRight.RemoveAt(0);
    }
}

// one of these two is already empty; the other is in sorted order...
list.AddRange(sLeft);
list.AddRange(sRight);
return list;

(Ideally I'd refactor this to use integer indexes against each list, instead of using .RemoveAt, because it's more performant to loop through the list than destroy it, and because it might be useful to leave the original lists intact. This is still more efficient code than the original, though!)

Dan Puzey
Great thinking! Eventually I got to this same solution and had the exact same thoughts about RemoveAt.
tzup
+14  A: 

Merging two sorted lists can be done in O(n).

List<int> lList, rList, resultList;
int r,l = 0;

while(l < lList.Count && r < rList.Count)
{
  if(lList[l] < rList[r]
    resultList.Add(lList[l++]);
  else
    resultList.Add(rList[r++]);
}
//And add the missing parts.
while(l < lList.Count)
  resultList.Add(lList[l++]);
while(r < rList.Count)
  resultList.Add(rList[r++]);
Carra
nice and clean. I like.
Yves M.
I think you meant O(m+n) where m is size of list1 and n is size of list2
Hasan Khan
Good answer! Two things: 1. that "l" variable is confusing and usually should be avoided; 2. Adding the missing parts could be done using AddRange as Dan Puzey showed in his answer. Why did you prefer the while?
tzup
@Hasan Kahn: O(n) should be read as "linear to the size of the input". Your finer level of detail in O(m+n) is in no way wrong, but neither is simply saying O(n).
Jakob
Carra
What is `iList`? I suspect that you mean `rList` in the first case and `lList` in the second.
Svante
@Carra: Typo on the "if" line?
John Pirie
Fixet the typo.
Carra
Although your logic is very slick and by far the most popular, I'll give the answer to Dan Puzey because he has a very similar answer but has added some insight on his thinking process which makes it more complete.
tzup
+1  A: 

Merge list (by theory, input lists are sorted in advance) sorting could be implemented in following way:

List<int> MergeSorting(List<int> a, List<int> b)
    {
        int apos = 0;
        int bpos = 0;
        List<int> result = new List<int>();
        while (apos < a.Count && bpos < b.Count)
        {
            int avalue = int.MaxValue;
            int bvalue = int.MaxValue;
            if (apos < a.Count)
                avalue = a[apos];
            if (bpos < b.Count)
                bvalue = b[bpos];
            if (avalue < bvalue)
            {
                result.Add(avalue);
                apos++;
            }
            else
            {
                result.Add(bvalue);
                bpos++;
            }
        }
        return result;
    }

In case you start with not sorted list you need to split it by sorted subsequence and than marge them using function above

Sergey Osypchuk
You have some redundant assignments in there and you're missing adding the "leftovers" in the result list after the while (see Carra's answer), so you're almost there.
tzup
Sergey Osypchuk
Ok, please edit to make your answer a proper one.
tzup
tzup
+2  A: 

You were asking for differrent approaches as well. I might do as below depending on the usage. The below code is lazy so it will not sort the entire list at once but only when elements are requested.

class MergeEnumerable<T> : IEnumerable<T>
    {
        public IEnumerator<T> GetEnumerator()
        {
            var left = _left.GetEnumerator();
            var right = _right.GetEnumerator();
            var leftHasSome = left.MoveNext();
            var rightHasSome = right.MoveNext();
            while (leftHasSome || rightHasSome)
            {
                if (leftHasSome && rightHasSome)
                {
                  if(_comparer.Compare(left.Current,right.Current) < 0)
                  {
                    yield return returner(left);
                  } else {
                    yield return returner(right);
                  }
                }
                else if (rightHasSome)
                {
                    returner(right);
                }
                else
                {
                    returner(left);
                }
            }
        }

        private T returner(IEnumerator<T> enumerator)
        {
            var current = enumerator.Current;
            enumerator.MoveNext();
            return current;
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return ((IEnumerable<T>)this).GetEnumerator();
        }

        private IEnumerable<T> _left;
        private IEnumerable<T> _right;
        private IComparer<T> _comparer;

        MergeEnumerable(IEnumerable<T> left, IEnumerable<T> right, IComparer<T> comparer)
        {
            _left = left;
            _right = right;
            _comparer = comparer;
        }
    }

EDIT: It's basically the same implementatin as Sergey Osypchuk his will from start to finish when looking only at the sorting be fastest but the latency will be higher as well due to the fact of sorting the entire list upfront. So as I said depending on the usage I might go with this approach and an alternative would be something similar to Sergey Osypchuk

Rune FS
+1  A: 

Often you can use a stack instead of use recursion

Hungary1234
A: 

I never use recursion for merge sort. You can make iterative passes over the input, taking advantage of the fact that the sorted block size doubles with every merge pass. Keep track of the block size and the count of items you've processed from each input list; when they're equal, the list is exhausted. When both lists are exhausted you can move on to the next pair of blocks. When the block size is greater than or equal to your input size, you're done.

Edit: Some of the information I had left previously was incorrect, due to my misunderstanding - a List in C# is similar to an array and not a linked list. My apologies.

Mark Ransom