views:

692

answers:

5

I've written recursive functions with the guidance of a friend who is teaching me C++ (as a first language). However, I don't really understand what is going on. He helped me (and the SO community, as well) write a merge sort function.

std::vector<int> mergeSort(std::vector<int> original)

//code here to create two vectors, farray and sarray, out of the 
//original std::vector<int> original ; each is half the length,
//note: if the size % 2 != 0, sarray takes the odd int

farray = mergeSort(farray);
sarray = mergeSort(sarray);

//code here merges the two into a single, final sorted std::vector

In this function, I assign:

farray = mergeSort(farray);
sarray = mergeSort(sarray);

What exactly is happening here? It calls mergeSort with farray and sarray as parameters and changes the value. How far does mergeSort execute into itself recursively? Just upto the recursive function call?

+13  A: 

Each time you call a function recursively, it effectively makes a new copy of the information it needs, and goes on.

You could have a program that recurs "infinitely", ie, until it runs out of resources, usually stack space — the space in which those copies are going. That would look like

void recur(){
    recur();
}

int main(){
    recur();
    exit(0); /* won't ever really get here.*/
}

Obviously, this isn't very useful, so you want to write a program that has some limit on how often it recurs. Here's a really simple program that manages that:

#include <iostream>
using namespace std;
void recurSome(int count){
    cout << "RecurSome called with " << count << endl;
    if (count > 0){
        recurSome(count-1);
        cout << "Back at " << count << endl;
    }else{
        cout << "Bottom." << endl;
    }
    return;
}

int main(){
    recurSome(10);
    exit(0);  /* now it will get here. */
}

If you compile and run that, say with:

bash $ g++ -Wall -o rc recursome.cpp
bash $ ./rc

You'll get the results:

RecurSome called with 10
RecurSome called with 9
RecurSome called with 8
RecurSome called with 7
RecurSome called with 6
RecurSome called with 5
RecurSome called with 4
RecurSome called with 3
RecurSome called with 2
RecurSome called with 1
RecurSome called with 0
Bottom.
Back at 1
Back at 2
Back at 3
Back at 4
Back at 5
Back at 6
Back at 7
Back at 8
Back at 9
Back at 10
bash $

See how it gets called for 10, then 9, and so on, and then after it reaches the bottom, it shows it coming back for 1, then 2, and so on back up to 10?

The basic rule is that every recursive function should have something that makes a base case, one which does call itself again. In this one, the base case is count == 0 and in fact we could have written this as a recursive definition

recursome:
if c = 0 : print bottom
if c > 0 : print count, and recursome(c-1)

You'll see many recursive definitions of that sort as you move on in math.

Here's a somewhat niftier C version with better output:

#include <stdio.h>
#include <stdlib.h>

int max = 10;

void recurSome(int count){
    printf("RecurSome %*c Called with %d\n", max-count+1, ' ', count);
    if (count > 0){
        recurSome(count-1);
        printf("RecurSome %*c Back at %d\n", max-count+1, ' ', count);
    }else{
        printf("RecurSome %*c Bottom.\n", 2*max, ' ');
        printf("RecurSome %*c Back at %d\n", max-count+1, ' ', count);
    }
    return;
}

int main(){
    recurSome(max);
    exit(0);  /* now it will get here. */
}

Output:

RecurSome   Called with 10
RecurSome    Called with 9
RecurSome     Called with 8
RecurSome      Called with 7
RecurSome       Called with 6
RecurSome        Called with 5
RecurSome         Called with 4
RecurSome          Called with 3
RecurSome           Called with 2
RecurSome            Called with 1
RecurSome             Called with 0
RecurSome                      Bottom.
RecurSome             Back at 0
RecurSome            Back at 1
RecurSome           Back at 2
RecurSome          Back at 3
RecurSome         Back at 4
RecurSome        Back at 5
RecurSome       Back at 6
RecurSome      Back at 7
RecurSome     Back at 8
RecurSome    Back at 9
RecurSome   Back at 10
Charlie Martin
Great example: +1. I changed the language to C++, since that's the language the OP was asking about. Hope you don't mind!
e.James
No biggy. Mine compiles as C++ as well. Thanks for doing the work to change it.
Charlie Martin
"Bottom" is a nice touch.
JeffH
Excellent! This is exactly what I needed. Thanks!
Hooked
@Charlie Martin: I know it would have compiled in C++, but I figured, since Hooked was just learning C++ as a first language, he might be more familiar with iostream instead of stdio. Anyway, it is an excellent example. I'd like to give it +2
e.James
A: 

For a recursive function to not be infinite, there needs to be some condition where it returns without calling itself. For some algorithms, that condition is the point at which performing the call on the data no longer makes sense.

In your case, if you split the passed-in vector and end up with two vectors that each contain only 1 item, does it make sense to call mergeSort() on them? No.

You could handle this by inspecting the size of farray and sarray and deciding whether to call mergeSort() on either or both before combining them and returning the combination.

What if one has 2 items and one has 1? Call mergeSort() on the size 2 but not on the size 1.

When a call to mergeSort() doesn't call mergeSort() on either farray or sarray before returning, the recursion will start unwinding.

JeffH
+2  A: 

Check it out in the dictionary:

recursion: noun. see recursion

Now, being serious, in the joke above, the definition of recursion is given in terms of recursion itself. That is recursion.

A recursive algorithm is an algorithm whose implementation is based on the algorithm itself. The process of developing such an algorithm starts with the most basic case, whose solution is either known in advance or can be trivially calculated. Then you define the algorithm in terms of itself.

As a simple example, calculating the n-th power of a given integer i could be a function power( int number, int power ). How could you implement it? in many ways. The simplest being a call to a library, followed by a loop, or you could define the function in terms of itself:

int power( int number, unsigned int pow )
{
   // Basic trivial case, cuts the recursion:
   if ( pow == 0 ) return 1;

   // Implement power in terms of itself:
   return number * power( number, pow-1 );
}
int main()
{
   int power = power( 2, 10 );
}

We have defined the function in terms of itself. You start with the most basic case (n^0 = 1). If we are not in the simplest case, you can express your algorithm in terms of itself.

The program would start in main by calling power( 2, 10 ) that would recurse and call power( 2, 9 ) reducing the problem to a smaller problem and would then compose the final answer in terms of the simpler problem.

The actuall call trace would be:

power( 2, 5 )
  power( 2, 4 )
    power( 2, 3 )
      power( 2, 2 )
        power( 2, 1 )
          power( 2, 0 )    // simplest case: return 1
        return 2 * 1 -> 2  // obtain next solution in terms of previous
      return 2 * 2 -> 4
    return 2 * 4 -> 8
  return 2 * 8 -> 16
return 2 * 16 -> 32

While developing recursive algorithms it usually helped me believing that I already had the algorithm up and running and just work on the reduction/composition of the new result.

David Rodríguez - dribeas
Excellent "cuts the recursion" comment and helpful visual trace!
JeffH
Nice. Someone should do a LISP answer now ;-)
Charlie Martin
A: 

Have a look at the wikipedia page for merge sort for additional information on what you're trying to do.

As a sidenote, be aware that you're making a copy of every vector you're given as a parameter. Use vector<> const& instead.

Benoît
I'm unfamiliar with the const and address operator used in tandem. Would you explain this?
Hooked
@Benoit: to be able to avoid making at least one vector copy at each recurion level, wouldn't the parameter need to be non-const?
Michael Burr
@Michael Burr: yes, but i am actually avoiding one out of two copies at each recursion level. Since it is possible (but complicated) to apply this algorithm in-place, using iterators into the original vector would be sufficient, and const could be avoided. As-is, there is still one new vector per recursion level after my suggestion, and it is needed.
Benoît
+2  A: 

Recursion can be understood as a practical application of the principle of induction. To prove a statement P(n) for all n where P(n) means "The sum of the integers from 1 to n is n(n+1)/2" we have to prove two things:

  1. Base case: P(n) holds for a specific integer. P(1) is true because 1 = 1(1+1)/2.
  2. Inductive case: If P(n) is true, then P(n+1) must be true. 1 + ... + n + (n+1) = n(n+1)/2 + (n+1) = n(n+1)/2 + 2(n+1)/2 = (n+1)((n+1)+1)/2, which is the statment P(n+1).

Similarly, in a recursive function like mergeSort we need to handle two cases:

  1. Base case: If the array has one element, it is sorted; otherwise
  2. Recursive case: Split the array into two smaller arrays, mergeSort each of them, and merge them together.

The key is that the two arrays are smaller than the original, otherwise one of them will never hit the base case. Because the arrays are split roughly in half, the depth of the recursion will be about log2(n) in this case.

As far as the code in the question goes, there are three issues:

  1. The base case is missing.
  2. Reusing the variables farray and sarray is not strictly necessary and may be confusing.
  3. The code will be very slow because of the number of std::vectors that are created and destroyed. The best interface for mergeSort takes two vector::iterators as input, similar to the std::sort() function.

New programmers should try running mergeSort with paper and pencil.

Mark Ruzon