tags:

views:

100

answers:

3

I've seen many mergesort implementations. Here is the version in Data Structures and Algorithms in Java (2nd Edition) by Robert Lafore:

private void recMergeSort(long[] workSpace, int lowerBound,int upperBound)
  {
  if(lowerBound == upperBound)            // if range is 1,
     return;                              // no use sorting
  else
     {                                    // find midpoint
     int mid = (lowerBound+upperBound) / 2;
                                          // sort low half
     recMergeSort(workSpace, lowerBound, mid);
                                          // sort high half
     recMergeSort(workSpace, mid+1, upperBound);
                                          // merge them
     merge(workSpace, lowerBound, mid+1, upperBound);
     }  // end else
  }  // end recMergeSort()


  private void merge(long[] workSpace, int lowPtr,
                           int highPtr, int upperBound)
      {
      int j = 0;                             // workspace index
      int lowerBound = lowPtr;
      int mid = highPtr-1;
      int n = upperBound-lowerBound+1;       // # of items

      while(lowPtr <= mid && highPtr <= upperBound)
         if( theArray[lowPtr] < theArray[highPtr] )
            workSpace[j++] = theArray[lowPtr++];
         else
            workSpace[j++] = theArray[highPtr++];

      while(lowPtr <= mid)
         workSpace[j++] = theArray[lowPtr++];

      while(highPtr <= upperBound)
         workSpace[j++] = theArray[highPtr++];

      for(j=0; j<n; j++)
         theArray[lowerBound+j] = workSpace[j];
      }  // end merge()

One interesting thing about the merge method is that almost all the implementations didn't pass the mid parameter to the merge method. mid is calculated in the merge. This is strange, since highPtr is assigned to mid + 1 from the calling method.

Why did the author not pass mid to merge like merge(workSpace, lowerBound,mid, mid+1, upperBound);? If we write it like , we can easily see that [lowerBound,mid] is the lower range ,[mid+1,upperBound] is higher range. I think there must be a reason, otherwise I can't understand why all implementations of an algorithm older than half a century are coincident in the such a little detail.

A: 

Passing mid (or n) as arguments would save some trivial computation, but it would also suggest that these values are somehow interestingly distinct from the other arguments.

Passing lowPtr, highPtr, and upperBound (or any equivalent set, such as lowPtr, mid, and n) passes the minimal amount of required information to the merge method.

So the stylistic choice here is to pass minimal information rather than eliminating (trivial) redundant computations. In a practical implementation, you might make the other choice. (Or ideally, you would let your profiler tell you which is better.)

Eric
+1  A: 

Basically we're talking about two intervals that are adjacent [a..b-1] and [b..n] (bounds are inclusive), and you're asking why it's presented as (a, b, n) instead of (a, b-1, b, n)?

You said it yourself: it's "such a little detail".

It doesn't affect correctness, and any performance gained in not recomputing b-1 may be offset by the cost of passing it as an additional parameter. Is it worth the effort of profiling to investigate one way or another? No. It has no effect on the algorithm's asymptotic behavior, and any difference in performance is negligible; it's not worth fussing about little things like this.

By the way, it is worth noting that the way of computing average of two number as above is now considered broken (see: Google Research blog: Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken). Josh Bloch recommends:

int mid = (low + high) >>> 1;

Going back to the original question, generally speaking, if a parameter can be derived from another, then often it's best to omit them; it simplifies calling it, and it also simplifies analysis due to obviously smaller domain.

A function that takes 10 parameters, 5 them of derivable, is much harder to analyze than one that simply takes 5. Of course, if the derivable parameters are expensive and/or non-trivial to compute, and you've already computed their values before you call the function, then you may consider passing them on.

In fact, in the merge sort example, (a, n) suffices, since b is derivable from both. That computation is not trivial, however (the bug Bloch mentions escapes detection for 2 decades), so it's decided to simply pass it on.

b-1, by contrast, is way too trivial and is best omitted.

polygenelubricants
@polygenelubricants : well , it's not about efficiency , but clarity . (a, b-1, b, n) means [a,b-1] is the lower sorting range , [b,n] is the higher range. If I think in this way , I could write this out more easily.
ZZcat
I added more remarks, let me know if you need further input.
polygenelubricants
A: 

Unless I'm misreading this, lowPtr in merge() will initially be lowerBound from recMergeSort(), so everything is okay: lowPtr is the beginning of the first half to be merged, highPtr is the beginning of the second half to be merged. These pointers are advanced as the merge progresses.

As for why mid wasn't passed to merge(), it's a choice, and not the only way to do it. What you suggested would also work. Perhaps it's to enforce that mid is always initially one less than highPtr within merge(), though the author isn't terribly concerned with input validation (e.g. the lower bound being below the upper bound). Granted, since merge() is private and therefore only called from "trusted" code, validation isn't required.

miorel