ansaurus

Question

Merge algorithm in C: how does this work?

Answer 1

+29 A:

The code

The code uses what is called the post-increment operator and the ternary/conditional operator (see appendix for more details).

A more verbose version may look something like this:

if (in1[i1] < in2[i2]) {
    out[i] = in1[i1];
    i++;
    i1++;
} else {
    out[i] = in2[i2];
    i++;
    i2++;
}

The algorithm

If the elements in in1 and in2 are in sorted order, then the snippet serves as the main part of a merge algorithm to merge the two sorted input buffers into one sorted output buffer.

Care must be taken to ensure that i1 and i2 are in-bound for in1 and in2 respectively before comparing in1[i1] against in2[i2]. Then in1[i1] is the next available smallest element in in1, and similarly in2[i2] is the next available smallest element in in2.

Without loss of generality, let's assume in1[i1] < in2[i2] (the other case is a near mirror scenario). Then the next smallest element from in1 is smaller than the next smallest element from in2, and with in1[i1++] on the right hand side of the assignment, we fetch the next smallest value from in1 and advance its pointer to the next available value (if any). With out[i++] on the left hand side of the assignment, we assign the fetched value to a slot in the output buffer and advance its pointer to the next available slot (if any).

A higher-level pseudocode of the overall merge algorithm, using a Queue-like abstract data structure instead of arrays with corresponding pointer indices (for clarity!), may look something like this:

procedure MERGE(Queue in1, in2) : Queue
// given sorted queues in1, in2, return a merged sorted queue

   INIT out IS Empty-Queue

   WHILE in1.notEmpty() AND in2.notEmpty()
      IF in1.peek() < in2.peek()
         out.enqueue(in1.dequeue())
      ELSE
         out.enqueue(in2.dequeue())

   // at this point, at least one of the queue is empty

   // dump in1 to out in case it's not empty
   WHILE in1.notEmpty()
      out.enqueue(in1.dequeue())

   // dump in2 to out in case it's not empty
   WHILE in2.notEmpty()
      out.enqueue(in2.dequeue())

   RETURN out

Appendex A: Ternary/conditional operator

Essentially, an expression such as this:

condition ? trueExpr : falseExpr

first evaluates condition, and if it's true, it evaluates trueExpr whose value becomes the value of the entire expression. If instead condition is false, the operator instead evaluates falseExpr, whose value becomes the value of the entire expression.

Appendix B: post-increment operator

An expression such as i++ uses what is called a post-increment operator. The operator increments i, but the value of this expression is the value of i before the increment. By contrast, the value of a pre-increment expression (e.g. ++i) is the value of i after the increment.

There are also pre-decrement (e.g. --i) and post-decrement as well (e.g. i--).

Related questions

On pitfalls like i = i++; (most of these is Java, but applicable to other languages as well):

polygenelubricants 2010-07-18 13:04:39

Excellent answer. You may also want to mention that the particular code snippet is a kind of merge sort, selecting the next lowest value from the head of two "queues".

paxdiablo 2010-07-18 13:23:26

@paxdiablo: comment incorporated, even added pseudocode using "queues" as per your suggestion since it does make the concept more clear.

polygenelubricants 2010-07-19 17:15:37

Answer 2

+1 A:

http://en.wikipedia.org/wiki/%3F%3A

BobTurbo 2010-07-18 13:06:23

I URL encoded the `:` for you so the correct link is created.

BoltClock 2010-07-18 13:18:33

Answer 3

+11 A:

You've already got an excellent answer explaining the syntax but so far no-one has told you what the code actually does.

If you have two input arrays, in1 and in2, and an index into each then this line of code finds the smallest item out of the two current items and puts it into the output array. It then advances the index for that input array and also the index into the output array.

If the two inputs are sorted arrays and if this line is run in a loop it performs a merge of the two inputs in O(n) time. This operation is used repeatedly when performing a merge sort.

Mark Byers 2010-07-18 13:23:23

+1; I should've gone there instead of focusing on syntax. I may edit my answer eventually (also as per paxdiablo's suggestion) to incorporate this aspect.

polygenelubricants 2010-07-18 13:29:01

Answer 4

+2 A:

As you've seen, another thing it does is confuse new (or new to the language) developers. C folks especially like getting things down into a single line. It feels elegant. There are certain idioms or turns of phrase in both C and C++ that we just recognize without hesitation. When you come across one of these, there's nothing wrong with writing out (on paper or in a scratch file) the long version of it (like in @polygenelubricants answer) so that you then have a chance to work out the big-picture meaning of it (like in @Mark Byers answer). But leave it the short way once you understand it.

Kate Gregory 2010-07-18 13:37:45

Excellent _comment_!

polygenelubricants 2010-07-18 13:56:55

+1 for acknowledging both the elegance and confusion of "one-liners" and especially for "leave it the short way once you understand it". :-)

R.. 2010-07-18 16:30:42

ansaurus

tags:

views:

answers:

Merge algorithm in C: how does this work?

The code

The algorithm

See also

Appendex A: Ternary/conditional operator

Related questions

Appendix B: post-increment operator

Related questions

related questions