views:

549

answers:

4

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10. I came up with the following.

Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.

I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).

Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.

+4  A: 

If their range is limited (say between -100 and 100), it's easy.

Create an array quant[-100..100] then just cycle through your linked list, executing:

quant[value] = quant[value] + 1

Then the following loop will do the trick.

for i = -100 to 100:
    j = 10 - i
        for k = 1 to quant[i] * quant[j]
            output i, " ", j

Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).

This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.

When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.

This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.

Stop when the pointers pass each other.

There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).

You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).

So, for the list:

2 3 1 3 5 7 10 -1 11

you get:

Index    a  b  c  d  e  f  g  h
Value   -1  1  2  3  5  7 10 11
Count    1  1  1  2  1  1  1  1
  • You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
  • 1 + 10 > 10 so leave p1 at b, move p2 down to f.
  • 1 + 7 < 10 so move p1 to c, leave p2 at f.
  • 2 + 7 < 10 so move p1 to d, leave p2 at f.
  • 3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
  • 5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
  • Loop ends since p1 > p2.

Hence the overall output was:

(-1,11)
( 3, 7)
( 3, 7)

which is correct.

Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.

#include <stdio.h>

#define SZSRC 30
#define SZSORTED 20
#define SUM 14

int main (void) {
    int i, s, e, prod;
    int srcData[SZSRC];
    int sortedVal[SZSORTED];
    int sortedCnt[SZSORTED];

    // Make some random data.

    srand (time (0));
    for (i = 0; i < SZSRC; i++) {
        srcData[i] = rand() % SZSORTED;
        printf ("srcData[%2d] = %5d\n", i, srcData[i]);
    }

    // Convert to value/size array.

    for (i = 0; i < SZSORTED; i++) {
        sortedVal[i] = i;
        sortedCnt[i] = 0;
    }
    for (i = 0; i < SZSRC; i++)
        sortedCnt[srcData[i]]++;

    // Force 7+7 to specific count for testing.

    sortedCnt[7] = 2;
    for (i = 0; i < SZSORTED; i++)
        if (sortedCnt[i] != 0)
            printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);

    // Start and end pointers.

    s = 0;
    e = SZSORTED - 1;

    // Loop until they overlap.

    while (s <= e) {
        // Equal to desired value?

        if (sortedVal[s] + sortedVal[e] == SUM) {
            // Get product (note special case at midpoint).

            prod = (s == e)
                ? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
                : sortedCnt[s] * sortedCnt[e];

            // Output the right count.

            for (i = 0; i < prod; i++)
                printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);

            // Move both pointers and continue.

            s++;
            e--;
            continue;
        }

        // Less than desired, move start pointer.

        if (sortedVal[s] + sortedVal[e] < SUM) {
            s++;
            continue;
        }

        // Greater than desired, move end pointer.

        e--;
    }

    return 0;
}

You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.

If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).

And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.

And here's the output from a test run, which shows you the various stages of calculation.

srcData[ 0] =    13
srcData[ 1] =    16
srcData[ 2] =     9
srcData[ 3] =    14
srcData[ 4] =     0
srcData[ 5] =     8
srcData[ 6] =     9
srcData[ 7] =     8
srcData[ 8] =     5
srcData[ 9] =     9
srcData[10] =    12
srcData[11] =    18
srcData[12] =     3
srcData[13] =    14
srcData[14] =     7
srcData[15] =    16
srcData[16] =    12
srcData[17] =     8
srcData[18] =    17
srcData[19] =    11
srcData[20] =    13
srcData[21] =     3
srcData[22] =    16
srcData[23] =     9
srcData[24] =    10
srcData[25] =     3
srcData[26] =    16
srcData[27] =     9
srcData[28] =    13
srcData[29] =     5
Sorted [  0], count =   1
Sorted [  3], count =   3
Sorted [  5], count =   2
Sorted [  7], count =   2
Sorted [  8], count =   3
Sorted [  9], count =   5
Sorted [ 10], count =   1
Sorted [ 11], count =   1
Sorted [ 12], count =   2
Sorted [ 13], count =   3
Sorted [ 14], count =   2
Sorted [ 16], count =   4
Sorted [ 17], count =   1
Sorted [ 18], count =   1
(  0, 14)
(  0, 14)
(  3, 11)
(  3, 11)
(  3, 11)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  5,  9)
(  7,  7)

paxdiablo
Could you please elaborate on the second part "you can just run two indexes, one from the start and one from the end, looking for the ones that add up to 10".
Ankur
Good solution. Thanks.
Ankur
Is this solution has linear complexity O(n)?
Ankur
Yes, you can generally tell by the lack of nested loops. The solution I gave was O(n) because it allocates a counter for all possible values. If you wanted to save memory, you'd need to sort and search, which would make it O(n log n), but the solution as presented is definitely O(n) time complexity.
paxdiablo
As an aside, you can sometimes trade storage complexity for time complexity depending on whether you want blinding speed or low memory footprint. I say sometimes because the cost of a time-aggressive algorithm may be more storage space than you can fit on the planet :-)
paxdiablo
A: 

Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)

For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.

So for example you have 1 - 6 - 3 - 4 - 9

1 - is 9 in the set? Nope

6 - 4? No.

3 - 7? No.

4 - 6? Yup! Print (6,4)

9 - 1? Yup! Print (9,1)

Steven Schlansker
+1  A: 

This is a mini subset sum problem, which is NP complete.

Dave
A: 

If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Dave
Yes, a better algorithm. Would you say this solution has linear complexity?
Ankur
I think the worst case would still be O(n^2). I'd have to think about it.
Dave

related questions