ansaurus

Question

Merge sorted arrays - Efficient solution

Answer 1

A:

All you need is two pointers (or just int index counters), checking for minimum between array A and B, copying the value over to the resultant list, and incrementing the pointer of the array the minimum came from. If you run out of elements on one source array, copy the remainder of the second to the resultant and you're done.

Edit: You can trivially expand this to N arrays.

Edit: Don't trivially expand this to N arrays :-). Do two at a time. Silly me.

Gary 2010-07-23 20:08:02

He has more than two arrays.

IVlad 2010-07-23 20:09:27

Like Mad said, he's merging an arbitrary number of arrays.

Odrade 2010-07-23 20:11:20

The method is trivially expanded to any number of arrays. "All you need is n pointers, checking for minimum between all n arrays, ..."

Mark Ransom 2010-07-23 21:18:42

That makes the complexity O(NM²), where N is the length of the longest array and M is the number of arrays. Goz's solution runs in only O(NM log M).

Nabb 2010-07-24 12:55:58

Nabb, you're the man, but can you elaborate how you saw that?

Gary 2010-07-24 15:23:28

Answer 2

A:

You could just stick them all into a multiset. That will handle the sorting for you.

frankc 2010-07-23 20:08:14

This is a very bad idea. First of all multisets are slow, so you might as well copy all the lists in another vector and std::sort that vector, which will be much faster. Second, this is only better than merging if `log totalElements < numberOfLists`.

IVlad 2010-07-23 20:11:04

Who cares if its "slow"? Why would you assume it needs to be blazing fast? Nothing in the question said anything about this being for high frequency trading, or the input vectors being many billions of elements. That is blatant premature optimization.

frankc 2010-07-23 20:32:46

@user275455 I believe the point is that simply concatenating all the vectors into a single vector and then sorting that would be just as simple and faster than the multiset.

Mark B 2010-07-23 20:43:29

And that would be just as good? So what?

frankc 2010-07-23 20:45:14

@userN: The user asks for an efficient implementation, we try to give him one. If he wanted a naive implementation he wouldn't have to ask us.

Carl Smotricz 2010-07-24 15:30:45

Answer 3

A:

If you are merging very many vector together, then you could speed up performance by using a sort of tree to determine which vector contains the smallest element. This is probably not necessary for your application, but comment if it is and I'll try to work it out.

Thom Smith 2010-07-23 20:12:31

Actually, a heap is what you're looking for.

Nick Johnson 2010-07-23 23:51:15

Explained much better in kunigami's answer.

Carl Smotricz 2010-07-24 15:29:42

Answer 4

A:

Perhaps I'm misunderstanding the question...and I feel like I'm misunderstanding your solution.

That said, maybe this answer is totally off-base and not helpful.

But, especially with the number of vectors and push_back's you're already using, why do you not just use std::sort?

#include <algorithm>
void mergeAll(const vector<vector<int>> &origList, vector<int> &resultList)
{
    for(int i = 0; i < origList.size(); ++i)
    {
        resultList.insert(resultList.end(), origList[i].begin(), origList[i].end());
    }
    std::sort(resultList.begin(), resultList.end());
}

I apologize if this is totally off from what you're looking for. But it's how I understood the problem and the solution.

std::sort runs in O(N log (N)) http://www.cppreference.com/wiki/stl/algorithm/sort

KevenK 2010-07-23 20:23:04

I'm not keen on sorting. Merging is what I'm keen on. Wondering, if there is a faster way to merge the elements.

2010-07-23 20:30:46

Merge Sort runs an average `O(N log (N))` as well. http://en.wikipedia.org/wiki/Merge_sort#AnalysisI don't see either solution being, performance wise, much different. However, complexity, stability and maintenance may be affected. I would simplify this step by using the `std` functions and continue looking elsewhere for my performance gains.

KevenK 2010-07-23 21:04:06

Answer 5

A:

If you want to take advantage of multi-threading then a fairly good solution would be to just merge 2 lists at a time.

ie suppose you have 9 lists.

merge list 0 with 1. merge list 2 with 3. merge list 4 with 5. merge list 6 with 7.

These can be performed concurrently.

Then:

merge list 0&1 with 2&3 merge list 4&5 with 6&7

Again these can be performed concurrently.

then merge list 0,1,2&3 with list 4,5,6&7

finally merge list 0,1,2,3,4,5,6&7 with list 8.

Job done.

I'm not sure on the complexity of that but it seems the obvious solution and DOES have the bonus of being multi-threadable to some extent.

Goz 2010-07-23 21:03:36

Answer 6

+2 A:

You can generalize Merge Sort algorithm and work with multiple pointers. Initially, all of them are pointing to the beginning of each array. You maintain these pointers sorted (by the values they point to) in a priority queue. In each step, you remove the smallest element in the heap in O(log n) (n is the number of arrays). You then output the element pointed by the extracted pointer. Now you increment this pointer in one position and if you didn't reach the end of the array, reinsert in the priority queue in O(log n). Proceed this way until the heap is not empty. If there are a total of m elements, the complexity is O(m log n). The elements are output in sorted order this way.

kunigami 2010-07-23 21:14:08

Sounds very good to me! I just verified that the performance of priority queues backed by heaps is really just log n.

Carl Smotricz 2010-07-24 15:28:09

Good one. Thanks

2010-07-24 18:47:41

Answer 7

A:

Consider the priority-queue implementation in this answer linked in a comment above: http://stackoverflow.com/questions/886178/merging-8-sorted-lists-in-c-which-algorithm-should-i-use/886203#886203

It's O(n lg m) time (where n = total number of items and m = number of lists).

Mark B 2010-07-24 14:21:48

Answer 8

A:

Its not entirely related to your problem, but I would that instead of loading the vector size into a seperate variables

int totalNumbers = listOfIntegers.size();

You just access it directly through the .size() function - from my understanding of the still it just returns a member anyway.

Tomas Cokis 2010-07-24 14:58:32

ansaurus

tags:

views:

answers:

Merge sorted arrays - Efficient solution

related questions