ansaurus

Question

Splitting a double vector into equal parts

Answer 1

+2 A:

andand 2010-05-04 13:25:22

This seems incorrect if one looks at the wikipedia entry: in the wikipedia entry the list is sorted (decreasing) and the partitioning involves finding a split point within this list, putting the highest elements in one set and the lowest in another... though the wikipedia example seems suboptimal in this regard.

Matthieu M. 2010-05-04 18:17:08

This is the greedy algorithm that they mention and briefly describe.

andand 2010-05-04 21:02:24

@Matthieu: The split isn't a pivot split but rather a partitioning of the data; elements 1 and 3 can go to sublist 1 while elements 2 and 4 can go to sublist 2

Jamie Cook 2010-05-12 08:54:26

Answer 2

+3 A:

This is a Partition problem which is known to be NP-Complete, so no polynomial-time algorithm exists in general. However, the problem becomes easier when the sizes of the elements in the set are bounded. Above link to Wikipedia has quite a nice section on approximation algorithms (when you need a "good-enough" solution).

Krystian 2010-05-04 13:47:51

Answer 3

A:

If you wanted to use std algorithms and lambdas you could do the following

void splitProbabilityVector(std::vector<double>& data, std::vector<double>& rightHandSplit)
{
    double s1=0.0, s2=0.0;
    auto bound = std::stable_partition(data.begin(), data.end(), [&](double e) -> bool
    {
        if (abs(e + s1 - s2) < abs(e + s2 - s1))
        { s1 += e; return true;}
        else
        { s2 += e; return false; }
    });

    rightHandSplit.assign(bound, data.end());
    data.resize(bound-data.begin());
}

which should be quite performant. Just out of curiosity, why are you using this algorithm when on the wiki page you linked it states:

For this reason, Shannon–Fano is almost never used; Huffman coding is almost as computationally simple and produces prefix codes that always achieve the lowest expected code word length.

Jamie Cook 2010-05-12 10:20:35

Well the assignment was to analyze a text and compare the results yielded by both methods, to compute the medium code length for each one

Cosmin 2010-06-02 08:42:26

Ah! comparison... great stuff - I loved your use of c++0x in your questions example code btw.

Jamie Cook 2010-06-03 06:59:16

ansaurus

tags:

views:

answers:

Splitting a double vector into equal parts

related questions