views:

167

answers:

6

Edit: to clarify, the problem is with the second algorithm.

I have a bit of C++ code that samples cards from a 52 card deck, which works just fine:

void sample_allcards(int table[5], int holes[], int players) {
    int temp[5 + 2 * players];
    bool try_again;
    int c, n, i;

    for (i = 0; i < 5 + 2 * players; i++) {
        try_again = true;
        while (try_again == true) {
            try_again = false;
            c = fast_rand52();
            // reject collisions
            for (n = 0; n < i + 1; n++) {
                try_again = (temp[n] == c) || try_again;
            }
            temp[i] = c;
        }
    }
    copy_cards(table, temp, 5);
    copy_cards(holes, temp + 5, 2 * players);
}

I am implementing code to sample the hole cards according to a known distribution (stored as a 2d table). My code for this looks like:

void sample_allcards_weighted(double weights[][HOLE_CARDS], int table[5], int holes[], int players) {
    // weights are distribution over hole cards
    int temp[5 + 2 * players];
    int n, i;

    // table cards
    for (i = 0; i < 5; i++) {
        bool try_again = true;
        while (try_again == true) {
            try_again = false;
            int c = fast_rand52();
            // reject collisions
            for (n = 0; n < i + 1; n++) {
                try_again = (temp[n] == c) || try_again;
            }
            temp[i] = c;
        }
    }

    for (int player = 0; player < players; player++) {
        // hole cards according to distribution
        i = 5 + 2 * player;
        bool try_again = true;
        while (try_again == true) {
            try_again = false;
            // weighted-sample c1 and c2 at once
            // h is a number < 1325
            int h = weighted_randi(&weights[player][0], HOLE_CARDS);
            // i2h uses h and sets temp[i] to the 2 cards implied by h
            i2h(&temp[i], h);
            // reject collisions
            for (n = 0; n < i; n++) {
                try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]) || try_again;
            }
        }
    }

    copy_cards(table, temp, 5);
    copy_cards(holes, temp + 5, 2 * players);
}

My problem? The weighted sampling algorithm is a factor of 10 slower. Speed is very important for my application.

Is there a way to improve the speed of my algorithm to something more reasonable? Am I doing something wrong in my implementation?

Thanks.

edit: I was asked about this function, which I should have posted, since it is key

inline int weighted_randi(double *w, int num_choices) {
double r = fast_randd();
double threshold = 0;
int n;

for (n = 0; n < num_choices; n++) {
    threshold += *w;
    if (r <= threshold) return n;
    w++;
}
// shouldn't get this far
cerr << n << "\t" << threshold << "\t" << r << endl;
assert(n < num_choices);
return -1;

}

...and i2h() is basically just an array lookup.

+1  A: 

Your reject collisions are turning an O(n) algorithm into (I think) an O(n^2) operation.

There are two ways to select cards from a deck: shuffle and pop, or pick sets until the elements of the set are unique; you are doing the latter which requires a considerable amount of backtracking.

I didn't look at the details of the code, just a quick scan.

msw
Indeed, `std::random_shuffle` should be both simpler and unbiased (as far as the underlying random number generator - `rand() % x` by default? - is unbiased).
visitor
Thanks; this is a good observation of the first algo. I may well incorporate your insight. However it totally avoids the question; how do I weight the selection of cards?
supert
Just to clarify, I'm using a Mersenne twister RNG, not rand(), but I could implement my own shuffle of course.
supert
Using random_shuffle resulted in a ~30% improvement for the uniformly sampled code (the first routine).
supert
A: 

My guess would be the memcpy(1326*sizeof(double)) within the retry-loop. It doesn't seem to change, so should it be copied each time?

stefaanv
Good spot, thanks, not sure how I missed that. I just tried moving it out of the loop, it didn't help much.
supert
Did you move it out of both nested loops (for and while) or is that not possible?
stefaanv
I eliminated the memcpy by doing simplyint h = weighted_randi(
supert
Too bad that didn't make a difference (but it did simplify the code a bit). My best bet would be to also hava a look at weighted_randi() and i2h(). When nothing apparent can be changed, measure (profiling or as suggested by Mike)
stefaanv
+1  A: 

you could gain some speed by replacing the all the loops that check if a card is taken with a bit mask, eg for a pool of 52 cards, we prevent collisions like so:

DWORD dwMask[2] = {0}; //64 bits
//...
int nCard;
while(true)
{
    nCard = rand_52();
    if(!(dwMask[nCard >> 5] & 1 << (nCard & 31)))
    {
        dwMask[nCard >> 5] |= 1 << (nCard & 31);
        break;
    }
}
//...
Necrolis
A: 

Rather than tell you what the problem is, let me suggest how you can find it. Either 1) single-step it in the IDE, or 2) randomly halt it to see what it's doing.

That said, sampling by rejection, as you are doing, can take an unreasonably long time if you are rejecting most samples.

Mike Dunlavey
A: 

Your inner "try_again" for loop should stop as soon as it sets try_again to true - there's no point in doing more work after you know you need to try again.

for (n = 0; n < i && !try_again; n++) {
    try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]);
}
Mark B
A: 

Answering the second question about picking from a weighted set also has an algorithmic replacement that should be less time complex. This is based on the principle of that which is pre-computed does not need to be re-computed.

In an ordinary selection, you have an integral number of bins which makes picking a bin an O(1) operation. Your weighted_randi function has bins of real length, thus selection in your current version operates in O(n) time. Since you don't say (but do imply) that the vector of weights w is constant, I'll assume that it is.

You aren't interested in the width of the bins, per se, you are interested in the locations of their edges that you re-compute on every call to weighted_randi using the variable threshold. If the constancy of w is true, pre-computing a list of edges (that is, the value of threshold for all *w) is your O(n) step which need only be done once. If you put the results in a (naturally) ordered list, a binary search on all future calls yields an O(log n) time complexity with an increase in space needed of only sizeof w / sizeof w[0].

msw