views:

449

answers:

7

This problem sounds simple at first glance, but turns out to be a lot more complicated than it seems. It's got me stumped for the moment.

There are 52c5 = 2,598,960 ways to choose 5 cards from a 52 card deck. However, since suits are interchangeable in poker, many of these are equivalent - the hand 2H 2C 3H 3S 4D is equivalent to 2D 2S 3D 3C 4H - simply swap the suits around. According to wikipedia, there are 134,459 distinct 5 card hands once you account for possible suit recolorings.

The question is, how do we efficiently generate all these possible hands? I don't want to generate all hands, then eliminate duplicates, as I want to apply the problem to larger numbers of cards, and the number of hands to evaluate fast spirals out of control. My current attempts have centered around either generating depth-first, and keeping track of the currently generated cards to determine what suits and ranks are valid for the next card, or breadth-first, generating all possible next cards, then removing duplicates by converting each hand to a 'canonical' version by recoloring. Here's my attempt at a breadth-first solution, in Python:

# A card is represented by an integer. The low 2 bits represent the suit, while
# the remainder represent the rank.
suits = 'CDHS'
ranks = '23456789TJQKA'

def make_canonical(hand):
  suit_map = [None] * 4
  next_suit = 0
  for i in range(len(hand)):
    suit = hand[i] & 3
    if suit_map[suit] is None:
      suit_map[suit] = next_suit
      next_suit += 1
    hand[i] = hand[i] & ~3 | suit_map[suit]
  return hand

def expand_hand(hand, min_card):
  used_map = 0
  for card in hand:
    used_map |= 1 << card

  hands = set()
  for card in range(min_card, 52):
    if (1 << card) & used_map:
      continue
    new_hand = list(hand)
    new_hand.append(card)
    make_canonical(new_hand)
    hands.add(tuple(new_hand))
  return hands

def expand_hands(hands, num_cards):
  for i in range(num_cards):
    new_hands = set()
    for j, hand in enumerate(hands):
      min_card = hand[-1] + 1 if i > 0 else 0
      new_hands.update(expand_hand(hand, min_card))
    hands = new_hands
  return hands

Unfortunately, this generates too many hands:

>>> len(expand_hands(set([()]), 5))
160537

Can anyone suggest a better way to generate just the distinct hands, or point out where I've gone wrong in my attempt?

A: 

Look at Pokersource. The problem gets even worse when you're considering completing hands given some cards already drawn.

The guy behind PokerStove did a great job in this direction, but the source is disclosed.

Alexandre C.
Can you be more specific? I've looked all over the place, and have yet to find anyone enumerating only distinct hands. Pokersource has a lot of material.
Nick Johnson
@Nick: I'm at work and the proxy categorizes pokersource as "gambling"... more info as soon as I get home.
Alexandre C.
+1  A: 

I'm not a poker player, so the details of hand precedence are beyond me. But it seems like the problem is that you are traversing the space of "sets of 5 cards" by generating sets from the deck, when you should be traversing the space of "distinct poker hands".

The space of distinct hands will require a new grammar. The important thing is to capture exactly the information that is relevant to hand precedence. For example, there are only 4 hands that are royal flushes, so those hands can be described as the symbol "RF" plus a suit designator, like "RFC" for royal flush in clubs. A 10-high heart flush could be "FLH10" (not sure if there are other precedence characteristics of flushes, but I think that's all you need to know). A hand that is "2C 2S AH 10C 5D" would be a longer expression, something like "PR2 A 10 5" if I undestand your initial problem statement.

Once you have defined the grammar of distinct hands, you can express it as regular expressions and that will tell you how to generate the entire space of distinct hands. Sounds like fun!

Bill Gribble
You misunderstand - I'm not trying to enumerate outcomes (pair, two pair, etc etc), I'm trying to enumerate all distinct sets of 5 cards, modulo reassignment of suits. Ranking the hands is a completely separate issue. AC AD 2C 3C 4C and AC AD 2H 3H 4H rank the same, but they're different hands, because you can't convert one to the other by swapping suits.
Nick Johnson
I see. In any case, the same technique applies, but my examples are no good. First, identify the grammar of unique items in the space you want to traverse; write it as regular expressions; then traverse the space by expanding the regular expressions.
Bill Gribble
Certainly - but finding out a way to enumerate the unique elements is the whole point of my question.
Nick Johnson
A: 

Generating equivalence classes for 5 card hands is not an easy task. When I need this I usually use the http://www.vpgenius.com/ webpage. At http://www.vpgenius.com/video-poker/games/ you can choose which variety of poker game you need, and in the "Programming tab" you have an section on "Unique Suit Patterns". So just copying that and loading into program might be easier than trying to generate your own.

Rok
This is all for video poker, though - quite distinct from simply generating unique hands.
Nick Johnson
If you use Jacks-or-better variety you have the same distinct hands.
Rok
So, this seems like a reasonable approach. I'd far rather see a description of how to generate all the suit equivalence classes, though - after that, generating all the hands in each class should be fairly straightforward.
Nick Johnson
+1  A: 

You could simply give all hands a canonical ordering of values (A to K), then assign abstract suit letters according to their order of first appearance in that order.

Example: JH 4C QD 9C 3D would convert to 3a 4b 9b Jc Qa.

Generation should work best as dynamic programming:

  • start with a set of a single hand that is empty,
  • make a new set:
    • for each hand in the old set, generate each possible hand by adding one of the remaining cards
    • canonicalize all new hands
    • remove duplicates
Svante
The procedure you describe is exactly what I'm doing in my snippet - but it still generates more hands than should exist, so I'm clearly doing something wrong.
Nick Johnson
I did not see any ordering procedure in your snippet, and the decision to do everything through bit-fiddling really does not benefit the code. My guess would be that you do not canonicalize cards of equal value properly, so that `3D 4H 5D 5H` can become both `3a 4b 5a 5b` and `3a 4b 5b 5a`.
Svante
The ordering procedure is expressed in the 'min_card' parameter to expand_hand - it won't generate ranks lower than that, thus ensuring it's sorted by rank. You appear to be right about the canonicalization, though - can you suggest a way to fix it, however?
Nick Johnson
+2  A: 

Your problem sounded interesting, so i simple tried to implements it by just looping over all possible hands in a sorted way. I've not looked at your code in details, but it seems my implementation is quite different from yours. Guess what count of hands my script found: 160537

  • My hands are always sorted, e.g. 2 3 4 4 D
  • If there are 2 equal cards, the color is also sorted (colors are just called 0,1,2,3)
  • the first card has always color 0, the second color 0 or 1
  • A card can only have the color of an previous card or the next bigger number, e.g. if card 1+2 have color 0, card three can only have the colors 0 or 1

Are you sure, the number on wikipedia is correct?

count = 0
for a1 in range(13):
    c1 = 0
    for a2 in range(a1, 13):
        for c2 in range(2):
            if a1==a2 and c1==c2:
                continue
            nc3 = 2 if c1==c2 else 3
            for a3 in range(a2, 13):
                for c3 in range(nc3):
                    if (a1==a3 and c1>=c3) or (a2==a3 and c2>=c3):
                        continue
                    nc4 = nc3+1 if c3==nc3-1 else nc3
                    for a4 in range(a3, 13):
                        for c4 in range(nc4):
                            if (a1==a4 and c1>=c4) or (a2==a4 and c2>=c4) or (a3==a4 and c3>=c4):
                                continue
                            nc5 = nc4+1 if (c4==nc4-1 and nc4!=4) else nc4
                            for a5 in range(a4, 13):
                                for c5 in range(nc5):
                                    if (a1==a5 and c1>=c5) or (a2>=a5 and c2>=c5) or (a3==a5 and c3>=c5) or (a4==a5 and c4>=c5):
                                        continue
                                    #print([(a1,c1),(a2,c2),(a3,c3),(a4,c4),(a5,c5)])
                                    count += 1
print("result: ",count)
Florianx
Fairly certain. This paper, Enumerating Starting Poker Hands (http://www.math.sfu.ca/~alspach/comp42.pdf) arrives at the same figure as Wikipedia using some combinadics I don't pretend to fully understand. (See the section on Five Card Draw)
Nick Johnson
I really like this coloring scheme, well thought!
Matthieu M.
I found a case for which I produce duplicates: (B0, B1, B2, B3, D0) and (B0, B1, B2, B3, D1) are equal
Florianx
+7  A: 

Your overall approach is sound. I'm pretty sure the problem lies with your make_canonical function. You can try printing out the hands with num_cards set to 3 or 4 and look for equivalencies that you've missed.

I found one, but there may be more:

# The inputs are equivalent and should return the same value
print make_canonical([8, 12 | 1]) # returns [8, 13]
print make_canonical([12, 8 | 1]) # returns [12, 9]

For reference, below is my solution (developed prior to looking at your solution). I used a depth-first search instead of a breadth-first search. Also, instead of writing a function to transform a hand to canonical form, I wrote a function to check if a hand is canonical. If it's not canonical, I skip it. I defined rank = card % 13 and suit = card / 13. None of those differences are important.

import collections

def canonical(cards):
    """
    Rules for a canonical hand:
    1. The cards are in sorted order

    2. The i-th suit must have at least many cards as all later suits.  If a
       suit isn't present, it counts as having 0 cards.

    3. If two suits have the same number of cards, the ranks in the first suit
       must be lower or equal lexicographically (e.g., [1, 3] <= [2, 4]).

    4. Must be a valid hand (no duplicate cards)
    """

    if sorted(cards) != cards:
        return False
    by_suits = collections.defaultdict(list)
    for suit in range(0, 52, 13):
        by_suits[suit] = [card%13 for card in cards if suit <= card < suit+13]
        if len(set(by_suits[suit])) != len(by_suits[suit]):
            return False
    for suit in range(13, 52, 13):
        suit1 = by_suits[suit-13]
        suit2 = by_suits[suit]
        if not suit2: continue
        if len(suit1) < len(suit2):
            return False
        if len(suit1) == len(suit2) and suit1 > suit2:
            return False
    return True

def deal_cards(permutations, n, cards):
    if len(cards) == n:
        permutations.append(list(cards))
        return
    start = 0
    if cards:
        start = max(cards) + 1
    for card in range(start, 52):
        cards.append(card)
        if canonical(cards):
            deal_cards(permutations, n, cards)
        del cards[-1]

def generate_permutations(n):
    permutations = []
    deal_cards(permutations, n, [])
    return permutations

for cards in generate_permutations(5):
    print cards

It generates the correct number of permutations:

Cashew:~/$ python2.6 /tmp/cards.py | wc
134459
Daniel Stutzbach
Thanks! Can you provide a brief description of how the canonical test works? I think I get the basic idea, but a description would be awesome.
Nick Johnson
@Nick: Sure, I've added a docstring to the `canonical` function in my answer.
Daniel Stutzbach
Great answer, thank you! Actually, there is a reason to generate the canonical hand rather than just test for it: some hands have more isomorphisms than others, and knowing how many there are is important in order to know how many 'real' hands each corresponds to.
Nick Johnson
A: 

Initial input:

H 0 0 0 0 0 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0 0 0 0 0 0
D 1 0 0 0 0 0 0 0 0 0 0 0 0
S 1 1 0 0 0 0 0 0 0 0 0 0 0
+ A 2 3 4 5 6 7 8 9 T J Q K

Step 1: for each rank greater than or equal the highest rank used, set all suits in that rank to 0. you can get away with only checking higher cards because lower combinations will be checked by the lower starting points.

H 0 0 0 0 0 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0 0 0 0 0 0
D 1 0 0 0 0 0 0 0 0 0 0 0 0
S 1 0 0 0 0 0 0 0 0 0 0 0 0
+ A 2 3 4 5 6 7 8 9 T J Q K

Step 2: Collapse to distinct rows

0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
A 2 3 4 5 6 7 8 9 T J Q K

Step 3: Climb back up determining first suit that match each distinct row, and choose the suits which match the distinct rows (identified by a *)

H 0 * 0 0 0 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0 0 0 0 0 0
D 1 * 0 0 0 0 0 0 0 0 0 0 0
S 1 1 0 0 0 0 0 0 0 0 0 0 0
+ A 2 3 4 5 6 7 8 9 T J Q K

Now showing the repeat for rank 3

H 0 0 0 0 0 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0 0 0 0 0 0
D 1 0 0 0 0 0 0 0 0 0 0 0 0
S 1 1 0 0 0 0 0 0 0 0 0 0 0
+ A 2 3 4 5 6 7 8 9 T J Q K

0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0
A 2 3 4 5 6 7 8 9 T J Q K

H 0 0 * 0 0 0 0 0 0 0 0 0 0
C 1 0 0 0 0 0 0 0 0 0 0 0 0
D 1 0 * 0 0 0 0 0 0 0 0 0 0
S 1 1 * 0 0 0 0 0 0 0 0 0 0
+ A 2 3 4 5 6 7 8 9 T J Q K

Step 4: Once there are 5 cells set to 1, increment the total possible suit abstracted hands count by 1 and recurse up.

I have posted code (albeit mangled due to my phone) in my other answer. This is the description of how it works. The total number of suit abstracted hands possible is 57,027.

NickLarsen
Your number disagrees with Wikipedia and the paper I linked to - you're definitely generating too few hands.
Nick Johnson
You're right, I was seeding it with the lowest card, once it is seeded with an empty hand, it produces 134,459.
NickLarsen