views:

189

answers:

3

Hi,

In another question I learned how to calculate straight poker hand using regex (here).

Now, by curiosity, the question is: can I use regex to calculate the same thing, using ASCII CODE?

Something like:

regex: [C][C+1][C+2][C+3][C+4], being C the ASCII CODE (or like this)

Matches: 45678, 23456

Doesn't matches: 45679 or 23459 (not in sequence)

+3  A: 

Something like regex: [C][C+1][C+2][C+3][C+4], being C the ASCII CODE (or like this)

You can not do anything remotely close to this in most regex flavors. This is simply not the kinds of patterns that regex is designed for.

There is no mainstream regex pattern that will succintly match any two consecutive characters that differ by x in their ASCII encoding.


For instructional purposes...

Here you go (see also on ideone.com):

    String alpha = "ABCDEFGHIJKLMN";
    String p = alpha.replaceAll(".(?=(.))", "$0(?=$1|\\$)|") + "$";

    System.out.println(p);
    // A(?=B|$)|B(?=C|$)|C(?=D|$)|D(?=E|$)|E(?=F|$)|F(?=G|$)|G(?=H|$)|
    // H(?=I|$)|I(?=J|$)|J(?=K|$)|K(?=L|$)|L(?=M|$)|M(?=N|$)|N$

    String p5 = String.format("(?:%s){5}", p);

    String[] tests = {
        "ABCDE",    // true
        "JKLMN",    // true
        "AAAAA",    // false
        "ABCDEFGH", // false
        "ABCD",     // false
        "ACEGI",    // false
        "FGHIJ",    // true
    };
    for (String test : tests) {
        System.out.printf("[%s] : %s%n",
            test,
            test.matches(p5)
        );
    }

This uses meta-regexing technique to generate a pattern. That pattern ensures that each character is followed by the right character (or the end of the string), using lookahead. That pattern is then meta-regexed to be matched repeatedly 5 times.

You can substitute alpha with your poker sequence as necessary.

Note that this is an ABSOLUTELY IMPRACTICAL solution. It's much more readable to e.g. just check if alpha.contains(test) && (test.length() == 5).

Related questions

polygenelubricants
But would be nice if regex have it, isn't?
Topera
@Topera: would be nice if regex can play chess, too. But hey, at least regex can "swap boobs" (Pascal Thivent's words, not mine) http://stackoverflow.com/questions/3349814/how-do-i-put-a-value-of-an-array-into-another-one/3350191#3350191
polygenelubricants
@polygenelubricants: 2-3-4-5-6 it's a pattern (ascii code +1). Play chess isn't. :)
Topera
@polygenelubricants: if someday I find this in regex, would you pay me a coke? Huhauhaa? (I accept bounty too man!)
Topera
@Topera: You've challenged me, so see if my latest answer is up to your challenge.
polygenelubricants
@polygenelubricants: +1 - meta-regexing is cool! But still isn't the answer. Maybe it doesn't exist really.... :(
Topera
@polygenelubricants: by the way: great answer!
Topera
+4  A: 

Topera, your main problem is really going to be that you're not using ASCII encodings for your hands, you're using numerics and non-consecutive, non-ordered characters for the face cards. You need to detect, at the start of the strings, 2345A, 23456, 34567, ..., 6789T, 789TJ, 89TJQ, 9TJQK and TJQKA.

These are not consecutive ASCII codes and, even if they were, you would run into problems since both A2345 and TJQKA are valid and you won't get A being both less than and greater than the other characters in the same character set.

If it has to be done by a regex, then the regex segment I gave for your other question:

(2345A|23456|34567|45678|56789|6789T|789TJ|89TJQ|9TJQK|TJQKA)

is probably the easiest and most readable one you'll get.

paxdiablo
@pax - tks. I know that i need consecutive and ordered chars.So A23456789TJQKA may be ABCDEFGHIJKLMN.
Topera
But your 'A' can't be two positions at once, can it?
dash-tom-bang
+3  A: 

There is no regex that will do what you want as the other answers have pointed out, but you did say that you want to learn regex, so here's another meta-regex approach that may be instructional.

Here's a Java snippet that, given a string, programmatically generate the pattern that will match any substring of that string of length 5.

    String seq = "ABCDEFGHIJKLMNOP";
    System.out.printf("^(%s)$",
        seq.replaceAll(
            "(?=(.{5}).).",
            "$1|"
        )
    );

The output is (as seen on ideone.com):

^(ABCDE|BCDEF|CDEFG|DEFGH|EFGHI|FGHIJ|GHIJK|HIJKL|IJKLM|JKLMN|KLMNO|LMNOP)$

You can use this to conveniently generate the regex pattern to match straight poker hands, by initializing seq as appropriate.


How it works

. metacharacter matches "any" character (line separators may be an exception depending on the mode we're in).

The {5} is an exact repetition specifier. .{5} matches exactly 5 ..

(?=…) is positive lookahead; it asserts that a given pattern can be matched, but since it's only an assertion, it doesn't actually make (i.e. consume) the match from the input string.

Simply (…) is a capturing group. It creates a backreference that you can use perhaps later in the pattern, or in substitutions, or however you see fit.

The pattern is repeated here for convenience:

     match one char
        at a time
           |
(?=(.{5}).).
\_________/
 must be able to see 6 chars ahead
 (capture the first 5)

The pattern works by matching one character . at a time. Before that character is matched, however, we assert (?=…) that we can see a total of 6 characters ahead (.{5})., capturing (…) into group 1 the first .{5}. For every such match, we replace with $1|, that is, whatever was captured by group 1, followed by the alternation metacharacter.

Let's consider what happens when we apply this to a shorter String seq = "ABCDEFG";. The denotes our current position.

=== INPUT ===                                    === OUTPUT ===

 A B C D E F G                                   ABCDE|BCDEFG
↑
We can assert (?=(.{5}).), matching ABCDEF
in the lookahead. ABCDE is captured.
We now match A, and replace with ABCDE|

 A B C D E F G                                   ABCDE|BCDEF|CDEFG
  ↑
We can assert (?=(.{5}).), matching BCDEFG
in the lookahead. BCDEF is captured.
We now match B, and replace with BCDEF|

 A B C D E F G                                   ABCDE|BCDEF|CDEFG
    ↑
Can't assert (?=(.{5}).), skip forward

 A B C D E F G                                   ABCDE|BCDEF|CDEFG
      ↑
Can't assert (?=(.{5}).), skip forward

 A B C D E F G                                   ABCDE|BCDEF|CDEFG
        ↑
Can't assert (?=(.{5}).), skip forward

       :
       :

 A B C D E F G                                   ABCDE|BCDEF|CDEFG
              ↑
Can't assert (?=(.{5}).), and we are at
the end of the string, so we're done.

So we get ABCDE|BCDEF|CDEFG, which are all the substrings of length 5 of seq.

References

polygenelubricants
Poly, tks a lot for your explanation.
Topera
Wow, good effort put into teaching!
LarsH