tags:

views:

441

answers:

11

So I'm working on a project that will allow users to enter poker hand histories from sites like PokerStars and then display the hand to them.

It seems that regex would be a great tool for this, however I rank my regex knowledge at "slim to none".

So I'm using PHP and looping through this block of text line by line and on lines like this:

Seat 1: fabulous29 (835 in chips)

Seat 2: Nioreh_21 (6465 in chips)

Seat 3: Big Loads (3465 in chips)

Seat 4: Sauchie (2060 in chips)

I want to extract seat number, name, & chip count so the format is

Seat [number]: [letters&numbers&characters] ([number] in chips)

I have NO IDEA where to start or what commands I should even be using to optimize this.

Any advice is greatly appreciated - even if it is just a link to a tutorial on PHP regex or the name of the command(s) I should be using.

Cheers!

+2  A: 

Look at the PCRE section in the PHP Manual. Also, http://www.regular-expressions.info/ is a great site for learning regex. Disclaimer: Regex is very addictive once you learn it.

Bill Williams
+4  A: 

I'm not entirely sure what exactly to use for that without trying it, but a great tool I use all the time to validate my RegEx is RegExr which gives a great flash interface for trying out your regex, including real time matching and a library of predefined snippets to use. Definitely a great time saver :)

Adam Haile
+2  A: 

I always use the preg_ set of function for REGEX in PHP because the PERL-compatible expressions have much more capability. That extra capability doesn't necessarily come into play here, but they are also supposed to be faster, so why not use them anyway, right?

For an expression, try this:

/Seat (\d+): ([^ ]+) \((\d+)/

You can use preg_match() on each line, storing the results in an array. You can then get at those results and manipulate them as you like.

EDIT:

Btw, you could also run preg_match_all on the entire block of text (instead of looping through line-by-line) and get the results that way, too.

Brian Warshaw
A: 

Thanks that RegExr link was fantastic

Andrew G. Johnson
A: 

Seat [number]: [letters&numbers&characters] ([number] in chips)

Your Regex should look something like this

Seat (\d+): ([a-zA-Z0-9]+) ((\d+) in chips)

The brackets will let you capture the seat number, name and number of chips in groups.

Kibbee
A: 

you'll have to split the file by linebreaks, then loop thru each line and apply the following logic

$seat = 0;
$name = 1;
$chips = 2;

foreach( $string in $file ) {
  if (preg_match("Seat ([1-0]): ([A-Za-z_0-9]*) \(([1-0]*) in chips\)", $string, $matches)) {
    echo "Seat: " . $matches[$seat] . "<br>";
    echo "Name: " . $matches[$name] . "<br>";
    echo "Chips: " . $matches[$chips] . "<br>";
  }
}

I haven't ran this code, so you may have to fix some errors...

Roy Rico
A: 

Here's what I'm currently using:

preg_match("/(Seat \d+: [A-Za-z0-9 _-]+) \((\d+) in chips\)/",$line)
Andrew G. Johnson
+1  A: 

Check out preg_match. Probably looking for something like...

<?php
$str = 'Seat 1: fabulous29 (835 in chips)';
preg_match('/Seat (?<seatNo>\d+): (?<name>\w+) \((?<chipCnt>\d+) in chips\)/', $str, $matches);
print_r($matches);
?>

It's been a while since I did php, so this *could be a little or a lot off.*

Joel Meador
+3  A: 

Something like this might do the trick:

/Seat (\d+): ([^\(]+) \((\d+)in chips\)/

And some basic explanation on how Regex works:

  • \d = digit.

  • \<character> = escapes character, if not part of any character class or subexpression. for example:

    \t would render a tab, while \\t would render "\t" (since the backslash is escaped).

  • + = one or more of the preceding element.

  • * = zero or more of the preceding element.

  • [ ] = bracket expression. Matches any of the characters within the bracket. Also works with ranges (ex. A-Z).

  • [^ ] = Matches any character that is NOT within the bracket.

  • ( ) = Marked subexpression. The data matched within this can be recalled later.

Anyway, I chose to use

([^\(]+)

since the example provides a name containing spaces (Seat 3 in the example). what this does is that it matches any character up to the point that it encounters an opening paranthesis. This will leave you with a blank space at the end of the subexpression (using the data provided in the example). However, his can easily be stripped away using the trim() command in PHP.

If you do not want to match spaces, only alphanumerical characters, you could so something like this:

([A-Za-z0-9-_]+)

Which would match any letter (within A-Z, both upper- & lower-case), number as well as hyphens and underscores.

Or the same variant, with spaces:

([A-Za-z0-9-_\s]+)

Where "\s" is evaluated into a space.

Hope this helps :)

Andy
\((\d+)in chips\) should have a space like \((\d+) in chips\)
OIS
A: 

http://us3.php.net/manual/en/function.sscanf.php

     

Kevin
A: 

To process the whole input string at once, use preg_match_all()

preg_match_all('/Seat (\d+): \w+ \((\d+) in chips\)/', $preg_match_all, $matches);

For your input string, var_dump of $matches will look like this:

array
  0 => 
    array
      0 => string 'Seat 1: fabulous29 (835 in chips)' (length=33)
      1 => string 'Seat 2: Nioreh_21 (6465 in chips)' (length=33)
      2 => string 'Seat 4: Sauchie (2060 in chips)' (length=31)
  1 => 
    array
      0 => string '1' (length=1)
      1 => string '2' (length=1)
      2 => string '4' (length=1)
  2 => 
    array
      0 => string '835' (length=3)
      1 => string '6465' (length=4)
      2 => string '2060' (length=4)

On learning regex: Get Mastering Regular Expressions, 3rd Edition. Nothing else comes close to the this book if you really want to learn regex. Despite being the definitive guide to regex, the book is very beginner friendly.

Imran