tags:

views:

1458

answers:

3

Hi,

I have a regexp I'm using with sed, but now I need to make it work in PHP also. I can't use system calls as they are disabled.

$ cat uglynumber.txt:
Ticket number : 303905694, FOO:BAR:BAR: Some text
Case ID:123,456,789:Foobar - Some other text
303867970;[FOOBAR] Some text goes here
Case Ref: 303658850 - Some random text here - host.tld #78854w
$ cat uglynumbers.txt | sed "s/[, ]//g;s/.*\([0-9]\{9\}\).*/\1/g"
303905694
123456789
303867970
303658850

So, how to do the same with PHP?

I found one example like this, but I can't inject that regexp into that.

if (preg_match("/.../", $line, $matches)) {
  echo "Match was found";
  echo $matches[0];
}

Please help.

PROBLEM SOLVED! Thank you!

-Samuli

A: 

Try using preg_replace() instead of preg_match(). grep is to sed what preg_match is to preg_replace.

Adam Rosenfield
+1  A: 

Hi,

preg_replace is the function you are looking for. You can pass arrays as pattern and replace parameters

$pattern = array('/[, ]/g','/.*\([0-9]\{9\}\).*/g');
$replace = array("","\\1");

foreach($lines as $line) {
   $newlines[] = preg_replace($pattern, $replace, $line);
}
Czimi
+1  A: 

Your specific SED example is obviously 2 regular expressions, 1 being replacing the commas, and one being technically grabbing the 9 digit continuous numbers.

The first half of your SED string is best fit with the preg_replace() function.

//`sed s/regex/replace_value/flags`

preg_replace('/regex/flags', 'replace_value', $input);

The second half of your SED string would be a preg_match_all():

//`sed ...;s/regex/\1/flags`

$matches_array = array();
preg_match_all('/regex/flags', $input, &$matches_array);

So your specific code will look something like:

<?php
$input = file_get_contents('uglynumbers.txt');

$input = preg_replace('/[, ]/m','', $input);

$matches = array();
//No need for the .* or groupings, just match all occurrences of [0-9]{9}
if( preg_match_all('/[0-9]{9}/m', $input, $matches) )
{
    //...
    var_dump($matches);
}


It looks like 'g' is an SED modifier meaning match all lines. preg_match_all() should already takes care of this modifier but 'm' seems like an appropriate replacement as per the manual on PCRE modifiers.

dcousineau
I have this one problem with every test:Warning: preg_replace() [function.preg-replace]: Unknown modifier 'g' in ...
boogie
@boogie: http://php.net/manual/en/reference.pcre.pattern.modifiers.php I don't know SED that well so find the appropriate modifiers there
dcousineau
@boogie: it looks like 'g' means match all in SED, in this case you don't need it, though you might want to throw down an 'm' for multiline
dcousineau