tags:

views:

152

answers:

2

I'm trying build a regex that will replace any characters not of the format:

any number of digits, then optional (single decimal point, any number of digits)

i.e.
123            // 123
123.123        // 123.123
123.123.123a   // 123.123123
123a.123       // 123.123

I am using ereg_replace in php and the closest to a working regex i have managed is

ereg_replace("[^.0-9]+", "", $data);

which is almost what i need (apart from it will allow any number of decimal points)

i.e.
123.123.123a    // 123.123.123

my next attempt was

ereg_replace("[^0-9]+([^.]?[^0-9]+)?", "", $data);

which was meant to translate as
[^0-9]+        // any number of digits, followed by
(              // start of optional segment
  [^.]?        // decimal point (0 or 1 times) followed by
  [^0-9]+      // any number of digits
)              // end of optional segment
?              // optional segment to occur 0 or 1 times

but this just seems to allow any number of digits and nothing else.

Please help

Thanks

+4  A: 

Try these steps:

  1. remove any character except 0-9 and .
  2. remove any . behind the first decimal point.

Here’s a implementation with regular expressions:

$str = preg_replace('/[^0-9.]+/', '', $str);
$str = preg_replace('/^([0-9]*\.)(.*)/e', '"$1".str_replace(".", "", "$2")', $str);
$val = floatval($str);

And another one with just one regular expression:

$str = preg_replace('/[^0-9.]+/', '', $str);
if (($pos = strpos($str, '.')) !== false) {
    $str = substr($str, 0, $pos+1).str_replace('.', '', substr($str, $pos+1));
}
$val = floatval($str);
Gumbo
Thankyou, both work like a charm - out of interest can you explain the "$1" and "$2" - are these backtrace's from the regex ?? (+1 upvote)
JimmyJ
Yes, `$1` and `$2` are references to the match of the `/^([0-9]*\.)(.*)/` pattern. The `e` flag is to handle the replacement as expression to be executed and to use the return value as replacement. So `'"$1".str_replace(".", "", "$2")'` is evaluated and `$1` and `$2` are replaced with the matched values before.
Gumbo
Thanks again, i think i need to go to regex school ;)
JimmyJ
Use the second example if you don’t unterstand regular expressions very well. Both do the same thing.
Gumbo
+1  A: 

This should be faster, actually. And it is way more readable. ;-)

$s = preg_replace('/[^.0-9]/', '', '123.123a.123');
if (1 < substr_count($s, '.')) {
    $a = explode('.', $s);
    $s = array_shift($a) . '.' . implode('', $a);
}
Philippe Gerber
Thankyou, this also works well (+1 upvote)
JimmyJ