views:

178

answers:

2

how would i write an if statement that would find phone numbers and store them to a variable. Here is what i have so far but its not working.

if (preg_match('/^(?:(?:\+?1\s*(?:[.-]\s*)?)?(?:\(\s*([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9])\s*\)|([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]))\s*(?:[.-]\s*)?)?([2-9]1[02-9]|[2-9][02-9]1|[2-9][02-9]{2})\s*(?:[.-]\s*)?([0-9]{4})(?:\s*(?:#|x\.?|ext\.?|extension)\s*(\d+))?$
/', $buffer, $matches))
{
    $phonenumber = html_entity_decode($matches[1]);         
}
A: 

In the beginning, your searching for optional +1 and some greedy string. Changing

(?:[.-]\s*)?

to

(?:[.-]\d*?)?

should do the trick, but maybe there are even more problems in your regex.

Mikulas Dite
+1  A: 

Since you're using preg_match(), I'll assume you're using PHP. For phone numbers, because of their variability even in N.Am. (11,10 or 7 digits, varying or no separating characters, etc.) you may find a function like this easier to deal with than a regex:

function validphone(&$value) { //test for N.Am. phone number and reformat in standard format
    $valid=false;
    $area=NULL;
    $working=preg_replace('/\D/', '', $value); //only numbers left, no preceding zeros
    switch (strlen($working)) { //cases fall through from 11 to 7
        case 11: //e.g. 19024355764
            $working=stristr($working,'1'); //trims off 1st 1
        case 10: //e.g. 9024355764
            $area=substr($working,0,-7);
            $working=substr($working,3); //trims off 1st 3
        case 7: //e.g. 4355764
            $value=implode('-',array($area,substr($working,0,-4),substr($working,-4)));
            $valid=true;
            break;
        default:
            $valid=false;
            break;
    }
    return $valid;

}

ETA your questions in the comments:

You have a string that should be a phone number

$phonish='blahblah#._foo(123)4567890 ixlybob';
if(validphone($phonish)){ //function checks if $phonish is valid & reformats it in a standard way
  //do something with $phonish, which now equals '123-456-7890'
} else {
  echo 'not a valid phone number';
}

The validphone() function is most appropriate for shortish strings that are expected to be phone numbers. If you dump an entire page into a string and then feed it to validphone($mywholepage), it will extract all the numbers in the string at once. So text with multiple phone numbers will return false and text that happens to have 11,10 or 7 digits distributed throughout will return true.

dnagirl
Still no idea. If i add that to code, how do I return what it gets to a variable
Kirk
im scraping html files
Kirk
@Kirk: the function is pass by reference, so the original `$value` is substituted with the reformatted one, if it exists and a boolean is returned so you know if it worked. Call it like this `if(validphone($phonish)) //do something`. In terms of how you determine what should be in `$phonish`, it sounds like you are committing the cardinal sin of parsing HTML with regexes. Don't. There are better and easier options. Just search SO for HTML and regex- it's one of the most popular questions out there.
dnagirl
I don't know how to call this function properly. Could you help me?
Kirk