tags:

views:

126

answers:

3

My new phone does not recognize a phone number unless its area code matches the incoming call. Since I live in Idaho where an area code is not needed for in-state calls, many of my contacts were saved without an area code. Since I have thousands of contacts stored in my phone, it would not be practical to manually update them. I decided to write the following PHP script to handle the problem. It seems to work well, except that I'm finding duplicate area codes at the beginning of random contacts.

 <?php
//the script can take a while to complete
set_time_limit(200);

function validate_area_code($number) {
    //digits are taken one by one out of $number, and insert in to $numString
    $numString = "";
    for ($i = 0; $i < strlen($number); $i++) {
        $curr = substr($number,$i,1);
        //only copy from $number to $numString when the character is numeric
        if (is_numeric($curr)) {
            $numString = $numString . $curr;
        }
    }
    //add area code "208" to the beginning of any phone number of length 7
    if (strlen($numString) == 7) {
        return "208" . $numString;
    //remove country code (none of the contacts are outside the U.S.)
    } else if (strlen($numString) == 11) {
        return preg_replace("/^1/","",$numString);
    } else {
        return $numString;
    }
}
//matches any phone number in the csv
$pattern = "/((1? ?\(?[2-9]\d\d\)? *)? ?\d\d\d-?\d\d\d\d)/";
$csv = file_get_contents("contacts2.CSV");
preg_match_all($pattern,$csv,$matches);


foreach ($matches[0] as $key1 => $value) {
    /*create a pattern that matches the specific phone number by adding slashes before possible special characters*/
    $pattern = preg_replace("/\(|\)|\-/","\\\\$0",$value);

    //create the replacement phone number
    $replacement = validate_area_code($value);

    //add delimeters 
    $pattern = "/" . $pattern . "/";

    $csv = preg_replace($pattern,$replacement,$csv);
}
echo $csv;

?>

Is there a better approach to modifying the CSV? Also, is there a way to minimize the number of passes over the CSV? In the script above, preg_replace is called thousands of times on a very large String.

A: 

Ah programs... sometimes a 10-min hack is better.
If it were me... I'd import the CSV into Excel, sort it by something - maybe the length of the phone number or something. Make a new col for the fixed phone number. When you have a group of similarly-fouled numbers, make a formula to fix. Same for the next group. Should be pretty quick, no? Then export to .csv again, omitting the bad col.

Chris Thornton
Thanks for the advice Chris. I may do that; however I've made it a point to use programming as often as I can to solve real-world problems. While I'm partially interested in solving the problem, I'm more interested in doing so with code, assuming that it might help me learn.
Hurpe
I completely understand - it would be a good excercise. Being PHP, if you come up with a generic tool, you could put up a "fix my phonebook" web app. That would be pretty nice.
Chris Thornton
+2  A: 

If I understand you correctly, you just need to prepend the area code to any 7-digit phone number anywhere in this file, right? I have no idea what kind of system you're on, but if you have some decent tools, here are a couple options. And of course, the approaches they take can presumably be implemented in PHP; that's just not one of my languages.

So, how about a sed one-liner? Just look for 7-digit phone numbers, bounded by either beginning of line or comma on the left, and comma or end of line on the right.

sed -r 's/(^|,)([0-9]{3}-[0-9]{4})(,|$)/\1208-\2\3/g' contacts.csv

Or if you want to only apply it to certain fields, perl (or awk) would be easier. Suppose it's the second field:

perl -F, -ane '$"=","; $F[1]=~s/^[0-9]{3}-[0-9]{4}$/208-$&/; print "@F";' contacts.csv

The -F, indicates the field separator, the $" is the output field separator (yes, it gets assigned once per loop, oh well), the arrays are zero-indexed so second field is $F[1], there's a run-of-the-mill substitution, and you print the results.

Jefromi
A: 

A little more digging on my own revealed the issues with the regex in my question. The problem is with duplicate contacts in the csv.

Example: (208) 555-5555, 555-5555

After the first pass becomes:

2085555555, 208555555

and After the second pass becomes 2082085555555, 2082085555555

I worked around this by changing the replacement regex to:

$pattern = preg_replace("/\(|\)|\-|\./","\\\\$0",$value);//add escapes for special characters
$pattern = "/(\(?[0-9]{3}\)?)? ?" . $pattern . "/";//add delimiters, and optional area code
Hurpe