tags:

views:

201

answers:

6

I have a character array that contains a phone number of the form: "(xxx)xxx-xxxx xxxx" and need to convert it to something of the form: "xxx-xxx-xxxx" where I would just truncate the extension. My initial pass at the function looks like this:

static void formatPhoneNum( char *phoneNum ) {
    unsigned int i;
    int numNumbers = 0;
    /* Change the closing parenthesis to a dash and truncate at 12 chars. */
    for ( i = 0; i < strlen( phoneNum ); i++ ) {
     if ( phoneNum[i] == ')' ) {
      phoneNum[i] = '-';
     }
     else if ( i == 13 ) {
      phoneNum[i] = '\0';
      break;
     }
     else if ( isdigit( phoneNum[i] ) ) {
      numNumbers++;
     }
    }

    /* If the phone number is empty or not a full phone number, 
     * i.e. just parentheses and dashes, or not 10 numbers
     * format it as an emtpy string. */
    if ( numNumbers != 10 ) {
     strcpy( phoneNum, "" );
    }
    else {
     /* Remove the first parenthesis. */
     strcpy( phoneNum, phoneNum + 1 );
    }
}

It feels kinda hokey the way I'm removing the leading paren, but I can't just increment the pointer in the function as the calling version's pointer won't get updated. I also feel like I could be "more clever" in general all throughout the function.

Any ideas/pointers?

A: 

For starters, this is wrong:

strcpy( phoneNum, phoneNum + 1 );

because ISO C standard says regarding strcpy:

If copying takes place between objects that overlap, the behavior is undefined.

"objects" here being source and destination char arrays. MSDN concurs with this, by the way, so this won't work properly on at least one popular real-world implementation.

It seems that a simpler approach would be to have the function return a new "proper" value of the pointer (into the same buffer), so it can adjust it by 1 to trim the '('.

Your validation, which simply counts digits, permits formatting such as "1-234567890" or "1234567890-" or even "12345foobar4567890" - this may or may not be a problem, depending on requirements.

Pavel Minaev
That's good to know about strcpy... I definitely didn't know that. I'm not quite sure I grasp what you mean by returning a new "proper" value. Elaboration?Also, I'm not really "validating" as the input is guaranteed to have the formatting given, so I really don't need to worry about that.
Morinar
You mean something like: phoneNum = formatPhoneNum( phoneNum ); ?
Morinar
Yes, precisely like that. Or pass a `char**` in so that you can modify the pointer.
Pavel Minaev
+2  A: 

As Pavel said, you can't strcpy a string onto itself. I'm declaring a new variable for clarity, although my approach doesn't use strcpy - with care, you could re-use the original variable. Anyway, if your input is always of the form (xxx) xxx-xxxx xxxx, and your output is always going to be xxx-xxx-xxxx why not just do:

char newPhone[14];
newPhone[0] = phoneNum[1];
newPhone[1] = phoneNum[2];
newPhone[2] = phoneNum[3];
newPhone[3] = '-';
newPhone[4] = phoneNum[6];
newPhone[5] = phoneNum[7];
newPhone[6] = phoneNum[8];
newPhone[7] = '-';
newPhone[8] = phoneNum[10];
newPhone[9] = phoneNum[11];
newPhone[10] = phoneNum[12];
newPhone[11] = phoneNum[13];
newPhone[12] = '\0';

Brute force? Sure, but - if your inputs and outputs are always going to be as you state - it should run efficiently.

PTBNL
Ha! I don't think I ever would have thought of that. In my attempt to be "clever" I missed a rather obvious and even good solution.
Morinar
Just hope you aren't running this on user input. =] In that case, I would just take all the digits out of the string then put them into a format.
strager
+6  A: 

Since you stated that your input is guaranteed to be in the proper format, how about the following:

static void formatPhoneNum( char *phoneNum )
{
    memmove(phoneNum, phoneNum + 1, 12);
    phoneNum[3]  = '-';
    phoneNum[12] = 0;
}

memmove() is guaranteed to work with overlapping buffers

mocj
I like this quite a bit. Similar to the brute force approach by PTBNL, but fewer lines and easy to follow. I'll probably go with this one. I can't believe how badly I over thought this problem :-p
Morinar
I'm declaring a winner here. Thanks for the help all, and certainly wouldn't mind hearing any other "clever" solutions anybody else has.
Morinar
Yep, this is another good way! I just haven't used C much for years, so memmove didn't pop into my head.
PTBNL
+1  A: 

Well I guess I'm just too slow. Nothing clever about this over memmove(), but it shows how you can have a loop and still take all those comparisons out of the inside:

char *formatPhoneNum(char *buffer) {
        int index = 0;
        for( index = 0; index < 12; ++index ) {
                buffer[index] = buffer[index + 1];
        }
        buffer[3] = '-';
        buffer[12] = '\0';

        return buffer;
}

You may find it helpful if you return the start of the string you modify instead of just void so you can chain commands easier. E.g.,

printf("%s\n", formatPhoneNum(buffer));
indiv
A: 

When possible (and not performance degrading) I prefer to pass data to functions as consts. In your case I see no reason not to do it, so I'd declare your function as

static void formatPhoneNum(char *dst, const char *src);

or even, returning the length of the new number:

static int formatPhoneNum(char *dst, const char *src);

Then, just copy digits from src to dst inserting dashes at the right places. The caller is responsible to provide space in dst and check the return value: if 12 (dashes included), all ok; otherwise there was an error.

You can return some negative number to indicate possible errors. For example: -1 would indicate that src is not long enough; -2 would indicate a bad format for src, etc...

Do document all the return values!

Oh! And do not forget to NUL terminate dst!

pmg
If the output is always strictly 12 chars, then a neat trick is to ask for a pointer to array: `void formatPhoneNum(char (*dst)[12], const char* src)`. This will force the caller to use an array of the correct size for target.
Pavel Minaev
@Pavel: you forgot the NUL terminator! I like that trick, thanks
pmg
A: 

If you are allowed to change the API, you could either accept a char** or return a char *, and improve the time complexity:

static void formatPhoneNum(char **phoneNum) {
  (*phoneNum)[4] = '-';
  (*phoneNum)[13] = '\0';
  (*phoneNum)++;
}

Alternately

static char *formatPhoneNum(char *phoneNum) {
  phoneNum[4] = '-';
  phoneNum[13] = '\0';
  return phoneNum + 1;
}

The advantage is that this will take constant time.

Ashwin