tags:

views:

104

answers:

5

Hello,

gcc 4.4.3 c89

I have the following string

sip:[email protected]

How can I extract just the number? I just want the number.

12387654345443222118765

Many thanks for any advice,

+4  A: 

It sounds like you want it as a numeric type, which is going to be difficult (it's too large to fit in an int or a long). In theory you could just do:

const char* original = "sip:[email protected]";
long num = strtoul(original + 4, NULL, 10);

but it will overflow and strtoul will return -1. If you want it as a string and you know it's always going to be that exact length, you can just pull out the substring with strcpy/strncpy:

const char* original = "sip:[email protected]";
char num[24];
strncpy(num, original + 4, 23);
num[23] = 0;

If you don't know it's going to be 23 characters long every time, you'll need to find the @ sign in the original string first:

unsigned int num_length = strchr(original, '@') - (original + 4);
char* num = malloc(num_length + 1);
strncpy(num, original + 4, num_length);
num[num_length] = 0;
Michael Mrozek
The number will always be a variable width. What is the best way to search up to the @? Thanks.
robUK
I'd just use strchr() to find ':', then again to find '@', which tells you exactly what bytes should be copied (and how long the returned string is going to be). This avoids any breakage if the starting offset is not exactly four bytes.
Tim Post
@Tim Yeah, that's why I dislike string parsing questions; it's never clear what the format is from a single example. It sounds like the string always starts with "sip:", but I'm not positive
Michael Mrozek
It always starts with sip:. thanks very much for your solution.
robUK
@Michael - I can't ever recall being fond of parsing strings with C in general :)
Tim Post
I guess you mean `strncpy`, my normal strcpy only takes two arguments! In both cases you need to explicitly null-terminate after `strncpy`
kaizer.se
@kai Ah, yes, fixed both. I managed to remember the nul-terminator in the latter snippet and not the former
Michael Mrozek
+1  A: 

Have a look into the strtok or strtok_r functions.

Jens Gustedt
strtok is not thread safe.
robUK
You didn't ask for that in particular in your question, and that is why I also mentioned strtok_r, which is POSIX, too, and should do the trick if you have that requirement.
Jens Gustedt
I think strtok_r() is a little overkill for this. He doesn't need to tokenize the string, he just needs to extract a substring.
Tim Post
Hm, matter of taste I guess. From the example he gave it was not clear if the beginning of this string is always the same length. For example, there might be different such prefixes that can occur. I came up with strtok_r for a solution that finds a string between two boundary characters, namely ':' and '@'.
Jens Gustedt
+1  A: 

Use a regular expression :)

#include <regex.h>
regcomp() // compile your regex
regexec() // run your regex
regfree() // free your regex

:)

Nicolas Viennot
Technically right, but probably massively overboard for such a simple string manipulation
Michael Mrozek
@Michael,1) you'll see how good it is when you'll be maintaining your software, especially 6 months after you wrote it. Or if the requirements change.Code needs to be readable.2) You can make a little function to abstract this, taking two string (one regex, one string input, and returns a string with the first matched param..) so that you can reuse it across your projets ?
Nicolas Viennot
If your requirements are likely to change, C is unlikely to be the right implementation language in the first place.
Artelius
@Artelius I totally disagree !! It depends on what you are trying to accomplish.
Nicolas Viennot
@Pafy: Ok I take it back :)
Artelius
@Pafy Good point; regular expressions are well-known for their maintainability and how easy they are to read
Michael Mrozek
+5  A: 

There are lots of ways to do it, if the string is well-formatted you could use strchr() to search for the : and use strchr() again to search for the @ and take everything in between.

Here is another method that looks for a continuous sequence of digits:

char *start = sipStr + strcspn(sipStr, "0123456789");
int len = strspn(start, "0123456789");

char *copy = malloc(len + 1);

memcpy(copy, start, len);
copy[len] = '\0'; //add null terminator

...
//don't forget to
free(copy);
Artelius
+1  A: 

Here is something that will deal with a variable width substring, which doesn't care about the starting position of the substring. For instance, if string was iax2:[email protected], it would still work. It will, however return NULL if either delimiter can't be found.

It uses strchr() to find the delimiters, which lets us know where to start copying and where to stop. It returns an allocated string, the calling function must free() the returned pointer.

I'm pretty sure this is what you want?

Note: Edited from original to be more re-usable and a bit saner.

#include <stdio.h>                                                                              
#include <string.h>
#include <stdlib.h>

char *extract_string(const char *str, const char s1, const char s2)
{
    char *ret = NULL, *pos1 = NULL, *pos2 = NULL;
    size_t len;

    if (str == NULL || s1 < 0 || s2 < 0)
        return NULL;

    pos1 = strchr(str, s1);
    pos2 = strchr(str, s2);
    if (! pos1 || ! pos2)
        return NULL;

    len = ((pos2 - str) - (pos1 - str) - 1);
    ret = (char *) malloc(len + 1);
    if (ret == NULL)
        return NULL;

    memcpy(ret, str + (pos1 - str) + 1, len);
    ret[len] = '\0';

    return ret;
}

int main(void)
{
    const char *string = "sip:[email protected]";
    char *buff = NULL;

    buff = extract_string(string, ':', '@');
    if (buff == NULL)
        return 1;

    printf("The string extracted from %s is %s\n" , string, buff);

    free(buff);

    return 0;
}

You could easily modify that to not care if the second delimiter is not found and just copy everything to the right of the first. That's an exercise for the reader.

Tim Post
Thanks very much.
robUK