tags:

views:

55

answers:

1

I have the following string:

const char *str = "\"This is just some random text\" 130 28194 \"Some other string\" \"String 3\""

I would like to get the the integer 28194 of course the integer varies, so I can't do strstr("20194").

So I was wondering what would be a good way to get that part of the string?

I was thinking to use #include <regex.h> which I already have a procedure to match regexp's but not sure how the regexp in C will look like using the POSIX style notation. [:alpha:]+[:digit:] and if performance will be an issue. Or will it be better using strchr,strstr?

Any ideas will be appreciate it

A: 

If you want to use regex, you can use:

const char *str = "\"This is just some random text\" 130 28194 \"Some other string\" \"String 3\"";
regex_t re;
regmatch_t matches[2];
int comp_ret = regcomp(&re, "([[:digit:]]+) \"", REG_EXTENDED);
if(comp_ret)
{
    // Error occured.  See regex.h
}
if(!regexec(&re, str, 2, matches, 0))
{
    long long result = strtoll(str + matches[1].rm_so, NULL, 10);
    printf("%lld\n", result);
}
else
{
    // Didn't match
}
regfree(&re);

You're correct that there are other approaches.

EDIT: Changed to use non-optional repetition and show more error checking.

Matthew Flaschen
Thanks, that seems to do the job pretty well. Could there be a performance issues by using regexp instead of string parsing? I'm parsing ~ 1M lines
David
I think regex is a valid approach here. The regex shouldn't back-track at all. Note that you only have to compile the regex once, and you can also reuse the `matches` array. As always, profile to be sure.
Matthew Flaschen