tags:

views:

347

answers:

4

It's been 10 years since I looked at c. I need to write a little program in c that parses a string. I wanted to use regular expressions since I've been using them for years, but I have no idea how to do that in c. I've spent the morning Googling and I can't find any straight forward examples (ie. use this library, this is the methodology). Can someone give me a simple example?

Thanks!

+23  A: 

You can use PCRE:

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API. The PCRE library is free, even for building commercial software.

See pcredemo.c for a PCRE example.

If you cannot use PCRE, POSIX regular expression support is probably available on your system (as @tinkertim pointed out). For Windows, you can use the gnuwin Regex for Windows package.

The regcomp documentation includes the following example:

#include <regex.h>

/*
 * Match string against the extended regular expression in
 * pattern, treating errors as no match.
 *
 * Return 1 for match, 0 for no match.
 */

int
match(const char *string, char *pattern)
{
    int    status;
    regex_t    re;

    if (regcomp(&re, pattern, REG_EXTENDED|REG_NOSUB) != 0) {
        return(0);      /* Report error. */
    }
    status = regexec(&re, string, (size_t) 0, NULL, 0);
    regfree(&re);
    if (status != 0) {
        return(0);      /* Report error. */
    }
    return(1);
}
Sinan Ünür
+1, though POSIX does exist, pcre is available on almost all modern systems.
Tim Post
@tinkertim Thank you. Added a link to the docs.
Sinan Ünür
Thanks! Gives me a nice starting point.
jeffkolez
This is a nice, useful and cooperative answer. I hope this question receives more up votes, as regex in C can be tricky depending on the platform.
Tim Post
@tinkertim Thank you. Especially for reminding me about the POSIX regex specs.
Sinan Ünür
A: 

Another option besides a native C library is to use an interface to another language like Python or Perl. Not having to deal with C's string handling, and the better language support for regex's should make things much easier for you. You can also use a tool like SWIG to generate wrappers for calling the code from C.

Dana the Sane
+3  A: 

If forced into POSIX only (no pcre), here's a tidbit of fall back:

#include <regex.h>
#include <stdbool.h>

bool reg_matches(const char *str, const char *pattern)
{
    regex_t re;
    int ret;

    if (regcomp(&re, pattern, REG_EXTENDED) != 0)
     return false;

    ret = regexec(&re, str, (size_t) 0, NULL, 0);
    regfree(&re);

    if (ret == 0)
     return true;

    return false;
}

You might call it like this:

int main(void)
{
   static const char *pattern = "/foo/[0-9]+$";

   /* Going to return 1 always, since pattern wants the last part of the
    * path to be an unsigned integer */
   if (! reg_matches("/foo/abc", pattern))
       return 1;

   return 0;
}

I highly recommend making use of PCRE if its available. But, its nice to check for it and have some sort of fall back.

I pulled the snippets from a project currently in my editor. Its just a very basic example, but gives you types and functions to look up should you need them. This answer more or less augments Sinan's answer.

Tim Post
Thanks - nice examples. I'm not sure of the environment yet, so its handy to have a backup in case PCRE isn't available.
jeffkolez
@jeffkolez Good luck on your project. 10 years since you have touched C .. and here you are in one of its dark corners. Should you elect to shoot your computer, please consider purchasing a NERF gun first.
Tim Post
A: 

You should also take a look to the regex library. It uses regular expressions like the ones you can write into the Linux Shell.

For more informations, under Linux, type "man regcomp" without quotes

Daniele Bosi