questions about pcre | ansaurus

pcre

Match uneven number of escape symbols

I need to match C++ preprocessor statements. Now, preprocessor statements may span multiple lines: #define foobar \ "something glorious" This final backslash may be escaped so the following results in two separate lines: #define foobar \\ No longer in preprocessor. The question is how I can match the explicit line continuation ...

Wondering how to do PHP explode over every other word

Lets say I have a string - $string = "This is my test case for an example." If I do explode based on ' ' I get an Array('This','is','my','test','case','for','an','example.'); What I want is an explode for every other space: Array('This is','my test','case for','an example.'). The string may have an odd # of words, so the last item in ...

What do these regex Patterns Match?

I am new to regex in PHP and understand the basic patterns however the ones below are a bit complex and I don't understand what the following pattern matches: $ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#... "<a href='' rel='nofollow'></a>", $ret); $ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[...

Matching rest of string in perl-style regex

How do I match something thats "A known part (unknown word) (the rest of the string)" in a perl-style regex (PCRE)? ...

regex for parsing request parameters

Hello! I am looking for a single perl compatible regex that would parse strings of the form: param1=value1&...&param2=value2&... and extract values for param1 and param2 only. But param2 may precede param1 There may be no param1 or param2 param1 or param2 (or both) may have empty values, i.e. param1=&... ...

Matching arbitrary number of words in PCRE regex into strings

Hello, I am using PCRE for some regex parsing and I need to search a string for words in a specific pattern (let's say all words in a string of words separated by commas) and put them into a string vector. How would I go about doing that? ...

Tilde operator in Regular expressions

Hi guys, I want to know what's the meaning of tilde operator in regular expressions. I have this statement: if (!preg_match('~^\d{10}$~', $_POST['isbn'])) { $warnings[] = 'ISBN should be 10 digits'; } I found this document explaining what tilde means: ~ It said that =~ is a perl operator that means run this variable against this...

Matching order in PCRE

Hello, How can I set which order to match things in a PCRE regular expression? I have a dynamic regular expression that a user can supply that is used to extract two values from a string and stores them in two strings. However, there are cases where the two values can be in the string in reverse order, so the first (\w+) or whatever ne...

Invert match with regexp

With PCRE, how can you construct an expression that will only match if a string is not found. If I were using grep (which I'm not) I would want the -v option. A more concrete example: I want my regexp to match iff string foo is not in the string. So it would match bar would but not foobar. ...

Is there a utility that will convert POSIX to PCRE for PHP?

Is there a utility that will convert POSIX to PCRE for PHP? I'm somewhat confused by the PHP manual on PCRE, and while I'll try to find more info on PCRE, I was wondering if anyone had designed such a utility. Or, if anyone would explain how to convert the following, that would also be fine: ereg("^#[01-9A-F]{6}$", $sColor) But pleas...

Using preg_split to go from 'hi how are you' to [hi, how are you]

I want to split a string into two parts, the string is almost free text, for example: $string = 'hi how are you'; and i want the split to look like this: array( [0] => hi [1] => how are you ) I tried using this regex: /(\S*)\s*(\.*)/ but even when the array returned is the correct size, the values comes empty. What should ...

Stack overflow in IIRF (a C program, ISAPI)

I am using IIRF - an ISAPI rewrite filter for pretty URL's. I haven't been able to get much help from the developer on these issues. I'm hoping by making some sense of this dump, so I can find the problematic area in the code and rebuild it myself. I am not super familiar with C, but can get around. Do I need to build this with debug sym...

What flavour of regular expression is grep?

I'm guessing it's not a Perl compatible regular expression, since there's a special kind of grep which is specifically PCRE. What's grep most similar to? Are there any special quirks of grep that I need to know about? (I'm used to Perl and the preg functions in PHP) ...

Is there a token for capture line breaks in multiline regex?

I've run into this problems several times before when trying to do some html scraping with php and the preg* functions. Most of the time I've to capture structures like that:  <tag1>lorem ipsum</tag> <p>just more text with several html tags in it, sometimes CDATA encapsulated…</p>  In particular I wa...

Non greedy regex matching in sed?

I'm trying to use sed to clean up lines of URLs to extract just the domain.. e.g., from: http://www.suepearson.co.uk/product/174/71/3816/ I want: http://www.suepearson.co.uk/ (either with or without the trainling slash, it doesn't matter) I have tried: sed 's|\(http:\/\/.*?\/\).*|\1|' and (escaping the non greedy quantifier) ...

Can someone explain Possessive Quantifiers to me? (Regular Expressions)

I am reading the PCRE doc, and it refers to possessive quantifiers, but does not explicitly or specifically define them. I know what a greedy quantifier is, and I know what a lazy quantifer is. But possessive? The PCRE man page seems to be cheating when it uses the term without defining it. The man page specifically states that the ...

Help with Regex - Wordpress (search-regex)

My first attempt using RE has me stuck. I'm using Regex on a Wordpress website via the Search-Regex Plugin and need to match on a specific " buried within a bunch of html code. HTML example: provide brand-strengthening efforts for the 10-school conference.  </p> <p> <a href="http://www.learfield.com/oldblog/.a/6a00d8345233fa6...

Matching parenthesis content in PCRE without outermost parens

I need to extract content of unbalanced paren construction. In manual for PCRE i found solution for matching balanced parens. <\[ ( (?>[^(<\[|\]>)]+) | (?R) )* \]> For my test <[<[ab<[cd]>]><[ef]> It extracts 0.0: <[ab<[cd]>]> 0.1: <[ef]> But i want to extract same content without outermost parens: 0.0: ab<[cd]> 0.1: ef Could...

PCRE to replace #334455 hex with #345

I'm writing a function that replaces long hex coded color (#334455) with short one (#345). This can be only done when each color in hex is multiple of 17 (each hex pair consists of the same characters). e.g. #EEFFCC is replaced with #EFC, but #EDFFCC isn't replaced with anything. I want to make this with single preg_replace() call with...

Deploying C app that uses the PCRE library

Hello, I wrote a C app that uses the PCRE library. Everything works on my own computer. However, when I copy the binary over to another computer and run it, it gives the following error: /libexec/ld-elf.so.1: Shared object "libpcre.so.0" not found, required by "myapp" I know I can probably get it to work by installing the PCRE lib on ...

1
2
3
4
5
...
10