views:

66

answers:

3

Trying to use regex refind tag to find the content within the brackets in this example using coldfusion

 joe smith <[email protected]>

The resulting text should be

 [email protected]

Using this

<cfset reg = refind(
 "/(?<=\<).*?(?=\>)/s","Joe <[email protected]>") />

Not having any luck. Any suggestions?

Maybe a syntax issue, it works in an online regex tester I use.

A: 
/\<([^>]+)\>$/

something like that, didn't test it though, that one's yours ;)

Peter Kruithof
A: 

I've never been happy with the regular expression matching functions in CF. Hence, I wrote my own:

<cfscript>
    function reFindNoSuck(string pattern, string data, numeric startPos = 1){
        var sucky = refindNoCase(pattern, data, startPos, true);
        var i = 0;
        var awesome = [];

        if (not isArray(sucky.len) or arrayLen(sucky.len) eq 0){return [];} //handle no match at all
        for(i=1; i<= arrayLen(sucky.len); i++){
            //if there's a match with pos 0 & length 0, that means the mime type was not specified
            if (sucky.len[i] gt 0 && sucky.pos[i] gt 0){
                //don't include the group that matches the entire pattern
                var matchBody = mid( data, sucky.pos[i], sucky.len[i]);
                if (matchBody neq arguments.data){
                    arrayAppend( awesome, matchBody );
                }
            }
        }
        return awesome;
    }
</cfscript>

Applied to your problem, here is my example:

<cfset origString = "joe smith <[email protected]>" />
<cfset regex = "<([^>]+)>" />
<cfset matches = reFindNoSuck(regex, origString) />

Dumping the "matches" variable shows that it is an array with 2 items. The first will be <[email protected]> (because it matches the entire regex) and the second will be [email protected] (because it matches the 1st group defined in the regular expression -- all subsequent groups would also be captured and included in the array).

Adam Tuttle
Thanks Adam, I was able to use the wrapper that Peter developed, but thanks for your two cents as well.
jeff
+2  A: 

You can't use lookbehind with CF's regex engine (uses Apache Jakarta ORO).

However, you can use Java's regex though, which does support them, and I've created a wrapper CFC that makes this even easier. Available from: http://www.hybridchill.com/projects/jre-utils.html

Also, the /.../s stuff isn't required/relevant here.

So, from your example, but with improved regex:

<cfset jrex = createObject('component','jre-utils').init()/>

<cfset reg = jrex.match( "(?<=<)[^<>]+(?=>)" , "Joe <[email protected]>" ) />


A quick note, since I've updated that regex a few times; hopefully it's at its best now...

(?<=<) # positive lookbehind - start matching at `<` but don't capture it.
[^<>]+ # any char except  `<` or `>`, the `+` meaning one-or-more greedy.
(?=>)  # positive lookahead - only succeed if there's a `>` but don't capture it.
Peter Boughton
You are a genius Peter. This works well.. Thanks for the help
jeff