views:

68

answers:

8

I'm having some trouble trying to develop a regular expression which will pick out all the function calls to "tr" from this block of asp code below. Specifically I need to get the string in each "tr" function call.

    if(RS.Fields("Audid").Value <> 0 ) Then
        Response.Write ("<td>" & tr("RA Assigned") & "</td>")
    else
        Response.Write ("<td>" & tr("Not Yet Assigned") & "</td>")
    End if

    if(RS.Fields("rStatus").Value = "Activated") then
        Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
    Else
        If (gParLevelz_Admin = gParLevelz and RS.Fields("CustomerParid").Value <> 0) Then 
            Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")
        else                                       
            Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")
        End if
    End if

I believe I have a good first attempt at getting this done. The following expression extracts values for most of the cases I will run into...

tr\(\"([^%]|%[0-9]+)+\"\)

What's causing me the most confusion and stress is how to capture all manner of strings which show up in the "tr" function. Literally anything could be between the quotation marks of the "tr" call and unfortunately my expression returns values past that last quotation. So given the above snippet which I have posted one of the matches is...

tr("RA Assigned %2") & "</td>")
            else
                Response.Write ("<td>" & tr("Not Yet Assigned %4") & "</td>")
            End if

            if(RS.Fields("rStatus").Value = "Activated") then
                Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
            Else
                If (gParLevelz_Admin = gParLevelz and RS.Fields("CustomerParid").Value <> 0) Then 
                    Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")
                else                                       
                    Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")

Which is way more than I want. I just want tr("RA Assigned %2") to be returned.

+3  A: 

It looks like your regex pattern is greedy. Try making it non-greedy by adding an ? after the 2nd +: tr\(\"([^%]|%[0-9]+)+?\"\)

A simplified version to capture anything inside the tr(...) would be: tr\(\"(.+?)\"\)

Ahmad Mageed
Wow lots of great replies here. The "?" to make the search non-greedy was the key to my issue which a few people have noted. Thanks for the response Ahmad.
Rob Segal
@Rob my pleasure :)
Ahmad Mageed
+1  A: 

Use a question mark after the plus sign modifier to make it non-greedy (only match as much as it needs).

Also, maybe anchor against ") & " if that always follows a call to tr().

Anonymous
+1  A: 

You'll need a non-greedy pattern; just add a ?, like:

tr\(\"([^%]|%[0-9]+)+?\"\)
//                   ^--- notice this
Rubens Farias
+1  A: 

tr\((\"[^\"]*)\"\)

Christopher Bruns
doesn't run due error
Rubens Farias
Thank dionadar. In the SO editor I need to type "\\(" to get "\(". I should pay closer attention to the preview.
Christopher Bruns
yeah, markdown does the backslash-escape-dance
dionadar
A: 

tr(\".*\")

in regex, . = anything, * = any number (including 0)

zendrums
Rubens Farias
greedy .* is rather risky here...
dionadar
This still suffers from the initial greedy pattern problem. It should be `.*?` and you also need to include the escaped opening/closing parentheses.
Ahmad Mageed
A: 

I'm not sure if it's perfect, but it properly retrieved all of the entries in your sample. While testing the other expressions on this page I found that some erroneous entries were being returned. This one does not return any bad data:

tr\("([\W\w\s]+?)"\)

The result returned will contain both the entire function call, and also the strings within the function. I tested it with the following input:

Response.Write ("<td>" & tr("RA Assigned") & "</td>")
Response.Write ("<td>" & tr("Not Yet Assigned") & "</td>")
Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")                              
Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")
Response.Write ("<td>" & tr("RA Ass14151igned") & "</td>")
Response.Write ("<td>" & tr("RA %Ass_!igned") & "</td>")

And received the following output:

$matches Array:
(
    [0] => Array
        (
            [0] => tr("RA Assigned")
            [1] => tr("Not Yet Assigned")
            [2] => tr("Edit")
            [3] => tr("Awaiting Authorization")
            [4] => tr("Awaiting Authorization")
            [5] => tr("RA Ass14151igned")
            [6] => tr("RA %Ass_!igned")
        )

    [1] => Array
        (
            [0] => RA Assigned
            [1] => Not Yet Assigned
            [2] => Edit
            [3] => Awaiting Authorization
            [4] => Awaiting Authorization
            [5] => RA Ass14151igned
            [6] => RA %Ass_!igned
        )

)

On a related note, check out My Regex Tester. It's a super useful tool for testing regular expressions in your browser.

Nathan Taylor
A: 

Just don't match on the equals sign for the string.

tr\(\"([^\"]+)\"\)

Mike Nelson
A: 

This should do it, use non-greedy (?) after * or +:

    const string pattern = "tr\\(\".*?\"\\)";
    const string text = "tr(\"RA Assigned %2\") & \"</td>\")";
    Regex r = new Regex(pattern, RegexOptions.Compiled);
    Match m = r.Match(text);
    while (m.Success)
    {
        foreach (Capture c in m.Captures)
        {
            Console.WriteLine(c.Value);
        }
        m = m.NextMatch();
    }

(Here there is a good regex in C# cheat sheet)

Ariel