tags:

views:

107

answers:

3
Q: 

Regex Help

I have a text file which all contain the following fragments of code in it

Lines.Strings = (
  '[email protected]'
  '[email protected]'
  '[email protected]'
  '[email protected]')

the e-mail address will change. as will the Lines part of Lines.String it can be called anything EG Test.Strings, or ListBox.Strings

I want to match all and any text inside of this Strings = ( ) block;

here is the code I'm using

[a-zA-Z]+\.Strings\s+=\s+[\(]{1}(.+)[\)]{1}

but this doesn't catch all of the matches

I want the grouping to stop after the first ) is found. it looks like it's matching any character after the first ( and all the way to the end of the file to the last )

any help would be greatly appreciated.

EDIT:

I'm not asking to match "emails"; i want any text between the opening ( and closing )

I think it's called "non-greedy" matching. i only want the match to end after the first ) is found.

+3  A: 

This may work better.

[a-zA-Z]+\.Strings\s*=\s*\(([^)])+)\)

Also, how are you dealing with line breaks? That may be your issue. From the regexp you gave, it looks as if you are concatenating all the lines together with spaces. If not, you need to be thinking in terms of multi-line regular expressions.

In response to your answer to your own question:

"[a-zA-Z]+\\.Strings\\s+=\\s+[\\(]{1}(.+?)\\){1}?",

is needlessly cluttered, and may not work as you expect for several cases.

  • you don't need to write {1} since that's the default
  • [\)] is the same as just \)
  • you won't match cases where the = isn't surrounded by spaces, even though such spaces are generally optional
  • Making the final ) optional and not excluding ) from your internal pattern means it will be part of your capture.

Fixing these leads to:

"[a-zA-Z]+\\.Strings\\s*=\\s*\\(([^)]+)\\)"

Which is just what I posted, but with the backslashes doubled for use in a string.

MarkusQ
I use the "SingleLine" RegexOption
Michael G
Then you'll need to glue all the lines together into one long line (with spaces between them) to match multiline cases like your example.
MarkusQ
Thanks for the help! +1 and accepted
Michael G
+1  A: 

Try

/\(([^\)]+)\)/

NODE                     EXPLANATION
----------------------------------------------------------------------
  \(                       '('
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^\)]+?                  any character except: '\)' (1 or more
                             times (matching the least amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  \)                       ')'
----------------------------------------------------------------------
Ed Guiness
A: 
public static Regex regex = new Regex(
      "[a-zA-Z]+\\.Strings\\s+=\\s+[\\(]{1}(.+?)\\){1}?",
    RegexOptions.IgnoreCase
    | RegexOptions.Singleline
    | RegexOptions.CultureInvariant
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
    );

I seem to have gotten it.

If you can improve this; post your own answer with the updated version.

-- Note -- the above works.

Michael G
You've gotten it, but you are doing a lot of unnecessary things; see my answer for details.
MarkusQ