views:

948

answers:

3

It has been a while since I have used regular expressions and I'm hoping what I'm trying to do is possible. I have a program that sends automated response regarding a particular file and I'd like to grab the text between two words I know will never change. In this example those words are "regarding" and "sent"

Dim subject As String = "Information regarding John Doe sent."
Dim name As String = Regex.IsMatch(subject, "")

So in this case I'd like to be able to get just "John Doe". Every regexp I'm coming up with includes the words "regarding" and "sent". How can I use those words as the boundaries but not include them in the match?

+3  A: 

Assuming "Information regarding " and "sent." never change, you can use a capturing group to get "John Doe":

^Information regarding (.+) sent.$

And you use it this way:

Dim regex As New Regex("^Information regarding (.+) sent.$")
Dim matches As MatchCollection = regex.Matches(subject)

Now, it should only match once, and you can get the group from the Groups property of the match:

For Each match As Match In matches  
  Dim groups As GroupCollection = match.Groups
  Console.WriteLine(groups.Item(1).Value) // prints John Doe
Next
Welbog
I am using the sample code and the results being returned are "Information regarding John Doe sent." The whole string is being returned... Any idea what I'm doing incorrectly?
swolff1978
That last line should have been `Console.WriteLine(groups.Item(1).Value)` - group #0 is the whole match, while group #1 is the first capturing (parenthesized) group.
Alan Moore
@Alan M: Good catch. I've updated my answer. Thanks a lot.
Welbog
A: 

Your regular expression should essentially look like this:

.*regarding (.+) sent.*

And the data you're looking for will be in the first capture variable ($1 in Perl).

Keith B
A: 

While matching all the groups is a way of doing, I would use two non matching groups and one named froup so that it would only ever return the group that you wanted. This would give you the regex:

(?:regarding )(?<filename>.*)(?: sent)

this would give you the ability to call the filename from the groups for example

Dim rx As New Regex("(?:regarding )(?<filename>.*)(?: sent)", _
           RegexOptions.Compiled )
Dim text As String = "Information regarding John Doe sent."
Dim matches As MatchCollection = rx.Matches(text)
'The lazy way to get match, should print 'John Doe'
Console.WriteLine( matches[0].Groups.Item("filename").Value )

A good resource for Regex is found on the msdn site here

Andrew Cox