tags:

views:

92

answers:

2

I'm using the Java matcher to try and match the following:

@tag TYPE_WITH_POSSIBLE_SUBTYPE -PARNAME1=PARVALUE1 -PARNAME2=PARVALUE2: MESSAGE

The TYPE_WITH_POSSIBLE_SUBTYPE consists of letters with periods.

Every parameter has to consist of letters, and every value has to consist of numerics/letters. There can be 0 or more parameters. Immediately after the last parameter value comes the semicolon, a space, and the remainder is considered message.

Everything needs to be grouped.

My current regexp (as a Java literal) is:

(@tag)[\\s]+?([\\w\\.]*?)[\\s]*?(-.*=.*)*?[\\s]*?[:](.*)

However, I keep getting all the parameters as one group. How do I get each as a separate group, if it is even possible?

I don't work that much with regexps, so I always mess something up.

+1  A: 

Try this out (you may need to add extra '\' to make it work within a string.

(@tag)\s*(\w*)\s*(-[\w\d]*=[\w\d]*\s*)*:(.*)

By the way, I highly recommend this site to help you build regular expressions: RegexPal. Or even better is RegexBuddy; its well worth the $40 if you plan on doing a lot of regular expressions in the future.

Mike
+2  A: 

If you want to capture each parameter separately, you have to have a capture group for each one. Of course, you can't do that because you don't know how many parameters there will be. I recommend a different approach:

Pattern p = Pattern.compile("@tag\\s+([^:]++):\\s*(.*)");
Matcher m = p.matcher(s);
if (m.find())
{
  String[] parts = m.group(1).split("\\s+");
  for (String part : parts)
  {
    System.out.println(part);
  }
}
System.out.printf("message: %s%n", m.group(2));

The first element in the array is your TYPE name and the rest (if there are any more) are the parameters.

Alan Moore
Thank you. I actually have something like that right now (where I split the parameters, but I was hoping that there was a way to end up with variable length groups...
Uri
As far as I know, .NET is the only regex flavor that lets you access all of the matches for a capture group as opposed to just the last one.
Alan Moore