tags:

views:

92

answers:

2

From a string, I need to pull out groups that match a given pattern.

An example string: <XmlLrvs>FIRST</XmlLrvs><XmlLrvs>SECOND</XmlLrvs><XmlLrvs>Third</XmlLrvs>

Each group shall begin with <XmlLrvs> and end with </XmlLrvs>. Here is a snippet of my code...

String patternStr = "(<XmlLrvs>.+?</XmlLrvs>)+";

// Compile and use regular expression
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(text);
matcher.matches();

// Get all groups for this match
for (int i = 1; i<=matcher.groupCount(); i++) {
   System.out.println(matcher.group(i));
}

The output is <XmlLrvs>Third</XmlLrvs>. I am expecting group first and second but those aren't being captured. Can anyone assist?

+3  A: 

You are iterating over the groups when you should be iterating over matches. The matches() method checks the entire input for a match. What you want is the find() method.

Change

matcher.matches();

for (int i = 1; i<=matcher.groupCount(); i++) {
    System.out.println(matcher.group(i));
}

to

while (matcher.find()) {
    System.out.println(matcher.group(1));
}
waxwing
Note that the + in the regex needs to be removed, or everything will be matched at once, and not in three iterations.
molf
I don't agree with that, the .+? is an non-greedy quantifier. But I haven't tested it.
waxwing
I mean the final +.
molf
Removing the + at the tail of the expression and using the while control statement suggested did just the job. Thanks
@molf: Right you are, didn't see that!
waxwing
A: