tags:

views:

94

answers:

3

Hi,

If I have a regex with a capturing group, e.g. foo(_+f). If I match this against a string and want to replace the first capturing group in all matches with baz so that

foo___f blah foo________f

is converted to:

foobaz blah foobaz

There doesn't appear to be any easy way to do this using the standard libraries. If I use Matcher.replaceAll() this will replace all matches of the entire pattern and convert the string to

baz blah baz

Obviously I can just iterate through the matches, store the start and end index of each capturing group, then go back and replace them, but is there an easier way?

Thanks, Don

+1  A: 
p = Pattern.compile("foo(g.*?f)");
m = p.matcher("foog___f blah foog________f");
s = m.replaceAll("foobaz");//replace with foobaz instead of just baz
System.out.println(s);//foobaz blah foobaz
Amarghosh
No, I'm trying to replace the capturing groups in all matches
Don
Which is what Amarghosh's snippet will do. While "foo" is being matched, it also is being included in the replacement string, meaning any instances like foo_f, foo____f, foo__f, etc., become foobaz.
JAB
@Don updated the code for you to test. As @JAB mentioned, I've included foo in the replacement string too. And the original regex you posted was greedy, and your question was not clear enough - that's why I asked if you were looking for the lazy quantifier.
Amarghosh
+3  A: 

I think you want something like this?

    System.out.println(
        "foo__f blah foo___f boo___f".replaceAll("(?<=foo)_+f", "baz")
    ); // prints "foobaz blah foobaz boo___f"

Here you simply replace the entire match with "baz", but the match uses lookbehind to ensure that _+f is preceded by foo.

See also


If lookbehind is not possible (perhaps because the length is not finite), then simply capture even what you're NOT replacing, and refer to them back in the replacement string.

    System.out.println(
        "fooooo_f boooo_f xxx_f".replaceAll("(fo+|bo+)(_+f)", "$1baz")
    ); // prints "fooooobaz boooobaz xxx_f"

So here we're effectively only replacing what \2 matches.

polygenelubricants
Nice answer, but OP seems to have edited the matching pattern however (`g` is removed). This changes the view on the problem pretty much. I suggest to update your answer accordingly.
BalusC
The second suggestion is simple, effective, something I should have thought of myself, and doesn't require me to learn about lookarounds :)
Don
Yes, I updated the pattern in the question in an attempt to clarify. Sorry if it messed up your response.
Don
@Don: lookarounds are awesome, see e.g: http://stackoverflow.com/questions/2559759/how-do-i-convert-camelcase-into-human-readable-names-in-java
polygenelubricants
@PolyGene You can rest assured that I will voting for you in the forthcoming "Stackoverflow Regex Emperor" election
Don
A: 

Is this anywhere close ....

String[] s = {"foo___f blah foo________f", 
    "foo___f blah goo________f"};
for(String ss: s)
System.out.println(ss.replaceAll("(foo)(_+)f", "$1baz"));

Ie, add a capturing group for 'foo' also. Otherwise a simple replacement would be

"foo___f blah foo________f".replaceAll("(_+)f", "baz")
Kennet