tags:

views:

110

answers:

5

I'm wanting to (in Java) find any substrings in a string which start with say aba and end in say aca, where there is one or more non-whitespace chars between them.

For example" blingblangabablahacablingblang would find the substring abablahaca.

Then I want to replace each of those substrings, modifying the start to just b and the end to ca, but leaving the internal blah as it was.

For example" blingblangabablahacablingblang would be changed to blingblangbblahcablingblang.

Is there some way I can do this using String.replaceAll() ? There will be many instances within the original string to change.

Thanks for your help.

+2  A: 

I don't know about the Java, but the regex should simply replace 'aba(.+?)aca' with 'b$1ca'.

Thom Smith
Thanks Thom - great suggestion too - yours and Mikael's suggestion of the group and referencing it again with $1 was what I needed.
Drew
This regex also matches whitespace between the aba and aca, and the question specifies "where there is one or more non-whitespace chars between them"
Stephen P
+2  A: 

You could try:

myString.replaceAll("aba(\\w*?)aca", "b$1ca") // would also match abaaca, without blah in the middle

or

myString.replaceAll("aba(\\w+?)aca", "b$1ca") // match onlu if there is a char between aba and aca
Mikael Svenson
Of course, in actual Java you had to double the backslashes.
musiKk
Thanks Mikael (and musikk). Quick, elegant, and almost perfect answer - but it gave me what I was after - just had to use \\S+? instead of \\w+? and it worked like a dream.+1 and tick just as soon as I get enough reputation points!
Drew
@musikk Added the double slosh. That's what I get for never having compiled a java program in my life ;)
Mikael Svenson
A: 

Try this pattern: aba[a-z]*aca

Pattern p = Pattern.compile("aba[a-z]*aca");
Matcher m = p.matcher("blingblangabablahacablingblang");
while (m.find())
   System.out.println(">" + m.group() + "<");
Truong Ha
This will (might) match the interesting sequence but the answer misses a solution for replacing prefix and postfix.
Andreas_D
Also since it is greedy it matches the largest sequence starting with `abc` and ending with `aca`. This may or may not be desired by the OP.
musiKk
A: 

If it is guaranteed that both aba and aca are unique within the given character sequence, then you can use the good old String#replace instead of regular expressions:

String result = original.replace("aba","b").replace("aca", "ca");
Andreas_D
On top of that it must be guaranteed that they only appear in the desired order.
musiKk
A: 

Whats wrong with a simple if string.startsWith() and string.endsWith() and then replace the substring less then start and end and then rebuild the new string ?

mP