tags:

views:

99

answers:

6

If I have the string: ababa

and I want to replace any "aba" sequence with "c". How do I do this? The regular expression "aba" to be replaced by "c" doesn't work as this comes out as "cba". I want this to come out as "cc". I'm guessing this is because the second "a" in the input string is being consumed by the first match. Any idea how to do this?

A: 

Does "c" appear in the original string?

If not, use a loop to repeated replace strings. Replace "aba" by "c", and also replace "cba" by "cc".

edit: If c does appear, is there some character that doesn't appear in the original string? Say, z?

Use a loop to replace "aba" by "z", and also replace "zba" by "zz". When the loop finishes, replace all the "z" with "c".

UncleO
A: 

ab(?=a) A zero-width positive lookahead assertion.

Antony Hatchkins
this is close but it's still not a complete answer, it will yield cca in this example
Paul Creasey
Won't work - `ababa` will come out as `cca` when they want `cc`.
Amber
Fixed, posted as another answer.
Antony Hatchkins
A: 

I'm not sure you can do this in a single-pass regex replacement - most regex engines treat replacements as having to deal with non-overlapping matches of a pattern.

Instead, you might want to write some simple code that scans through the string and looks for overlapping occurrences, then replaces runs of occurrences with the appropriate number of repetitions of the replacement, before moving on to the next run.

Amber
A: 

I don't think this is possible with a single step, since the first match will invalidate the second. You could achieve it using two steps and lookaround, what tool are you doing this in?

first match the b's which are surrounded by a's

s/(?<=a)~b~(?=a)/b/g

this matches the b's and replaces them, you can then use another step to remove the surrounding a's

s/(~a~|a~|~a)//g

This is an example using perl like syntax, the ~ characters i inserted just to mark the a's which should be removed in the second step.

Paul Creasey
A: 

Its gotta be multi-pass something like:

s/ab(?=a)/c/g followed by s/a//g

Also you can in perl play around with the pos match function which will reset the positon of the last match for you (that is you would do something like pos = pos -1). Mastering Perl is a good reference if you want to go down this path.

ennuikiller
+2  A: 

One pass!

s/ab(?=aba)|aba/c/g;

This in fact is the solution!

aba -> cc
ababa -> ccc
abazaba -> czc
Antony Hatchkins
Good answer, although excessive use of capturing parenthesis which serve no purpose! ab(?=aba)|aba is fine :)
Paul Creasey
Thanks, fixed )
Antony Hatchkins
fails on abababa, returns cbc but I'm assuming he would want ccc.
Zac Thompson
check once again. abababa returns ccc.
Antony Hatchkins