tags:

views:

25

answers:

2

Hi all

How can I replace 2 known strings separated by an unknown string using a regexp?

For example, I might have

known_string_1 blah_random text blah known string 2

I know I need some sort of wildcard expression between the two known strings in the refexp, but being a regexp nooblet I have no idea what to use. The unknown string in the middle of the two known strings could be any length.

I'm using this to replace some old code with new stuff, but the fact that known blocks are indented with varying tabs doesn't help.

Thanks a lot,

James

+2  A: 

Very simply, .* will match any character, any number of times.

So for your situation here, the regex

known_string_1.*known_string_2

should work (so long as none of the characters in your known strings are themselves metacharacters such as ?, +, etc.).

Andrzej Doyle
If they are metacharacters, would I be right in saying they need to be escaped with a backslash?
JamWaffles
Yes, escaping with a backslash should work for any metacharacter (including a literal backslash itself).
Andrzej Doyle
@Andrzej Doyle thank you :)
JamWaffles
+1  A: 

Using .* as the pattern for the unknown text between the two known strings will get you most of the way. However, what if you have a string that's like known_string_1 unknown_text_1 known_string_2 unknown_text_2 known_string_2?

If you just use .*, then this will match greedily, and it will match the string unknown_text_1 known_string_2 unknown_text_2. Is this what you want?

If that's not what you want (i.e. you just want to remove unknown_text_1) then you need to use the non-greedy modifier: .*?.

And as an aside, I hope that your known_text_1 and known_text_2 strings aren't opening and closing [X]HTML elements. Everybody knows you shouldn't parse [X]HTML with a regular expression.

CanSpice
What I'm doing is replacing old HTML in some files with updated stuff (replace ugly table code with nice div code). This regex is only going to be used with a GUI regex text replacement tool, not a script which is what you're thinking I'm using (As an aside... bit). As for the greediness, it doesn't matter - known_string_2 only appears once, but thanks all the same!
JamWaffles