ansaurus

Question

Is there a single Python regex that can change all "foo" to "bar" on lines starting with "#"?

Answer 1

+3 A:

lines = mystring.split('\n')
for line in lines:
    if line.startswith('#'):
        line = line.replace('foo', 'bar')

No need for a regex.

Harley 2009-02-09 23:32:44

Yes, but as I specifically said in the last line of the question, I'd like to do this without having to split the string and sift through it line by line.

mike 2009-02-09 23:33:40

Why not split the string? I see Mat's provided a regex solution, but I find this one much easier to read.

John Fouhy 2009-02-09 23:44:22

There's an existing function that takes a series of regexes and applies them to an input string, and it's politically infeasible to change this function since quite a lot depends upon it.

mike 2009-02-09 23:48:54

Sorry, missed that last line. I'm genuinely curious why splitting is not an option though, I think both methods load the entire string into memory

Harley 2009-02-10 00:11:56

Unfortunately, using regexes for solutions like this in python is not ... well ... pythonic. Text replacement using regexes is not as well supported in python as it is in perl, since python is much more generic in focus. The for loop may be your best bet for a simple, concice implementation.

Robert P 2009-02-10 01:07:05

Answer 2

+1 A:

It looked pretty easy to do with a regular expression:

>>> import re
... text = """line 1
... line 2
... Barney Rubble Cutherbert Dribble and foo
... line 4
... # Flobalob, bing, bong, foo and brian
... line 6"""
>>> regexp = re.compile('^(#.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
line 1
line 2
Barney Rubble Cutherbert Dribble and foo
line 4
# Flobalob, bing, bong, bar and brian
line 6

But then trying your example text is not so good:

>>> text = """# foo
... foo
... # foo foo"""
>>> regexp = re.compile('^(#.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
# bar
foo
# foo bar

So, try this:

>>> regexp = re.compile('(^#|\g.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
# foo
foo
# foo foo

That seemed to work, but I can't find \g in the documentation!

Moral: don't try to code after a couple of beers.

Mat 2009-02-09 23:38:04

Wait, Python has a \g sigil that works like Perl's \G? I didn't notice that in the docs.

mike 2009-02-09 23:44:37

Anyway, this doesn't work. Try feeding it "# foo foo"

mike 2009-02-09 23:45:52

Yeah, I just realised that when I saw your example text. Darn!

Mat 2009-02-09 23:46:37

That last one doesn't seem to work at all -- it's all foos and no bars! :) Anyway, I think I'm going to give up on this feature. It's probably not possible.

mike 2009-02-10 00:27:13

Answer 3

A:

\g works in python just like perl, and is in the docs.

"In addition to character escapes and backreferences as described above, \g will use the substring matched by the group named name, as defined by the (?P...) syntax. \g uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE."

Algorias 2009-02-10 00:12:42

ansaurus

tags:

views:

answers:

Is there a single Python regex that can change all "foo" to "bar" on lines starting with "#"?

Edit: Yes, I know that it's possible to split the string into individual lines and test each line and then decide whether to apply the transformation, but please take my word that doing so would be non-trivial in this case. I really do need to do it with a single regular expression.

related questions