views:

144

answers:

1

Hello

I'm trying to make a pattern for tags, but the sub method just replace the first char and 3 at the end of the line, im trying to replace all tags on the line and with multiline

p=re.compile('<img=([^}]*)>([^}]*)</img>', re.S)
p.sub(r'[img=\1]\2[/img]','<img="test">dsad</img> <img="test2">dsad2</img>')
output:
'**[**img="test">dsad</img> <img="test2"]dsad2**[/img]**'
+1  A: 

You're using towards the start of your re's pattern:

<img=([^}]*)>

this will gobble up (as group 1) all characters after the leading <img=, including other tags!!!, up to the last > it can possibly gobble; * is GREEDY -- it gobbles up as much as it possibly can. Not sure why you're specifically excluding closed-braces }? Maybe you meant to exclude closed angular brackets instead (>).

For NON-greedy matching, instead of *, you need *?; with that, you'll be gobbling up as little as you can, instead of as much as you can. So, I think you mean:

p = re.compile(r'<img=([^>]*?)>(.*?)</img>', re.S)

this matches one img tag (and all tags inside it), and appears to be performing exactly the substitutions you mean.

Alex Martelli
Thanks, this solve the problemi didnt find good documentation/manuals about patterns, even u understood better your explanation than some sites...
i even understood*
do you know how make the inverse?p = re.compile(r'[img=([^]]*?)](.*?)[/img]', re.S) File "C:\Python25\lib\re.py", line 241, in _compile raise error, v # invalid expressionsre_constants.error: unbalanced parenthesis