tags:

views:

324

answers:

7

When using regular expressions we generally, if not always use them to extract some kind of information. What I need is to replace the match value with some other value...

Right now I'm doing this...

def getExpandedText(pattern, text, replaceValue):
    """
        One liner... really ugly but it's only used in here.
    """

    return text.replace(text[text.find(re.findall(pattern, text)[0]):], replaceValue) + \
            text[text.find(re.findall(pattern, text)[0]) + len(replaceValue):]

so if I do sth like

>>> getExpandedText("aaa(...)bbb", "hola aaaiiibbb como estas?", "ooo")
'hola aaaooobbb como estas?'

It changes the (...) with 'ooo'.

Do you guys know whether with python regular expressions we can do this?

thanks a lot guys!!

+1  A: 

Of course. See the 'sub' and 'subn' methods of compiled regular expressions, or the 're.sub' and 're.subn' functions. You can either make it replace the matches with a string argument you give, or you can pass a callable (such as a function) which will be called to supply the replacement. See http://docs.python.org/lib/module-re.html

Thomas Wouters
+7  A: 
sub (replacement, string[, count = 0])

sub returns the string obtained by replacing the leftmost non-overlapping occurrences of the RE in string by the replacement replacement. If the pattern isn't found, string is returned unchanged.

    p = re.compile( '(blue|white|red)')
    >>> p.sub( 'colour', 'blue socks and red shoes')
    'colour socks and colour shoes'
    >>> p.sub( 'colour', 'blue socks and red shoes', count=1)
    'colour socks and red shoes'
Swati
+2  A: 

You want to use re.sub:

>>> import re
>>> re.sub(r'aaa...bbb', 'aaaooobbb', "hola aaaiiibbb como estas?")
'hola aaaooobbb como estas?'

To re-use variable parts from the pattern, use \g<n> in the replacement string to access the n-th () group:

>>> re.sub( "(svcOrdNbr +)..", "\g<1>XX", "svcOrdNbr               IASZ0080")
'svcOrdNbr               XXSZ0080'
David Schmitt
A: 

re.sub() does replacing based on regular expressions.

Nouveau
A: 

But sub doesn't work with this for example:

re.sub( "svcOrdNbr +(..)", "svcOrdNbr               IASZ0080", "XX")
'XX'

I would need to replace IA with XX, so that the return string would be

'svcOrdNbr               XXSZ0080'

Thanks!

miya
A: 

If you want to continue using the syntax you mentioned (replace the match value instead of replacing the part that didn't match), and considering you will only have one group, you could use the code below.

def getExpandedText(pattern, text, replaceValue):
    m = re.search(pattern, text)
    expandedText = text[:m.start(1)] + replaceValue + text[m.end(1):]
    return expandedText
Bruno Gomes
A: 
def getExpandedText(pattern,text,*group):
    r""" Searches for pattern in the text and replaces
    all captures with the values in group.

    Tag renaming:
    >>> html = '<div> abc <span id="x"> def </span> ghi </div>'
    >>> getExpandedText(r'</?(span\b)[^>]*>', html, 'div')
    '<div> abc <div id="x"> def </div> ghi </div>'

    Nested groups, capture-references:
    >>> getExpandedText(r'A(.*?Z(.*?))B', "abAcdZefBgh", r'<\2>')
    'abA<ef>Bgh'
    """
    pattern = re.compile(pattern)
    ret = []
    last = 0
    for m in pattern.finditer(text):
        for i in xrange(0,len(m.groups())):
            start,end = m.span(i+1)

            # nested or skipped group
            if start < last or group[i] is None:
                continue

            # text between the previous and current match
            if last < start:
                ret.append(text[last:start])

            last = end
            ret.append(m.expand(group[i]))

    ret.append(text[last:])
    return ''.join(ret)

Edit: Allow capture-references in the replacement strings.

MizardX