views:

489

answers:

2

In PHP, you have preg_replace($patterns, $replacements, $string), where you can make all your substitutions at once by passing in an array of patterns and replacements.

What is the equivalent in Python?

I noticed that the string and re functions replace() and sub() don't take dictionaries...

Edited to clarify based on a comment by rick: the idea is to have a dict with keys to be taken as regular expression patterns, such as '\d+S', and (hopefully) constant string values (hopefully w/o backreferences). Now editing my answer accordingly (i.e. to answer the actual question).

+2  A: 

closest is probably:

somere.sub(lambda m: replacements[m.group()], text)

for example:

>>> za = re.compile('z\w')
>>> za.sub(lambda m: dict(za='BLU', zo='BLA')[m.group()], 'fa za zo bu')
'fa BLU BLA bu'

with a .get instead of []-indexing if you want to supply a default for matches that are missing in replacements.

Edit: what rick really wants is to have a dict with keys to be taken as regular expression patterns, such as '\d+S', and (hopefully) constant string values (hopefully w/o backreferences). The cookbook recipe can be adapted for this purpose:

def dict_sub(d, text): 
  """ Replace in 'text' non-overlapping occurences of REs whose patterns are keys
  in dictionary 'd' by corresponding values (which must be constant strings: may
  have named backreferences but not numeric ones). The keys must not contain
  anonymous matching-groups.
  Returns the new string.""" 

  # Create a regular expression  from the dictionary keys
  regex = re.compile("|".join("(%s)" % k for k in d))
  # Facilitate lookup from group number to value
  lookup = dict((i+1, v) for i, v in enumerate(d.itervalues()))

  # For each match, find which group matched and expand its value
  return regex.sub(lambda mo: mo.expand(lookup[mo.lastindex]), text)

Example use:

  d={'\d+S': 'wot', '\d+T': 'zap'}
  t='And 23S, and 45T, and 66T but always 029S!'
  print dict_sub(d, t)

emits:

And wot, and zap, and zap but always wot!

You could avoid building lookup and just use mo.expand(d.values()[mo.lastindex-1]), but that might be a tad slow if d is very large and there are many matches (sorry, haven't precisely measured/benchmarked both approaches, so this is just a guess;-).

Alex Martelli
This only supports one regex, I think you can't go simpler than the function I grabbed from ActiveState in case you actually want both a replacements and a patterns dict. Can you?
Vinko Vrsalovic
Alex Martelli
I think the original question is better answered by the recipe, because PHP's preg_replace accepts both many regexps and replacements.
Vinko Vrsalovic
The recipe doesn't accept _any_ re -- it builds one from the dict; a pretty different task. Lemme check out preg_replace precisely and see how to rewrite that exact spec...
Alex Martelli
...I see: in each place it accepts a string _or_ an array of strings, and no dicts (or named groups), just positional (numeric) correspondences. However, the question specifically mentioned dictionaries, so it can't be asking about an exact equivalent of preg_replace. Maybe @rick can clarify exactly what he wants with an edit to his question!
Alex Martelli
Won't this throw a KeyError on the string "za zi ze"? Seems a little fragile for general use.
Triptych
If the RE may match items that aren't key in the dict, and you want to replace those with (say) 'UNKNOWN', instead of [m.group(0)] you use .get(m.group(0),'UNKNOWN'); if you want to NOT replace such items, use .get(m.group(0),m.group(0)). In other words, you use completely normal Python techniques for dict access: [] if a missing key is an error, .get with a default if it's an OK case.
Alex Martelli
Thanks Alex. The problem I'm having is as Triptych suggested, Key Error. The replacement dictionary I have consists of regex, but the sub() method using this recipe looks up the key in the dict using the match object which is a string, for example, "12S". The dict doesn't contain the key "12S" so it throws a key error (the dict has something like '\d*S')I don't think using .get would solve this particular problem right?
rick
@rick, no you're right it wouldn't -- so @Vinko's read your mind better than I had, but not perfectly (as he also uses mo.group() as the key). OK, let me edit to clarify the Q and then give the A;-).
Alex Martelli
I know I'm a good mind reader... :) I deleted my answer to avoid further confusion. I'll just leave the recipe link here http://code.activestate.com/recipes/81330/ which replaces using regular strings rather than regexps.
Vinko Vrsalovic
+1 for good mindreading and excellent stack overflow "best practice"!-)
Alex Martelli
A: 

It's easy enough to do this:

replacements = dict(hello='goodbye', good='bad')
s = "hello, good morning";
for old, new in replacements.items():
    s = s.replace(old, new)

You will find many places where PHP functions accept an array of values and there is no direct Python equivalent, but it is much easier to work with arrays (lists) in Python so it is less of an issue.

too much php
you might want to use dict.iteritems instead of dict.items, per PEP290 http://www.python.org/dev/peps/pep-0290/#looping-over-dictionaries
NicDumZ
Don't like it. dict.items() is not guaranteed to be in any particular order, so the resulting replacement is unpredictable. For instance, in your example, if "hello" is processed first, the resultant string is "badbye, bad morning"; otherwise, it's "goodbye, bad morning".
Triptych
@Triptych: Nice catch.
too much php