views:

136

answers:

4

This is in reference to a question I asked before here

I received a solution to the problem in that question but ended up needing to go with regex for this particular part.

I need a regular expression to search and replace a string for instances of two vowels in a row that are the same, so the "oo" in "took", or the "ee" in "bees" and replace it with the one of the letters that was replaced and a :.

Some examples of expected behavior:

"took" should become "to:k"

"waaeek" should become "wa:e:k"

"raaag" should become "ra:ag"

Thank you for the help.

+1  A: 

Search for ([aeiou])\1 and replace it with \1:

I don't know about python, but you should be able to make the regex case insensitive and global with something like /([aeiou])\1/gi

Amarghosh
Thanks SilentGhost for the edit. I wasn't aware of python syntax - just went by the usual convention.
Amarghosh
+7  A: 

Try this:

re.sub(r'([aeiou])\1', r'\1:', str)
Gumbo
A: 

You'll need to use a back reference in your search expression. Try something like: ([a-z])+\1 (or ([a-z])\1 for just a double).

Seth
A: 

What NOT to do: As noted, this will match any two vowels together. Leaving this answer as an example of what NOT to do. The correct answer (in this case) is to use backreferences as mentioned in numerous other answers.

import re

data = ["took","waaeek","raaag"]

for s in data:
  print re.sub(r'([aeiou]){2}',r'\1:',s)

This matches exactly two occurrences {2} of any member of the set [aeiou]. and replaces it with the vowel, captured with the parens () and placed in the sub string by the \1 followed by a ':'

Output:

to:k
wa:e:k
ra:ag
Mark Peters
this will change `teak` to `ta:k`
Amarghosh
repetition applies to the character class - so any two consecutive vowels will match. the last matched character will be captured (tested with expresso).
Amarghosh
You learn something new every day on this site. You are quite correct.
Mark Peters