tags:

views:

133

answers:

2

I have:

Rutsch is for rutterman ramping his roe

which is a phrase from Finnegans Wake. The epic riddle book is full of leitmotives like this, such as 'take off that white hat,' and 'tip,' all which get mutated into similar sounding words depending on where you are in the book itself. All I want is a way to find obvious occurrences of this particular leitmotif, IE

[word1] is for [word2] [word-part1]ing his [word3]

+3  A: 

You can do this with regular expressions in Python:

import re
pattern = re.compile(r'(?P<word>.*) is for (?P=word) (?P=word)ing his (?P=word)')
words = pattern.findall(text)

That won't match your example, but it will match [word] is for [word] [word-part]ing his [word]. Add seasoning to taste. You can find more details in the re module docs.

Nathon
+1: Which is the same way you'd do it in AWK.
S.Lott
+2  A: 
import re
# read the book into a variable 'text'
matches = re.findall(r'\w+ is for \w+ \w+ing his \w+', text)
imgx64
No, the point is that you want the _same_ word in all of these places. Yours will match "Alex is for Bob Charlieing his Dan.".
katrielalex
@katreilalex: The example is "Rutsch is for rutterman ramping his roe".
Jeff
@Jef: Oops, I fail. Thanks.
katrielalex
For robustness, you could replace the spaces in the string with \s+.
Jeff