tags:

views:

170

answers:

5

Hi Everyone,

I have a sentence. I want to find all occurrences of a word that start with a specific character in that sentence. I am very new to programming and Python, but from the little I know, this sounds like a Regex question.

What is the pattern match code that will let me find all words that match my pattern?

Many thanks in advance,

Brock

+1  A: 
import re

s = "Your sentence that contains the word ROAD"
s = re.sub(r'\bROAD', 'RD.', s)

print s

Read: http://diveintopython3.org/regular-expressions.html

Nimbuz
Why the replacement, and end-of-string anchor?
Matthew Flaschen
+1  A: 
>>> sentence="a quick brown fox for you"
>>> pattern="fo"
>>> for word in sentence.split():
...     if word.startswith(pattern):
...         print word
...
fox
for

Split the sentence on spaces, use a loop to search for the pattern and print them out.

ghostdog74
+1  A: 

I second the Dive Into Python recommendation. But it's basically:

m = re.findall(r'\bf.*?\b', 'a fast and friendly dog')
print(m)

\b means word boundary, and .*? ensures we store the whole word, but back off to avoid going too far (technically, ? is called a lazy operator).

Matthew Flaschen
+1  A: 

You could do (doesn't use re though):

matching_words = [x for x in sentence.split() if x.startswith(CHAR_TO_FIND)]

Regular expressions work too (see the other answers) but I think this solution will be a little more readable, and as a beginner learning Python, you'll find list comprehensions (like the solution above) important to gain a comfort level with.

Scott Stafford
+1  A: 
import re
print re.findall(r'\bv\w+', thesentence)

will print every word in the sentence that starts with 'v', for example.

Using the split method of strings, as another answer suggests, would not identify words, but space-separated chunks that may include punctuation. This re-based solution does identify words (letters and digits, net of punctuation).

Alex Martelli
This seems like a great approach, as I want to avoid commas directly after a word. My issue is that I cant get this to work if I want the word to start with the @ symbol. Ideas?Thanks
Btibert3
add a `@?` -- an optional occurrence of `'@'` at the start of the regex' pattern, before the `\b` (you can actually omit that `\b`, since `@` then `v` will _always_ mean a word-boundary, no need to check;-).
Alex Martelli