ansaurus

Question

Answer 1

+1 A:

Works fine for me:

import re

l = "one two 9three 52eight four".split()
c = re.compile("(?!\d)\w+")

m = [w for w in l if re.match(c, w)]
print m

Prints:

['one', 'two', 'four']

miles82 2010-04-28 14:08:27

Strange, check this out http://tinyurl.com/2ctzevm

pocoa 2010-04-28 14:19:59

@pocoa, because he splitted into words first, and check one word by word with `re.match` which need to match at the start of the string. thats why 9three, and 52eight is not matched.

S.Mark 2010-04-28 14:29:42

@S.Mark I don't have a Python installed on this public computer. So I'm relying on online tools. Sorry @miles82!

pocoa 2010-04-28 14:36:16

Answer 2

+2 A:

that's because \w includes number. what you need to do is:

>>> s = "one two 9three 52eight four"
>>> import re
>>> re.findall(r'\b[a-z]+\b', s, re.I)
['one', 'two', 'four']

Also, what you're using (?!...) is called negative look-ahead, while you probably meant negative look-behind (?<!...), which would of course still fail because of above-mentioned issue.

eta: then you just need a single word border:

>>> re.findall(r'\b(?!\d)\w+', s)
['one', 'two', 'four']

SilentGhost 2010-04-28 14:10:51

Thanks. Sorry, I didn't provide enough information. I don't want to match if it stars with the number but "four8" is okay.

pocoa 2010-04-28 14:17:44

Thanks, second example is working too.

pocoa 2010-04-28 14:24:22

Answer 3

+3 A:

Try

\b[a-zA-Z]\w*

S.Mark 2010-04-28 14:11:04

Thanks. This one is working.

pocoa 2010-04-28 14:18:14

this is definitely the right answer +1 includes both lowercase and uppercase chars

c0mrade 2010-04-28 14:19:10

@c0mrade: and which answer doesn't?

SilentGhost 2010-04-28 14:21:31

Note: `\w` include underscore, if underscore is not need, `[a-zA-Z0-9]*` should be used instead of `\w*`

S.Mark 2010-04-28 14:22:40

SilentGhost's answer also does for both cases, there is `re.I` (ignore case) flag.

S.Mark 2010-04-28 14:23:57

Thanks for all comments. S.Mark was the first so I marked as the answer.

pocoa 2010-04-28 14:26:19

@SilentGhost yours, maybe u do it with re.I I don't know that part I just read regex not language specific, ah I see now tagged python .. sorry my bad ..

c0mrade 2010-04-28 14:55:12

Answer 4

A:

regexp might be overkill.

In [3]: [word for word in eg.split(' ') if not word[0].isdigit()]
Out[3]: ['one', 'two', 'four']

Reagle 2010-04-28 19:46:37

ansaurus

tags:

views:

answers:

Words doesn't starts with numbers

related questions