tags:

views:

117

answers:

3

I need a reg exp that will parse something like-

"2 * 240pin"

where the * can be either the regular star or unicode char \u00d7 or just an x. This is what I have but its not working:

multiple= r'^(\d+)\s?x|*|\\u00d7\s?(\d+)(\w{2,4})$'
multiplepat= re.compile(multiple, re.I)
print multiplepat.search(u'1 X 240pin').groups()

returns

multiplepat= re.compile(multiple, re.I)
File "C:\Python26\lib\re.py", line 188, in compile
return _compile(pattern, flags)
File "C:\Python26\lib\re.py", line 243, in _compile
raise error, v # invalid expression
error: nothing to repeat
+2  A: 
multiple= r'^(\d+)\s[xX\*\\u00d7]\s?(\d+)(\w{2,4})$'
Francis
You don’t need to escape the `*` inside a character class.
Gumbo
oh, that's right, i forgot - but it does not harm to escape it :-)
Francis
+2  A: 

You need to escape the * as it is a quantifier in the context you use it. But you could also use a character class. So try this:

ur'^(\d+)\s?[x*\u00d7]\s?(\d+)(\w{2,4})$'
Gumbo
Thanks for explaining!
+1  A: 

Use character sets ([]) :

[]

Used to indicate a set of characters. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a '-'. Special characters are not active inside sets.

>>> m= u'^(\\d+)\\s?[x*\u00d7]\\s?(\\d+)(\\w{2,4})$'
>>> mpat=re.compile(m)
>>> mpat.search(u'1 * 240pin').groups()
(u'1', u'240', u'pin')
>>>
gimel