ansaurus

Question

Does anyone see why the first part of my regex isn't working in Python?

Answer 1

+3 A:

The first part of your regex doesn't have capturing parentheses around it. Try the regex:

,([A-Z\s]+?),(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?
 #^^ This was [A-Z\s]+?; needs to be ([A-Z\s]+?)

which would be this in python:

r",([A-Z\s]+?),(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?"

Example from the interpreter:

>>> import re
>>> r = re.compile(r",[A-Z\s]+?,(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?")
>>> r.match(",POWDER,RO,ML,8/19/2002").groups()
('RO', 'ML', '8/19/2002')
>>> r = re.compile(r",([A-Z\s]+?),(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?")
>>> r.match(",POWDER,RO,ML,8/19/2002").groups()
('POWDER', 'RO', 'ML', '8/19/2002')

eldarerathis 2010-10-29 18:19:56

Answer 2

A:

I'm not into python, but you just forgot to use brackets to indicate that you want to capture that part:

,([A-Z\s]+)?,(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})? should do what you want

balu 2010-10-29 18:20:04

Answer 3

+6 A:

Yes. You did not capture the first group.

r",([A-Z\s]+),(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?"
#  ^        ^

BTW, it seems that you are parsing a CSV file with regex. In Python, there is already a csv module.

KennyTM 2010-10-29 18:20:18

Thx, i'll check out the csv module to see if i can leverage it for what I need to do. Sadly this is just using python to prove to myself that my script will work, I'll need to actually implement it in Java or Groovy so no one at work freaks out.

jonny 2010-10-29 18:30:43

Answer 4

A:

Yes, you missed the grouping parentheses:

>>> s = ",POWDER,RO,ML,8/19/2002"
>>> pat = r",([A-Z\s]+?),(LA|RO|MU|FE|AV|CA),(ML|FE|MN|FS|UN)?,(\d+/\d+/\d{4})?"
>>> re.match(pat, s).groups()
('POWDER', 'RO', 'ML', '8/19/2002')

Lie Ryan 2010-10-29 18:22:01

ansaurus

tags:

views:

answers:

Does anyone see why the first part of my regex isn't working in Python?

related questions