ansaurus

Question

Answer 1

+2 A:

First of all, it's redundant to run the match twice - instead, run it, store the result, and branch off of that:

m = re.match(regex.new_round, hh_line)
if m:
    # ...

Next, if you have a bunch of regex -> processing combinations, you might instead make a dict of regex -> function mappings, and then just iterate over it:

def process_a(data):
    # ...

def process_b(data):
    # ...

regex_to_process = {
    'regex_a': process_a,
    'regex_b': process_b,
}

for hh_line in <file object>:
    for regex,process in regex_to_process.iteritems():
        m = re.match(regex, hh_line)
        if m:
            process(hh_line)

Amber 2010-08-24 02:52:31

Yes, I reckoned it was. =) Thanks!

laka 2010-08-24 02:55:22

Thanks, that looks great - but just one follow-up: why can't I access m.group('title') in ex. in that loop? When I have defined lables in the regex.. but I can see them all using groupdict().

laka 2010-08-24 03:31:16

You're using `(?P<name>expression)` syntax, correct? Not sure - could you show more code?

Amber 2010-08-24 03:42:59

That's correct. There is really nothing more to show, but the grouping is freaky. The first regex contains like 6-7 groups, all with lables. The second regex contains 3 groups, and when I try to print any higher that 3, it fails. Why?

laka 2010-08-24 03:50:26

Well, do keep in mind that the loop contents are running for every regex - so if you try to look at a group that exists in one regex but not in another, it'll fail on the iteration that is for the second regex.

Amber 2010-08-24 04:13:17

ansaurus

tags:

views:

answers:

Python parser script layout

related questions