Perl best practices: file parser using regexes and database storage | ansaurus

tags:

views:

64

answers:

1

+1 Q:

Perl best practices: file parser using regexes and database storage

Hi, I'm writing a log file parser in Perl, using regexes that I've stored in a database. My workflow is basically like this:

Looping over the file and searching for patterns matching my regexes and then extract them
Do something with these matches
Store them accordingly in a database

Last time I did this I explicitly wrote each regex (not looping through each regex in the database), like this.

Now that I'm doing this again I was wondering if there were better solutions out there, better yet comments on what I've already done.

Thanks! =)

+2 A:

You might want to check out Regexp::Assemble.

It will let you compose 1 regex that matches all of your regexes. It also claims it can track which of the original patterns the match corresponds too. I have not used this package before, though.

frankc 2010-08-31 20:37:21

Looks great, I'll check it out. =)

Lenny Benny 2010-08-31 20:52:29

I have used it, it's great. See also: command line tool [`assemble`](http://search.cpan.org/dist/Regexp-Assemble/eg/assemble) (not installed by default) and the improved [`Regexp::Assemble::Compressed`](http://p3rl.org/Regexp::Assemble::Compressed).

daxim 2010-08-31 21:50:25

@Lenny Benny I also use Regexp::Assemble in my project Octopussy (which is a log management solution :) ) It's a really good module to speed up your parsing except for some cases like this [one](http://stackoverflow.com/q/1739285/24820).

sebthebert 2010-09-01 16:28:01

related questions

Passing a commented, multi-line (freespace) regex to preg_match

My regex is matching too much. How do I make it stop?

Using Regex to generate Strings rather than match them

Complexity of Regex substitution

What is the most brilliant regex you've ever used?

RFC calculation in Java need help with algorithm

What did I do wrong here? [Javascript Regex]

How do you use back-references to PCREs in PHP?

Need help writing a regex statement. [PHP]

Regex and unicode

Python Regular Expressions

Question about specific regular expression

Pre-built regular expression patterns or Regex Libraries?

Parsing attributes with regex in Perl

Regex Rejecting matches because of Instr

How do I bind a regular expression to a key combination in emacs?

How do you retrieve selected text using Regex in C#?

Remove Quotes and Commas from a String in MySQL

Regular expression for parsing links from a webpage?

What are good regular expressions?

Why is this regular expression faster?

Learning Regular Expressions

How far should one take e-mail address validation?

How can I get at the matches when using preg_replace in PHP?

Regex: To pull out a sub-string between two tags in a string