views:

175

answers:

4

What are some good Perl modules to process files based on configurations?

Basically I am working on taking data files, split them into columns, remove some rows based on some columns, remove unnecessary columns, compare them to baseline (writes where changes have occured) and save a csv of the data and the comments as metadata.

Sample file is:

001SMSL22009032020090321024936
002XXXXX20090320102436               010000337 00051     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               020000333 00090     
009000000009000000000271422122

it will compare row by row with another file (baseline) and some differing rows will be highlighted ( I am use Tk::DiffText).

Here is the pipeline where [is a pipe]

file -> [split] -> [remove production] -> [sort] -> [compare] -> {user jumps in and writes comments, edits file as needed} -> [save csv] -> [save comments]

The real question is what perl module helps to model and make a pipeline flow like this? After more research I have found this http://en.wikipedia.org/wiki/Flow-based_programming.

+1  A: 

Hmmm, seems that it's nothing Perl cannot handle almost by itself :

taking data files

while (<>)

split them into columns,

my @row = split(/,/);

remove some rows based on some columns,

next if @row[5] =~ m/black_list_data/;

remove unnecessary columns

@row = ($row[1], $row[4]);

remove unnecessary columns

@row = ($row[1], $row[4]);

compare them to baseline (writes where changes have occured)

Ok, here you might use Algorithm::Diff

and save a csv of the data and the comments as metadata.

Class::CSV or DBD::CSV ?

Steve Schnepp
Hmm.. the quote CSS style is quite similar to the code style :-(
Steve Schnepp
I normally leave the quotes as normal text or make them comments in the code.
Chas. Owens
A: 
Chas. Owens
A: 

I am not aware of any Perl implementations of Flow-Based Programming, but I believe Perl 5.8 has made interpreter threads available to Perl coders (someone correct me if I'm wrong!), so it should be relatively straightforward to build an FBP implementation on Perl. See http://perldoc.perl.org/threads.html

Paul Morrison
A: 

This is what I was looking for:

Text::Pipe

Text::Pipe::Stackable

Thank you for helping me clarify my ideas!

kthakore