views:

180

answers:

5

Hello all,

I have a text file in the following format:

211B1 CUSTOMER|UPDATE|  
211B2 CUSTOMER|UPDATE|  
211B3 CUSTOMER|UPDATE|  
211B4 CUSTOMER|UPDATE|  
211B5 CUSTOMER|UPDATE|  
567FR CUSTOMER|DELETE|  
647GI CUSTOMER|DELETE|

I want a script that processes the text file and reports the following:

"UPDATE" for column CUSTOMER found for Acct's: 211B1,211B2,211B3,211B4,211B5

"DELETE" for column CUSTOMER found for Acct's: 5675FR,6470GI

I can script simple solutions, but this seems a little complex to me and would appreciate assistence or guidance.

Thanks! David

+6  A: 

collate.pl

#!/usr/bin/perl

use strict;

my %actions;
while (<>) {
    my ($key, $fld, $action) = /^(\w+) (.+?)\|(.+?)\|/ or die "Failed on line $.!";
    push @{$actions{$action}{$fld}}, $key;
}

foreach my $action (keys %actions) {
    foreach my $fld (keys %{$actions{$action}}) {
        print "\"$action\" for column $fld found for Acct's: " . join(",", @{$actions{$action}{$fld}}), "\n";
    }
}

Use like so:

perl collate.pl < input.txt > output.txt
j_random_hacker
Wow, you guys are fantastic! This worked exactly as I wanted. Thanks! This is a great site and the speed of response is unbelievable...
You're welcome David. Response times actually vary quite a bit, but everyone races to solve "cute little problems" like this. :)
j_random_hacker
A: 

Based on your question, you could do this:

perl -i.bak -pe'if(/^211B[1-5]/){s/CUSTOMER/UPDATE/}elsif(/^(5675FR|6470GI)/){s/CUSTOMER/DELETE/}' filename

Though I notice now that the last two account numbers differ in the example, and also that the second column already has those values...

Anonymous
Huh? That just reproduces the input, changing the "CUSTOMER" to "UPDATE" for the first few lines and to "DELETE" for the last few. -1.
j_random_hacker
From rereading the question, I now see what happened -- the OP used the word "reformat", which you interpreted to mean that he wanted a kind of search-and-replace operation. But actually it was bad wording on his part -- he wanted to summarise the information, not "reformat" it. (I've now reworded it.) I'm gonna leave the -1 sorry (harsh I know) because your solution doesn't solve the right problem.
j_random_hacker
What an idiot I've been. I've been answering what the OP asks. This should teach a lesson.
Anonymous
@Anonymous: It's generous to say that the original question was ambiguous -- the other 3 responders had no trouble inferring what the OP actually wanted. But have a good cry if it makes you feel better.
j_random_hacker
Thank you for that advice. I need direction, and now I have someone to emulate.
Anonymous
+1  A: 

With awk:

echo '211B1 CUSTOMER|UPDATE|  
211B2 CUSTOMER|UPDATE|  
211B3 CUSTOMER|UPDATE|  
211B4 CUSTOMER|UPDATE|  
211B5 CUSTOMER|UPDATE|  
567FR CUSTOMER|DELETE|  
647GI CUSTOMER|DELETE|' | awk -F '[ |]' '
    BEGIN {
        upd="";del=""
    } {
      if ($3 == "UPDATE") {upd = upd" "$1};
      if ($3 == "DELETE") {del = del" "$1};
    } END {
        print "Updates:"upd; print "Deletes:"del
    }'

produces:

Updates: 211B1 211B2 211B3 211B4 211B5
Deletes: 567FR 647GI

It basically just breaks each line into three fields (with the -F option) and maintains a list of updates and deletes that it appends to, depending on the "command".

The BEGIN and END are run before and after all line processing so they're initialization and the final output.

I'd put it into a script to make it easier. I left it as a command line tool just since that's how I usually debug my awk scripts.

paxdiablo
+1  A: 
#!/usr/bin/perl

use strict;
use warnings;

my %data;

while ( my $line = <DATA> ) {
    next unless $line =~ /\S/;
    my ($acct, $col, $action) = split /\s|\|/, $line;
    push @{ $data{$action}->{$col} }, $acct;
}

for my $action ( keys %data ) {
    for my $col ( keys %{ $data{$action} } ) {
        print qq{"$action" for column $col found for acct's: },
              join q{,}, @{ $data{$action}->{$col} }, "\n";    
    }

}
__DATA__
211B1 CUSTOMER|UPDATE|  
211B2 CUSTOMER|UPDATE|  
211B3 CUSTOMER|UPDATE|  
211B4 CUSTOMER|UPDATE|  
211B5 CUSTOMER|UPDATE|  
567FR CUSTOMER|DELETE|  
647GI CUSTOMER|DELETE|
Sinan Ünür
+1. Snap! :)
j_random_hacker
A: 

another awk version, though does reverse order of code values, and has an extra "," at end of each line


BEGIN { FS="[ |]" }

{
        key = $3 " for column " $2
        MAP[ key ] = $1 "," MAP[ key ]
}

END {
        for ( item in MAP ) {
                print item " found for Acct's: " MAP[ item ]
        }
}
Straff