tags:

views:

68

answers:

4

Hi - I have one text file (fileA), approx 200 lines, each line with the format

afield1 afield2 afield3 afield4 afield5

and another text file (fileB), approx 300 lines, same format

bfield1 bfield2 bfield3 bfield4 bfield5

I want to create another text file, where if afield1 & bfield1 match, it writes a line like:

"some text" bfield4 "some text" afield3 "some text" afield1

I think this would be very easy to do in perl, or even awk, if I just knew how. a simple shell script is proving very difficult.

Very grateful for any help received.

Thanks

+1  A: 

Well this might be even easier with some modules but since you seem to need something quick and dirty, here's what I can come up with. (this assumes that your file is delimited by commas. Change the delimiter in the split call if you're using something else.

open(my $fh1, "fileA.txt") or die $!;
open(my $fh2, "fileB.txt") or die $!;
open( my $out, ">outfile.txt") or die $!;
while( my $line = <$fh2> ) {
     chomp($line);
     my @columns_2 = split(/,/, $line);
     my $a_line = <$fh1>;
     my @columns_1 = split(/,/, $a_line);

     if( $columns_2[0] eq $columns_1[0] ) {
          print $out "text $columns_2[3] more text $columns_1[2] more text $columns_1[0]\n";
     }
}
close($fh1);
close($fh2);
close($out);
Cfreak
thanks cfreak - that's perfect
paul44
except doesn't work for me - hopefully I can debug - keep getting Can't use an undefined value as filehandle reference at line 2
paul44
hmm ... what version of perl?
Cfreak
perl5 (5.0 patchlevel 5 subversion 3)
paul44
+1  A: 
awk 'FNR==NR{a[$1];next}($1 in a) {print "sometext "$4" some text blah"} ' file1 file2

give a more concrete example of your data file and your expected output next time.

ghostdog74
+1  A: 

In Bash:

join <(sort fileA) <(sort fileB) | awk '{print $8, "some text", $3, "some text", $1}'

If you're not using Bash, you may need to pre-sort the files.

sort fileA > temp1
sort fileA > temp2
join temp1 temp2 | awk '{print $8, "some text", $3, "some text", $1}'
Dennis Williamson
A: 

Building on ghostdog74's answer

awk '
    # read file1 first
    FNR == NR {
        # store afield3 for later
        a[$1] = $3 
        next
    }
    ($1 in a) {
        # bfield1 == some afield1
        print "some text " $4 " some text " a[$1] " some text " $1
    } 
' file1 file2
glenn jackman
thanks - supposing afield1 occurs only once, but bfield1 may occur a few times in fileB - and I only want to print the line on the first match?
paul44
@paul44, then `delete a[$1]` after the print statement
glenn jackman