tags:

views:

105

answers:

2

I have a datas et and like to do a simple while operation with a Perl script. Here is a small extraction from the dataset:

"number","code","country","gamma","X1","X2","X3","X4","X5","X6" 1,"DZA","Algeria","0.01",7.44,47.3,0.46,0,0,0.13 2,"AGO","Angola","0.00",6.79,"NULL",0.21,1,0,0.28 3,"BEN","Benin","-0.01",7.02,38.9,0.27,1,0,0.05 4,"BWA","Botswana","0.06",6.28,45.7,0.42,1,0,0.07 5,"HVO","Burkina Faso","0.00",6.15,36.3,0.08,1,0,0.05 6,"BDI","Burundi","0.00",6.38,41.8,0.18,1,0,0

The script should count the length of every , separated field and store the highest values into an array.

However, the saving doesn't work properly. Here is a part of the code:

@maxl = map length, @terms;

while(`<INFILE>`) {
$_ =~ s/[\"\n]//g ;
@terms = split/$sep/, $_;
@lengths = map length, @terms;
for($k = 0, $k <= $#terms, $k++) { 
    if($lengths[$k] > $maxl[$k]) {
    $maxl[$k] = $lenghts[$k];
    }
}
print "@lengths\n";
}

Now the @maxl uses an earlier part from the code where it uses the second line of the dataset. When I use a print command just to see the values of the @maxl operation i get:

1 3 7 4 4 4 4 1 1 5

In the while loop I used another print statement just to see the other values, I get:

1 3 6 4 4 4 4 1 1 4
1 3 5 5 4 4 4 1 1 4
1 3 8 4 4 4 4 1 1 4
1 3 12 4 4 4 4 1 1 4
1 3 7 4 4 4 4 1 1 1
1 3 8 4 4 4 4 1 1 4
1 3 10 4 4 4 4 1 1 4
1 3 16 5 4 4 4 1 1 4
2 3 4 5 3 4 4 1 1 4
2 3 7 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 5 4 4 4 4 1 1 4
2 3 8 4 4 4 4 1 1 4
2 3 5 4 4 4 1 1 1 4

The fourth column eg has obviously values which are greater than 3. The while loop was supposed to save the greatest values and substitute those values into @maxl.

What went wrong?

A: 

...in the for loop the comma are wrong

for($k = 0, $k <= $#terms, $k++)

however, after cleaning that up there still seems to be a problem...

mropa
Can you update the code?
Ivan Nevostruev
Don't post non-answers as answers. Update your original post instead.
Sinan Ünür
+7  A: 

there's a typo here $maxl[$k] = $lenghts[$k]; for starters (which 'use strict' would have caught)

consider using Text::CSV for more reliable parsing of comma-separated data (it can also handle other separators):

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;

my $csv = Text::CSV->new();
my @max_lengths;

while ( my $line = <INFILE> ) {

    die "Unable to parse '$line'" unless $csv->parse($line);

    my @column_lengths = map { length } $csv->fields();

    for my $i ( 0 .. $#column_lengths ) {
        if ( $column_lengths[$i] > ($max_lengths[$i] || 0) ) {
            $max_lengths[$i] = $column_lengths[$i];
        }
    }
}

print "MAX LENGTHS OF EACH FIELD: @max_lengths\n";
plusplus
forgot to remove the <DATA> from my example - needs to be replaced with the <INFILE> as before...
plusplus
@plusplus: replaced. BTW, you can edit your own posts
Ivan Nevostruev