tags:

views:

243

answers:

4

Hi,

I have a csv file with following sample data.

o-option(alphabetical)
v-value(numerical)

number1,o1,v1,o2,v2,o3,v3,o4,v4,o5,v5,o6,v6
number2,o1,v11,o2,v22,o3,v33,o44,v44,o5,v55,o6,v66

and so on....

Required output.

NUM,o1,o2,o3,o4,o44,o5,o6
number1,v1,v2,v3,v4,,v5,v6
number2,v11,v22,v33,,v44,v55,v66

and so on...

In this data, all the options are same i.e. o1,o2,etc through out the file but option 4 value is changing, i.e. o4,o44, etc. In total there are about 9 different option values at o4 field. Can anyone please help me with the perl code to get the required output.

I have written the below code but still not getting the required output.

my @values;
my @options;
my %hash;

while (<STDIN>) {
chomp;
my ($srn,$o1,$v1,$o2,$v2,$o3,$v3,$o4,$v4,$o5,$v5,$o6,$v6) = split /[,\n]/, $_;
push @values, [$srn,$v1,$v2,$v3,$v4,$v5,$v6];
push @options, $o1,$o2,$o3,$o4,$o5,$o6;
}

#printing the header values
my @out = grep(!$hash{$_}++,@options);
print 'ID,', join(',', sort @out), "\n";

#printing the values.
for my $i ( 0 .. $#values) {
        print @{$values[$i]}, "\n";
}

Output:

ID,o1,o2,o3,o4,o44,o5,o6
number1,v1,v2,v3,v4,v5,v6
number2,v1,v2,v3,v44,v5,v6

As from the above output, when the value 44 comes, it comes under option4 and hence the other values are shifting to left. The values are not mapping with the options. Please suggest.

A: 

Make one pass through the file identifying all the different option values, build an array of those values.

Make second pass through the file:

for each record
    initialise an associative array from your list of option value
    parse the assigning values for the options you have
    use your list of option values to iterate the associative array printing the values
djna
@Virus How big are the files? Any performance requirements? (i.e. doing it in real-time)?
AndreaG
@Andrea: file size is about 70 to 80 mb. I am not facing any issues with reading data from file.
Space
+1  A: 

If you want to line the numeric values up in columns based on the value of the preceding options values, store your data rows as hashes, using the options as the keys to the hash.

use strict;
use warnings;

my (@data, %all_opts);

while (<DATA>) {
    chomp;
    my %h = ('NUM', split /,/, $_);
    push @data, \%h;
    @all_opts{keys %h} = 1;
}

my @header = sort keys %all_opts;
print join(",", @header), "\n";

for my $d (@data){
    my @vals = map { defined $d->{$_} ? $d->{$_} : '' } @header;
    print join(",", @vals), "\n";    
}


__DATA__
number1,o1,v1,o2,v2,o3,v3,o4,v4,o5,v5,o6,v6
number2,o1,v11,o2,v22,o3,v33,o44,v44,o5,v55,o6,v66
FM
A: 

Is this what your after?

use strict;
use warnings;
use 5.010;

my %header;
my @store;

while (<DATA>) {
    chomp;
    my ($srn, %f) = split /,/;
    @header{ keys %f } = 1;
    push @store, [ $srn, { %f } ];
}

# header
my @cols = sort keys %header;
say join q{,} => 'NUM', @cols;

# rows
for my $row (@store) {
    say join q{,} => $row->[0], 
                     map { $row->[1]->{ $_ } || q{} } @cols;
}

__DATA__
number1,o1,v1,o2,v2,o3,v3,o4,v4,o5,v5,o6,v6
number2,o1,v11,o2,v22,o3,v33,o44,v44,o5,v55,o6,v66

Which outputs:

NUM,o1,o2,o3,o4,o44,o5,o6 
number1,v1,v2,v3,v4,,v5,v6
number2,v11,v22,v33,,v44,v55,v66

/I3az/

draegtun
A: 

You might look at the CPAN module DBD::AnyData. One of the neater modules out there. It allows you to manipulate a CSV file like it was a database. And much more.

xcramps