views:

308

answers:

4

I am trying to read values from an input file in Perl. Input file looks like:

1-sampledata1 This is a sample test
              and data for this continues
2-sampledata2 This is sample test 2
              Data for this also is on second line

I want to read the above data so that data for 1-sampledata1 goes into @array1 and data for 2-sampledata2 goes in @array2 and so on. I will have about 50 sections like this. like 50-sampledata50.

EDIT: The names wont always be X-sampledataX. I just did that for example. So names cant be in a loop. I think I'll have to type them manually

I so far have the following (which works). But I am looking for a more efficient way to do this..

foreach my $line(@body){
        if ($line=~ /^1-sampledata1\s/){
                $line=~ s/1-ENST0000//g;
                $line=~ s/\s+//g;
                push (@array1, $line);
          #using splitarray because i want to store data as one character each
          #for ex: i wana store 'This' as T H I S in different elements of array
                @splitarray1= split ('',$line);
        last if ($line=~ /2-sampledata2/);
        }
}
foreach my $line(@body){
        if ($line=~ /^2-sampledata2\s/){
                $line=~ s/2-ENSBTAP0//g;
                $line=~ s/\s+//g;
                @splitarray2= split ('',$line);
        last if ($line=~ /3-sampledata3/);
        }
}

As you can see I have different arrays for each section and different for loops for each section. If I go with approach I have so far then I will end up with 50 for loops and 50 arrays.

Is there another better way to do this? In the end I do want to end up with 50 arrays but do not want to write 50 for loops. And since I will be looping through the 50 arrays later on in the program, maybe store them in an array? I am new to Perl so its kinda overwhelming ...

A: 

You should, instead, use a hash map to arrays.

Use this regex pattern to get the index:

/^(\d+)-sampledata(\d+)/

And then, with my %arrays, do:

push($arrays{$index}), $line;

You can then access the arrays with $arrays{$index}.

Daniel
+3  A: 

The first thing to notice is that you are trying to use variable names with integer suffixes: Don't. Use an array whenever you find your self wanting to do that. Second, you only need to read to go over the file contents once, not multiple times. Third, there is usually no good reason in Perl to treat a string as an array of characters.

Update: This version of the code uses existence of leading spaces to decide what to do. I am leaving the previous version up as well for reference.

#!/usr/bin/perl

use strict;
use warnings;

my @data;

while ( my $line = <DATA> ) {
    chomp $line;
    if ( $line =~ s/^ +/ / ) {
        push @{ $data[-1] }, split //, $line;
    }
    else {
        push @data, [ split //, $line ];
    }
}

use Data::Dumper;
print Dumper \@data;

__DATA__
1-sampledata1 This is a sample test
              and data for this continues
2-sampledata2 This is sample test 2
              Data for this also is on second line

Previous version:

#!/usr/bin/perl

use strict;
use warnings;

my @data;

while ( my $line = <DATA> ) {
    chomp $line;
    $line =~ s/\s+/ /g;
    if ( $line =~ /^[0-9]+-/ ) {
        push @data, [ split //, $line ];
    }
    else {
        push @{ $data[-1] }, split //, $line;
    }
}

use Data::Dumper;
print Dumper \@data;

__DATA__
1-sampledata1 This is a sample test
              and data for this continues
2-sampledata2 This is sample test 2
              Data for this also is on second line
Sinan Ünür
+1  A: 
#! /usr/bin/env perl
use strict;
use warnings;

my %data;
{
  my( $key, $rest );
  while( my $line = <> ){
    unless( ($rest) = $line =~ /^     \s+(.*)/x ){
      ($key, $rest) = $line =~ /^(.*?)\s+(.*)/;
    }
    push @{ $data{$key} }, $rest;
  }
}
Brad Gilbert
just add a "split''," before $rest in the push and you've got the elegant solution
bubaker
+1  A: 
J.F. Sebastian