views:

224

answers:

4

If I got a table in a text file such like

  • A B 1
  • A C 2
  • A D 1
  • B A 3
  • C D 2
  • A E 1
  • E D 2
  • C B 2
  • . . .
  • . . .
  • . . .

And I got another symbol list in another text file. I want to transform this table into a Perl data structure like:

  • _ A D E . . .
  • A 0 1 1 . . .
  • D 1 0 2 . . .
  • E 1 2 0 . . .
  • . . . . . . .

But I only need some selected symbol, for example A, D and E are selected in the symbol text but B and C are not.

+2  A: 

Use an array for the first one and a 2-dimentional hash for the second one. The first one should look roughly like:

$list[0] # row 1 - the value is "A B 1"

And the hash like:

$hash{A}{A} # the intersection of A and A - the value is 0

Figuring out how to implement a problem is about 75% of the mental battle for me. I'm not going to go into specifics about how to print the hash or the array, because that's easy and I'm also not entirely clear on how you want it printed or how much you want printed. But converting the array to the hash should look a bit like this:

foreach (@list) {
  my ($letter1, $letter2, $value) = split(/ /);
  $hash{$letter1}{$letter2} = $value;
}

At least, I think that's what you're looking for. If you really want you could use a regular expression, but that's probably overkill for just extracting 3 values out of a string.

EDIT: Of course, you could forgo the @list and just assemble the hash straight from the file. But that's your job to figure out, not mine.

Chris Lutz
If the hash keys are simple enough, you can use a single hash if you concatenate the keys. So $hash{AD} holds 1. Used appropriately, this approach can simplify your code. Used inappropriately, it gets you weird bugs and ugly code. This technique should only be used for very simple hash keys.
daotoad
I tried to use a hash, but there are some problem when I print out the value. My code was post below(Answer4). Would you please take a look. Thanks! -Debbie
A: 

Another way to do this would be to make a two-dimensional array -

my @fArray = ();
## Set the 0,0th element to "_"
push @{$fArray[0]}, '_';

## Assuming that the first line is the range of characters to skip, e.g. BC
chomp(my $skipExpr = <>);

while(<>) {
    my ($xVar, $yVar, $val) = split;

    ## Skip this line if expression matches
    next if (/$skipExpr/);

    ## Check if these elements have already been added in your array
    checkExists($xVar);
    checkExists($yVar);

    ## Find their position 
    for my $i (1..$#fArray) {
     $xPos = $i if ($fArray[0][$i] eq $xVar);
     $yPos = $i if ($fArray[0][$i] eq $yVar);
    }

    ## Set the value 
    $fArray[$xPos][$yPos] = $fArray[$yPos][$xPos] = $val;
}

## Print array
for my $i (0..$#fArray) {
    for my $j (0..$#{$fArray[$i]}) {
     print "$fArray[$i][$j]", " ";
    }
    print "\n";
}

sub checkExists {
    ## Checks if the corresponding array element exists,
    ## else creates and initialises it.
    my $nElem = shift;
    my $found;

    $found = ($_ eq $nElem ? 1 : 0) for ( @{fArray[0]} );

    if( $found == 0 ) {
     ## Create its corresponding column
     push @{fArray[0]}, $nElem;

     ## and row entry.
     push @fArray, [$nElem];

     ## Get its array index
     my $newIndex = $#fArray;

     ## Initialise its corresponding column and rows with '_'
     ## this is done to enable easy output when printing the array
     for my $i (1..$#fArray) {
      $fArray[$newIndex][$i] = $fArray[$i][$newIndex] = '_';
     }

     ## Set the intersection cell value to 0
     $fArray[$newIndex][$newIndex] = 0;
    }
}

I am not too proud regarding the way I have handled references but bear with a beginner here (please leave your suggestions/changes in comments). The above mentioned hash method by Chris sounds a lot easier (not to mention a lot less typing).

muteW
If I save all required characters into a array @all_nodes.Can I change "next if (/$skipExpr/);" into "next unless (/@all_node/);"?
he array @all_node will be interpolated with the help of the list separator variable - $" which is set to a space ' ' by default. So it could work but you'd need to set it to '' before using the array in the regular expression. local $" = ''; next unless(/@all_node/);
muteW
A: 

CPAN has many potentially useful suff. I use Data::Table for many purposes. Data::Pivot also looks promising, but I have never used it.

bgbg
A: 

My source code was listed below.

But the output file just look like

  • _ A D E....
  • A
  • D
  • E

I think there would be some problem when returning the matrix value. Would anyone can help me?

Thanks!

=============================

require "nodes.dump"; # it's a required symbol list. ex: A, D, E

my $ppi_file = "PPI_files.txt"; # it's the table. ex: A B 1

my @all_node = sort keys %nodes;

open FILE,">distance_matrix.txt";

print FILE "\t";

foreach ( @all_node ){

    print FILE "$_\t";

} #print first line of distance matrix>

print FILE "\n";

foreach my $a ( @all_node ){

    print FILE "$a\t";

    foreach my $b ( @all_node ){
            my $value = &search_distance_value($a,$b);
            print FILE "$value\t";
    }
    print FILE "\n";

}

close FILE;

    sub search_distance_value
    {

         my $a = $_[0];
         my $b = $_[1];
         my %ppi; 

    open TABLE, "<$ppi_file"; #table: A B 1, A D 2,...

    while ( $line = <TABLE> ){
              chomp $line;
              my ( $node1, $node2, $dist ) = split /\s+/, $line;
 $ppi{$node1}{$node2} = $dist;
    }
    if ( ( $node1 = $a ) && ( $node2 = $b ) ){
            return $dist;
    }

}

Chris Lutz