tags:

views:

105

answers:

4

Hi,

I have 2 MYSQL tables viz. main_table and query1. main_table contains the columns position and chr whilst query1 contains position, chr and symbol. The table query1 is derived by querying the main_table. I am wanting to match up both these tables using Perl such that the output would have the entire list of positions from the main_table in the first column and 2nd column would be symbols corresponding to that position. There could be no symbols at all or just one symbol or multiple symbols for each positions.

I am not very certain how to write the code up for this, currently I have

#!/usr/bin/perl

use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');
my $main = $dbh->prepare("select chr, position from main_table");        
my $q1   = $dbs->prepare("select position, symbol, chrom from query1");

$main->execute();
$q1->execute();    

while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

The following are examples of the of the main_table and query1. The desired output is what I am expecting and I worked it out using the VLOOKUP function in excel.

main_table     
CHR Position  
chr1    229830537  
chr1    229723373   
chr1    229723385   
chr1    229723393   
chr1    229723420   
chr1    229829627  
chr1    229723430   
chr1    229829926  
chr1    229723483   
chr1    229723490   
chr1    229723499   
chr1    229723501   
chr1    229830343  
chr1    229723534   
chr1    229723540   
chr1    230039934  
chr1    229723576   
chr1    229830537  
chr1    229830469   
chr1    229725982   
chr1    229726209  
chr1    229966154  
chr1    229726439   
chr1    229726726   
chr1    229726755   
chr1    229726973  
chr1    229967564  
chr1    229727249   
chr1    229727408   
chr1    229727612   
chr1    229728018   
chr1    229728050   
chr1    229728435       
chr1    229728513       
chr1    229966327       

Query1     
symbol  CHR Position  
C1  chr1 229829230  
C1  chr1 229829278   
C1  chr1 229829442  
C1  chr1 229829627  
C1  chr1 229829653  
C1  chr1 229829683  
C1  chr1 229829810   
C1  chr1 229829926  
C1  chr1 229829961   
C1  chr1 229830085   
C1  chr1 229830086   
C1  chr1 229830087   
C1  chr1 229830088  
C1  chr1 229830141   
C1  chr1 229830343  
C1  chr1 229830469  
C1  chr1 229830534  
C1  chr1 229830537  
C2  chr1 230039932  
C2  chr1 230039934   
C2  chr1 230039939  
C2  chr1 230039944  
457 chr1 229966154   
457 chr1 229966327  
457 chr1 229966500   
457 chr1 229966552   
457 chr1 229966748  
457 chr1 229966998   
457 chr1 229967327   
457 chr1 229967564   
457 chr1 229967594   
457 chr1 229829627  



Desired Output    
Position    symbol  
229830537   C1  
229723373     
229723385     
229723393     
229723420     
229829627   C1, 457  
229723430     
229829926   C1  
229723483     
229723490     
229723499     
229723501     
229830343   C1  
229723534     
229723540     
230039934   c2  
229723576     
229830537   C1  
229830469     
229725982     
229726209     
229966154   457  
229726439     
229726726     
229726755     
229726973     
229967564   457  
229727249     
229727408     
229727612     
229728018     
229728050     
229728435     
229728513     
229966327

Thanks in advance

Caren

A: 

IF you have all the data already and you're just wondering how to output it in columns you should look at sprintf and printf which allow you to format output strings.

Dave Webb
I have the data in two separate tables within MYSQL, i need to the match up the positions of both the tables and print the symbols for the corresponding position. I also want to print out the entire list of positions which is a column in main_table.
So I guess that means that you know neither Perl nor SQL?
innaM
+1  A: 

It sounds like you need to do a join operation in your SQL query, but you'll need some kind of relationship in order for this to work properly. You might be able to figure out what you need using the MySQL reference manual's section on JOIN syntax.

On the Perl side you'll need to write the logic for your output. I would recommend making a hash, using the "position" as your key and then any symbols as values. Fill the hash first, then do your output. It would simplify your process for outputting your query the way you would like.

Weegee
A: 
use strict;
use DBI;

my %ucsc;

my $dbh  = DBI->connect('DBI:mysql:disc1pathway;user=home;password=home');
my $dbs  = DBI->connect('DBI:mysql:results;user=home;password=home');

my $main = $dbh->prepare("select chr, position from main_table");
$main->execute();

my $q1 = $dbs->prepare("select position, symbol, chrom from query1");
$q1->execute();


while (my $main_ref = $main->fetchrow_hashref()) {
    $ucsc{$main_ref->{chr}}{$main_ref->{position}} = 1;
}

while (my $gene_ref = $q1->fetchrow_hashref()) {
    my $q1position = $gene_ref->{position};
    my $q1symbol   = $gene_ref->{symbol};
    my $q1chr      = $gene_ref->{chr};

    foreach my $ucsc (keys %{$ucsc{$q1chr}}) {
        print "$ucsc $q1symbol\n";
    }
}

$dbh->disconnect();
$dbs->disconnect();   

exit (0);

=====================================================================================

The above code just lists the position and the symbol, but does not match them up. I cant seem to get my head around on how to match them up. Any suggestions.

Thanks. Caren

A: 

Weegee has the right answer, you can specify the location of a table like this: ipaddress.database.table. If you are on the same machine you can drop the ipaddress portion, and if you are in the same database you can drop the database portion. So your code should wind up looking like:

#!/usr/bin/perl

use strict;
use warnings;

use DBI;

my $dbh = DBI->connect(
    'DBI:mysql:disc1pathway',
    "home",
    "home",
    {
        ChopBlanks       => 1,
        AutoCommit       => 1,
        PrintError       => 0,
        RaiseError       => 1,
        FetchHashKeyName => 'NAME_lc',
    }
) or die "could not connect to database: ", DBI->errstr;

my $sth = $dbh->prepare("
    SELECT
        disc1pathway.main_table.chr,
        disc1pathway.main.position,
        results.query1.symbol,
        results.query1.chrom
    FROM disc1pathway.main_table, results.query1
    JOIN results.query1 ON (
        disc1pathway.main_table.position = results.query1.position
    )
");        

$sth->execute;

while (my $col = $sth->fetchrow_hashref) {
    print join(" ", @{$col}{qw/chr position symbol chrom/}), "\n";        
}

$sth->finish;

$dbh->disconnect;
Chas. Owens