views:

44

answers:

4

Hi I have two tables which are related in the following way :

Table1
gene goterms
ape   1a
ape   1b
ape   1c
boy   2a
boy   1a
cat   1a
cat   1c

Table2
gene gene count
ape  ape   3
ape  boy   1
ape  cat   2
boy  ape   1
boy  boy   2
boy  cat   1  
cat  ape   2
cat  boy   1
cat  cat   2

now,i need value that corresponds to second row of table2 as = (count)/(count of ape from table 1 + count of boy from table 1) - count

for [ape boy 1] row , value = 1 / ((3 + 2) - 1 ) = 0.25

similarly for all the other rows in table2.

The output must be like

Table3 gene gene count calc ape ape 3 1 ape boy 1 .25 ape cat 2 .5 boy ape 1 .25 boy boy 2 1 boy cat 1 .2 cat ape 2 .5 cat boy 1 .2 cat cat 2 1

I need a sql query for this , i attempted in many ways but in vain.

Thank you...

A: 

SELECT gene, count(*) AS value FROM Table1 GROUP BY gene; will give you the count per gene in Table1.

SELECT value FROM ( SELECT gene, count(*) as value FROM Table1 GROUP BY gene; ) WHERE gene = 'boy'; will give you the value from the sub-select statement.

This sounds like homework so I'll just give this answer as a hint.

Scott
I am working on a datamining project and got struck at this query,I have 2000 distinct genes and need a query which dynamically generates the calc value of the table3
In that case, RC's answer looks pretty good.
Scott
A: 

You have duplicate column names in Table2. Assuming that the first column of Table2 is "gene" and the second column is "gene2":

Select *
    , [Count]
        / ( Coalesce(Gene1.Total,0) + Coalesce(Gene2.Total,0) - Coalesce(Table2.Count,0))
From Table2
    Left Join (
                Select T1.gene, Count(*) As Total
                From Table1 As T1
                Group By T1.gene
                ) As Gene1
            On Gene1.gene = Table2.gene
    Left Join (
                Select T2.gene, Count(*) As Total
                From Table1 As T2
                Group By T2.gene
                ) As Gene2
            On Gene2.gene = Table2.gene2
Where ( Coalesce(Gene1.Total,0) + Coalesce(Gene2.Total,0) - Coalesce(Table2.Count,0)) > 0

Granted, if you have a lot of data, this isn't going to be the fastest query in the world.

With SQL Server 2008, you can write it a bit cleaner using a common-table expression like so:

With CountByGene As
    (
    Select T1.gene, Count(*) As Total
    From Table1 As T1
    Group By T1.gene
    )
Select *
    , [Count]
        / ( Coalesce(Gene1.Total,0) + Coalesce(Gene2.Total,0) - Coalesce(Table2.Count,0))
From Table2
    Left Join CountByGene As Gene1
        On Gene1.gene = Table2.gene
    Left Join CountByGene As Gene2
        On Gene2.gene = Table2.gene2
Where ( Coalesce(Gene1.Total,0) + Coalesce(Gene2.Total,0) - Coalesce(Table2.Count,0)) > 0
Thomas
thank you so much..even this worked...
+2  A: 

SELECT gene, gene2, count, count / (( SELECT count() FROM Table1 t1 WHERE t2.gene = t1.gene ) + (SELECT count() FROM Table1 t1 WHERE t2.gene2 = t1.gene) - count) as value FROM Table2 t2

outputs:

gene    gene2   count   value
ape     ape     3       1.0000
ape     boy     1       0.2500
ape     cat     2       0.6667
boy     ape     1       0.2500
boy     boy     2       1.0000
boy     cat     1       0.3333
cat     ape     2       0.6667
cat     boy     1       0.3333
cat     cat     2       1.0000
RC
Thank you so much..this worked perfectly